Here are the example specs for the new VMWare Server:
AMD x2 4200+ 65nm CPU
4GB PC2-6400 DDR2 RAM (Dual Channel)
2x Seagate Barracuda 7200RPM 250GB SATA 3.0Gb drives
ASRock ALiveNF6P-VSTA System Board
580W Power Supply
Operating System: Ubuntu 7.10 Server 64bit
Drives configured as follows:
/dev/sda
1 - Swap (1GB)
2 - md0 (250MB)
3 - md1 (231GB)
/dev/sdb
1 - Swap (1GB)
2 - md0 (250MB)
3 - md1 (231GB)
/dev/hda
1 - ext3 (100GB)
All MD devices are configured as RAID-1 using Linux Software Raid
Device md0 was formatted as ext3 and exists as /boot
Device md1 was added to volume group vg00, then Logical Volume ROOT and mounted on /
LILO was used, and it is written to both sda and sdb (so the system remains bootable if sda fails)
Using Logical Volume Manager (LVM) for / provides flexability in that I can add two more 250GB drives as md2, and grow / to 500GB with a few simple commands (vgextend, lvextend, resize2fs). In addition, using LVM also provides capabilities to mirror or snapshot volumes. Using Software Raid instead of the integrated NVidia MediaShield provides better RAID management flexability, and as both devices are CPU backed since there is no internal RAID processor on the MCP chips there is really no performance gain by using the integrated card.
Tuning:
After building the system, and installing VMWare I started creating Virtual Machines. Something that I immediately noticed was that during even small IO, wait states climbed to 100% CPU and network dropoffs occured, it was causing connections to the web server VM to drop for even static content performance was absolutely terrible. I initially thought that the problem was the Software Raid devices, but I quickly identified that the problem child was actually problem children and that the RAID wasn't one of them. I installed a product called monitorix and started collecting data which was instrumental in identifying the performance bottlenecks.
Virtual Machine configuration:
When creating virtual machines on this platform I elected to us pre-allocated disks. Using pre-allocated disks reduces disk fragmentation, and improves overall performance. Additionally, I always remove the floppy device. Here is an overview of the configurations:
Generic base configuration:
Linux 32bit and 64bit VMs:
384MB Ram 20GB SCSI disk pre-allocated 1 Ethernet device (bridged)
Windows VMs:
256MB Ram 8GB SCSI disk pre-allocated 1 Ethernet device (bridged)
It's recommended by VMWare that Windows VMs be configured to use IDE, however in my experience the Virtual IDE devices use tons more CPU time than the SCSI device. This is due to the emulation level done and lack of I/O threading in VMWare's IDE controller. I have to assume that this is a problem with IDE in general, as it's never been very good at multithreaded I/O (this is one big reason it's never been used for servers). Additionally, I recommend using the LSILogic controller is it supports multithreaded IO while the Buslogic controller doesn't.
One of my virtual machines was destined to become a print server, printing to a USB printer directly connected to the VMWare server. Ubuntu 7.10 doesn't configure USBFS out of the box. This can be corrected by editing a few files:
Add to fstab:
usbfs /proc/bus/usb usbfs auto 0 0
Edit /etc/init.d/mountdevsubfs.sh, and uncomment the following lines:
#mkdir -p /dev/bus/usb/.usbfs
#domount usbfs "" /dev/bus/usb/.usbfs -obusmode=0700,devmode=0600,listmode=0644
#ln -s .usbfs/devices /dev/bus/usb/devices
#mount --rbind /dev/bus/usb /proc/bus/usb
This is done by removing the # from the front of each line. Once this is done, go ahead and run the script.
/etc/init.d/mountdevsubfs.sh start
In the Virtual Machine configuration, I needed to ensure that the printer was always connected on startup so I inserted the following configuration into that Virtual Machine's VMX file:
usb.present = "TRUE"
usb.generic.autoconnect = "FALSE"
usb.autoConnect.device0 = "0x0000:x0000"
usb.autoConnect.device1 = "0x04e8:0x327e"
usb.generic.skipsetconfig = "TRUE"
You can get the IDs for your devices by issuing an lsusb on the VMWare Server, this command will output similar to the following:
Bus 002 Device 002: ID 04e8:327e Samsung Electronics Co., Ltd
Bus 002 Device 001: ID 0000:0000
Bus 001 Device 001: ID 0000:0000
Additionally, I had to blacklist usblp on the VMWare Server so the host didn't connect to the printer making it unavailable to the guest.
echo "blacklist usblp" >>/etc/modprobe.d/blacklist
As you can see above I have used the device ID for the Samsung printer, as well as the USB hub that it's connected to (0000:0000). Now, whenever the Virtual Machine restarts, it scans the USB bus on the host and automatically connects those devices. Printing now "just works" after rebooting. Of course for it to "just work" you also need to configure CUPS or Windows printer shareing, but that is out of the scope of this article ;-).
After I installed monitorix, I identified multiple significant performance bottlenecks. I resolved these bottlenecks by tuning both VMWare and the Linux kernel itself. The difference in performance is astronomical, and it's a very visible improvement in the graphs as seen below.


To resolve the bottlenecks, I made all of the following changes to both VMWare Server, the Virtual Machines and the Linux host itself:
Add each of these to /etc/vmware/config:
mainMem.useNamedFile tells VMWare where to put it's temporary workspace file. This file contains the content of the Virtual Machine memory which is not used. By default it is placed in the directory with the virtual machine, however that can seriously impact performance so we'll turn it off.
mainMem.useNamedFile = FALSE
tmpDirectory is the default path for any temp files. We need to change that to be a shared memory filesystem (in RAM).
tmpDirectory="/dev/shm"
prefvmx.useRecommendedLockedMemSize and prefvmx.minVmMemPct tell VMWare to either use a fixed sized memory chunk or balloon and shrink memory as needed. Since I have 4GB of memory in this "server" I want to make sure that I use a fixed chunk of memory to reduce disk IO.
prefvmx.useRecommendedLockedMemSize="TRUE"
prefvmx.minVmMemPct="100"
To tune each Virtual Machine, I installed VMWare tools and then made the following changes to each VMX file:
Set the time in the Virtual Machine to the hosts time (I use NTP on the host):
tools.syncTime = "TRUE"
When I reboot the host, I want to gracefully stop each VM instead of just powering it off:
autostop = "softpoweroff"
I don't care about collapsing memory into a shared pool, this tells the VM to not share which saves CPU cycles:
mem.ShareScanTotal=0
mem.ShareScanVM=0
mem.ShareScanThreshold=4096
sched.mem.maxmemctl=0
sched.mem.pshare.enable = "FALSE"
This basically performs the same action as the configuration I put in /etc/vmware/config by telling the VM to eliminate the temp files and not to balooning and shrink memory, however it doesn't hurt anything to have it in both locations:
mainMem.useNamedFile = "FALSE"
MemTrimRate = "0"
MemAllowAutoScaleDown = "FALSE"
Additionally, by default Ubuntu writes an access time stamp to every inode that's accessed. This is pretty accessive and known to cause bottlenecks in high I/O scenarios. It doesn't negatively impact the filesystem unless you care about access time stamps, so in each VM and the VMWare host host I add "noatime" as an option to all of my mounted disks in /etc/fstab.
In order for the VMWare configuration to work properly with shared memory, you'll need to increase the default shared memory size for tmpfs to match the amount of memory in your system. This can be done by editing /etc/default/tmpfs:
SHM_SIZE=4G
You can use 'mount -o remount /dev/shm' and 'df -h' to implement and verify the change.
Last, I configure /etc/sysctl.conf on the VMWare Server which configures the kernel to perform better as a Virtual Server by inserting the following configuration:
vm.swappiness = 0
vm.overcommit_memory = 1
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.dirty_expire_centisecs = 1000
dev.rtc.max-user-freq = 1024
Lastly, I disable the tickless kernel option in kernel 2.6.22 which further reduces the Virtual Machine I/O constraints by reverting back to using ticks which is better supported by VMWare. This can be done by adding the following option to the kernel options line in /boot/grub/menu.lst or /etc/lilo.conf:
nohz=off
With all of these options configured, the VMWare server now performs wonderfully at under 20% host CPU utilization with 6 Virtual Machines all running various flavors of Windows and Linux.
15 comments:
Best post ever. Seriously Windows XP was completely unusable on my desktop before and now appears to be working much better. Thanks a lot!
What version of VMWare Server are you running?
VMWare Server 1.0.6
Great info! Thanks! Tried those settings and it really makes difference!
Seriously, thank you, thank you, thank you.
I was almost ready to abandon VM Server on Linux and buy a stupid Windows Server after all the searching I did online.
I finally found your post and applied the changes you mention. Although my server was Fedora, the concepts were the same and for the most part even the syntax.
This solved my problems 100%. Thank you again. I hardly ever leave posts like this, but you really deserve it. Thanks.
Rob Simmermon
TEREX Utilities
rob.simmermon@terex.com
Should this settings work for vmplayer as well? I am running XP guest with vmplayer..
Thanks - G
Some of your recommendations could put people in worse shape, IMHO. For example, useNamedFile = FALSE places more memory management burden on the physical RAM of the host. The option was really meant to relieve slowdowns if someone was running a VM off a USB drive. If you engage that option and don't have enough physical RAM to cover the difference, performance will be worse than if it had been left alone.
Some links to VM KB articles so people can make their own judgement calls might be prudent.
My $.02
useNamedFile = false when backed by /dev/shm forces the swap to be stored in physical memory, and only dumped to the host swap file when necessary. When paired with vm.swappiness = 0 it won't use any of the server's swap unless it's out of memory.
Of course it offloads disk based memory management to physical memory, that's exactly what you want to do! You never want to read or write to disk unless you are putting or getting persistant data (IE: OS files, data files, etc). Since I've set the configuration options to lock VM memory size and not de-dupe memory between virtual machines there is a gigantic reduction in the overhead of memory management.
If anyone runs their system out of memory regardless of their configuration they are going to have serious performance issues (or trip OOM killer). It doesn't matter if they are writing to a Linux swap file or to a VMware swap file it's going to be a source of IO contention when you are writing to many virtual machine files (VMDK, VMEM) on the same spindles. Actually, if your system is paging to and from VMWare swap (VMEM) while reading or writing to virtual disks (VMDKs) you have to multiply your IO contention by the number of active virtual disks + the number of VMEM files (these files are located in the virtual machine directory by default) in addition to the contention caused by physical swapping or other disk activity of the host OS.
So, for a host with 4 VMs running concurrently there could be greater than 8 concurrent IO operations to your VMDK or VMEM files in addition to that if you are using the LSILogic disk controller those could each be multithreaded operations causing an even greater level of contention. When properly configured there should only be data operations going to disk spindles and memory operations staying in memory significantly reducing IO contention.
When there's a heavy load on the system this makes a HUGE difference in process response time. Imagine trying to write 1GB to VMEM while reading 512MB from VMEM while concurrently doing a table scan of a MySQL database in another VM which is suddenly balooning memory to support the requirement of the operation. Look at my graphs for a fine example of IO contention with the default configuration.
Instead, what I proposed here and have been running in practice for a significant amount of time is that you don't baloon memory, and you don't write idle memory out to disk, so when that table scan occurs the only disk IO is the data operation.
As you can see from the first paragraph this computer is a dedicated VMWare server, and nothing else. These settings do not harm other systems (I even run them on my laptop). They will absolutely need to be altered for lower memory desktops also running VMWare server.
The useNamedFile configuration item was not designed for use when booting virtual machines from Thumb drives, it's been a VMWare feature for many years, long before thumb drives were large enough to boot from.
I don't have any links handy, but if I do come across any that help though I'll definitely update the blog entry.
Um. Fantastic. Thank you.
I also added two more kernel parameters.
On the host:
elevator = deadline
on each linux guest:
elevator = noop
This works, but I have not done the sort of extensive testing that you have.
Your settings really made a difference, thanks!
However, the most performance was gained by assigning the virtual machines one processor instead of two. I learned that it's a very bad idea trying to give a W2K3 server (32bit) guest two cpus, if the host is a 64 bit ubuntu 8.10 server (if you're using vmware server 2).
Unfortunately, removing one cpu in the virtual machine was the last thing I tried - just before I was going to wipe vmware off the disk and start over with kvm (which, by the way, would have been a better choice as I know by now).
You mentioned that performance improves when you pre-allocate the disk space. Have you noticed any difference in performance if you split the disks into 2GB files vs. storing them as a single file?
I prefer to not split them. The reason for this is that splitting them increases fragmentation at rest, and potentially forces a lot of seeking which increases contention.
I haven't tested that theory though.
I'm running Ubuntu 9.0.4 and VMWare Server 2.0.1. When I try to implement the recommendations here, my VMNet0 bridged adapter stops working for my XP guest VM. Any suggestions?
Did not research the proposed changes before doing them (I know: brainless) but the result speaks for itself - I went from 12 hours for backup and verify a Win SBS 2003 R2 to 1 hour and 12 minutes.
Thanks a lot for this great post.
kudos!
Post a Comment