Except from Scalable Storage Performance for ESX 3.5 document
Things that affect scalabilityThroughput
- Fibre Channel link speed
- Number of outstanding I/O requests
- Number of disk spindles
- RAID type
- SCSI reservations
- Caching or prefetching algorithms
Latency
- Queue depth or capacity at various levels
- I/O request size
- Disk properties such as rotational, seek, and access delays
- SCSI reservations
- Caching or prefetching algorithms.
Factors affecting scalability of ESX storage
Number of active commands
- SCSI device drivers have configurable parameter called LUN queue depth which determines how many commands can be active to a given LUN at any one time.
- QLogic fibre channel HBAs support up to outstanding commands 256, Emulex 128
- Default value in ESX is set to 32 for both
- Any excess commands are queued in vmkernel which increases latency
- When VMs share a LUN, the total number of outstanding commands permitted from all VMs to that LUN is goverened by Disk.SchedNumReqOutstanding. If this is exceeded, commands will be queued in VMkernel. Maximum figure recommended is 64. For LUNs with single VM, this figure is inapplicable, and HBA queue depth is used.
- Disk.SchedNumReqOutstanding should be the same value as the LUN queue depth.
- n = Maximum Outstanding I/O Recommended for array per LUN (this figure should be obtained with help from the storage vendor)
- a = Average active SCSI Commands per VM to shared VMFS
- d = LUN queue depth on each ESX host
- Max number VMs per ESX host on shared VMFS = d/a
- Max number VMs on shared VMFS = n/a
- To establish a look at QSTATS in esxtop, and add active commands to queued commands to get total number of outstanding commands.
SCSI Reservations
- Reservations are created by creating/deleting virtual disks, extending VMFS volume, creating/deleting snapshots. all these result in metadata updates to the file system using locks.
- Recommendation is to minimise these activities during the working day.
- Perform these tasks on the same ESX host that hosts I/O intensive VMs as the SCSI reservations are issued by the same host as there will be no reservation conflicts as the host is already generating the reservations. I/O intensive VMs on other hosts will be affected for the duration of the task.
- Limit the use of snapshots. It is not recommended to run many virtual machines from multiple servers that are using virtual disk snapshots on the same VMFS. Snapshot files grow in 16MB chunks, so for vmdks with lots of changes, this file will grow quickly, and for every 16MB chunk that the file grows by, you will get a SCSI reservation.
Total available link bandwidth
- Make sure you have enough FC links with enough capacity (1/2/4 Gbps) to all VMFS volumes
- e.g. with each VM on different VMFS volume, each using 45MBps, 2Gbps link will be saturated with 4VMs.
- Balancing VMs across 2 links means 8VMs will be able to perform.
- recommended to balance VMFS LUNs across HBA links.
Spanned VMFS Volumes
- VMFS volume is spanned if it includes multiple LUNs.
- Done by using extends
- Good to add storage to an existing VMFS datastore on the fly
- Hard to calculate performance
- SCSI reservations only lock first LUN in a spanned volume, therefore potentially improving performance
Multipathing
Fo Active/active arrays, it's important to find out if they are Asymmetric or Symmetric. some really good information here: http://frankdenneman.wordpress.com/2009/02/09/hp-continuous-access-and-the-use-of-lun-balancing-scripts/
For Asymmetric active/active arrays, or ALUA (such as HP EVA 4x00,6x00 & 8x00 see here for more info), multipathing should be configured on the host so that the "owning" storage processor will be on the primary preferred path for each LUN on all hosts. The non owning processor is then only used as a backup, and no cross-talk between the SPs will happen thus reducing the latency of requests.
Here is another good article that helps http://virtualgeek.typepad.com/virtual_geek/2009/02/are-you-stuck-with-a-single-really-busy-array-port-when-using-esx-script-for-balancing-multipathing-in-esx-3x.html
Unload unnecessary drivers
- Unload VMFS-2 driver if it's not required. Command vmkload_mod -u vmfs2
- Unload NFS drivers if not required. Command vmkload_mod -u nfsclient
How many VMs per LUN?
Depends .... on
- LUN queue depth limit, i.e. the sum of all active SCSI commands from all VMs on a single ESX host sharing the same LUN should not consistently exceed the LUN queue depth
- Determine the max number of outstanding I/O commands to the shared LUN. Array vendor may be able to help supply this value. A latency of 50 milliseconds is the tipping point, you don't really want it any higher.
Zoning
- Single initiator hard zone is what I'd recommend. For a description what this means, see here and here