Monday, August 13, 2012

VMware on NetApp NFS

This information was taken from a NetApp white paper by Bikash Roy Choudhury – TR-3839. I am just pulling out snippets that I thought were very interesting. Not all of it is specific to NetApp, but most of my consulting work is on NetApp.
  1. Deduplication - A VMware datastore has a high level of data duplication. This is because it contains many instances of the same guest operating systems, application software, and so on. NetApp deduplication technology can eliminate this duplicated data, dramatically reducing the amount of storage required by your VMware environment. Space savings range from 50% to 90%, with 70% being typical.
  2. Strongly consider using CAT6 cables rather than CAT5/5e. GbE will work on Cat5 cable and retransmissions will recover from any errors that occur, but they have a more significant impact on IP storage networks.
  3. NFS NETWORKS AND STORAGE PERFORMANCE - There are three primary measures of storage performance: bandwidth (MBps), throughput in I/O operations per second (IOPS), and latency (ms). Throughput and bandwidth are related in the sense that the throughput measured in Mbps equals to IOPs multiplied by the I/O size. The I/O size is the size of the I/O operation from the host perspective.IOPS are usually determined by the back-end storage configuration. If the workload is cached, then it‘s determined by the cache response; most often, it‘s determined by the spindle configuration for the storage object. In the case of NFS datastores, the storage object is the file system. On a NetApp storage system, the IOPS achieved are primarily determined by the number of disk drives in an aggregate. You should also know that each NFS datastore mounted by ESX (including vSphere) uses just one TCP session to an NFS datastore carrying both NFS control information and NFS data. Because of this, the upper limit throughput achievable from a single ESX host for a single datastore—regardless of link aggregation—is a single link. If you use 1GbE, this means that a reasonable expectation is a unidirectional workload of ~80–100MB/sec (GbE is full duplex, so this can be 160MB/sec bidirectionally with a mixed workload). Higher total throughput on an ESX server can be achieved by leveraging multiple datastores. You can scale up the total throughput to multiple datastores using link aggregation and routing mechanisms.
    The performance described above is sufficient for many use cases. Based on these numbers, NFS is appropriate for various VMware scenarios:
    • A shared datastore supporting many VMs with an aggregate throughput requirement within the guidelines above (a large number of IOPS, but generally not large-block I/O)
    • A single busy VM as long as its I/O load can be served by a single GbE link
    With small-block I/O (8K), a GbE connection delivers 12,500 IOPS—roughly the performance of 70 15K spindles. On the other hand, a SharePoint VM tends to use I/O sizes of 256K or larger. Using 256K I/O sizes yields just 390 IOPS, which would likely be a problem. Under such circumstances, 10GbE may be the best performance option. If you use 10GbE—though a single TCP session will be used per datastore—much more throughput is available for demanding workloads. If 10GbE isn‘t an option, you can use NFS for some VMs over 1GbE and FC or iSCSI for others depending on the application throughput and latency requirements.
  4. Partition Alignment - For NFS, there is no VMFS layer involved, so only the alignment of the guest VM file system within the VMDK to the NetApp storage array is required.The correct alignment for a new VM can be set using either diskpart to format a partition with the correct offset or by using fdisk from the ESX service console. In practice to avoid further creating of improperly aligned VMs, you must create templates that are properly aligned so new virtual machines built using those templates are properly aligned. NetApp has created a tool: mbralign, to check and correct the alignment of existing virtual machines.
  5. THIN PROVISIONING - As mentioned above, the ability to take best advantage of thin provisioning is a major benefit of using NetApp NFS. VMware thin provisions the VMDK files on NFS datastores by default, but there are two types of thin-provisioned virtual disk files available:
    • “Thick” type thin-provisioned virtual disk.This type of virtual disk file is created by default on NFS datastores during the virtual machine creation process. It has the following properties:
      • Creates a flat .VMDK file; does not occupy actual disk blocks (thin provisioned) until there is a physical write from the guest OS
      • Guaranteed disk space reservation
      • Cannot oversubscribe the disk space on the NFS datastore
    • “Thin” type thin-provisioned virtual disk.You must create this type of virtual disk file using the vmkfstools command. It‘s properties are:
      • Creates a flat .VMDK file; does not occupy actual disk blocks (thin provisioned) until there is a physical write from the guest OS
      • No guaranteed disk space reservation
      • Can oversubscribe the disk space on the NFS datastore
    You can run the following command against an NFS datastore to show its actual disk space utilization:
    • # vdf -h /vmfs/volumes/<NFS Datastore Name>
    Using the “thin” type of virtual disk you have the option to oversubscribe the storage capacity of a datastore, allocating more space than a datastore actually contains on the assumption that VMs will not all use the capacity allocated to them. This can be a very attractive option; however, it has a few limitations that you must know about before you implement it.
    Should an oversubscribed datastore encounter an out-of-space condition, all of the running VMs will become unavailable. The VMs simply “pause” waiting for space, but applications running inside of VMs may fail if the out-of-space condition isn’t addressed in a short period of time. For example, Oracle databases will remain active for 180 seconds; after that time has elapsed the database will fail.
  6. High Availability and Disaster Recovery - NetApp recommends the following ESX failover timeout settings for NFS. We recommend increasing the default values to avoid VMs being disconnected during a failover event. NetApp VSC can configure these settings automatically. The settings that NetApp recommends (across all ESX hosts) are:
    • NFS.HeartbeatFrequency (NFS.HeartbeatDelta in vSphere) = 12
    • NFS.HeartbeatTimeout = 5(default)
    • NFS.HeartbeatMaxFailures = 10. When the number of NFS datastores are increased, we also recommend increasing the heap values: Net.TcpipHeapSize =>’30′ to Net.TcpipHeapMax => ‘120′
      1. Back up your Windows registry.
      2. Select Start>Run, type regedit.exe, and click OK.
      3. In the left‐panel hierarchy view, double-click HKEY_LOCAL_MACHINE, then System, then CurrentControlSet, then Services, and then Disk.
      4. Select the TimeOutValue and set the data value to 190 (decimal).
    Every “HeartbeatFrequency” (or 12 seconds) the ESX server checks to see that the NFS datastore is reachable. Those heartbeats expire after “HeartbeatTimeout”(or 5 seconds), after which another heartbeat is sent. If “HeartbeatMaxFailures”(or 10 heartbeats) fail in a row, the datastore is marked as unavailable and the VMs crash.
    This means that the NFS datastore can be unavailable for a maximum of 125 seconds before being marked unavailable, which covers the large majority of NetApp failover events.
    During this time period, a guest sees a non-responsive SCSI disk on the vSCSI adapter. The disk timeout is how long the guest OS will wait when the disk is non-responsive. Use the following procedure to set operating system timeouts for Windows servers to match the 190-second maximum set for the datastore:
    1. Back up your Windows registry.
    2. Select Start>Run, type regedit.exe, and click OK.
    3. In the left‐panel hierarchy view, double-click HKEY_LOCAL_MACHINE, then System, then CurrentControlSet, then Services, and then Disk.
    4. Select the TimeOutValue and set the data value to 190 (decimal).

No comments:

Post a Comment

Thanks for your comment!