February 10, 2014

Top Performance Tuning Tips for Windows Server 2012 R2 Hyper-V

Microsoft has published their Performance And Tuning Guide For Windows Server 2012 R2 on TechNet. This guide includes information for different workloads and roles, including Hyper-V. In this article, I will summarize and discuss some of the included suggestions on enhancing performance in Windows Server 2012 R2 Hyper-V.

Windows Server 2012 R2 Hyper-V Performance Tips

Host Management OS

The host management OS should be the management OS and nothing but the management OS of that host. It should not be a domain controller, it should not be a file server, it should not be a database server… you get the idea. If you need additional services to run on a physical server, then install them into virtual machines that will run on the host.

You can install management features such as monitoring agents, virtualization management, job management, backup agents, and so on, as long as those agents will not be heavy consumers of processor, storage IO, or memory.

Limit who can log into a host. Local administrator rights on virtualization hosts (any vendor) should be limited to a subset of the IT department, just as the domain admin right it. If you need to grant rights then use System Center Virtual Machine Manager and delegate the subset of rights that are required to an Active Directory group that contains the selected people.

You should also try to avoid logging directly into a host. Use tools that are installed onto your PC, such as the free Remote Server Administration Toolkit or the System Center consoles.

When installing the paid-for version of Windows Server you can choose to follow Microsoft’s recommendation that you use a core installation with no GUI. They say it has:
  • A smaller attack surface: Are you surfing the web from your host? Is your host naked on the Internet? If so, you deserve to be hacked.
  • Fewer patches: I’m not concerned with the number of patches; I’m concerned about the number of reboots which is probably still one per host per month. Cluster Aware Updating (CAU) removes that concern and this issue.
  • Consumes less RAM: You’ll probably save a couple of hundred MB, which is nothing when your host has 128 GB to 4 TB RAM.
  • Consume less disk space: Your management OS is installed on a pair internal disks (RAID1) in the host. I’m not concerned about saving maybe 6 GB of disk space when the smallest disk I can probably still buy is 300 GB in size. The rest of the space is not going to be used.
To use a full installation of Windows Server for the management OS. This means that if something goes wrong on the host I can still use the OEM’s tools to determine the cause and I don’t need to search the web for command prompt or PowerShell commands.

Microsoft does allow us to switch between core and full GUIs, but this requires a reboot, and reboots of physical servers take a long time.

Host Processor

Try to purchase a processor with a large L2/L3 cache. Haswell processors will include an L4 or Last Level Cache (LLC). These caches can make a large difference in workload performance, offering more than pure gigahertz horsepower will add.

Each host processor has a number of logical processors (LPs). For example, an Intel Xeon with 8 cores and Hyperthreading enabled will have 16 LPs. A 16 core AMD processor will have 16 LPs (no Hyperthreading). WS2012 R2 Hyper-V allows for a total of 320 logical processors in a host. Note that just 64 will be used by and be visible in the management OS, but the others are actually used by virtual machines, despite not being visible in the host. The old LP to virtual processor (VP) ratio limits of 8:1 (server workloads) or 12:1 (VDI workloads) are no longer applicable since WS2012; instead you can place as many virtual machines onto a host up to the maximum number of virtual machines (1024) or the host’s maximum performance capacity (dependent on virtual machines’ loads).

Host Memory

RAM is usually the first bottleneck for adding virtual machines to a host. You can have up to 4 TB RAM in a single host. That is theoretical for the vast majority of us! Enabling Dynamic Memory in appropriate virtual machines will improve the utilization of RAM and reduce host capital and operational costs. You might consider low-power RAM if electrical costs are a driving force for your organization. However, most will opt for faster RAM to get the most from their hosts.

It makes sense from a cost perspective to add as much RAM into a host as possible. However, these can mean using more expensive DIMMs that are larger. It can also mean that you will put too many eggs into one basket. Try to balance capital costs with the placement of redundant VMs in too few hosts.

Some tricks from older versions of Hyper-V should not be applied since WS2012:
  • Paging File: The management OS of a Hyper-V host will not need a large paging file, even though the host might have up to 4 TB RAM. Do not manually configure this file; leave it with the default settings. VMware admins: please note that Hyper-V does not do second-level paging like ESXi does, so there is no need to put the paging file onto special disks.
  • Host Memory Reserve: In W2008 R2 SP1 hosts we often reserved RAM on the host to prevent Dynamic Memory from draining the management OS of RAM. WS2012 and later will manage this dynamically. WS2012 R2 is actually more conservative. Do not manually configure this registry setting. SCVMM admins: Note that the host memory reserve that you have is a different setting and has to do with the placement of VMs on hosts and nothing to do with Dynamic Memory; that is, it is irrelevant to this advice.
Host Disk

The disks (RAID1) that you need for the management OS and paging file of the host do not need to be fast. Their only purpose is to boot up the management OS and page for the RAM that is used by the management OS.

On the other hand, the disks used to store the virtual machines should be fast. RAID10 is strongly recommended for performance. Please don’t be the person who would install database log files onto RAID5 arrays. In fact, in Storage Spaces, you should use 2-way mirrored or 3-way mirrored virtual disks (storage spaces or LUNs); parity disks are only supported for archive workloads.

On non-clustered hosts, the management OS should be on a different array (RAID1) to the virtual machines (ideally RAID10). This gives you flexibility (for when Murphy’s Law comes a-calling) and performance.
Host Networking

1 Gbps networking is fine for many. 10 Gbps is better again. And RDMA-capable networking is the best.

If you want to aggregate NICs and have network path fault tolerance then you should use NIC teaming. Third-party NIC teaming is not supported by Microsoft, so take all your issues (and expect them) to your hardware vendor if you use it instead of the completely supported NIC teaming that has been included since WS2012. In WS2012 R2 you should use the Dynamic load-balancing mode because it offers the best of Hyper-V Port and Address Hashing.

The tips for best performance and stability:
  • Get the latest drivers from the manufacturer. This is Dell if you bought the NIC from Dell, HP if you got the NIC from HP, Chelsio if you got it from Chelsio, and so on. Do not use generic Intel/Broadcom drivers, and do not use the driver that ships with Windows Server – that driver exists only to let you get the correct latest and most stable driver.
  • Install all the latest firmwares. I don’t care if you just bought the server directly from the manufacturer. Check for a newer version of firmware for everything that is included in the server, the extra HBA, the additional third-party NIC, or whatever. They really do make a difference.
Doing the above prevents the majority of network related issues in all versions of Hyper-V.



Virtual Machine Processors

You should not add lots of virtual processors (vCPUs) to virtual machines just because you can. A virtual machine with 16 vCPUs will occupy 16 LPs on the host even if it only has one thread to execute. Assign a small number of vCPUs and monitor the performance of the VM. A good metric to watch in virtual machines with multi-threaded services (such as SQL Server) is CPU Run Queue Length. High values will indicate that the guest OS is having processor latency while trying to queue up processes to execute. Adding more vCPUs will resolve this issue.



Virtual Machine RAM

You should try to use Dynamic Memory as often as possible, but make sure that the services and guest OS will support it. For example, older versions of SQL Server Standard and all versions of the Exchange mailbox role do not support Dynamic Memory.

I try to set the Startup RAM in my templates to 1024 MB. This is typically the minimum amount of RAM required to install software. Starting VMs with this amount of RAM will avoid needless calls to helpdesk when delegated users (such as in a cloud) try to install SQL Server and the setup only detects 728 MB RAM (preventing the install). Set the maximum RAM to be a realistic value; the default of 1 TB is crazy; applications with memory leaks will eat up that memory (up to the maximum the host can offer), and this will cause issues in clouds where cross-charging is enabled.

WS2012 (and later) hosts support virtual machines with up to 1 TB RAM. When you have large memory workloads you need to consider how the guest (or physical) OS will schedule processes and assign RAM. Physical hardware has hard boundary Non-Uniform Memory Access (NUMA) boundaries. Spanning those boundaries degrades performance. Hyper-V has the ability to do virtual NUMA where the guest OS (Linux or Windows) is made aware of the NUMA hardware that it is occupying. This allows the guest OS to schedule processes efficiently, placing them in the same NUMA node as their assigned RAM. This is important for large memory workloads that require great performance. Virtual NUMA is disabled if you enable Dynamic Memory in a virtual machine. Therefore you should use static memory in large memory virtual machines if NUMA node spanning will unacceptably degrade the performance of those virtual machines.

Virtual Hard Disks

Always use VHDX files instead of VHD files. There is one exception to this: Windows Azure IaaS does not support VHDX files yet, so you will need to stick with legacy VHD if you plan on moving virtual machines to Microsoft’s public cloud.

If you are using Hyper-V Replica then move the paging file of Windows guest OSs to a dedicated virtual hard disk. This will allow you to avoid needless replication of this constantly changing file.

Be sure to place all data in a virtual machine into virtual hard disks that are attached to the virtual machine’s virtual SCSI controller. This will allow the best possible performance and manageability. For example, live VHDX resizing is only possible with VHDX files that are connected to SCSI controllers.

There is a split in the Hyper-V community regarding the type of virtual hard disk to use: fixed size vs. dynamic. Fixed consumes more space but is less prone to fragmentation. Performing a defrag on a CSV requires redirect IO. I always prefer to use fixed size virtual hard disks for production server workloads.

You can configure Storage QoS to limit the number of IOPS any virtual hard disks are starving others of storage bandwidth (IOPS).

Virtual Machine Configuration Tuning

There are some tweaks you can do to your virtual machine templates to reduce the amount of hardware that is included when a new machine is deployed. This will have a small beneficial effect on the host, but for some, every little counts. There are also some tips for improving the performance of the guest OS and the virtual device drivers. According to Microsoft you should:

    * Install the latest version of the Virtual Machine Integration Services.
    * Remove the emulated network adapter through the virtual machine settings dialog box (use the Microsoft Hyper-V-specific adapter).
    * Remove unused devices such as the CD-ROM and COM port, or disconnect their media.
    * Keep the Windows guest operating system on the sign-in screen when it is not being used and disable the screen saver.
    * Review the scheduled tasks and services that are enabled by default.
    * Review the ETW trace providers that are on by default by running logman.exe query -ets
    * Improve server applications to reduce periodic activity (such as timers).
    * Close Server Manager on both the host and guest operating systems.
    * Don’t leave Hyper-V Manager running since it constantly refreshes the virtual machine’s thumbnail.

In the case of VDI workloads, Microsoft also says that you should:

    * Disable background services such as SuperFetch and Windows Search
    * Disable scheduled tasks such as Scheduled Defrag
    * Disable Aero glass and other user interface effects

Guest OS Type

The choice of full versus core GUI is more relevant when it comes to the virtual machine guest OS. The sum total of your virtual machines consume lots of RAM. They also consume lots of disk space, and that’s probably expensive disk space in the case of a Hyper-V cluster. If you have the skills and your services support it, then using a core installation for the guest OS can offer great savings. Just be sure that you know or have documented how to resolve issues if they happen.

Just like with the host’s management OS, you should limit your direct console activity with the guest OS, and use tools that are installed on your PC such as the Remote Server Administration Toolkit or System Center. It only takes a few minutes to install the tools and it makes day-to-day administration a lot easier.

No comments:

Post a Comment