Enabling ‘Latency-Sensitivity’ option on VMs – Should I do it?
I have had a number of customers ask me about this setting in the past few weeks, including a sitdown with a customer at VMworld Las Vegas. I then saw another question about this on Twitter and decided I should finish my post about this setting so everyone can benefit.
What is ‘Latency-Sensitivity’ and where do I set it?
This setting is found in the VM settings under ‘Edit Settings > VM Options > Latency Sensitivity’.
This setting is meant for workloads that are so latency sensitive that it requires an additional setting to try to squeeze as much performance and lowest latency out of the hardware as possible.
Using this setting will provide lower latency for the specific VM, as well as ensure that it will not be vMotioned by DRS or users manually (more on this below)
Are my applications Latency Sensitive?
I’ve honestly only seen a small handful of applications that were truly latency sensitive. The majority of those were in specific financial customers’ data centers, and telcos. These applications were so sensitive that the vMotion stuns affected them greatly, and the admins wanted to also prioritize these workloads over non-latency-sensitive workloads.
To be clear, regular VM’s response times are roughly 512 microseconds. Sometimes you’ll see 1 millisecond, but usually you are < 1ms. Below is a chart from the vSphere Performance Best Practices slide deck showing the difference in response times between a regular VM, a VM with SRIOV enabled, and a VM with both Latency Sensitivity and SRIOV enabled. As you can see, there is a decent difference between the three, but once again, this is in MICROsecond. If those additional 300 microseconds are required by your application to run correctly, then I would venture to say your application should use this setting.
This sounds awesome. I think I’ll do it for all my VM’s!
Well, it is awesome. But there is a catch. And this catch is one that you DEFINITELY should know about before enabling this setting. Let’s walk through what happens…
Here’s the catch
When you enable Latency-Sensitivity for a VM, vSphere will give it dedicated physical cores to vCPUs, which significantly reduces execution latency and jitter for that VM. However, CPU cores assigned to a VM cannot and will not be used by any other vCPUs EVEN IF THEY ARE CURRENTLY IDLE. This means that your VM density on that host is going to decrease, and the performance of the remaining VMs on that host may be affected too depending on how many VM’s there are and what they are running.
VMs that dedicate physical cores will no longer be able to be vMotioned, either by an admin or by DRS. This may be beneficial to the admin, but at the same time, it’s good to know what is going on and what the repercussions may be.
From the latency-sensitivity whitepaper:
“If a latency-sensitive VM uses VNICs, VMkernel I/O threads are used for network processing. Since VMkernel I/O threads are not allowed to run on the PCPUs exclusively owned by the VM, they become to share the remaining PCPUs with others. If those VMkernel I/O threads do not get enough CPU due to contention, the VM can experience a negative performance impact, even when all VCPUs are given exclusive access to PCPUs.”
So, not only does the VM dedicate physical cores for the workload, but the VM overhead will be run on other PCPU’s or PCores as well.
When VMXNET3 is used instead of SR-IOV with Latency-Sensitivity enabled, Latency-Sensitivity will automatically disable ‘VNIC interrupt coalescing and LRO’ which helps decrease the latency for the VM but can potentially add a higher CPU cost for the host as there is an increased packet rate and the processing of the packets themselves require resources.
And finally, you are told to set the memory and CPU reservations to 100% to provide better performance.
Best Practices (From the referenced white paper)
- Set Latency Sensitivity to High. This requires 100% memory reservation.
- Consider reserving 100% of the CPU. This guarantees exclusive PCPU access, which in turn helps to reduce VCPU halt/wake-up cost.
- Consider overprovisioning PCPUs. This reduces the impact of sharing the LLC (last level cache) and also helps to improve the performance of latency-sensitive VMs that use VNICs for network I/O.
- Consider using a pass-through mechanism such as SR-IOV to bypass the network virtualization layer, if the hardware supports one and virtualization features such as vMotion and FaultTolerance are not needed.
- Consider using a separate PNIC for latency sensitive VMs to avoid network contention.
- Consider using NetIOC, if a pass-through mechanism is not used and there is contention for network bandwidth.
- Consider disabling all power management in both the BIOS and vSphere.
Have I scared you away from using this setting? Hopefully not. Hopefully I’ve just helped you understand the implications of this feature and ensure that you do not try to use this for applications that really, truly are not latency-sensitive. If you do have these types of VM’s, use this sparingly and realize that it can take more than just checking this box to get the performance you are looking for.