VMworld 2016 Roundup: Day 3
vSphere 6.x Host Resource Deep Dive [INF8430]
Frank Denneman and Niels Hagoort delivered a refreshingly deep look at host resources in vSphere. When Frank was working for PernixData (quick aside: if you haven’t heard yet, Frank has recently re-joined VMware), he had access to 90k hosts’ worth of data. He was able to identify some trends and patterns from this data. For instance, the most common number of sockets in a host was two, and there were a log of 512GB and 384GB memory configurations. It’s important to remember that the DIMMs themselves are connected to CPU channels, and each DIMM is “local” to it’s connected CPU and as such, access is considered local. If the CPU accesses memory connected to another CPU, this is considered to be remote access. The latency between CPU and memory differs depending on whether the memory is local or remote to the CPU. This is the basis for NUMA, the Non-Uniform Memory Access method. It’s non-uniform because of the differences between local and remote.
From here on Frank went rapid-fire through a series of points, all of which seemed very relevant to me, however I wasn’t able to absorb it all efficiently to take comprehensive notes. I’ll need to brush up on Frank’s blog posts for that.
Where I came back in, and things started clicking again, was during Frank’s discussion of memory channels. Most enterprise CPUs these days, well, at least Intel’s, have a maximum three DIMMs per channel. If you populate channel DIMMs one and two, everyone will work as you expect it to. When you being to populate the third channel DIMM, you’ll see a reduction in bus speed which varies from vendor to vendor. Frank has determined that, based on this, the best price/performance ratio you can achieve on a system with a pair of 12-core processors is 16 x 32GB DIMMs, for a total of 512GB. This configuration allows you to populate only the first two DIMM slots on each memory channel, thus avoiding any reduction in performance. This also allows you to purchase RDIMMs and not LRDIMMs, which would result in a potentially significant cost savings.
As a best practice, you should align your vCPU configuration with the pCPU technology and core count. Use vcpu.numa.preferHT if need be, and make sure you use cores/socket correctly.
The next resource deep dive was networking. Note that VXLAN adds an additional layer of packet processing. If it’s supported, VXLAN offload can speed things up by accelerating the packet processing, but make sure that you test it sufficiently first to ensure that it works with your hardware. For instance, there was a scenario previously where a VMware driver caused problems with VXLAN offload was enabled. Also be sure to use vendor specific drivers for your network interfaces when available as they’ll likely be more efficient than generic drivers.
If you want to take advantage of NetQueue, you need to have VMDq available and enabled on your NVMe device. RSS provides similar performance improvements as NetQueue, however it follows a different approach. Note that RSS also requires hardware support. If you don’t have RSS for VXLAN enabled, you will be restricted to a single thread per pNIC. With RSS you can have multiple threads per pNIC. This has to be manually configured as it requires some comprehensive understanding of the hardware capabilities used and a determination whether this will properly provide a performance boost in your environment. The particular chipset on the pNICs can make a large difference. For example the Intel X520/540 scales RSS on VXLAN outer header info, while the Intel X710 scales RSS on VXLAN inner and outer header info. This will impact the performance gains available. You can also configure a second transmit (Tx) thread on your vNIC, again manually.
So while your transmit (Tx) and receive (NetPoll) threads can be scaled, you’ll need to take the extra CPU cycles required to help process the additional network I/O into account to see if these potential efficiencies are valuable in your environment.
Both Frank and Niels were engagement and knowledgeable speakers. They have a forthcoming book called “vSphere 6.x: Host resources deep-dive” that will dive further into these topics. Be sure to keep an eye out for it.
vSphere Encryption Deep Dive: Technology Preview [INF8856]
Mike Foley & Salil Suri gave us the inside scoop on a feature that may, or may not, make it into a future version of vSphere. Or not. vSphere encryption is considered a technical preview. As such, it may, or may not make it into a final product. There are no guarantee’s and VMware’s not ready to make any commitments one way or another. Personally though, I’d be surprised if this feature didn’t show up relatively soon. Say, within a future release of vSphere.
More positions are open on the security team than anywhere else in the whole company. Security of company assets is job #1.
– VMware Customer
Based on the above quote, it’s clear that security is a major priority for VMware customers. Maybe not to the exact same extend as the quoted customer, but still extremely important. When building out a security strategy, you have to think about defense in depth. VMware’s putting R&D efforts against secure access, secure infrastructure, and secure data. The vision is for comprehensive security. Where access, infrastructure, and data are secured for both traditional apps as well as cloud-native apps, and end-to-end data center protection is achieved. This has to be accomplished in a way that makes security easy to manage. For example, implementing encryption “below” the guest, rather than within the guest.
vSphere encryption will be applied via storage policies. Define a storage policy that specifies that encryption is required, and once that policy is applied to a VM, CPU-level encryption will be enforced on that VM. The current approach leverages AES-NI and XTS-AES256, ensuring that modern ciphers and protocols are in use. It’s important to note that, as this is “below” the guest, the guest OS is ignorant of the encryption. It doesn’t matter which guest OS is in use, or the hardware underneath.
So how does it work? vCenter requests a tenant key from the customer’s third-party KMS solution. The tenant key is passed to the hosts and is used to generate per-VM keys for those VMs whose storage policy says they should be encrypted. The tenant key is distributed to all hosts in an HA cluster, ensuring that if there’s a host failure that a target host will be able to read and restart an encrypted VM. The VM keys are generated per-VM and are themselves encrypted and stored in the appropriate VM’s VMX file.
Note that the granularity with which you can apply encryption to a VM from a storage policy increases significantly if you use PowerCLI versus the GUI. With PowerCli you can set policy against just the VM home, for instance, and not the VM’s VMDK’s. This shows the flexibility available with vSphere encryption, and could be useful for use cases were in-guest encryption is already in use within a guest, rendering “double-encryption” redundant.
The model that is being approached in designing vSphere encryption assumes that a company’s security team will be responsible for the third-party KMS, and that a subset of administrators will be responsible for managing guest encryption within vSphere. The right to (de)encrypt can be assigned or denied within vSphere to limit which admins or sub-groups of admins are allowed to create and apply encryption based storage policies. Due to the model in use, only the holders of the tenant key, in this case the security team, will be able to decrypt VMs away from the vSphere environment. This approach should provide a reasonable degree of security and tenant key integrity, with the security team responsible for deciding how best to protect the tenant key. While vSphere encryption is fully KMIP 1.1 compliant, there will be a number of key management servers that are explicitly tested and likely added to the HCL for this feature.
So what happens when keys expire? New encryption operations will be blocked, however VMs encrypted with expired keys will continue to work while they remain powered on. To be on the safe side, once a key expires on the KMS you should immediately re-key. Note that because of the approach used by vSphere encryption, your third-party KMS solution is going to have to be highly available. If the KMS service crashes then vSphere hosts can no longer boot encrypted VMs, however already running encrypted VMs will continue to run and HA will continue to work. Be sure to check with your KMS vendor to see what options and architectures area available for availability, clustering, DR, etc.
Some points of note, core dumps must now always be bundled with a password, as they will contain sensitive data related to encryption. Also, SAN-based backups are unsupported in their current approaches. If your backup solution has a proxy VM, it will have to be encrypted in order to access encrypted VMs. Also the backup user must have the Cryptographer.DirectAccess permission (that will be available if/when vSphere encryption becomes available) in order to read those encrypted VMs. Make note, that once the backup solution picks up the data from an encrypted VM that vSphere encryption is no longer enforced. It’s up to the backup solution to encrypt the backup data as necessary. Make sure that there’s a storage policy ready to re-encrypt and restored VMs coming back out of your backup solution.
As a best practice do not encrypt your VC, VCSA, or PSC VMs. This would become a chicken and egg situation where a scenario could develop where vCenter was unavailable to participate in the vSphere encryption process and therefore unable to be read and run itself.
Last, but not least, vSphere encryption will support vMotion encryption. You can chose to set, via policy again, to encrypt data during a vMotion. This is not the network connection being encrypted, mind you, only the data traversing the network as part of a vMotion. The vMotion encryption can be specified independently of VM encryption, so you can chose how to apply each based on your needs. Similarly to the VM encryption, a vMotion specific key is generated. In this case, however, the key is specific to each vMotion session, and once the vMotion is complete the key is destroyed immediately. This means that post-vMotion there’s little risk of someone decrypting the encrypted vMotion traffic.
Based on the nature of how vSphere encryption is engineered, I’m hopeful that we will see this feature relatively soon. It stands to allow companies a way to easily and transparently apply encryption at rest for VMs and in flight for vMotion that should ease tensions between the security and virtualization operations teams as it will be easy and relatively painless to do the right thing.