After using NFS-based storage in my home lab for a number of years, I wanted to try out vSAN. This was to play with the technology itself, as well as to make my storage a little faster where needed for some VM’s.
I decided to go all-in and put a small all-flash configuration together with all hosts using a 120Gb SSD for the cache tier and a 500Gb SSD for the capacity tier in a single pool. This went swimmingly, and the majority of my VM’s now live on vSAN.
Something I noticed after implementing vSAN, is that I was starting to get intermittent memory usage alerts for the hosts, usually when the host had the memory-heavy VCSA sitting on it. During my initial quick look into the alert, I was a little baffled as vCenter was reporting less than 8Gb of VM memory usage on the host with the alert. As the hosts in my home lab all have 16Gb of RAM there was no immediate reason that a host with what appeared to be 8Gb of free RAM should generate a memory usage alert.
Initially, I thought that there must be a physical memory issue and so I ran a number of memory testing utilities such as Memtest86+. All of these tests came back successful with no errors reported.
I then SSH’d directly into a host exhibiting the memory usage alert and used esxtop to try and see if there were any processes or drivers with large memory usage that might indicate a memory leak or something similar. Nothing appeared out of the ordinary here so I was back to square one.
As the only major change recently implemented was vSAN I then started looking into the vSAN kernel memory usage and came across VMware KB article 2113954 which details vSAN memory usage in ESXi 6.0U3 and later. Now, the majority of VMware hosts I deal with in my professional life have 768Gb+ RAM so I had never really worried about the (comparatively) small amount of RAM used by the various kernels running on ESXi. What this KB article showed is that in a home lab environment with only 16Gb per host I really need to be more aware of this.
The calculation from the KB article to determine memory usage is:
BaseConsumption + (NumDiskGroups * (DiskGroupBaseConsumption + (SSDMemOverheadPerGB * SSDSize))) + (NumCapacityDisks * CapacityDiskBaseConsumption)
To put the equation in real terms for my home lab with 120Gb SSD for the cache tier and 1x capacity disk:
5426 MB + (1 * (636 MB + (14 MB * 120))) + (1 * 70 MB )
5426 MB + 2386 MB = 7812 MB
As you can see, the vSAN kernel is taking up almost half of the host’s RAM! I had now found why the host memory usage alerts were being generated, as combining the vSAN kernel usage with a bunch of VM’s meant the hosts were sometimes hitting the default vCenter warning and alert levels of 90% and 95% respectively.
I also came across a wonderful vSAN Memory Consumption Calculator created by Marco van Baggum after he ran into a similar issue which helps take the hassle out of the above calculations.