Best Practices for Oversubscription of CPU, Memory and Storage in

Best Practices for Oversubscription of CPU, Memory and Storage in
Best Practices for Oversubscription
of CPU, Memory and Storage in
vSphere Virtual Environments
How far can oversubscription be taken safely?
Written by Scott D. Lowe
Abstract
One of the benefits of virtualization is that it enables
administrators to efficiently share host resources among
different applications. In fact, administrators often
oversubscribe the physical resources on a host in order to
maximize the number of workloads that can run on a host. But
how much oversubscription is too much? This paper discusses
what oversubscription is and why it is used, explores the pros
and cons of the practice, and proposes some ideas about the
point at which oversubscription becomes dangerous
Introduction
Virtualization enables data centers to focus on the needs of
business applications, but it can make resource allocation
more challenging.
One of the great features of virtualization is the ability to run
many disparate workloads on a single host server, thereby
maximizing the utilization of that host server. In doing so,
organizations have been able to reinvent the modern data
center. Whereas data centers of ten years ago tended to be
server-centric places, modern data centers revolve around the
needs of line-of-business applications, including ensuring that
those applications remain highly available and able to survive
the loss of host servers.
Virtualization has changed the data center dynamic in many
other ways as well. While workloads used to be confined to
the hardware on which they were originally installed, in a
modern data center, workloads are fluid; they flow from host
to host based on sets of administrator-defined rules as well
as in reaction to changes in the host environment. The fluidic
nature of the modern data center has added new challenges to
resource allocation, but over the years, both free and paid tools
have been introduced to assist administrators in their resource
planning efforts.
Sizing resources in a
virtualized environment
Administrators can
oversubscribe the
physical resources
on a host in order
to maximize
the number of
workloads that can
run on it. In other
words, they can
assign to virtual
machines, in
aggregate, more
resources than are
actually available on
the host.
Virtualization makes hardware sizing
more complex—but offers opportunities
for improving data center efficiency.
The rise of virtualization has also enabled
the use of hardware in ways that were
never envisioned even just ten years ago.
In those days, administrators purchased
servers sized to support the peak needs
of a single application, and that sizing
included a projection of the resources
the application would likely need over
the life of the server hardware. Because
many servers were deployed with just
a single application, resource planning
was relatively simple. In modern data
center environments, which are heavily
virtualized, resource planning takes on
new complexity: because a wide array
of I/O patterns will be present on single
pieces of hardware, administrators need
insight into how individual applications
interact with the rest of the environment.
This blending of I/O in a heavily
virtualized environment has also created
a significant opportunity for efficiency in
the data center. Whereas administrators
used to size individual servers based
on the needs of a single application,
the mixed nature of I/O in a virtual
environment enables sharing of resources
with different peak needs. As a result,
there is opportunity for administrators
to very efficiently share host resources
among different applications.
Overprovisioning is useful—but how
much is too much?
Even with virtualization, administrators
still sometimes overprovision resources
and size individual virtual machines
(VMs) to meet peak demands, so there
are often resources that go unused in
a virtual machine. vSphere provides a
number of powerful methods for sharing
idle resources with other running
workloads. In addition, administrators
can oversubscribe the physical resources
on a host in order to maximize the
number of workloads that can run on
a host. In other words, they can assign
to virtual machines, in aggregate, more
2
resources than are actually available on
the host. For example, suppose a host
has 96 GB of physical RAM. Under the
right circumstances, an administrator
might assign 128 GB of RAM to all of the
virtual machines running on that host.
But just how far can this oversubscription
be taken? The limits depend on a
number of factors. This paper discusses
oversubscription in general, explores
its pros and cons, and proposes
some ideas about the point at which
oversubscription becomes dangerous.
Resource management and
oversubscription
What is oversubscription?
Oversubscription in vSphere refers
to various methods by which more
resources than are available on the
physical host can be assigned to the
virtual servers that are supported by
that host. In general, administrators
have the ability to oversubscribe
processing, memory and storage
resources in virtual machines.
Refusing to oversubscribe resources is the
safest choice, but often wastes resources.
Different administrators have
different opinions on the wisdom of
oversubscribing physical resources.
Many administrators prefer to assign
only those resources that are physically
available to support all of the running
workloads. This is the safest option as
it ensures that, in general, all running
virtual machines will always have the
resources they need.
However, in days of physical servers, it
was not uncommon to find that physical
servers rarely made use of all of their
resources. From a processor standpoint,
utilization averaged only 5–15 percent,
meaning that there was a whole lot of
room for growth.
Oversubscription maximizes the value
of resources—but introduces risk of the
host not having enough resources to
service all its VMs.
While virtual machines are generally
more right-sized than their physical
counterparts were in the past, there is
still room to grow built in, especially
when particular workloads are idle.
Many administrators see this as an
opportunity to make use of those idle
resources in order to maximize virtual
machine density on a host. However,
with oversubscription, administrators are
basically assigning to virtual machines
more resources than are actually
available on the host. In other words,
if all of the virtual machines suddenly
requested access to all of their allocated
resources, the host would not have
enough resources to service the needs.
Resource oversubscription, while it does
increase virtual machine density, carries
with it some risks. Once a particular
resource is finally exhausted, if that
resource happens to be oversubscribed,
stability issues can occur and major
performance problems can be
introduced affecting all of the workloads
running on the host server.
Before we discuss more about
oversubscription, it’s important to
understand vSphere manages the three
basic types of resources:
• Processing resources
• Memory resources
• Storage resources
How vSphere manages
processing resources
How physical resources are represented
on a vSphere host
In vSphere, administrators assign CPUs
to virtual machines in order to support
the workload needs of each individual
virtual machine. These virtual processing
resources are pulled from the host’s
available physical CPUs. The number of
physical CPUs that are present in hosts is
dependent on a couple factors.
3
In vSphere, a physical CPU (pCPU)
refers to:
• When hyperthreading is not present or
enabled: A single physical CPU core
• When hyperthreading is present and
enabled: A single logical CPU core
Here are two examples:
• If a host has two eight core processors and
hyperthreading is either not supported or
not enabled, that host has sixteen physical
CPUs (8 cores x 2 processors).
• If a host has two eight core processors
and hyperthreading is enabled, that host
has thirty-two physical CPUs (8 cores x 2
processors x 2 threads per core).
How those resources are presented to
virtual machines
In a virtual machine, processors are
referred to as virtual CPUs (vCPUs).
When an administrator adds vCPUs to a
virtual machine, each of those vCPUs is
assigned to a pCPU, although the actual
pCPU may not always be the same.
There must be enough pCPUs available
to support the number of vCPUs
assigned to an individual virtual machine
or that virtual machine will not boot.
However, that doesn’t mean that
administrators are limited to just the
number of pCPUs in the host. On the
contrary, there is no 1:1 ratio between
the number of vCPUs that can be
assigned to virtual machines and the
number of physical CPUs in the host.
In fact, as of vSphere 5.0, there is a
maximum of 25 vCPUs per physical core,
and administrators can allocate up to
2,048 vCPUs to virtual machines on a
single host.
How vSphere manages
memory resources
vSphere uses a number of techniques
to maximize the use of RAM in a
virtual environment:
• Transparent page sharing (TPS)—In most
virtual environments, administrators run
many copies of the same operating system.
In these cases, there is a lot of duplication
of memory pages in host memory.
Transparent page sharing is basically a
Oversubscription
is an opportunity
to maximize VM
density on a host.
But it introduces
risks: if all of the VMs
suddenly request
access to all of their
allocated resources,
the host will not
have enough
resources to service
the needs.
The number of
vCPUs that can be
assigned to VMs is
not limited to the
number of physical
CPUs in the host.
4
form of memory deduplication—vSphere
combines multiple identical memory pages
into just one and frees the remaining
pages up for other uses. TPS has an almost
imperceptible impact on the performance
of the host.
• Memory ballooning—When the VMware
Tools are installed inside a guest virtual
machine, a memory balloon driver is
installed as well. This driver acts as a
Windows process, and the OS can use its
normal memory management techniques
to assign idle or unused memory pages to
the driver. The balloon driver then “pins”
those pages and reports this back to the
hypervisor. If the host becomes low on
physical memory, guest memory pages are
assigned to the balloon driver, and the host
can then reclaim these memory pages in
order to address the needs of other virtual
machines that may need the RAM.
In this way, when a particular virtual
machine has RAM to spare, it can
transparently share that RAM with other
virtual machines on the same host,
enabling the host to achieve yet higher
levels of VM density. Whereas TPS is a
memory deduplication technique, the
ballooning process brings to RAM a sort of
thin provisioning capability. The ballooning
process does require some processing
overhead, which is usually imperceptible
in the performance of the guest and host.
However, in extreme cases, ballooning can
cause swapping inside the OS.
• Memory compression—Introduced
in vSphere 4.1, memory compression
can, in some cases, replace the costly
swapping process. With this technique,
rather than memory pages being swapped
to disk on a per-VM basis, the memory
pages are compressed and placed into
a compression cache on disk. When the
need arises to swap to return to RAM a
page that would have been swapped, the
page is instead retrieved from the cache
and uncompressed. While this process is
less costly than swapping to disk, it does
still carry something of a performance hit.
• Swapping to disk—Swapping to disk
is the hypervisor’s last-ditch effort to
retrieve enough physical RAM to satisfy
the needs of workloads running on a
host. Swapping is a process by which the
hypervisor moves the least used memory
pages to disk. Those memory pages are
still accessible, but when they are required,
they must be retrieved from disk. Swapping
will noticeably degrade the overall
performance of the host.
Because vSphere’s other memory
management techniques are so good,
swapping usually takes place only on
seriously overcommitted hosts, although
swapping can also be caused by resource
pool constraints or memory limits
configured on a virtual machine. In addition,
if a VM does not have VMware Tools
installed or VMware Tools is not running,
the ballooning process would get skipped
completely and the system will go straight
to swapping.
It should be noted that neither swapping
nor compression take place unless
there is a memory contention issue on
the host, or in the situations discussed
with regard to swapping. In most
environments, memory contention
issues that result in swapping or
compression should be avoided since
this situation means that the host has
basically run out of RAM.
How vSphere manages
storage resources
Storage is the third piece of the
resource puzzle in a vSphere
environment. Storage resources can
also be oversubscribed through what
have become very common resource
allocation techniques.
Thin provisioning is the most common
technique for overprovisioning
storage resources.
The most common technique for
overprovisioning storage is a process
known as thin provisioning. In many
cases, when an administrator allocates
storage to a virtual machine, more
storage than is absolutely necessary
is allocated. After all, it’s reasonable
to expect that the virtual machine will
continue to need additional disk space
as time goes on.
Thin provisioning operates as follows:
When an administrator provisions the
total disk space for the virtual machine,
the virtual machine is told that it has
access to the entirety of the allocated
space. In reality, however, vSphere gives
the virtual machine only the space that it
is actually consuming. For instance, if an
administrator allocates 200 GB to a new
virtual machine, but that virtual machine
is using only 40 GB, the remaining
160 GB remain available for allocation
to other virtual machines. As a virtual
machine requires more space, vSphere
provides additional chunks to that virtual
machine, up to the size of the disk that
was originally allocated.
By using thin provisioning, administrators
can create virtual machines with
virtual disks of a size that is necessary
in the long term, without having to
immediately commit the total disk
space that is necessary to support that
allocation. In many tests, it has been
shown that thin provisioning carries with
it only a very slight—almost negligible—
performance impact. Accordingly, thin
provisioning has become a common,
acceptable and often recommended
method for managing storage capacity.
Some storage devices have additional
features, such as data compression and
deduplication, that enable additional
levels of oversubscription. For the
purposes of this paper, however, the
focus is on the hypervisor, so only thin
provisioning will be discussed.
Getting insight into resource usage
in your environment
Now that we have see how resources
are managed in a vSphere environment,
let’s move on to oversubscribing
those resources. In order to determine
whether resources are overcommitted,
the administrator needs a monitoring
tool such as the free vOPS™ Server
Explorer tool from Dell®. One of its
several utilities is Environment Explorer,
which provides administrators with a
high-level view of resource usage in
the environment. As shown in Figure 1,
Environment Explorer shows resource
utilization as a percentage of actual
physical resources, making the utility
a perfect fit for exploring the topic of
resource over-commitment.
Now let’s explore some guidelines for
oversubscribing each of the three types
of resources: processing, memory
and storage.
Figure 1. Environment Explorer (part of the free vOPS Server
Explorer tool) provides a high-level view of resource usage.
5
Swapping to disk
is the hypervisor’s
last-ditch effort to
retrieve enough
physical RAM to
satisfy the needs of
workloads running
on a host. Swapping
will noticeably
degrade the overall
performance of
the host.
Oversubscribing processing
resources
By using thin
provisioning,
administrators can
create VMs with
virtual disks of a size
that is necessary
in the long term,
without having to
immediately commit
the total disk space
that is necessary
to support that
allocation.
6
The common wisdom
As mentioned earlier, in vSphere 5, every
physical processor core can support
up to 25 vCPUs. However, for every
additional workload beyond a 1:1 vCPU
to pCPU ratio, the vSphere hypervisor
needs to invoke processor scheduling
in order to distribute processor time
to virtual machines that need it. For
example, if an administrator has created
a vCPU to pCPU ratio of 5:1, then each
processor is supporting five vCPUs.
Experts in the field disagree about the
proper rules of thumb when it comes
to vCPU to pCPU ratio. However, there
are two items on which just about
everyone agrees:
• Start with one vCPU per virtual machine.
Most experts agree that administrators
should create new virtual machines with
just one vCPU and add virtual vCPUs
as needs dictate. As vCPUs are added,
the virtual machine is tied to requiring
processor time from the host. Whenever
the virtual machine needs to perform an
operation, it has to wait for a number of
physical CPUs equal to the number of
assigned vCPUs to be available. So, as
administrators add more vCPUs to a virtual
machine, there is an increased risk of
poorer overall performance.
• The vCPU to pCPU ration is workload
dependent. While 1:1 vCPU to pCPU
assignment is sometimes advocated, other
ratios are common. Although vSphere
5 supports a ratio of up to 25:1, your
ability to achieve a high ratio will depend
on the kinds of workloads you need to
support. If the host is supporting lots of
virtual machines, each with only meager
processing needs, the vCPU to pCPU ratio
could be quite high. If, however, the host
is running a number of processor intensive
workloads, the ratio may be much smaller.
Metrics to watch
The following metrics will help you
maintain a vCPU to pCPU ratio that
makes efficient use of resources while
still allowing workloads to run well:
• Inside virtual machines
• CPU Utilization—When average CPU
usage remains high, it’s time to add an
additional vCPU to the VM.
• On the host:
• CPU Ready—CPU Ready is, by far, the
most important gauge of overall host
health standpoint with regard to CPU.
CPU Ready indicates the length of time
that a VM is waiting for enough physical
processors to become available in order
to meet its demands. For example, if a
VM is allocated four vCPUs, this metric
will show the length of time that the VM
waited for four corresponding pCPUs to
become available at the same time.
• CPU Utilization—The overall CPU
usage on the host server is also critical
because it enables an administrator to
understand just how much work the
host server is doing.
Real-world observations and advice
Virtual resources
4 vCPUs - 200 % of Actual cores
3 GB Memory - 37 % of Physical
26.6 GB Storage - 13 % of Provisioned
Figure 2. A lab with a vCPU to pCPU
ratio of 2;1
Online forums are filled with questions
from users requesting insight into
acceptable vCPU to pCPU ratios in a
real-world environment. While some
responses continue to advocate for a
1:1 ratio, from a pure density standpoint,
1:1 should be considered a worst-case
scenario. Figure 2 shows a lab with a
ratio of 2:1.
Some respondents indicate that they
have received guidance that suggests
no more than a 1.5:1 vCPU to pCPU
ratio, but guidance from industry experts
suggests that vSphere real-world
numbers are in the 10:1 to 15:1 range.
Still others indicate that VMware itself has
a recommended ratio range of 6:1 to 8:1.
The Dell white paper, “Demystifying
CPU Ready (% RDY) as a Performance
Metric,” establishes the following
vCPU:pCPU guidelines:
• 1:1 to 3:1 is no problem.
• 3:1 to 5:1 may begin to cause
performance degradation.
• 6:1 or greater is often going to cause
a problem.
In addition, keeping the CPU Ready
metric at 5 percent or below is
considered a best practice.
The actual achievable ratio in a specific
environment will depend on a number
of factors:
• vSphere version—The vSphere CPU
scheduler is always being improved. The
newer the version of vSphere, the more
consolidation that should be possible.
• Processor age—Newer processors are
much more robust than older ones, so
organizations with newer processors
should be able to achieve higher
processor ratios.
• Workload type—Different kinds of
workloads on the host will result in different
optimal ratios.
vScope Explorer, another utility included
in vOPS Server Explorer, can help
you investigate performance metrics,
including CPU Ready, at both the host
and virtual machine levels, so you can
determine whether your vCPU to pCPU
ratio is too high.
7
In addition, Environment Explorer
identifies where host processor
resources are overcommitted, so you’ll
know where perform additional analysis
to determine if that over-commitment
is causing performance issues. As the
“% of actual cores” metric begins to
surpass 500 percent, carefully monitor
CPU Ready and general workload
performance to ensure that business
needs are being met.
Oversubscribing memory resources
The common wisdom
Oversubscribing RAM is one of the more
controversial resource oversubscription
options. Whereas CPU and storage
resources are often overcommitted,
there seems to be some conservatism
when it comes to overcommitting RAM.
Metrics to watch
On a host server, administrators need to
watch the amount of RAM actually in use
by virtual machines. As the actual RAM in
use approaches 100 percent, either add
additional RAM to the server or migrate
workloads to hosts that have more
available RAM.
Real-world observations and advice
In order to maximize VM density and
ensure that the environment remains
operational, it’s important to monitor
the actual memory utilization. That’s
why Environment Explorer displays RAM
usage (the “% of physical” metric) using
the amount of RAM actually provisioned
to each virtual machine, rather than
the amount of RAM actually being
used by virtual machines once all of
vSphere’s various memory management
techniques are taken into consideration.
Environment
Explorer (part of
Dell’s free vOPS
Server Explorer
tool) provides a
high-level view of
resource usage.
The level of over-commitment possible
depends on one primary factor: how
much memory deduplication can take
place by virtue of the fact that there are
many similar workloads running on the
host. The greater the level of disparity
between running workloads, the less
memory consolidation that can take place
and the less density that can be enjoyed.
Here are some observations about what
others are doing and recommending with
regard to memory over-commitment:
Most experts agree
that administrators
should create new
virtual machines
with just one vCPU
and add virtual
vCPUs as needed.
• Many administrators refuse to
oversubscribe RAM at all.
• Some administrators prefer to not exceed
125 percent of physical memory, feeling
that going beyond that limit carries
unacceptable risk.
• If every workload on the server is identical,
much higher over-commitment levels
are possible.
• Many other administrators simply spotcheck host memory usage, but don’t
regularly scan for over-commitment levels.
Oversubscribing storage resources
The common wisdom
It has become commonplace to
oversubscribe storage resources using
thin provisioning. This technique offers
many benefits; the primary one is
maximizing the use of the organization’s
storage capacity. Plus, thin provisioning
also helps IT in two ways. First, it enables
administrators to give a virtual machine
all of the storage it will ever need
without having to constantly watch
to see if it needs more space. Second,
thin provisioning can reduce conflicts
between IT and other teams: application
owners can request all of the storage
they like and storage administrators—
knowing full well that the request is
too high—can simply grant the request
without worrying about wasting that
over-requested storage.
However, thin provisioning also carries
with it some challenges. While it can
make life easier on a daily basis, it
does add some complexity, and if
administrators aren’t careful, they can
introduce major availability issues. If the
storage oversubscription results in the
8
storage volume running out of space,
the VMs will still think they have available
disk space to use but there won’t be any
space available. This can cause a serious
outages that can result in data loss
and costly recovery. So if you use thin
provisioning, be sure to monitor carefully.
Metrics to watch
To mitigate the risks associated with
thin provisioning, you need to keep a
close eye on the amount of free space
in a datastore. As a datastore gets low
on space, proactively add space to the
datastore or use Storage vMotion to
move one of the virtual machines to
a different datastore that has enough
available capacity to serve the needs of
the workloads.
Real-world observations and advice
Thin provisioning is well represented
in Environment Explorer, although it’s
displayed only in aggregate. In the
example in Figure 2, 26.8 GB of storage
is currently in use—13 percent of what’s
actually provisioned to the three virtual
machines that are powered on. As the
“% of provisioned” metric approaches
100 percent, take care to ensure that
additional physical resources are
made available.
Conclusion
Overprovisioning of processing,
memory and storage can help you
maximize resource utilization in your
virtual environment. But you want to
overprovision in such as way that you can
also maintain high performance. Using
the techniques and real-world advice
presented in this white paper, along with
the right tools, you can balance these
needs and overprovision wisely.
For More Information
© 2013 Dell, Inc. ALL RIGHTS RESERVED. This document
contains proprietary information protected by copyright. No
part of this document may be reproduced or transmitted in
any form or by any means, electronic or mechanical, including
photocopying and recording for any purpose without the
written permission of Dell, Inc. (“Dell”).
Dell, Dell Software, the Dell Software logo and products—as
identified in this document—are registered trademarks of Dell,
Inc. in the U.S.A. and/or other countries. All other trademarks
and registered trademarks are property of their respective
owners.
The information in this document is provided in connection
with Dell products. No license, express or implied, by estoppel
or otherwise, to any intellectual property right is granted by
this document or in connection with the sale of Dell products.
EXCEPT AS SET FORTH IN DELL’S TERMS AND CONDITIONS AS
SPECIFIED IN THE LICENSE AGREEMENT FOR THIS PRODUCT,
About Dell
Dell Inc. (NASDAQ: DELL) listens to customers and delivers
worldwide innovative technology, business solutions and
services they trust and value. For more information,
visit www.dell.com.
If you have any questions regarding your potential use of
this material, contact:
Dell Software
5 Polaris Way
Aliso Viejo, CA 92656
www.dell.com
Refer to our Web site for regional and international
office information.
Whitepaper-vSphere-Environ-US-KS-2013-03-28
DELL ASSUMES NO LIABILITY WHATSOEVER AND DISCLAIMS
ANY EXPRESS, IMPLIED OR STATUTORY WARRANTY RELATING
TO ITS PRODUCTS INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR
A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. IN NO
EVENT SHALL DELL BE LIABLE FOR ANY DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE, SPECIAL OR INCIDENTAL
DAMAGES (INCLUDING, WITHOUT LIMITATION, DAMAGES
FOR LOSS OF PROFITS, BUSINESS INTERRUPTION OR LOSS
OF INFORMATION) ARISING OUT OF THE USE OR INABILITY
TO USE THIS DOCUMENT, EVEN IF DELL HAS BEEN ADVISED
OF THE POSSIBILITY OF SUCH DAMAGES. Dell makes no
representations or warranties with respect to the accuracy or
completeness of the contents of this document and reserves
the right to make changes to specifications and product
descriptions at any time without notice. Dell does not make
any commitment to update the information contained in this
document.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising