Resource Management Guide

Resource Management Guide
Red Hat Enterprise Linux 7
Resource Management Guide
Managing system resources on Red Hat Enterprise Linux 7
Milan Navrátil
Douglas Silas
Eva Majoršinová
Martin Prpič
Peter Ondrejka
Rüdiger Landmann
Red Hat Enterprise Linux 7 Resource Management Guide
Managing system resources on Red Hat Enterprise Linux 7
Milan Navrátil
Red Hat Custo mer Co ntent Services
mnavrati@redhat.co m
Eva Majo ršino vá
Red Hat Custo mer Co ntent Services
Peter Ondrejka
Red Hat Custo mer Co ntent Services
Do uglas Silas
Red Hat Custo mer Co ntent Services
Martin Prpič
Red Hat Pro duct Security
Rüdiger Landmann
Red Hat Custo mer Co ntent Services
Legal Notice
Co pyright © 20 16 Red Hat, Inc.
This do cument is licensed by Red Hat under the Creative Co mmo ns Attributio n-ShareAlike 3.0
Unpo rted License. If yo u distribute this do cument, o r a mo dified versio n o f it, yo u must pro vide
attributio n to Red Hat, Inc. and pro vide a link to the o riginal. If the do cument is mo dified, all Red
Hat trademarks must be remo ved.
Red Hat, as the licenso r o f this do cument, waives the right to enfo rce, and agrees no t to assert,
Sectio n 4 d o f CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shado wman lo go , JBo ss, OpenShift, Fedo ra, the Infinity
lo go , and RHCE are trademarks o f Red Hat, Inc., registered in the United States and o ther
co untries.
Linux ® is the registered trademark o f Linus To rvalds in the United States and o ther co untries.
Java ® is a registered trademark o f Oracle and/o r its affiliates.
XFS ® is a trademark o f Silico n Graphics Internatio nal Co rp. o r its subsidiaries in the United
States and/o r o ther co untries.
MySQL ® is a registered trademark o f MySQL AB in the United States, the Euro pean Unio n and
o ther co untries.
No de.js ® is an o fficial trademark o f Jo yent. Red Hat So ftware Co llectio ns is no t fo rmally
related to o r endo rsed by the o fficial Jo yent No de.js o pen so urce o r co mmercial pro ject.
The OpenStack ® Wo rd Mark and OpenStack lo go are either registered trademarks/service
marks o r trademarks/service marks o f the OpenStack Fo undatio n, in the United States and o ther
co untries and are used with the OpenStack Fo undatio n's permissio n. We are no t affiliated with,
endo rsed o r spo nso red by the OpenStack Fo undatio n, o r the OpenStack co mmunity.
All o ther trademarks are the pro perty o f their respective o wners.
Abstract
Managing system reso urces o n Red Hat Enterprise Linux 7.
T able of Cont ent s
T able of Contents
. .hapt
⁠C
. . . .er
. .1. .. Int
. . .roduct
. . . . . .ion
. . .t.o. Cont
. . . . .rol
. . .G. roups
. . . . . .(Cgroups)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . . . . .
⁠1.1. What are Co ntro l G ro up s
2
⁠1.2. Default Cg ro up Hierarc hies
2
⁠1.3. Res o urc e Co ntro llers in Linux Kernel
4
⁠1.4. Ad d itio nal Res o urc es
5
. .hapt
⁠C
. . . .er
. .2. .. Using
. . . . . .Cont
. . . . rol
. . .G. roups
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. . . . . . . . . .
⁠2 .1. Creating Co ntro l G ro up s
7
⁠2 .2. Remo ving Co ntro l G ro up s
8
⁠2 .3. Mo d ifying Co ntro l G ro up s
9
⁠2 .4. O b taining Info rmatio n ab o ut Co ntro l G ro up s
13
⁠2 .5. Ad d itio nal Res o urc es
16
. .hapt
⁠C
. . . .er
. .3.
. .Using
. . . . . .libcgroup
. . . . . . . . .T.ools
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1. 8. . . . . . . . . .
⁠3 .1. Mo unting a Hierarc hy
⁠3 .2. Unmo unting a Hierarc hy
⁠3 .3. Creating Co ntro l G ro up s
⁠3 .4. Remo ving Co ntro l G ro up s
⁠3 .5. Setting Cg ro up Parameters
⁠3 .6 . Mo ving a Pro c es s to a Co ntro l G ro up
⁠3 .7. Starting a Pro c es s in a Co ntro l G ro up
⁠3 .8 . O b taining Info rmatio n ab o ut Co ntro l G ro up s
⁠3 .9 . Ad d itio nal Res o urc es
18
20
20
21
22
23
25
25
26
. .hapt
⁠C
. . . .er
. .4. .. Cont
. . . . rol
. . .G
. .roup
. . . . Applicat
. . . . . . . .ion
. . . Examples
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. 8. . . . . . . . . .
⁠4 .1. Prio ritiz ing Datab as e I/O
28
⁠4 .2. Prio ritiz ing Netwo rk Traffic
29
. .ppendix
⁠A
. . . . . . . A.
. . Revision
. . . . . . . . .Hist
. . . ory
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
...........
1
Resource Management G uide
Chapter 1. Introduction to Control Groups (Cgroups)
1.1. What are Cont rol Groups
The control groups, abbreviated as cgroups in this guide, are a Linux kernel feature that allows you to
allocate resources — such as CPU time, system memory, network bandwidth, or combinations of
these resources — among hierarchically ordered groups of processes running on a system. By using
cgroups, system administrators gain fine-grained control over allocating, prioritizing, denying,
managing, and monitoring system resources. Hardware resources can be smartly divided up among
applications and users, increasing overall efficiency.
Control Groups provide a way to hierarchically group and label processes, and to apply resource
limits to them. Traditionally, all processes received similar amounts of system resources that the
administrator could modulate with the process niceness value. With this approach, applications that
involved a large number of processes received more resources than applications with few processes,
regardless of the relative importance of these applications.
Red Hat Enterprise Linux 7 moves the resource management settings from the process level to the
application level by binding the system of cgroup hierarchies with the systemd unit tree. Therefore,
you can manage system resources with systemctl commands, or by modifying systemd unit files.
See Chapter 2, Using Control Groups for details.
In previous versions of Red Hat Enterprise Linux, system administrators built custom cgroup
hierarchies with the use of the cg co nfi g command from the libcgroup package. This package is now
deprecated, and it is not recommended to use it since it can easily create conflicts with the default
cgroup hierarchy. However, libcgroup is still available to cover for certain specific cases, where
syst emd is not yet applicable, most notably for using the net-prio subsystem. See Chapter 3, Using
libcgroup Tools.
The aforementioned tools provide a high-level interface to interact with cgroup controllers (also
known as subsystems) in Linux kernel. The main cgroup controllers for resource management are
cpu, memory, and blkio, see Available Controllers in Red Hat Enterprise Linux 7 for the list of
controllers enabled by default. For detailed description of resource controllers and their configurable
parameters, refer to Controller-Specific Kernel D ocumentation.
1.2. Default Cgroup Hierarchies
By default, syst emd automatically creates a hierarchy of slice, scope and service units to provide a
unified structure for the cgroup tree. With the systemctl command, you can further modify this
structure by creating custom slices, as shown in Section 2.1, “ Creating Control Groups” . Also,
syst emd automatically mounts hierarchies for important kernel resource controllers (see Available
Controllers in Red Hat Enterprise Linux 7) in the /sys/fs/cg ro up/ directory.
Warning
The deprecated cg co nfi g tool from the l i bcg ro up package is available to mount and
handle hierarchies for controllers not yet supported by syst emd (most notably the net-pri o
controller). Never use l i bcg ro pup tools to modify the default hierarchies mounted by
syst emd since it would lead to unexpected behavior. The l i bcg ro up library will be removed
in future versions of Red Hat Enterprise Linux. For more information on how to use cg co nfi g ,
see Chapter 3, Using libcgroup Tools.
2
⁠Chapt er 1 . Int roduct ion t o Cont rol G roups (Cgroups)
Syst emd Unit T ypes
All processes running on the system are child processes of the syst emd init process. Systemd
provides three unit types that are used for the purpose of resource control (for a complete list of
systemd 's unit types, see the chapter called Managing Services with systemd in Red Hat Enterprise
Linux 7 System Administrator's Guide):
Service — A process or a group of processes, which systemd started based on a unit
configuration file. Services encapsulate the specified processes so that they can be started and
stopped as one set. Services are named in the following way:
name.servi ce
Where name stands for the name of the service.
Sco p e — A group of externally created processes. Scopes encapsulate processes that are
started and stopped by arbitrary processes via the fo rk() function and then registered by
syst emd at runtime. For instance, user sessions, containers, and virtual machines are treated as
scopes. Scopes are named as follows:
name.sco pe
Here, name stands for the name of the scope.
Slice — A group of hierarchically organized units. Slices do not contain processes, they
organize a hierarchy in which scopes and services are placed. The actual processes are
contained in scopes or in services. In this hierarchical tree, every name of a slice unit
corresponds to the path to a location in the hierarchy. The dash (" -" ) character acts as
a separator of the path components. For example, if the name of a slice looks as follows:
parent-name.sl i ce
it means that a slice called parent-name.sl i ce is a subslice of the parent.sl i ce. This slice can
have its own subslice named parent-name-name2.sl i ce, and so on.
There is one root slice denoted as:
-. sl i ce
Service, scope, and slice units directly map to objects in the cgroup tree. When these units are
activated, they map directly to cgroup paths built from the unit names. For example, the ex.service
residing in the test-waldo.slice is mapped to the cgroup test. sl i ce/testwal d o . sl i ce/ex. servi ce/.
Services, scopes, and slices are created manually by the system administrator or dynamically by
programs. By default, the operating system defines a number of built-in services that are necessary to
run the system. Also, there are four slices created by default:
- .slice — the root slice;
syst em.slice — the default place for all system services;
u ser.slice — the default place for all user sessions;
mach in e.slice — the default place for all virtual machines and Linux containers.
3
Resource Management G uide
Note that all user sessions are automatically placed in a separated scope unit, as well as virtual
machines and container processes. Furthermore, all users are assigned with an implicit subslice.
Besides the above default configuration, the system administrator can define new slices and assign
services and scopes to them.
The following tree is a simplified example of a cgroup tree. This output was generated with the
systemd -cg l s command described in Section 2.4, “ Obtaining Information about Control Groups” :
├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 20
├─user.slice
│ └─user-1000.slice
│
└─session-1.scope
│
├─11459 gdm-session-worker [pam/gdm-password]
│
├─11471 gnome-session --session gnome-classic
│
├─11479 dbus-launch --sh-syntax --exit-with-session
│
├─11480 /bin/dbus-daemon --fork --print-pid 4 --print-address 6 -session
│
...
│
└─system.slice
├─systemd-journald.service
│ └─422 /usr/lib/systemd/systemd-journald
├─bluetooth.service
│ └─11691 /usr/sbin/bluetoothd -n
├─systemd-localed.service
│ └─5328 /usr/lib/systemd/systemd-localed
├─colord.service
│ └─5001 /usr/libexec/colord
├─sshd.service
│ └─1191 /usr/sbin/sshd -D
│
...
As you can see, services and scopes contain processes and are placed in slices that do not contain
processes of their own. The only exception is PID 1 that is located in the special syst emd .slice.
Also note that - .slice is not shown as it is implicitly identified with the root of the entire tree.
Service and slice units can be configured with persistent unit files as described in Section 2.3.2,
“ Modifying Unit Files” , or created dynamically at runtime via API calls to PID 1 (see Section 1.4,
“ Online D ocumentation” for API reference). Scope units can be created only by the first method. Units
created dynamically with API calls are transient and exist only during runtime. Transient units are
released automatically as soon as they finish, get deactivated, or the system is rebooted.
1.3. Resource Cont rollers in Linux Kernel
A resource controller, also called a cgroup subsystem, represents a single resource, such as CPU
time or memory. The Linux kernel provides a range of resource controllers, that are mounted
automatically by syst emd . Find the list of currently mounted resource controllers in
/pro c/cg ro ups, or use the lssu b sys monitoring tool. In Red Hat Enterprise Linux 7, syst emd
mounts the following controllers by default:
Availab le C o n t ro llers in R ed H at En t erp rise Lin u x 7
bl ki o — sets limits on input/output access to and from block devices;
4
⁠Chapt er 1 . Int roduct ion t o Cont rol G roups (Cgroups)
cpu — uses the CPU scheduler to provide cgroup tasks access to the CPU. It is mounted together
with the cpuacct controller on the same mount;
cpuacct — creates automatic reports on CPU resources used by tasks in a cgroup. It is mounted
together with the cpu controller on the same mount;
cpuset — assigns individual CPUs (on a multicore system) and memory nodes to tasks in a
cgroup;
d evi ces — allows or denies access to devices for tasks in a cgroup;
freezer — suspends or resumes tasks in a cgroup;
memo ry — sets limits on memory use by tasks in a cgroup and generates automatic reports on
memory resources used by those tasks;
net_cl s — tags network packets with a class identifier (classid ) that allows the Linux traffic
controller (the tc command) to identify packets originating from a particular cgroup task. A
subsystem of net_cl s, the net_fi l ter (iptables) can also use this tag to perform actions on
such packets. The net_fi l ter tags network sockets with a firewall identifier (f wid ) that allows
the Linux firewall (the i ptabl es command) to identify packets (skb->sk) originating from a
particular cgroup task;
perf_event — enables monitoring cgroups with the p erf tool;
hug etl b — allows to use virtual memory pages of large sizes and to enforce resource limits on
these pages.
The Linux kernel exposes a wide range of tunable parameters for resource controllers that can be
configured with syst emd . See the kernel documentation (list of references in the Controller-Specific
Kernel D ocumentation section) for detailed description of these parameters.
1.4 . Addit ional Resources
To find more information about resource control under syst emd , the unit hierarchy, as well as the
kernel resource controllers, refer to the materials listed below:
Inst alled Document at ion
C g ro u p - R elat ed Syst emd D o cu men t at io n
The following manual pages contain general information on unified cgroup hierarchy under
syst emd :
systemd . reso urce-co ntro l (5) — describes the configuration options for resource control
shared by system units.
systemd . uni t(5) — describes common options of all unit configuration files.
systemd . sl i ce(5) — provides general information about .slice units.
systemd . sco pe(5) — provides general information about .scope units.
systemd . servi ce(5) — provides general information about .service units.
5
Resource Management G uide
C o n t ro ller- Sp ecif ic K ern el D o cu men t at io n
The kernel-doc package provides a detailed documentation of all resource controllers. This package
is included in the Optional subscription channel. Before subscribing to the Optional channel, please
see the Scope of Coverage D etails for Optional software, then follow the steps documented in the
article called How to access Optional and Supplementary channels, and -devel packages using Red
Hat Subscription Manager (RHSM)? on Red Hat Customer Portal. To install kernel-doc from the
Optional channel, type as ro o t:
yum i nstal l kernel-doc
After the installation, the following files will appear under the /usr/share/d o c/kernel d o c-<kernel_version>/D o cumentati o n/cg ro ups/ directory:
bl ki o subsystem — bl ki o -co ntro l l er. txt
cpuacct subsystem — cpuacct. txt
cpuset subsystem — cpusets. txt
d evi ces subsystem — d evi ces. txt
freezer subsystem — freezer-subsystem. txt
memo ry subsystem — memo ry. txt
net_cl s subsystem — net_cl s. txt
Additionally, refer to the following files on further information about the cpu subsystem:
Real-Time scheduling — /usr/share/d o c/kernel d o c-<kernel_version>/D o cumentati o n/sched ul er/sched -rt-g ro up. txt
CFS scheduling — /usr/share/d o c/kernel d o c-<kernel_version>/D o cumentati o n/sched ul er/sched -bwc. txt
Online Document at ion
Red Hat Enterprise Linux 7 System Administrator's Guide — The System Administrator's Guide
documents relevant information regarding the deployment, configuration, and administration of
Red Hat Enterprise Linux 7. This guide contains a detailed explanation of the syst emd concepts
as well as instructions for service management with syst emd .
The D -Bus API of systemd — The reference material for D -Bus API commands used to interact with
syst emd .
6
⁠Chapt er 2 . Using Cont rol G roups
Chapter 2. Using Control Groups
The following sections provide an overview of tasks related to creation and management of control
groups. This guide focuses on utilities provided by syst emd that are preferred as a way of cgroup
management and will be supported in the future. Previous versions of Red Hat Enterprise Linux used
the libcgroup package for creating and managing cgroups. This package is still available to assure
backward compatibility (see Warning), but it will not be supported in future versions of Red Hat
Enterprise Linux.
2.1. Creat ing Cont rol Groups
From the syst emd 's perspective, a cgroup is bound to a system unit configurable with a unit file and
manageable with systemd's command-line utilities. D epending on the type of application, your
resource management settings can be transient or persistent.
To create a t ran sien t cg ro u p for a service, start the service with the systemd -run command. This
way, it is possible to set limits on resources consumed by the service during its runtime. Applications
can create transient cgroups dynamically by using API calls to syst emd . See Section 2.5, “ Online
D ocumentation” for API reference. Transient unit is removed automatically as soon as the service is
stopped.
To assign a p ersist en t cg ro u p to a service, edit its unit configuration file. The configuration is
preserved after the system reboot, so it can be used to manage services that are started
automatically. Note that scope units can not be created in this way.
2.1.1. Creat ing T ransient Cgroups wit h syst emd-run
The systemd -run command is used to create and start a transient service or scope unit and run a
custom command in the unit. Commands executed in service units are started asynchronously in the
background, where they are invoked from the syst emd process. Commands run in scope units are
started directly from the systemd -run process and thus inherit the execution environment of the
caller. Execution in this case is synchronous.
To run a command in a specified cgroup, type as ro o t:
systemd -run --uni t=name --sco pe --sl i ce=slice_name command
The name stands for the name you want the unit to be known under. If --uni t is not specified, a
unit name will be generated automatically. It is recommended to choose a descriptive name, since
it will represent the unit in the systemctl output. The name has to be unique during runtime of
the unit.
Use the optional --sco pe parameter to create a transient scope unit instead of service unit that is
created by default.
With the --sl i ce option, you can make your newly created service or scope unit a member of a
specified slice. Replace slice_name with the name of an existing slice (as shown in the output of
systemctl -t sl i ce), or create a new slice by passing a unique name. By default, services
and scopes are created as members of the syst em.slice.
Replace command with the command you wish to execute in the service unit. Place this command
at the very end of the systemd -run syntax, so that the parameters of this command are not
confused for parameters of systemd -run.
7
Resource Management G uide
Besides the above options, there are several other parameters available for systemd -run. For
example, --d escri pti o n creates a description of the unit, --remai n-after-exi t allows to
collect runtime information after terminating the service's process. The --machi ne option executes
the command in a confined container. See the systemd -run(1) manual page to learn more.
Examp le 2.1. St art in g a N ew Service wit h syst emd - ru n
Use the following command to run the t o p utility in a service unit in a new slice called test. Type
as ro o t:
~]# systemd -run --uni t=toptest --sl i ce=test to p -b
The following message is displayed to confirm that you started the service successfully:
Running as unit toptest.service
Now, the name toptest.service can be used to monitor or to modify the cgroup with systemctl
commands.
2.1.2. Creat ing Persist ent Cgroups
To configure a unit to be started automatically on system boot, execute the systemctl enabl e
command (see the chapter called Managing Services with systemd in Red Hat Enterprise Linux 7
System Administrators Guide). Running this command automatically creates a unit file in the
/usr/l i b/systemd /system/ directory. To make persistent changes to the cgroup, add or modify
configuration parameters in its unit file. For more information, see Section 2.3.2, “ Modifying Unit
Files” .
2.2. Removing Cont rol Groups
Transient cgroups are released automatically as soon as the processes they contain finish. By
passing the --remai n‑after-exi t option to systemd -run you can keep the unit running after its
processes finished to collect runtime information. To stop the unit gracefully, type:
systemctl sto p name.servi ce
Replace name with the name of the service you wish to stop. To terminate one or more of the unit's
processes, type as ro o t:
systemctl ki l l name.servi ce --ki l l -who =PID,... --si g nal =signal
Replace name with a name of the unit, for example httpd.service. Use --ki l l -who to select which
processes from the cgroup you wish to terminate. To kill multiple processes at the same time, pass a
comma-separated list of PID s. Replace signal with the type of POSIX signal you wish to send to
specified processes. D efault is SIGTERM. For more information, see the systemd . ki l l manual
page.
Persistent cgroups are released when the unit is disabled and its configuration file is deleted by
running:
systemctl d i sabl e name.servi ce
8
⁠Chapt er 2 . Using Cont rol G roups
where name stands for the name of the service to be disabled.
2.3. Modifying Cont rol Groups
Each persistent unit supervised by systemd has a unit configuration file in the
/usr/l i b/systemd /system/ directory. To change parameters of a service unit, modify this
configuration file. This can be done either manually or from the command-line interface by using the
systemctl set-pro perty command.
2.3.1. Set t ing Paramet ers from t he Command-Line Int erface
The systemctl set-pro perty command allows you to persistently change resource control
settings during the application runtime. To do so, use the following syntax as ro o t:
systemctl set-pro perty name parameter=value
Replace name with the name of the systemd unit you wish to modify, parameter with a name of the
parameter to be changed, and value with a new value you want to assign to this parameter.
Not all unit parameters can be changed at runtime, but most of those related to resource control may,
see Section 2.3.2, “ Modifying Unit Files” for a complete list. Note that systemctl set-pro perty
allows you to change multiple properties at once, which is preferable over setting them individually.
The changes are applied instantly, and written into the unit file so that they are preserved after
reboot. You can change this behavior by passing the --runti me option that makes your settings
transient:
systemctl set-pro perty --runti me name property=value
Examp le 2.2. U sin g syst emct l set - p ro p ert y
To limit the CPU and memory usage of httpd.service from the command line, type:
~]# systemctl set-pro perty httpd . servi ce C P UShares=600
Memo ryLi mi t=500M
To make this a temporary change, add the --runti me option:
~]# systemctl set-pro perty --runti me httpd . servi ce C P UShares=600
Memo ryLi mi t=500M
2.3.2. Modifying Unit Files
Systemd service unit files provide a number of high-level configuration parameters useful for
resource management. These parameters communicate with Linux cgroup controllers, that have to be
enabled in the kernel. With these parameters, you can manage CPU, memory consumption, block IO,
as well as some more fine-grained unit properties.
Managing CPU
The cpu controller is enabled by default in the kernel, and consequently every system service receives
9
Resource Management G uide
the same amount of CPU time, regardless of how many processes it contains. This default behavior
can be changed with the D efaul tC o ntro l l ers parameter in the /etc/systemd /system. co nf
configuration file. To manage CPU allocation, use the following directive in the [ Service] section of
the unit configuration file:
C P UShares= value
Replace value with a number of CPU shares. The default value is 1024. By increasing the
number, you assign more CPU time to the unit. This parameter implies that
C P UAcco unti ng is turned on in the unit file.
The C P UShares parameter controls the cpu.shares control group parameter. See the description of
the cpu controller in Controller-Specific Kernel D ocumentation to see other CPU-related control
parameters.
Examp le 2.3. Limit in g C PU C o n su mp t io n o f a U n it
To assign the Apache service 1500 CPU shares instead of the default 1024, modify the
C P UShares setting in the /usr/l i b/systemd /system/httpd . servi ce unit file:
[Service]
CPUShares=1500
To apply the changes, reload systemd's configuration and restart Apache so that the modified
service file is taken into account:
~]# systemctl d aemo n-rel o ad
~]# systemctl restart httpd . servi ce
Managing Me m o ry
To enforce limits on the unit's memory consumption, use the following directives in the [ Service]
section of the unit configuration file:
Memo ryLi mi t= value
Replace value with a limit on maximum memory usage of the processes executed in the
cgroup. Use suffixes K, M, G, or T to identify Kilobyte, Megabyte, Gigabyte, or Terabyte as
the unit of measurement. Also, the Memo ryAcco unti ng parameter has to be enabled for
the unit.
The Memo ryLi mi t parameter controls the memory.limit_in_bytes control group parameter. For more
information, see the description of the memo ry controller in Controller-Specific Kernel
D ocumentation.
Examp le 2.4 . Limit in g Memo ry C o n su mp t io n o f a U n it
To assign a 1GB memory limit to the Apache service, modify the Memo ryLi mi t setting in the
/usr/l i b/systemd /system/httpd . servi ce unit file:
[Service]
MemoryLimit=1G
10
⁠Chapt er 2 . Using Cont rol G roups
To apply the changes, reload systemd's configuration and restart Apache so that the modified
service file is taken into account:
~]# systemctl d aemo n-rel o ad
~]# systemctl restart httpd . servi ce
Managing Blo ck IO
To manage the Block IO, use the following directives in the [ Service] section of the unit
configuration file. D irectives listed below assume that the Bl o ckIO Acco unti ng parameter is
enabled:
Bl o ckIO Wei g ht= value
Replace value with a new overall block IO weight for the executed processes. Choose a
single value between 10 and 1000, the default setting is 1000.
Bl o ckIO D evi ceWei g ht= device_name value
Replace value with a block IO weight for a device specified with device_name. Replace
device_name either with a name or with a path to a device. As with Bl o ckIO Wei g ht, it is
possible to set a single weight value between 10 and 1000.
Bl o ckIO R ead Band wi d th= device_name value
This directive allows you to limit a specific bandwidth for a unit. Replace device_name with
the name of a device or with a path to a block device node, value stands for a bandwidth
rate. Use suffixes K, M, G, or T to specify units of measurement. A value with no suffix is
interpreted as bytes per second.
Bl o ckIO Wri teBand wi d th= device_name value
Limits the write bandwidth for a specified device. Accepts the same arguments as
Bl o ckIO R ead Band wi d th.
Each of the aforementioned directives controls a corresponding cgroup parameter. See the
description of the bl ki o controller in Controller-Specific Kernel D ocumentation.
Note
Currently, the bl ki o resource controller does not support buffered write operations. It is
primarily targeted at direct I/O, so the services that use buffered write will ignore the limits set
with Bl o ckIO Wri teBand wi d th. On the other hand, buffered read operations are supported,
and Bl o ckIO R ead Band wi d th limits will be applied correctly both on direct and buffered
read.
Examp le 2.5. Limit in g B lo ck IO o f a U n it
To lower the block IO weight for the Apache service accessing the /ho me/jd o e/ directory, add
the following text into the /usr/l i b/systemd /system/httpd . servi ce unit file:
[Service]
BlockIODeviceWeight=/home/jdoe 750
11
Resource Management G uide
To set the maximum bandwidth for Apache reading from the /var/l o g / directory to 5MB per
second, use the following syntax:
[Service]
BlockIOReadBandwith=/var/log 5M
To apply your changes, reload systemd's configuration and restart Apache so that the modified
service file is taken into account:
~]# systemctl d aemo n-rel o ad
~]# systemctl restart httpd . servi ce
Managing Ot he r Syst e m Re so urce s
There are several other directives that can be used in the unit file to facilitate resource management:
D evi ceAl l o w= device_name options
This option controls access to specific device nodes. Here, device_name stands for a path
to a device node or a device group name as specified in /pro c/d evi ces. Replace
o pti o ns with a combination of r, w, and m to allow the unit to read, write, or create device
nodes.
D evi ceP o l i cy= value
Here, value is one of: strict (only allows the types of access explicitly specified with
D evi ceAl l o w), closed (allows access to standard pseudo devices including /dev/null,
/dev/zero, /dev/full, /dev/random, and /dev/urandom) or auto (allows access to all devices if
no explicit D evi ceAl l o w is present, which is the default behavior)
Sl i ce= slice_name
Replace slice_name with the name of the slice to place the unit in. The default is system.slice.
Scope units can not be arranged in this way, since they are tied to their parent slices.
ExecStartP o st= command
Currently, systemd supports only a subset of cgroup features. However, as a workaround,
you can use the ExecStartP o st= option along with setting the
memory.memsw.limit_in_bytes parameter in order to prevent any swap usage for a
service. For more information on ExecStartP o st= , see the systemd . servi ce(5) man
page.
Examp le 2.6 . C o n f ig u rin g C g ro u p O p t io n s
Imagine that you wish to change the memory.memsw.limit_in_bytes setting to the same value
as the unit's Memo ryLi mi t= in order to prevent any swap usage for a given example service.
ExecStartPost=/bin/bash -c "echo 1G >
/sys/fs/cgroup/memory/system.slice/example.service/memory.memsw.limit_i
n_bytes"
To apply the change, reload systemd configuration and restart the service so that the modified
setting is taken into account:
12
⁠Chapt er 2 . Using Cont rol G roups
~]# systemctl d aemo n-rel o ad
~]# systemctl restart example.service
2.4 . Obt aining Informat ion about Cont rol Groups
Use the systemctl command to list system units and to view their status. Also, the systemd -cg l s
command is provided to view the hierarchy of control groups and systemd -cg to p to monitor their
resource consumption in real time.
2.4 .1. List ing Unit s
Use the following command to list all active units on the system:
systemctl l i st-uni ts
The l i st-uni ts option is executed by default, which means that you will receive the same output
when you omit this option and execute just:
systemctl
UNIT
abrt-ccpp.service
hook
abrt-oops.service
abrt-vmcore.service
abrt-xorg.service
...
LOAD
ACTIVE SUB
loaded active exited
DESCRIPTION
Install ABRT coredump
loaded active running ABRT kernel log watcher
loaded active exited Harvest vmcores for ABRT
loaded active running ABRT Xorg log watcher
The output displayed above contains five columns:
UNIT — the name of the unit that also reflects the unit's position in the cgroup tree. As mentioned
in Section 1.2, “ Systemd Unit Types” , three unit types are relevant for resource control: slice,
scope, and service. For a complete list of systemd 's unit types, see the chapter called Managing
Services with systemd in Red Hat Enterprise Linux 7 System Administrators Guide.
LOAD — indicates whether the unit configuration file was properly loaded. If the unit file failed to
load, the field contains the state error instead of loaded. Other unit load states are: stub, merged,
and masked.
ACTIVE — the high-level unit activation state, which is a generalization of SUB.
SUB — the low-level unit activation state. The range of possible values depends on the unit type.
DESCRIPTION — the description of the unit's content and functionality.
By default, systemctl lists only active units (in terms of high-level activations state in the ACTIVE
field). Use the --al l option to see inactive units too. To limit the amount of information in the output
list, use the --type (-t) parameter that requires a comma-separated list of unit types such as service
and slice, or unit load states such as loaded and masked.
Examp le 2.7. U sin g syst emct l list - u n it s
To view a list of all slices used on the system, type:
13
Resource Management G uide
~]$ systemctl -t sl i ce
To list all active masked services, type:
~]$ systemctl -t servi ce,masked
To list all unit files installed on your system and their status, type:
systemctl l i st-uni t-fi l es
2.4 .2. Viewing t he Cont rol Group Hierarchy
The aforementioned listing commands do not go beyond the unit level to show the actual processes
running in cgroups. Also, the output of systemctl does not show the hierarchy of units. You can
achieve both by using the systemd -cg l s command that groups the running process according to
cgroups. To display the whole cgroup hierarchy on your system, type:
systemd -cg l s
When systemd -cg l s is issued without parameters, it returns the entire cgroup hierarchy. The
highest level of the cgroup tree is formed by slices and can look as follows:
├─system
│ ├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 20
│ ...
│
├─user
│ ├─user-1000
│ │ └─ ...
│ ├─user-2000
│ │ └─ ...
│ ...
│
└─machine
├─machine-1000
│ └─ ...
...
Note that machine slice is present only if you are running a virtual machine or a container. For more
information on the cgroup tree, see Section 1.2, “ Systemd Unit Types” .
To reduce the output of systemd -cg l s, and to view a specified part of the hierarchy, execute:
systemd -cg l s name
Replace name with a name of the resource controller you want to inspect.
As an alternative, use the systemctl status command to display detailed information about a
system unit. A cgroup subtree is a part of the output of this command.
systemctl status name
14
⁠Chapt er 2 . Using Cont rol G roups
To learn more about systemctl status, see the chapter called Managing Services with systemd in
Red Hat Enterprise Linux 7 System Administrators Guide.
Examp le 2.8. Viewin g t h e C o n t ro l G ro u p H ierarch y
To see a cgroup tree of the memo ry resource controller, execute:
~]$ systemd -cg l s memo ry
memory:
├─
1 /usr/lib/systemd/systemd --switched-root --system --deserialize
23
├─ 475 /usr/lib/systemd/systemd-journald
...
The output of the above command lists the services that interact with the selected controller. A
different approach is to view a part of the cgroup tree for a certain service, slice, or scope unit:
~]# systemctl status httpd . servi ce
httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
Active: acti ve (runni ng ) since Sun 2014-03-23 08:01:14 MDT; 33min
ago
Process: 3385 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful
(code=exited, status=0/SUCCESS)
Main PID: 1205 (httpd)
Status: "Total requests: 0; Current requests/sec: 0; Current
traffic:
0 B/sec"
CGroup: /system.slice/httpd.service
├─1205 /usr/sbin/httpd -DFOREGROUND
├─3387 /usr/sbin/httpd -DFOREGROUND
├─3388 /usr/sbin/httpd -DFOREGROUND
├─3389 /usr/sbin/httpd -DFOREGROUND
├─3390 /usr/sbin/httpd -DFOREGROUND
└─3391 /usr/sbin/httpd -DFOREGROUND
...
Besides the aforementioned tools, syst emd also provides the machi nectl command dedicated to
monitoring Linux containers.
2.4 .3. Viewing Resource Cont rollers
The aforementioned systemctl commands enable monitoring the higher-level unit hierarchy, but do
not show which resource controllers in Linux kernel are actually used by which processes. This
information is stored in dedicated process files, to view it, type as ro o t:
cat pro c/PID/cg ro up
Where PID stands for the ID of the process you wish to examine. By default, the list is the same for all
units started by syst emd , since it automatically mounts all default controllers. See the following
example:
~]# cat pro c/27/cg ro up
15
Resource Management G uide
10:hugetlb:/
9:perf_event:/
8:blkio:/
7:net_cls:/
6:freezer:/
5:devices:/
4:memory:/
3:cpuacct,cpu:/
2:cpuset:/
1:name=systemd:/
By examining this file, you can determine if the process has been placed in the desired cgroups as
defined by the systemd unit file specifications.
2.4 .4 . Monit oring Resource Consumpt ion
The systemd -cg l s command provides a static snapshot of the cgroup hierarchy. To see a
dynamic account of currently running cgroups ordered by their resource usage (CPU, Memory, and
IO), use:
systemd -cg to p
The behavior, provided statistics, and control options of systemd -cg to p are akin of those of the
to p utility. See systemd -cg to p(1) manual page for more information.
2.5. Addit ional Resources
For more information on how to use syst emd and related tools to manage system resources on Red
Hat Enterprise Linux, refer to the sources listed below:
Inst alled Document at ion
Man Pag es o f C g ro u p - R elat ed Syst emd T o o ls
systemd -run(1) — The manual page lists all command-line options of the systemd -run utility.
systemctl (1) — The manual page of the syst emct l utility that lists available options and
commands.
systemd -cg l s(1) — This manual page lists all command-line options of the systemd -cg l s
utility.
systemd -cg to p(1) — The manual page contains the list of all command-line options of the
systemd -cg to p utility.
machi nectl (1) — This manual page lists all command-line options of the machi nectl utility.
systemd . ki l l (5) — This manual page provides an overview of kill configuration options for
system units.
C o n t ro ller- Sp ecif ic K ern el D o cu men t at io n
The kernel-doc package provides detailed documentation of all resource controllers. This package is
16
⁠Chapt er 2 . Using Cont rol G roups
included in the Optional subscription channel. Before subscribing to the Optional channel, please
see the Scope of Coverage D etails channel, then follow the steps documented in the article called
How to access Optional and Supplementary channels, and -devel packages using Red Hat
Subscription Manager (RHSM)? on Red Hat Customer Portal. To install kernel-doc from the Optional
channel, type as ro o t:
yum i nstal l kernel-doc
After the installation, the following files will appear under the /usr/share/d o c/kernel d o c-<kernel_version>/D o cumentati o n/cg ro ups/ directory:
bl ki o subsystem — bl ki o -co ntro l l er. txt
cpuacct subsystem — cpuacct. txt
cpuset subsystem — cpusets. txt
d evi ces subsystem — d evi ces. txt
freezer subsystem — freezer-subsystem. txt
memo ry subsystem — memo ry. txt
net_cl s subsystem — net_cl s. txt
Additionally, refer to the following files on further information about the cpu subsystem:
Real-Time scheduling — /usr/share/d o c/kernel d o c-<kernel_version>/D o cumentati o n/sched ul er/sched -rt-g ro up. txt
CFS scheduling — /usr/share/d o c/kernel d o c-<kernel_version>/D o cumentati o n/sched ul er/sched -bwc. txt
Online Document at ion
Red Hat Enterprise Linux 7 System Administrators Guide — The System Administrator's Guide
documents relevant information regarding the deployment, configuration and administration of
Red Hat Enterprise Linux 7. It is oriented towards system administrators with a basic
understanding of the system.
The D -Bus API of systemd — The reference for D -Bus API commands for accessing syst emd .
17
Resource Management G uide
Chapter 3. Using libcgroup Tools
The libcgroup package, which was the main tool for cgroup management in previous versions of Red
Hat Enterprise Linux, is now deprecated. To avoid conflicts, do not use libcgroup tools for default
resource controllers (listed in Available Controllers in Red Hat Enterprise Linux 7) that are now an
exclusive domain of syst emd . This leaves a limited space for applying libcgroup tools, use it only
when you need to manage controllers not currently supported by syst emd , such as net_prio.
The following sections describe how to use libcgroup tools in relevant scenarios without conflicting
with the default system of hierarchy.
Note
In order to use libcgroup tools, first ensure the libcgroup and libcgroup-tools packages are
installed on your system. To install them, run as ro o t:
~]# yum i nstal l l i bcg ro up
~]# yum i nstal l l i bcg ro up-to o l s
Note
The net_pri o controller is not compiled in the kernel like the rest of the controllers, rather it is
a module that has to be loaded before attempting to mount it. To load this module, type as
ro o t:
mo d pro be netpri o _cg ro up
3.1. Mount ing a Hierarchy
To use a kernel resource controller that is not mounted automatically, you have to create a hierarchy
that will contain this controller. Add or detach the hierarchy by editing the mo unt section of the
/etc/cg co nfi g . co nf configuration file. This method makes the controller attachment persistent,
which means your settings will be preserved after system reboot. As an alternative, use the mo unt
command to create a transient mount only for the current session.
Using t he cgconfig Service
The cg co nfi g service installed with the libcgroup-tools package provides a way to mount hierarchies
for additional resource controllers. By default, this service is not started automatically. When you
start cg co nfi g , it applies the settings from the /etc/cg co nfi g . co nf configuration file. The
configuration is therefore recreated from session to session and becomes persistent. Note that if you
stop cg co nfi g , it unmounts all the hierarchies that it mounted.
The default /etc/cg co nfi g . co nf file installed with the libcgroup package does not contain any
configuration settings, only information that syst emd mounts the main resource controllers
automatically.
18
⁠Chapt er 3. Using libcgroup T ools
Entries of three types can be created in /etc/cg co nfi g . co nf — mount, group, and template. Mount
entries are used to create and mount hierarchies as virtual file systems, and attach controllers to
those hierarchies. In Red Hat Enterprise Linux 7, default hierarchies are mounted automatically to the
/sys/fs/cg ro up/ directory, cg co nfi g is therefore used solely to attach non-default controllers.
Mount entries are defined using the following syntax:
mount {
controller_name = /sys/fs/cgroup/controller_name;
…
}
Replace controller_name with a name of the kernel resource controller you wish to mount to the
hierarchy. See Example 3.1, “ Creating a mount entry” for an example.
Examp le 3.1. C reat in g a mo u n t en t ry
To attach the net_pri o controller to the default cgroup tree, add the following text to the
/etc/cg co nfi g . co nf configuration file:
mount {
net_prio = /sys/fs/cgroup/net_prio;
}
Then restart the cg co nfi g service to apply the setting:
systemctl restart cg co nfi g . servi ce
Group entries in /etc/cg co nfi g . co nf can be used to set the parameters of resource controllers.
See Section 3.5, “ Setting Cgroup Parameters” for more information about group entries.
Template entries in /etc/cg co nfi g . co nf can be used to create a group definition applied to all
processes.
Using t he mount Command
Use the mo unt command to temporarily mount a hierarchy. To do so, first create a mount point in the
/sys/fs/cg ro up/ directory where syst emd mounts the main resource controllers. Type as ro o t:
mkd i r /sys/fs/cg ro up/name
Replace name with a name of the new mount destination, usually the name of the controller is used.
Next, execute the mo unt command to mount the hierarchy and simultaneously attach one or more
subsystems. Type as ro o t:
mo unt -t cg ro up -o controller_name no ne /sys/fs/cg ro up/controller_name
Replace controller_name with a name of the controller to specify both the device to be mounted as well
as the destination folder. The -t cg ro up parameter specifies the type of mount.
Examp le 3.2. U sin g t h e mo u n t co mman d t o at t ach co n t ro llers
19
Resource Management G uide
To mount a hierarchy for the net_pri o controller with use of the mo unt command, first create the
mount point:
~]# mkd i r /sys/fs/cg ro up/net_pri o
Then mount net_pri o to the destination you created in the previous step:
~]# mo unt -t cg ro up -o net_pri o no ne /sys/fs/cg ro up/net_pri o
You can verify whether you attached the hierarchy correctly by listing all available hierarchies
along with their current mount points using the l ssubsys command (see Section 3.8, “ Listing
Controllers” ):
~]# l ssubsys -am
cpuset /sys/fs/cgroup/cpuset
cpu,cpuacct /sys/fs/cgroup/cpu,cpuacct
memory /sys/fs/cgroup/memory
devices /sys/fs/cgroup/devices
freezer /sys/fs/cgroup/freezer
net_cls /sys/fs/cgroup/net_cls
blkio /sys/fs/cgroup/blkio
perf_event /sys/fs/cgroup/perf_event
hugetlb /sys/fs/cgroup/hugetlb
net_prio /sys/fs/cgroup/net_prio
3.2. Unmount ing a Hierarchy
If you mounted a hierarchy by editing the /etc/cg co nfi g . co nf configuration file, you can
unmount it simply by removing the configuration directive from the mount section of this configuration
file. Then restart the service to apply the new configuration.
Similarly, you can unmount a hierarchy by executing the following command as ro o t:
~]# umo unt /sys/fs/cg ro up/controller_name
Replace controller_name with the name of the hierarchy that contains the resource controller you wish
to detach.
Warning
Make sure that you use umo unt to remove only hierarchies that you mounted yourself
manually. D etaching a hierarchy that contains a default controller (listed in Available
Controllers in Red Hat Enterprise Linux 7) will most probably lead to complications requiring a
system reboot.
3.3. Creat ing Cont rol Groups
Use the cg create command to create transient cgroups in hierarchies you created yourself. The
syntax for cg create is:
20
⁠Chapt er 3. Using libcgroup T ools
cg create -t uid:gid -a uid:gid -g controllers:path
where:
-t (optional) — specifies a user (by user ID , uid) and a group (by group ID , gid) to own the
tasks pseudo-file for this cgroup. This user can add tasks to the cgroup.
Note
Note that the only way to remove a process from a cgroup is to move it to a different cgroup.
To be able to move a process, the user has to have write access to the destination cgroup;
write access to the source cgroup is not necessary.
-a (optional) — specifies a user (by user ID , uid) and a group (by group ID , gid) to own all
pseudo-files other than tasks for this cgroup. This user can modify the access to system
resources for tasks in this cgroup.
-g — specifies the hierarchy in which the cgroup should be created, as a comma-separated list of
the controllers associated with hierarchies. The list of controllers is followed by a colon and the
path to the child group relative to the hierarchy. D o not include the hierarchy mount point in the
path.
Because all cgroups in the same hierarchy have the same controllers, the child group has the same
controllers as its parent.
As an alternative, you can create a child of the cgroup directly. To do so, use the mkd i r command:
~]# mkd i r /sys/fs/cg ro up/controller/name/child_name
For example:
~]# mkd i r /sys/fs/cg ro up/net_pri o /l ab1/g ro up1
3.4 . Removing Cont rol Groups
Remove cgroups with the cg d el ete command that has syntax similar to that of cg create. Run the
following command as ro o t:
cg d el ete controllers:path
where:
controllers is a comma-separated list of controllers.
path is the path to the cgroup relative to the root of the hierarchy.
For example:
~]# cg d el ete net_pri o : /test-subg ro up
cg d el ete can also recursively remove all subgroups when the -r option is specified.
21
Resource Management G uide
Note that when you delete a cgroup, all its processes move to its parent group.
3.5. Set t ing Cgroup Paramet ers
Modify the parameters of the control groups by editing the /etc/cg co nfi g . co nf configuration
file, or by using the cg set command. Changes made to /etc/cg co nfi g . co nf are preserved after
reboot, while cg set changes the cgroup parameters only for the current session.
Modifying /et c/cgconfig.conf
You can set the controller parameters in the Groups section of /etc/cg co nfi g . co nf. Group
entries are defined using the following syntax:
group name {
[permissions]
controller {
param_name = param_value;
…
}
…
}
Replace name with the name of your cgroup, controller stands for the name of the controller you wish
to modify. You should modify only controllers you mounted yourself, not any of the default controllers
mounted automatically by syst emd . Replace param_name and param_value with the controller
parameter you wish to change and its new value. Note that the permi ssi o ns section is optional. To
define permissions for a group entry, use the following syntax:
perm {
task {
uid = task_user;
gid = task_group;
}
admin {
uid = admin_name;
gid = admin_group;
}
}
22
⁠Chapt er 3. Using libcgroup T ools
Note
Restart the cg co nfi g service for the changes in the /etc/cg co nfi g . co nf to take effect.
Restarting this service rebuilds hierarchies specified in the configuration file but does not
affect all mounted hierarchies. You can restart a service by executing the systemctl
restart command, however, it is recommended to first stop the cg co nfi g service:
~]# systemctl sto p cg co nfi g
Then open and edit the configuration file. After saving your changes, you can start cg co nfi g
again with the following command:
~]# systemctl start cg co nfi g
Using t he cgset Command
Set controller parameters by running the cg set command from a user account with permission to
modify the relevant cgroup. Use this only for controllers you mounted manually.
The syntax for cg set is:
cg set -r parameter=value path_to_cgroup
where:
parameter is the parameter to be set, which corresponds to the file in the directory of the given
cgroup;
value is the value for the parameter;
path_to_cgroup is the path to the cgroup relative to the root of the hierarchy.
The values that can be set with cg set might depend on values set higher in a particular hierarchy.
For example, if g ro up1 is limited to use only CPU 0 on a system, you cannot set
g ro up1/subg ro up1 to use CPUs 0 and 1, or to use only CPU 1.
It is also possible use cg set to copy the parameters of one cgroup into another, existing cgroup.
The syntax to copy parameters with cg set is:
cg set --co py-fro m path_to_source_cgroup path_to_target_cgroup
where:
path_to_source_cgroup is the path to the cgroup whose parameters are to be copied, relative to the
root group of the hierarchy;
path_to_target_cgroup is the path to the destination cgroup, relative to the root group of the
hierarchy.
3.6. Moving a Process t o a Cont rol Group
Move a process into a cgroup by running the cg cl assi fy command:
23
Resource Management G uide
cg cl assi fy -g controllers:path_to_cgroup pidlist
where:
controllers is a comma-separated list of resource controllers, or * to launch the process in the
hierarchies associated with all available subsystems. Note that if there are multiple cgroups of the
same name, the -g option moves the processes in each of those groups.
path_to_cgroup is the path to the cgroup within the hierarchy;
pidlist is a space-separated list of process identifier (PID s).
If the -g option is not specified, cg cl assi fy automatically searches /etc/cg rul es. co nf and
uses the first applicable configuration line. According to this line, cg cl assi fy determines the
hierarchies and cgroups to move the process under. Note that for the move to be successful, the
destination hierarchies have to exist. The subsystems specified in /etc/cg rul es. co nf has to be
also properly configured for the corresponding hierarchy in /etc/cg co nfi g . co nf.
You can also add the --sti cky option before the pid to keep any child processes in the same
cgroup. If you do not set this option and the cg red service is running, child processes will be
allocated to cgroups based on the settings found in /etc/cg rul es. co nf. The process itself,
however, will remain in the cgroup in which you started it.
It is also possible to use the cg red service (which starts the cg rul eseng d service) that moves tasks
into cgroups according to parameters set in the /etc/cg rul es. co nf file. Use cg red only to
manage manually attached controllers. Entries in the /etc/cg rul es. co nf file can take one of the
two forms:
user subsystems control_group;
user:command subsystems control_group.
For example:
maria
net_prio
/usergroup/staff
This entry specifies that any processes that belong to the user named mari a access the d evi ces
subsystem according to the parameters specified in the /userg ro up/staff cgroup. To associate
particular commands with particular cgroups, add the command parameter, as follows:
maria:ftp
devices
/usergroup/staff/ftp
The entry now specifies that when the user named mari a uses the ftp command, the process is
automatically moved to the /userg ro up/staff/ftp cgroup in the hierarchy that contains the
d evi ces subsystem. Note, however, that the daemon moves the process to the cgroup only after the
appropriate condition is fulfilled. Therefore, the ftp process can run for a short time in an incorrect
group. Furthermore, if the process quickly spawns children while in the incorrect group, these
children might not be moved.
Entries in the /etc/cg rul es. co nf file can include the following extra notation:
@ — when prefixed to user, indicates a group instead of an individual user. For example,
@ ad mi ns are all users in the ad mi ns group.
* — represents " all" . For example, * in the subsystem field represents all subsystems.
% — represents an item the same as the item on the line above. For example:
24
⁠Chapt er 3. Using libcgroup T ools
@ adminstaff net_prio
@ labstaff %
%
/admingroup
3.7. St art ing a Process in a Cont rol Group
Launch processes in a manually created cgroup by running the cg exec command. The syntax for
cg exec is:
cg exec -g controllers:path_to_cgroup command arguments
where:
controllers is a comma-separated list of controllers, or * to launch the process in the hierarchies
associated with all available subsystems. Note that, as with the cg set command described in
Section 3.5, “ Setting Cgroup Parameters” , if cgroups of the same name exist, the -g option
creates processes in each of those groups.
path_to_cgroup is the path to the cgroup relative to the hierarchy;
command is the command to be executed in the cgroup;
arguments are any arguments for the command.
It is also possible to add the --sti cky option before the command to keep any child processes in
the same cgroup. If you do not set this option and the cg red service is running, child processes will
be allocated to cgroups based on the settings found in /etc/cg rul es. co nf. The process itself,
however, will remain in the cgroup in which you started it.
3.8. Obt aining Informat ion about Cont rol Groups
The libcgroup-tools package contains several utilities for obtaining information about controllers,
control groups, and their parameters.
List ing Cont rollers
To find the controllers that are available in your kernel and information on how they are mounted
together to hierarchies, execute:
cat /pro c/cg ro ups
Alternatively, to find the mount points of particular subsystems, execute the following command:
l ssubsys -m controllers
Here controllers stands for a list of the subsystems in which you are interested. Note that the
l ssubsys -m command returns only the top-level mount point per each hierarchy.
Finding Cont rol Groups
To list the cgroups on a system, execute as ro o t:
l scg ro up
25
Resource Management G uide
To restrict the output to a specific hierarchy, specify a controller and a path in the format
controller: path. For example:
~]$ l scg ro up cpuset: ad mi nusers
The above command lists only subgroups of the ad mi nusers cgroup in the hierarchy to which the
cpuset controller is attached.
Displaying Paramet ers of Cont rol Groups
To display the parameters of specific cgroups, run:
~]$ cg g et -r parameter list_of_cgroups
where parameter is a pseudo-file that contains values for a controller, and list_of_cgroups is a list of
cgroups separated with spaces.
If you do not know the names of the actual parameters, use a command similar to:
~]$ cg g et -g cpuset /
3.9. Addit ional Resources
The definitive documentation for cgroup commands can be found in the manual pages provided with
the libcgroup package.
Inst alled Document at ion
T h e lib cg ro u p - relat ed Man Pag es
cg cl assi fy(1) — the cg cl assi fy command is used to move running tasks to one or more
cgroups.
cg cl ear(1) — the cg cl ear command is used to delete all cgroups in a hierarchy.
cg co nfi g . co nf(5) — cgroups are defined in the cg co nfi g . co nf file.
cg co nfi g parser(8) — the cg co nfi g parser command parses the cg co nfi g . co nf file and
mounts hierarchies.
cg create(1) — the cg create command creates new cgroups in hierarchies.
cg d el ete(1) — the cg d el ete command removes specified cgroups.
cg exec(1) — the cg exec command runs tasks in specified cgroups.
cg g et(1) — the cg g et command displays cgroup parameters.
cg snapsho t(1) — the cg snapsho t command generates a configuration file from existing
subsystems.
cg red . co nf(5) — cg red . co nf is the configuration file for the cg red service.
26
⁠Chapt er 3. Using libcgroup T ools
cg rul es. co nf(5) — cg rul es. co nf contains the rules used for determining when tasks
belong to certain cgroups.
cg rul eseng d (8) — the cg rul eseng d service distributes tasks to cgroups.
cg set(1) — the cg set command sets parameters for a cgroup.
l scg ro up(1) — the l scg ro up command lists the cgroups in a hierarchy.
l ssubsys(1) — the l ssubsys command lists the hierarchies containing the specified
subsystems.
27
Resource Management G uide
Chapter 4. Control Group Application Examples
This chapter provides application examples that take advantage of the cgroup functionality.
4 .1. Priorit iz ing Dat abase I/O
Running each instance of a database server inside its own dedicated virtual guest allows you to
allocate resources per database based on their priority. Consider the following example: a system is
running two database servers inside two KVM guests. One of the databases is a high priority
database and the other one a low priority database. When both database servers are run
simultaneously, the I/O throughput is decreased to accommodate requests from both databases
equally; Figure 4.1, “ I/O throughput without resource allocation” indicates this scenario — once the
low priority database is started (around time 45), I/O throughput is the same for both database
servers.
Fig u re 4 .1. I/O t h ro u g h p u t wit h o u t reso u rce allo cat io n
To prioritize the high priority database server, it can be assigned to a cgroup with a high number of
reserved I/O operations, whereas the low priority database server can be assigned to a cgroup with a
low number of reserved I/O operations. To achieve this, follow the steps in Procedure 4.1, “ I/O
Throughput Prioritization” , all of which are performed on the host system.
Pro ced u re 4 .1. I/O T h ro u g h p u t Prio rit iz at io n
1. Make sure resource accounting is on for both services:
~]# systemctl set-pro perty d b1. servi ce Bl o ckIO Acco unti ng =true
~]# systemctl set-pro perty d b2. servi ce Bl o ckIO Acco unti ng =true
2. Set a ratio of 10:1 for the high and low priority services. Processes running in those service
units will use only the resources made available to them
~]# systemctl set-pro perty d b1. servi ce Bl o ckIO Wei g ht=10 0 0
~]# systemctl set-pro perty d b2. servi ce Bl o ckIO Wei g ht=10 0
28
⁠Chapt er 4 . Cont rol G roup Applicat ion Examples
Figure 4.2, “ I/O throughput with resource allocation” illustrates the outcome of limiting the low priority
database and prioritizing the high priority database. As soon as the database servers are moved to
their appropriate cgroups (around time 75), I/O throughput is divided between both servers with the
ratio of 10:1.
Fig u re 4 .2. I/O t h ro u g h p u t wit h reso u rce allo cat io n
Alternatively, block device I/O throttling can be used for the low priority database to limit its number of
read and write operations. For more information, refer to the description of the bl ki o controller in
Controller-Specific Kernel D ocumentation.
4 .2. Priorit iz ing Net work T raffic
When running multiple network-related services on a single server system, it is important to define
network priorities among these services. D efining the priorities ensures that packets originating from
certain services have a higher priority than packets originating from other services. For example,
such priorities are useful when a server system simultaneously functions as an NFS and Samba
server. The NFS traffic has to be of high priority as users expect high throughput. The Samba traffic
can be deprioritized to allow better performance of the NFS server.
The net_pri o controller can be used to set network priorities for processes in cgroups. These
priorities are then translated into Type of Service (ToS) field bits and embedded into every packet.
Follow the steps in Procedure 4.2, “ Setting Network Priorities for File Sharing Services” to configure
prioritization of two file sharing services (NFS and Samba).
Pro ced u re 4 .2. Set t in g N et wo rk Prio rit ies f o r File Sh arin g Services
1. The net_pri o controller is not compiled in the kernel, it is a module that has to be loaded
manually. To do so, type:
~]# mo d pro be netpri o _cg ro up
2. Attach the net_pri o subsystem to the /cg ro up/net_pri o cgroup:
~]# mkd i r sys/fs/cg ro up/net_pri o
~]# mo unt -t cg ro up -o net_pri o no ne sys/fs/cg ro up/net_pri o
3. Create two cgroups, one for each service:
29
Resource Management G uide
~]# mkd i r sys/fs/cg ro up/net_pri o /nfs_hi g h
~]# mkd i r sys/fs/cg ro up/net_pri o /samba_l o w
4. To automatically move the nfs services to the nfs_hi g h cgroup, add the following line to
the /etc/sysco nfi g /nfs file:
CGROUP_DAEMON="net_prio:nfs_high"
This configuration ensures that nfs service processes are moved to the nfs_hi g h cgroup
when the nfs service is started or restarted.
5. The smbd service does not have a configuration file in the /etc/sysco nfi g directory. To
automatically move the smbd service to the samba_l o w cgroup, add the following line to the
/etc/cg rul es. co nf file:
*:smbd
net_prio
samba_low
Note that this rule moves every smbd service, not only /usr/sbi n/smbd , into the
samba_l o w cgroup.
You can define rules for the nmbd and wi nbi nd d services to be moved to the samba_l o w
cgroup in a similar way.
6. Start the cg red service to load the configuration from the previous step:
~]# systemctl start cg red
Starting CGroup Rules Engine Daemon:
[
OK
]
7. For the purposes of this example, let us assume both services use the eth1 network interface.
D efine network priorities for each cgroup, where 1 denotes low priority and 10 denotes high
priority:
~]# echo "eth1 1" >
/sys/fs/cg ro up/net_pri o /samba_l o w/net_pri o . i fpri o map
~]# echo "eth1 10 " >
/sys/fs/cg ro up/net_pri o /nfs_hi g h/net_pri o . i fpri o map
8. Start the nfs and smb services and check whether their processes have been moved into the
correct cgroups:
~]# systemctl start smb
Starting SMB services:
~]# cat /sys/fs/cg ro up/net_pri o /samba_l o w/tasks
16122
16124
~]# systemctl start nfs
Starting NFS services:
Starting NFS quotas:
Starting NFS mountd:
Stopping RPC idmapd:
Starting RPC idmapd:
Starting NFS daemon:
30
[
OK
]
[
[
[
[
[
[
OK
OK
OK
OK
OK
OK
]
]
]
]
]
]
⁠Chapt er 4 . Cont rol G roup Applicat ion Examples
~]# cat sys/fs/cg ro up/net_pri o /nfs_hi g h/tasks
16321
16325
16376
Network traffic originating from NFS now has higher priority than traffic originating from
Samba.
Similar to Procedure 4.2, “ Setting Network Priorities for File Sharing Services” , the net_pri o
subsystem can be used to set network priorities for client applications, for example, Firefox.
31
Resource Management G uide
Appendix A. Revision History
R evisio n 0.0- 1.7
Mo n O ct 17 2016
Version for 7.3 GA publication.
Marie D o lež elo vá
R evisio n 0.0- 1.6
Version for 7.2 GA release.
Jan a H eves
Wed N o v 11 2015
R evisio n 0.0- 1.4
T h u Feb 19 2015
R ad ek B íb a
Version for 7.1 GA release. Linux Containers moved to a separate book.
R evisio n 0.0- 1.0
Mo n Ju l 21 2014
Pet er O n d rejka
R evisio n 0.0- 0.14
Version for 7.0 GA release
Mo n May 13 2013
Pet er O n d rejka
32
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising