Linux Administration: A Beginner`s Guide

Linux Administration: A Beginner`s Guide
Linux Administration
A Beginner’s Guide
About the Author
Wale Soyinka is a systems/network engineering consultant and has written a decent library of Linux
administration training materials. In addition to the fifth edition of Linux Administration: A
Beginner’s Guide, he is the author of Wireless Network Administration: A Beginner’s Guide and a
projects lab manual, Microsoft Windows 2000 Managing Network Environments (Prentice Hall).
Wale participates in several open source discussions and projects. His pet project is at caffe*nix
(www.caffenix.com), where he usually hangs out. caffenix is possibly the world’s first (or only
existing) brick-and-mortar store committed and dedicated to promoting and showcasing open source
technologies and culture.
About the Technical Editor
David Lane is an infrastructure architect and IT manager working and living in the Washington, DC,
area. He has been working with open source software since the early 1990s and was introduced to
Linux via the Slackware distribution early in its development. David soon discovered Red Hat 3 and
has never looked back. Unlike most Linux people, David does not have a programming background
and fell into IT as a career after discovering he was not cut out for sleeping on the street. He has
implemented Linux solutions for a variety of government and private companies with solutions
ranging from the simple to the complex. In his spare time, David writes about open source issues,
especially those related to business, for the Linux Journal as well as championing Linux to the next
generation. David is an amateur radio operator and Emergency Coordinator for amateur radio
responders for his local county. David speaks regularly to both Linux and amateur radio user groups
about the synergies between open source and amateur radio.
Linux Administration
A Beginner’s Guide,
Sixth Edition
WALE SOYINKA
New York Chicago San Francisco
Lisbon London Madrid Mexico City Milan
New Delhi San Juan Seoul Singapore Sydney Toronto
Cataloging-in-Publication Data is on file with the Library of Congress
Linux Administration: A Beginner’s Guide, Sixth Edition
Copyright © 2012 by The McGraw-Hill Companies. All rights reserved. Except as permitted under
the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed
in any form or by any means, or stored in a database or retrieval system, without the prior written
permission of the publisher, with the exception that the program listings may be entered, stored, and
executed in a computer system, but they may not be reproduced for publication.
ISBN: 978-0-07-176759-0
MHID: 0-07-176759-2
The material in this eBook also appears in the print version of this title: ISBN 978-0-07-176758-3,
MHID 0-07-176758-4.
All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after
every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit
of the trademark owner, with no intention of infringement of the trademark. Where such designations
appear in this book, they have been printed with initial caps.
McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales
promotions, or for use in corporate training programs. To contact a representative please e-mail us at
[email protected]
Sponsoring Editor
Megg Morin
Editorial Supervisor
Janet Walden
Project Manager
Anupriya Tyagi,
Cenveo Publisher Services
Acquisitions Coordinator
Stephanie Evans
Technical Editor
David Lane
Copy Editor
Lisa Theobald
Proofreader
Claire Splan
Indexer
Claire Splan
Production Supervisor
Jean Bodeaux
Composition
Cenveo Publisher Services
Illustration
Cenveo Publisher Services
Art Director, Cover
Jeff Weeks
Cover Designer
Jeff Weeks
TERMS OF USE
This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its
licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as
permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work,
you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works
based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it
without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and
personal use; any other use of the work is strictly prohibited. Your right to use the work may be
terminated if you fail to comply with these terms.
THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO
GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR
COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK,
INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA
HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its
licensors do not warrant or guarantee that the functions contained in the work will meet your
requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its
licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of
cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the
content of any information accessed through the work. Under no circumstances shall McGraw-Hill
and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar
damages that result from the use of or inability to use the work, even if any of them has been advised
of the possibility of such damages. This limitation of liability shall apply to any claim or cause
whatsoever whether such claim or cause arises in contract, tort or otherwise.
Dedicated to everyone who has contributed to open source technologies and ideals in one form
or another. Without you, I would have nothing to write about in this book.
At a Glance
PART I Introduction, Installation, and Software Management
1
2
3
Technical Summary of Linux Distributions
Installing Linux in a Server Configuration
Managing Software
PART II Single-Host Administration
4
5
6
7
8
9
10
Managing Users and Groups
The Command Line
Booting and Shutting Down
File Systems
Core System Services
The Linux Kernel
Knobs and Dials: Virtual File Systems
PART III Networking and Security
11
12
13
14
15
TCP/IP for System Administrators
Network Configuration
Linux Firewall (Netfilter)
Local Security
Network Security
PART IV Internet Services
16 DNS
17
18
19
20
21
FTP
Apache Web Server
SMTP
POP and IMAP
The Secure Shell (SSH)
PART V Intranet Services
22
23
24
25
26
27
28
29
30
Network File System (NFS)
Samba
Distributed File Systems
Network Information Service
LDAP
Printing
DHCP
Virtualization
Backups
PART VI Appendixes
A Creating a Linux Installer on Flash/USB Devices
B openSUSE Installation
Index
Contents
Acknowledgments
Introduction
Part I
Introduction, Installation, and Software Management
1 Technical Summary of Linux Distributions
Linux: The Operating System
What Is Open Source Software and GNU All About?
What Is the GNU Public License?
Upstream and Downstream
The Advantages of Open Source Software
Understanding the Differences Between Windows and Linux
Single Users vs. Multiple Users vs. Network Users
The Monolithic Kernel and the Micro-Kernel
Separation of the GUI and the Kernel
The Network Neighborhood
The Registry vs. Text Files
Domains and Active Directory
Summary
2 Installing Linux in a Server Configuration
Hardware and Environmental Considerations
Server Design
Uptime
Methods of Installation
Installing Fedora
Project Prerequisites
The Installation
Initial System Configuration
Installing Ubuntu Server
Summary
3 Managing Software
The Red Hat Package Manager
Managing Software Using RPM
GUI RPM Package Managers
The Debian Package Management System
APT
Software Management in Ubuntu
Querying for Information
Installing Software in Ubuntu
Removing Software in Ubuntu
Compile and Install GNU Software
Getting and Unpacking the Package
Looking for Documentation
Configuring the Package
Compiling the Package
Installing the Package
Testing the Software
Cleanup
Common Problems When Building from Source Code
Problems with Libraries
Missing Configure Script
Broken Source Code
Summary
Part II
Single-Host Administration
4 Managing Users and Groups
What Exactly Constitutes a User?
Where User Information Is Kept
The /etc/passwd File
The /etc/shadow File
The /etc/group File
User Management Tools
Command-Line User Management
GUI User Managers
Users and Access Permissions
Understanding SetUID and SetGID Programs
Pluggable Authentication Modules
How PAM Works
PAM’s Files and Their Locations
Configuring PAM
The “Other” File
D’oh! I Can’t Log In!
Debugging PAM
A Grand Tour
Creating Users with useradd
Creating Groups with groupadd
Modifying User Attributes with usermod
Modifying Group Attributes with groupmod
Deleting Users and Groups with userdel and groupdel
Summary
5 The Command Line
An Introduction to BASH
Job Control
Environment Variables
Pipes
Redirection
Command-Line Shortcuts
Filename Expansion
Environment Variables as Parameters
Multiple Commands
Backticks
Documentation Tools
The man Command
The texinfo System
Files, File Types, File Ownership, and File Permissions
Normal Files
Directories
Hard Links
Symbolic Links
Block Devices
Character Devices
Named Pipes
Listing Files: ls
Change Ownership: chown
Change Group: chgrp
Change Mode: chmod
File Management and Manipulation
Copy Files: cp
Move Files: mv
Link Files: ln
Find a File: find
File Compression: gzip
bzip2
Create a Directory: mkdir
Remove a Directory: rmdir
Show Present Working Directory: pwd
Tape Archive: tar
Concatenate Files: cat
Display a File One Screen at a Time: more
Disk Utilization: du
Show the Directory Location of a File: which
Locate a Command: whereis
Disk Free: df
Synchronize Disks: sync
Moving a User and Its Home Directory
List Processes: ps
Show an Interactive List of Processes: top
Send a Signal to a Process: kill
Miscellaneous Tools
Show System Name: uname
Who Is Logged In: who
A Variation on who: w
Switch User: su
Editors
vi
emacs
joe
pico
Summary
6 Booting and Shutting Down
Boot Loaders
GRUB Legacy
GRUB 2
LILO
Bootstrapping
The init Process
rc Scripts
Writing Your Own rc Script
Enabling and Disabling Services
Disabling a Service
Odds and Ends of Booting and Shutting Down
fsck!
Booting into Single-User (“Recovery”) Mode
Summary
7 File Systems
The Makeup of File Systems
i-Nodes
Block
Superblocks
ext3
ext4
Btrfs
Which File System Should You Use?
Managing File Systems
Mounting and Unmounting Local Disks
Using fsck
Adding a New Disk
Overview of Partitions
Traditional Disk and Partition Naming Conventions
Volume Management
Creating Partitions and Logical Volumes
Creating File Systems
Summary
8 Core System Services
The init Daemon
upstart: Die init. Die Now!
The /etc/inittab File
systemd
xinetd and inetd
The /etc/xinetd.conf File
Examples: A Simple Service Entry and Enabling/Disabling a Service
The Logging Daemon
Invoking rsyslogd
Configuring the Logging Daemon
Log Message Classifications
Format of /etc/rsyslog.conf
The cron Program
The crontab File
Editing the crontab File
Summary
9 The Linux Kernel
What Exactly Is a Kernel?
Finding the Kernel Source Code
Getting the Correct Kernel Version
Unpacking the Kernel Source Code
Building the Kernel
Preparing to Configure the Kernel
Kernel Configuration
Compiling the Kernel
Installing the Kernel
Booting the Kernel
The Author Lied—It Didn’t Work!
Patching the Kernel
Downloading and Applying Patches
Summary
10 Knobs and Dials: Virtual File Systems
What’s Inside the /proc Directory?
Tweaking Files Inside of /proc
Some Useful /proc Entries
Enumerated /proc Entries
Common proc Settings and Reports
SYN Flood Protection
Issues on High-Volume Servers
Debugging Hardware Conflicts
SysFS
cgroupfs
Summary
Part III
Networking and Security
11 TCP/IP for System Administrators
The Layers
Packets
TCP/IP Model and the OSI Model
Headers
Ethernet
IP (IPv4)
TCP
UDP
A Complete TCP Connection
Opening a Connection
Transferring Data
Closing the Connection
How ARP Works
The ARP Header: ARP Works with Other Protocols, Too!
Bringing IP Networks Together
Hosts and Networks
Subnetting
Netmasks
Static Routing
Dynamic Routing with RIP
Digging into tcpdump
A Few General Notes
Graphing Odds and Ends
IPv6
IPv6 Address Format
IPv6 Address Types
IPv6 Backward-Compatibility
Summary
12 Network Configuration
Modules and Network Interfaces
Network Device Configuration Utilities (ip and ifconfig)
Simple Usage
IP Aliasing
Setting up NICs at Boot Time
Managing Routes
Simple Usage
Displaying Routes
A Simple Linux Router
Routing with Static Routes
How Linux Chooses an IP Address
Summary
13 Linux Firewall (Netfilter)
How Netfilter Works
A NAT Primer
NAT-Friendly Protocols
Chains
Installing Netfilter
Enabling Netfilter in the Kernel
Configuring Netfilter
Saving Your Netfilter Configuration
The iptables Command
Cookbook Solutions
Rusty’s Three-Line NAT
Configuring a Simple Firewall
Summary
14 Local Security
Common Sources of Risk
SetUID Programs
Unnecessary Processes
Picking the Right Runlevel
Nonhuman User Accounts
Limited Resources
Mitigating Risk
Using chroot
SELinux
AppArmor
Monitoring Your System
Logging
Using ps and netstat
Using df
Automated Monitoring
Mailing Lists
Summary
15 Network Security
TCP/IP and Network Security
The Importance of Port Numbers
Tracking Services
Using the netstat Command
Security Implications of netstat’s Output
Binding to an Interface
Shutting Down Services
Shutting Down xinetd and inetd Services
Shutting Down Non-xinetd Services
Shutting Down Services in a Distribution-Independent Way
Monitoring Your System
Making the Best Use of syslog
Monitoring Bandwidth with MRTG
Handling Attacks
Trust Nothing (and No One)
Change Your Passwords
Pull the Plug
Network Security Tools
nmap
Snort
Nessus
Wireshark/tcpdump
Summary
Part IV
Internet Services
16 DNS
The Hosts File
How DNS Works
Domain and Host Naming Conventions
Subdomains
The in-addr.arpa Domain
Types of Servers
Installing a DNS Server
Understanding the BIND Configuration File
The Specifics
Configuring a DNS Server
Defining a Primary Zone in the named.conf File
Defining a Secondary Zone in the named.conf File
Defining a Caching Zone in the named.conf File
DNS Records Types
SOA: Start of Authority
NS: Name Server
A: Address Record
PTR: Pointer Record
MX: Mail Exchanger
CNAME: Canonical Name
RP and TXT: The Documentation Entries
Setting up BIND Database Files
Breaking out the Individual Steps
The DNS Toolbox
host
dig
nslookup
whois
nsupdate
The rndc Tool
Configuring DNS Clients
The Resolver
Configuring the Client
Summary
17 FTP
The Mechanics of FTP
Client/Server Interactions
Obtaining and Installing vsftpd
Configuring vsftpd
Starting and Testing the FTP Server
Customizing the FTP Server
Setting up an Anonymous-Only FTP Server
Setting up an FTP Server with Virtual Users
Summary
18 Apache Web Server
Understanding HTTP
Headers
Ports
Process Ownership and Security
Installing the Apache HTTP Server
Apache Modules
Starting up and Shutting Down Apache
Starting Apache at Boot Time
Testing Your Installation
Configuring Apache
Creating a Simple Root-Level Page
Apache Configuration Files
Common Configuration Options
Troubleshooting Apache
Summary
19 SMTP
Understanding SMTP
Rudimentary SMTP Details
Security Implications
Installing the Postfix Server
Installing Postfix via RPM in Fedora
Installing Postfix via APT in Ubuntu
Configuring the Postfix Server
The main.cf File
Checking Your Configuration
Running the Server
Checking the Mail Queue
Flushing the Mail Queue
The newaliases Command
Making Sure Everything Works
Summary
20 POP and IMAP
POP and IMAP Basics
Installing the UW-IMAP and POP3 Server
Running UW-IMAP
Other Issues with Mail Services
SSL Security
Testing IMAP and POP3 Connectivity over SSL
Availability
Log Files
Summary
21 The Secure Shell (SSH)
Understanding Public Key Cryptography
Key Characteristics
Cryptography References
Understanding SSH Versions
OpenSSH and OpenBSD
Alternative Vendors for SSH Clients
Installing OpenSSH via RPM in Fedora
Installing OpenSSH via APT in Ubuntu
Server Start-up and Shutdown
SSHD Configuration File
Using OpenSSH
Secure Shell (ssh) Client Program
Secure Copy (scp) Program
Secure FTP (sftp) Program
Files Used by the OpenSSH Client
Summary
Part V
Intranet Services
22 Network File System (NFS)
The Mechanics of NFS
Versions of NFS
Security Considerations for NFS
Mount and Access a Partition
Enabling NFS in Fedora
Enabling NFS in Ubuntu
The Components of NFS
Kernel Support for NFS
Configuring an NFS Server
The /etc/exports Configuration File
Configuring NFS Clients
The mount Command
Soft vs. Hard Mounts
Cross-Mounting Disks
The Importance of the intr Option
Performance Tuning
Troubleshooting Client-Side NFS Issues
Stale File Handles
Permission Denied
Sample NFS Client and NFS Server Configuration
Common Uses for NFS
Summary
23 Samba
The Mechanics of SMB
Usernames and Passwords
Encrypted Passwords
Samba Daemons
Installing Samba via RPM
Installing Samba via APT
Samba Administration
Starting and Stopping Samba
Using SWAT
Setting up SWAT
The SWAT Menus
Globals
Shares
Printers
Status
View
Password
Creating a Share
Using smbclient
Mounting Remote Samba Shares
Samba Users
Creating Samba Users
Allowing Null Passwords
Changing Passwords with smbpasswd
Using Samba to Authenticate Against a Windows Server
winbindd Daemon
Troubleshooting Samba
Summary
24 Distributed File Systems
DFS Overview
DFS Implementations
GlusterFS
Summary
25 Network Information Service
Inside NIS
The NIS Servers
Domains
Configuring the Master NIS Server
Establishing the Domain Name
Starting NIS
Editing the Makefile
Using ypinit
Configuring an NIS Client
Editing the /etc/yp.conf File
Enabling and Starting ypbind
Editing the /etc/nsswitch.conf File
NIS at Work
Testing Your NIS Client Configuration
Configuring a Secondary NIS Server
Setting the Domain Name
Setting up the NIS Master to Push to Slaves
Running ypinit
NIS Tools
Using NIS in Configuration Files
Implementing NIS in a Real Network
A Small Network
A Segmented Network
Networks Bigger than Buildings
Summary
26 LDAP
LDAP Basics
LDAP Directory
Client/Server Model
Uses of LDAP
LDAP Terminology
OpenLDAP
Server-Side Daemons
OpenLDAP Utilities
Installing OpenLDAP
Configuring OpenLDAP
Configuring slapd
Starting and Stopping slapd
Configuring OpenLDAP Clients
Creating Directory Entries
Searching, Querying, and Modifying the Directory
Using OpenLDAP for User Authentication
Configuring the Server
Configuring the Client
Summary
27 Printing
Printing Terminologies
The CUPS System
Running CUPS
Installing CUPS
Configuring CUPS
Adding Printers
Local Printers and Remote Printers
Routine CUPS Administration
Setting the Default Printer
Enabling, Disabling, and Deleting Printers
Accepting and Rejecting Print Jobs
Managing Printing Privileges
Managing Printers via the Web Interface
Using Client-Side Printing Tools
lpr
lpq
lprm
Summary
28 DHCP
The Mechanics of DHCP
The DHCP Server
Installing DHCP Software via RPM
Installing DHCP Software via APT in Ubuntu
Configuring the DHCP Server
A Sample dhcpd.conf File
The DHCP Client Daemon
Configuring the DHCP Client
Summary
29 Virtualization
Why Virtualize?
Virtualization Concepts
Virtualization Implementations
Hyper-V
KVM
QEMU
UML
VirtualBox
VMware
Xen
Kernel-Based Virtual Machines
KVM Example
Managing KVM Virtual Machines
Setting up KVM in Ubuntu/Debian
Summary
30 Backups
Evaluating Your Backup Needs
Amount of Data
Backup Hardware and Backup Medium
Network Throughput
Speed and Ease of Data Recovery
Data Deduplication
Tape Management
Command-Line Backup Tools
dump and restore
Miscellaneous Backup Solutions
Summary
Part VI
Appendixes
A Creating a Linux Installer on Flash/USB Devices
Creating a Linux Installer on Flash/USB Devices (via Linux OS)
Creating a Linux Installer on Flash/USB Devices (via Microsoft Windows OS)
Fedora Installer Using Live USB Creator on Windows OS
Ubuntu Installer Using UNetbootin on Windows OS
B openSUSE Installation
Index
Acknowledgments
M
y acknowledgment list is a very long and philosophical one. It includes everybody who has
ever believed in me and provided me with one opportunity or another to experience various
aspects of my life up to this point. It includes everybody I have ever had any kind of direct or indirect
contact with. It includes everyone I have ever had a conversation with. It includes everybody I have
ever looked at. It includes everyone who has ever given to or taken away from me. You have all
contributed to and enriched my life. I am me because of you. You know who you are, and I thank you.
Introduction
O
n October 5, 1991, Linus Torvalds posted this message to the news-group comp.os.minix:
Do you pine for the nice days of minix-1.1, when men were men and wrote their
own device drivers? Are you without a nice project and just dying to cut your
teeth on an OS you can try to modify for your needs? Are you finding it
frustrating when everything works on minix? No more all-nighters to get a
nifty program working? Then this post might be just for you :-)
Linus went on to introduce the first cut of Linux to the world. Unbeknown to him, he had
unleashed what was to become one of the world’s most popular and disruptive operating systems.
More than 20 years later, an entire industry has grown up around Linux. And, chances are, you’ve
probably already used it (or benefitted from it) in one form or another!
Who Should Read This Book
A part of the title of this book reads “A Beginner’s Guide”; this is mostly apt. But what the title
should say is “A Beginner’s to Linux Administration Guide,” because we do make a few assumptions
about you, the reader. (And we jolly well couldn’t use that title because it was such a mouthful and
not sexy enough.)
But seriously, we assume that you are already familiar with Microsoft Windows servers at a
“power user” level or better. We assume that you are familiar with the terms (and some concepts)
necessary to run a small- to medium-sized Windows network. Any experience with bigger networks
or advanced Windows technologies, such as Active Directory, will allow you to get more from the
book but is not required.
We make these assumptions because we did not want to write a guide for dummies. There are
already enough books on the market that tell you what to click without telling you why; this book is
not meant to be among those ranks. Furthermore, we did not want to waste time writing about
information that we believe is common knowledge for power users of Windows. Other people have
already done an excellent job of conveying that information, and there is no reason to repeat that work
here.
In addition to your Windows background, we assume that you’re interested in having more
information about the topics here than the material we have written alone. After all, we’ve spent only
30 to 35 pages on topics that have entire books devoted to them! For this reason, we have scattered
references to other resources throughout the chapters. We urge you to take advantage of these
recommendations. No matter how advanced you are, there is always something new to learn.
We believe that seasoned Linux system administrators can also benefit from this book because it
can serve as a quick how-to cookbook on various topics that might not be the seasoned reader’s
strong points. We understand that system administrators generally have aspects of system
administration that they like or loath a lot. For example, backups is not one of our favorite aspects of
system administration, and this is reflected in the half a page we’ve dedicated to backups. (Just
kidding, there’s an entire chapter on the topic.)
What’s in This Book?
Linux Administration: A Beginner’s Guide, Sixth Edition comprises six parts.
Part I: Introduction, Installation, and Software Management
Part I includes three chapters (Chapter 1, “Technical Summary of Linux Distributions”; Chapter 2,
“Installing Linux in a Server Configuration”; and Chapter 3, “Managing Software”) that give you a
firm handle on what Linux is, how it compares to Windows in several key areas, and how to install
server-grade Fedora and Ubuntu Linux distributions. Part I ends with a chapter on how to install
software from prepackaged binaries and source code, as well as how to perform standard software
management tasks.
Ideally, the information in Part I should be enough information to get you started and help you
draw parallels to how Linux works based on your existing knowledge of Windows. Some of the
server installation and software installation tasks performed in Part I help serve as a reference point
for some other parts of the book.
Part II: Single-Host Administration
Part II covers the material necessary to manage a stand-alone system (a system not requiring or
providing any services to other systems on the network). Although this might seem useless at first, it
is the foundation on which many other concepts are built, and it will come in handy for your
understanding network-based services later on.
This part comprises seven chapters. Chapter 4, “Managing Users and Groups,” covers the
underlying basics of user and group concepts on Linux platforms, as well as day-to-day management
tasks of adding and removing users and groups. The chapter also introduces the basic concepts of
multiuser operation and the Linux permissions model. Chapter 5, “The Command Line,” begins
covering the basics of working with the Linux command line so that you can become comfortable
working without a GUI. Although it is possible to administer a system from within the graphical
desktop, your greatest power comes from being comfortable with both the command line interface
(CLI) and the GUI. (This is true for Windows, too. Don’t believe that? Open a command prompt, run
netsh, and try to do what netsh does in the GUI.)
Once you are comfortable with the CLI, you can read Chapter 6, “Booting and Shutting Down,”
which documents the entire booting and shutting down process. This includes details on how to start
up services properly and shut them down properly. You’ll learn how to add new services manually,
which will come in handy later on in the book.
Chapter 7, “File Systems,” continues with the basics of file systems—their organization, creation,
and, most important, their management.
The basics of operation continue in Chapter 8, “Core System Services,” with coverage of basic
tools such as xinetd, upstart, rsyslog, cron, systemd, and so on. xinetd is the Linux equivalent
of Windows’ svchost. rsyslog manages logging for all applications in a unified framework. You
might think of rsyslog as a more flexible version of the Event Viewer.
Chapter 9, “The Linux Kernel,” finishes this section and Chapter 10, “Knobs and Dials: Virtual
File Systems” covers the kernel and kernel-level tweaking through /proc and /sys. Kernel coverage
documents the process of configuring, compiling, and installing your own custom kernel in Linux.
This capability is one of the points that gives Linux administrators an extraordinary amount of finegrained control over how their systems operate. The ability to view and modify certain kernel-level
configuration and runtime variables through the /proc and /sys file systems, as shown in Chapter 10,
gives administrators almost infinite kernel fine-tuning possibilities. When applied properly, this
ability amounts to an arguably better and easier way than in the Microsoft Windows world.
Part III: Networking and Security
Part III begins our journey into the world of security and networking. With the ongoing importance of
security on the Internet, as well as compliancy issues with Sarbanes-Oxley and Health Insurance
Portability and Accountability Act (HIPAA), the use of Linux in scenarios that require high security
has risen dramatically. We deliberately decided to move coverage of security up before introducing
network-based services (Part IV), so that we could touch on some essential security best practices
that can help in protecting our network-based services from attacks.
This section kicks off with Chapter 11, “TCP/IP for System Administrators,” which provides a
detailed overview of TCP/IP in the context of what system administrators need to know. The chapter
provides a lot of detail on how to use troubleshooting tools such as tcpdump to capture packets and
read them back, as well as a step-by-step analysis of how TCP connections work. These tools should
enable you to troubleshoot network peculiarities effectively.
Chapter 12, “Network Configuration,” returns to administration issues by focusing on basic
network configuration (for both IPv4 and IPv6). This includes setting up IP addresses, routing entries,
and so on. We extend past the basics in Chapter 13, “Linux Firewall (Netfilter),” by delving into
advanced networking concepts and showing you how to build a Linux-based firewall and router.
Chapter 14, “Local Security,” and Chapter 15, “Network Security,” discuss aspects of system and
network security in detail. They include Linux-specific issues as well as general security tips and
tricks so that you can better configure your system and protect it against attacks.
Part IV: Internet Services
The remainder of the book is divided into two distinct parts: Internet and Intranet services. Although
they sound similar, they are different—InTER(net) and InTRA(net). We define Internet services as
those running on a Linux system exposed directly to the Internet. Examples of this include web and
Domain Name System (DNS) services.
This section starts off with Chapter 16, “DNS.” This chapter covers the information you need to
know to install, configure, and manage a DNS server. In addition to the actual details of running a
DNS server, we provide a detailed background on how DNS works and several troubleshooting tips,
tricks, and tools.
From DNS, we move on to Chapter 17, “FTP,” which covers the installation and care of File
Transfer Protocol (FTP) servers. Like the DNS chapter, this chapter also includes a background on
FTP itself and some notes on its evolution.
Chapter 18, “Apache Web Server,” moves on to what may be considered one of the most popular
uses of Linux today: running a web server with the popular Apache software. This chapter covers the
information necessary to install, configure, and manage the Apache web server.
Chapter 19, “SMTP,” and Chapter 20, “POP and IMAP,” dive into e-mail through the setup and
configuration of Simple Mail Transfer Protocol (SMTP), Post Office Protocol (POP), and Internet
Message Access Protocol (IMAP) servers. We cover the information needed to configure all three, as
well as show how they interact with one another. What you may find a little different about this book
from other books on Linux is that we have chosen to cover the Postfix SMTP server instead of the
classic Sendmail server, because Postfix provides a more flexible server with a better security
record.
Part IV ends with Chapter 21, “The Secure Shell (SSH).” Knowing how to set up and manage the
SSH service is useful in almost any server environment—regardless of the server’s primary function.
Part V: Intranet Services
We define intranet services as those that are typically run behind a firewall for internal users and
internal consumption only. Even in this environment, Linux has a lot to offer. Part V starts off with
Chapter 22, “Network File System (NFS).” NFS has been around for close to 20 years now and has
evolved and grown to fit the needs of its users quite well. This chapter covers Linux’s NFS server
capabilities, including how to set up both clients and servers, as well as troubleshooting.
Chapter 23, “Samba,” continues the idea of sharing disks and resources with coverage of the
Samba service. Using Samba, administrators can share disks and printing facilities and provide
authentication for Windows (and Linux) users without having to install any special client software.
Thus, Linux can become an effective server, able to support and share resources between
UNIX/Linux systems as well as Windows systems. The Distributed File Systems (DFS) section
(Chapter 24) is a bit of an odd-ball for Part V, because DFS can be used/deployed in both Internetand intranet-facing scenarios. DFS solutions are especially important and relevant in today’s cloudcentric world. Among the many DFS implementations available, we have selected to cover GlusterFS
because of its ease of configuration and cross-distribution support.
In Chapter 25, “Network Information Service,” we talk about NIS, which is typically deployed
alongside NFS servers to provide a central naming service for all users within a network. The
chapter pays special attention to scaling issues and how you can make NIS work in an environment
with a large user base.
We revisit directory services in Chapter 26, “LDAP,” with coverage of Lightweight Directory
Access Protocol (LDAP) and how administrators can use this standard service for providing a
centralized user database (directory) for use among heterogeneous operating systems and also for
managing tons of users.
Chapter 27, “Printing,” takes a tour of the Linux printing subsystem. The printing subsystem, when
combined with Samba, allows administrators to support seamless printing from Windows desktops.
The result is a powerful way of centralizing printing options for Linux, Windows, and even Mac OS
X users on a single server.
Chapter 28, “DHCP,” covers another common use of Linux systems: Dynamic Host Configuration
Protocol (DHCP) servers. This chapter discusses how to deploy the ISC DHCP server, which offers
a powerful array of features and access controls options.
Moving right along is Chapter 29, “Virtualization.” Virtualization is everywhere and is definitely
here to stay. It allows companies to consolidate services and hardware that previously required
several dedicated bare-metal machines into much fewer bare-metal machines. We discuss the basic
virtualization concepts and briefly cover some of the popular virtualization technologies in Linux.
The chapter also covers the kernel-based virtual machine (KVM) implementation in detail, with
examples.
The last chapter is Chapter 30, “Backups.” Backups are arguably one of the most critical pieces
of administration. Linux-based systems support several methods of providing backups that are easy to
use and readily usable by tape drives and other media. The chapter discusses some of the methods
and explains how they can be used as part of a backup schedule. In addition to the mechanics of
backups, we discuss general backup design and how you can optimize your backup system.
Part VI: Appendixes
At the end of the book, we include some useful reference material. Appendix A, “Creating a Linux
Installer on Flash/USB Devices,” details alternate and generic methods for creating an installation
media on nonoptical media, such as a USB flash drive, SD card, and so on. We make references to
the popular openSUSE Linux distro throughout this book and as such we conclude with Appendix B,
“openSUSE Installation,” which covers a quick run through of installing openSUSE.
Updates and Feedback
Although we hope that we’ve published a book with no errors, we have set up an errata list for this
book at www.labmanual.org. If you find any errors, we welcome your submissions for errata updates.
We also welcome your feedback and comments. Unfortunately, our day jobs prevent us from
answering detailed questions, so if you’re looking for help on a specific issue, you may find one of
the many online communities a useful resource. However, if you have two cents to share about the
book, we welcome your thoughts. You can send us an e-mail to [email protected]
PART
I
Introduction, Installation, and Software
Management
CHAPTER
1
Technical Summary of Linux
Distributions
inux has hit the mainstream. Hardly a day goes by without a mention of Linux (or open source
software) in widely read and viewed print or digital media. What was only a hacker’s toy
several years ago has grown up tremendously and is well known for its stability, performance,
and extensibility.
If you need more proof concerning Linux’s penetration, just pay attention to the frequency with
which “Linux” is listed as a desirable and must have skill for technology-related job postings of
Fortune 500 companies, small to medium-sized businesses, tech start-ups, and government, research,
and entertainment industry jobs—to mention a few. The skills of good Linux system administrators
and engineers are highly desirable!
With the innovations that are taking place in different open source projects (such as K Desktop
Environment, GNOME, Unity, LibreOffice, Android, Apache, Samba, Mozilla, and so on), Linux has
made serious inroads into consumer desktop, laptop, tablet, and mobile markets. This chapter looks at
some of the core server-side technologies as they are implemented in the Linux (open source) world
and in the Microsoft Windows Server world (possibly the platform you are considering replacing
with Linux). But before delving into any technicalities, this chapter briefly discusses some important
underlying concepts and ideas that form the genetic makeup of Linux and Free and Open Source
Software (FOSS).
L
Linux: The Operating System
Usually, people (mis)understand Linux to be an entire software suite of developer tools, editors,
graphical user interfaces (GUIs), networking tools, and so forth. More formally and correctly, such
software collectively is called a distribution, or distro. The distro is the entire software suite that
makes Linux useful.
So if we consider a distribution everything you need for Linux, what then is Linux exactly? Linux
itself is the core of the operating system: the kernel. The kernel is the program acting as chief of
operations. It is responsible for starting and stopping other programs (such as editors), handling
requests for memory, accessing disks, and managing network connections. The complete list of kernel
activities could easily fill a chapter in itself, and, in fact, several books documenting the kernel’s
internal functions have been written.
The kernel is a nontrivial program. It is also what puts the Linux badge on all the numerous Linux
distributions. All distributions use essentially the same kernel, so the fundamental behavior of all
Linux distributions is the same.
You’ve most likely heard of the Linux distributions named Red Hat Enterprise Linux (RHEL),
Fedora, Debian, Mandrake, Ubuntu, Kubuntu, openSUSE, CentOS, Gentoo, and so on, which have
received a great deal of press.
Linux distributions can be broadly categorized into two groups. The first category includes the
purely commercial distros, and the second includes the noncommercial distros, or spins. The
commercial distros generally offer support for their distribution—at a cost. The commercial distros
also tend to have a longer release life cycle. Examples of commercial flavors of Linux-based distros
are RHEL and SUSE Linux Enterprise (SLE).
The noncommercial distros, on the other hand, are free. These distros try to adhere to the original
spirit of the open source software movement. They are mostly community supported and maintained—
the community consists of the users and developers. The community support and enthusiasm can
sometimes supersede that provided by the commercial offerings.
Several of the so-called noncommercial distros also have the backing and support of their
commercial counterparts. The companies that offer the purely commercial flavors have vested
interests in making sure that free distros exist. Some of the companies use the free distros as the
proofing and testing ground for software that ends up in the commercial spins. Examples of
noncommercial flavors of Linux-based distros are Fedora, openSUSE, Ubuntu, Linux Mint, Gentoo,
and Debian. Linux distros such as Gentoo might be less well known and have not reached the same
scale of popularity as Fedora, openSUSE, and others, but they are out there and in active use by their
respective (and dedicated) communities.
What’s interesting about the commercial Linux distributions is that most of the programs with
which they ship were not written by the companies themselves. Rather, other people have released
their programs with licenses, allowing their redistribution with source code. By and large, these
programs are also available on other variants of UNIX, and some of them are becoming available
under Windows as well. The makers of the distribution simply bundle them into one convenient and
cohesive package that’s easy to install. In addition to bundling existing software, several of the
distribution makers also develop value-added tools that make their distribution easier to administer
or compatible with more hardware, but the software that they ship is generally written by others. To
meet certain regulatory requirements, some commercial distros try to incorporate/implement more
specific security requirements that the FOSS community might not care about but that some
institutions/corporations do care about.
What Is Open Source Software and GNU All About?
In the early 1980s, Richard Matthew Stallman began a movement within the software industry. He
preached (and still does) that software should be free. Note that by free, he doesn’t mean in terms of
price, but rather free in the same sense as freedom or libre. This means shipping not just a product,
but the entire source code as well. To clarify the meaning of free software, Stallman was once
famously quoted as saying:
“Free software” is a matter of liberty, not price. To understand the concept, you should think
of “free” as in “free speech,” not as in “free beer.”
Stallman’s policy was, somewhat ironically, a return to classic computing, when software was
freely shared among hobbyists on small computers and provided as part of the hardware by
mainframe and minicomputer vendors. It was not until the late 1960s that IBM considered selling
application software. Through the 1950s and most of the 1960s, IBM considered software as merely a
tool for enabling the sale of hardware.
This return to openness was a wild departure from the early 1980s convention of selling
prepackaged software, but Stallman’s concept of open source software was in line with the initial
distributions of UNIX from Bell Labs. Early UNIX systems did contain full source code. Yet by the
late 1970s, source code was typically removed from UNIX distributions and could be acquired only
by paying large sums of money to AT&T (now SBC). The Berkeley Software Distribution (BSD)
maintained a free version, but its commercial counterpart, BSDi, had to deal with many lawsuits from
AT&T until it could be proved that nothing in the BSD kernel came from AT&T.
Kernel Differences
Each company that sells a Linux distribution of its own will be quick to tell you that its kernel is
better than others. How can a company make this claim? The answer comes from the fact that
each company maintains its own patch set. To make sure that the kernels largely stay in sync,
most companies do adopt patches that are posted on www.kernel.org, the “Linux Kernel
Archives.” Vendors, however, typically do not track the release of every single kernel version
that is released onto www.kernel.org. Instead, they take a foundation, apply their custom patches
to it, run the kernel through their quality assurance (QA) process, and then take it to production.
This helps organizations have confidence that their kernels have been sufficiently baked, thus
mitigating any perceived risk of running open source–based operating systems.
The only exception to this rule revolves around security issues. If a security issue is found
with a version of the Linux kernel, vendors are quick to adopt the necessary patches to fix the
problem immediately. A new release of the kernel with the fixes is often made within a short
time (commonly less than 24 hours) so that administrators who install it can be sure their
installations are secure. Thankfully, exploits against the kernel itself are rare.
So if each vendor maintains its own patch set, what exactly is it patching? This answer
varies from vendor to vendor, depending on each vendor’s target market. Red Hat, for instance,
is largely focused on providing enterprise-grade reliability and solid efficiency for application
servers. This might be different from the mission of the Fedora team, which is more interested in
trying new technologies quickly, and even more different from the approach of a vendor that is
trying to put together a desktop-oriented or multimedia-focused Linux system.
What separates one distribution from the next are the value-added tools that come with each
one. Asking, “Which distribution is better?” is much like asking, “Which is better, Coke or
Pepsi?” Almost all colas have the same basic ingredients—carbonated water, caffeine, and
high-fructose corn syrup—thereby giving the similar effect of quenching thirst and bringing on a
small caffeine-and-sugar buzz. In the end, it’s a question of requirements: Do you need
commercial support? Did your application vendor recommend one distribution over another?
Does the software (package) updating infrastructure suit your site’s administrative style better
than another distribution? When you review your requirements, you’ll find that there is likely a
distribution that is geared toward your exact needs.
The idea of giving away source code is a simple one: A user of the software should never be
forced to deal with a developer who might or might not support that user’s intentions for the software.
The user should never have to wait for bug fixes to be published. More important, code developed
under the scrutiny of other programmers is typically of higher quality than code written behind locked
doors. One of the great benefits of open source software comes from the users themselves: Should
they need a new feature, they can add it to the original program and then contribute it back to the
source so that everyone else can benefit from it.
This line of thinking sprung a desire to release a complete UNIX-like system to the public, free of
license restrictions. Of course, before you can build any operating system, you need to build tools.
And this is how the GNU project was born.
NOTE GNU stands for GNU’s Not UNIX—recursive acronyms are part of hacker humor. If you
don’t understand why it’s funny, don’t worry. You’re still in the majority.
What Is the GNU Public License?
An important thing to emerge from the GNU project is the GNU Public License (GPL). This license
explicitly states that the software being released is free and that no one can ever take away these
freedoms. It is acceptable to take the software and resell it, even for a profit; however, in this resale,
the seller must release the full source code, including any changes. Because the resold package
remains under the GPL, the package can be distributed for free and resold yet again by anyone else
for a profit. Of primary importance is the liability clause: The programmers are not liable for any
damages caused by their software.
It should be noted that the GPL is not the only license used by open source software developers
(although it is arguably the most popular). Other licenses, such as BSD and Apache, have similar
liability clauses but differ in terms of their redistribution. For instance, the BSD license allows
people to make changes to the code and ship those changes without having to disclose the added code.
(Whereas the GPL requires that the added code is shipped.) For more information about other open
source licenses, check out www.opensource.org.
Historical Footnote
Many, many moons ago, Red Hat started a commercial offering of its erstwhile free product
(Red Hat Linux). The commercial release gained steam with the Red Hat Enterprise Linux
(RHEL) series. Because the foundation for RHEL is GPL, individuals interested in maintaining a
free version of Red Hat’s distribution have been able to do so. Furthermore, as an outreach to
the community, Red Hat created the Fedora Project, which is considered the testing grounds for
new software before it is adopted by the RHEL team. The Fedora Project is freely distributed
and can be downloaded from http://fedoraproject.org.
Upstream and Downstream
To help you understand the concept of upstream and downstream components, let’s start with an
analogy. Picture, if you will, a pizza with all your favorite toppings.
The pizza is put together and baked by a local pizza shop. Several things go into making a great
pizza—cheeses, vegetables, flour (dough), herbs, meats, to mention a few. The pizza shop will often
make some of these ingredients in-house and rely on other businesses to supply other ingredients. The
pizza shop will also be tasked with assembling the ingredients into a complete finished pizza.
Let’s consider one of the most common pizza ingredients—cheese. The cheese is made by a
cheesemaker who makes her cheese for many other industries or applications, including the pizza
shop. The cheesemaker is pretty set in her ways and has very strong opinions about how her product
should be paired with other food stuffs (wine, crackers, bread, vegetables, and so on). The pizza shop
owners, on the other hand, do not care about other food stuffs—they care only about making a great
pizza. Sometimes the cheesemaker and the pizza shop owners will bump heads because of differences
in opinion and objectives. And at other times they will be in agreement and cooperate beautifully.
Ultimately (and sometimes unbeknown to them), the pizza shop owners and cheesemaker care about
the same thing: producing the best product that they can.
The pizza shop in our analogy here represents the Linux distributions vendors/ projects (Fedora,
Debian, RHEL, openSUSE, and so on). The cheesemaker represents the different software project
maintainers that provide the important programs and tools (such as the Bourne Again Shell [BASH],
GNU Image Manipulation Program [GIMP], GNOME, KDE, Nmap, and GNU Compiler Collection
[GCC]) that are packaged together to make a complete distribution (pizza). The Linux distribution
vendors are referred to as the downstream component of the open source food chain; the maintainers
of the accompanying different software projects are referred to as the upstream component.
Standards
One argument you hear regularly against Linux is that too many different distributions exist, and
that by having multiple distributions, fragmentation occurs. The argument opines that this
fragmentation will eventually lead to different versions of incompatible Linuxes.
This is, without a doubt, complete nonsense that plays on “FUD” (fear, uncertainty, and
doubt). These types of arguments usually stem from a misunderstanding of the kernel and
distributions.
Ever since becoming so mainstream, the Linux community understood that it needed a formal
method and standardization process for how certain things should be done among the numerous
Linux spins. As a result, two major standards are actively being worked on.
The File Hierarchy Standard (FHS) is an attempt by many of the Linux distributions to
standardize on a directory layout so that developers have an easy time making sure their
applications work across multiple distributions without difficulty. As of this writing, several
major Linux distributions have become completely compliant with this standard.
The Linux Standard Base (LSB) specification is a standards group that specifies what a
Linux distribution should have in terms of libraries and tools.
A developer who assumes that a Linux machine complies only with LSB and FHS is almost
guaranteed to have an application that will work with all compliant Linux installations. All of
the major distributors have joined these standards groups. This should ensure that all desktop
distributions will have a certain amount of commonality on which a developer can rely.
From a system administrator’s point of view, these standards are interesting but not crucial
to administering a Linux environment. However, it never hurts to learn more about both. For
more information on the FHS, go to their web site at www.pathname.com/fhs. To find out more
about LSB, check out www.linuxbase.org.
The Advantages of Open Source Software
If the GPL seems like a bad idea from the standpoint of commercialism, consider the surge of
successful open source software projects—they are indicative of a system that does indeed work.
This success has evolved for two reasons. First, as mentioned earlier, errors in the code itself are far
more likely to be caught and quickly fixed under the watchful eyes of peers. Second, under the GPL
system, programmers can release code without the fear of being sued. Without that protection, people
might not feel as comfortable to release their code for public consumption.
NOTE The concept of free software, of course, often begs the question of why anyone would release
his or her work for free. As hard as it might be to believe, some people do it purely for altruistic
reasons and the love of it.
Most projects don’t start out as full-featured, polished pieces of work. They often begin life as a
quick hack to solve a specific problem bothering the programmer at the time. As a quick-and-dirty
hack, the code might not have a sales value. But when this code is shared and consequently improved
upon by others who have similar problems and needs, it becomes a useful tool. Other program users
begin to enhance it with features they need, and these additions travel back to the original program.
The project thus evolves as the result of a group effort and eventually reaches full refinement. This
polished program can contain contributions from possibly hundreds, if not thousands, of programmers
who have added little pieces here and there. In fact, the original author’s code is likely to be little in
evidence.
There’s another reason for the success of generously licensed software. Any project manager who
has worked on commercial software knows that the real cost of development software isn’t in the
development phase. It’s in the cost of selling, marketing, supporting, documenting, packaging, and
shipping that software. A programmer carrying out a weekend hack to fix a problem with a tiny,
kludged program might lack the interest, time, and money to turn that hack into a profitable product.
When Linus Torvalds released Linux in 1991, he released it under the GPL. As a result of its open
charter, Linux has had a notable number of contributors and analyzers. This participation has made
Linux strong and rich in features. It is estimated that since the v.2.2.0 kernel, Torvalds’s contributions
represent less than 2 percent of the total code base.
NOTE This might sound strange, but it is true. Contributors to the Linux kernel code include the
companies with competing operating system platforms. For example, Microsoft was one of the top
code contributors to the Linux version 3.0 kernel code base (as measured by the number of changes or
patches relative to the previous kernel version). Even though this might have been for self-promoting
reasons on Microsoft’s part, the fact remains that the open source licensing model that Linux adopts
permits this sort of thing to happen. Everyone and anyone who knows how-to, can contribute code
subject to peer review from which everyone can benefit!
Because Linux is free (as in speech), anyone can take the Linux kernel and other supporting
programs, repackage them, and resell them. A lot of people and corporations have made money with
Linux doing just this! As long as these individuals release the kernel’s full source code along with
their individual packages, and as long as the packages are protected under the GPL, everything is
legal. Of course, this also means that packages released under the GPL can be resold by other people
under other names for a profit.
In the end, what makes a package from one person more valuable than a package from another
person are the value-added features, support channels, and documentation. Even IBM can agree to
this; it’s how the company made most of its money from 1930 to 1970, and again in the late 1990s and
early 2000s with IBM Global Services. The money isn’t necessarily in the product alone; it can also
be in the services that go with it.
The Disadvantages of Open Source Software
This section was included to provide a detailed, balanced, and unbiased contrast to the previous
section, which discussed some of the advantages of open source software.
Unfortunately we couldn’t come up with any disadvantages at the time of this writing! Nothing to
see here.
Understanding the Differences Between Windows and Linux
As you might imagine, the differences between Microsoft Windows and the Linux operating system
cannot be completely discussed in the confines of this section. Throughout this book, topic by topic,
you’ll read about the specific contrasts between the two systems. In some chapters, you’ll find no
comparisons, because a major difference doesn’t really exist.
But before we attack the details, let’s take a moment to discuss the primary architectural
differences between the two operating systems.
Single Users vs. Multiple Users vs. Network Users
Windows was originally designed according to the “one computer, one desk, one user” vision of
Microsoft’s co-founder, Bill Gates. For the sake of discussion, we’ll call this philosophy “singleuser.” In this arrangement, two people cannot work in parallel running (for example) Microsoft Word
on the same machine at the same time. You can buy Windows and run what is known as Terminal
Server, but this requires huge computing power and extra costs in licensing. Of course, with Linux,
you don’t run into the cost problem, and Linux will run fairly well on just about any hardware.
Linux borrows its philosophy from UNIX. When UNIX was originally developed at Bell Labs in
the early 1970s, it existed on a PDP-7 computer that needed to be shared by an entire department. It
required a design that allowed for multiple users to log into the central machine at the same time.
Various people could be editing documents, compiling programs, and doing other work at the exact
same time. The operating system on the central machine took care of the “sharing” details so that each
user seemed to have an individual system. This multiuser tradition continues through today on other
versions of UNIX as well. And since Linux’s birth in the early 1990s, it has supported the multiuser
arrangement.
NOTE Most people believe that the term “multitasking” was invented with the advent of Windows
95. But UNIX has had this capability since 1969! You can rest assured that the concepts included in
Linux have had many years to develop and prove themselves.
Today, the most common implementation of a multiuser setup is to support servers—systems
dedicated to running large programs for use by many clients. Each member of a department can have a
smaller workstation on the desktop, with enough power for day-to-day work. When someone needs to
do something requiring significantly more processing power or memory, he or she can run the
operation on the server.
“But, hey! Windows can allow people to offload computationally intensive work to a single
machine!” you may argue. “Just look at SQL Server!” Well, that position is only half correct. Both
Linux and Windows are indeed capable of providing services such as databases over the network.
We can call users of this arrangement network users, since they are never actually logged into the
server, but rather send requests to the server. The server does the work and then sends the results
back to the user via the network. The catch in this case is that an application must be specifically
written to perform such server/client duties. Under Linux, a user can run any program allowed by the
system administrator on the server without having to redesign that program. Most users find the ability
to run arbitrary programs on other machines to be of significant benefit.
The Monolithic Kernel and the Micro-Kernel
Two forms of kernels are used in operating systems. The first, a monolithic kernel provides all the
services the user applications need. The second, a micro-kernel is much more minimal in scope and
provides only the bare minimum core set of services needed to implement the operating system.
Linux, for the most part, adopts the monolithic kernel architecture; it handles everything dealing
with the hardware and system calls. Windows, on the other hand, works off a micro-kernel design.
The Windows kernel provides a small set of services and then interfaces with other executive
services that provide process management, input/output (I/O) management, and other services. It has
yet to be proved which methodology is truly the best way.
Separation of the GUI and the Kernel
Taking a cue from the Macintosh design concept, Windows developers integrated the GUI with the
core operating system. One simply does not exist without the other. The benefit with this tight
coupling of the operating system and user interface is consistency in the appearance of the system.
Although Microsoft does not impose rules as strict as Apple’s with respect to the appearance of
applications, most developers tend to stick with a basic look and feel among applications. One reason
this is dangerous, however, is that the video card driver is now allowed to run at what is known as
“Ring 0” on a typical x86 architecture. Ring 0 is a protection mechanism—only privileged processes
can run at this level, and typically user processes run at Ring 3. Because the video card is allowed to
run at Ring 0, it could misbehave (and it does!), and this can bring down the whole system.
On the other hand, Linux (like UNIX in general) has kept the two elements—user interface and
operating system—separate. The X Window System interface is run as a user-level application,
which makes it more stable. If the GUI (which is complex for both Windows and Linux) fails, Linux’s
core does not go down with it. The GUI process simply crashes, and you get a terminal window. The
X Window System also differs from the Windows GUI in that it isn’t a complete user interface. It
defines only how basic objects should be drawn and manipulated on the screen.
One of the most significant features of the X Window System is its ability to display windows
across a network and onto another workstation’s screen. This allows a user sitting on host A to log
into host B, run an application on host B, and have all of the output routed back to host A. It is
possible for two people to be logged into the same machine, running a Linux equivalent of Microsoft
Word (such as OpenOffice or LibreOffice) at the same time.
In addition to the X Window System core, a window manager is needed to create a useful
environment. Linux distributions come with several window managers, including the heavyweight and
popular GNOME and KDE environments (both of which are available on other variants of UNIX as
well). Both GNOME and KDE offer an environment that is friendly, even to the casual Windows
user. If you’re concerned with speed, you can look into the F Virtual Window Manager (FVWM),
Lightweight X11 Desktop Environment (LXDE), and Xfce window managers. They might not have all
the glitz of KDE or GNOME, but they are really fast and lightweight.
So which approach is better—Windows or Linux—and why? That depends on what you are trying
to do. The integrated environment provided by Windows is convenient and less complex than Linux,
but out of the box, Windows lacks the X Window System feature that allows applications to display
their windows across the network on another workstation. The Windows GUI is consistent, but it
cannot be easily turned off, whereas the X Window System doesn’t have to be running (and
consuming valuable hardware resources) on a server.
NOTE With its latest server family (Windows Server 8 and newer), Microsoft has somewhat
decoupled the GUI from the base operating system (OS). You can now install and run the server in a
so-called “Server Core” mode. Windows Server 8 Server Core can run without the usual Windows
GUI. Managing the server in this mode is done via the command line or remotely from a regular
system, with full GUI capabilities.
The Network Neighborhood
The native mechanism for Windows users to share disks on servers or with each other is through the
Network Neighborhood. In a typical scenario, users attach to a share and have the system assign it a
drive letter. As a result, the separation between client and server is clear. The only problem with this
method of sharing data is more people-oriented than technology-oriented: People have to know which
servers contain which data.
With Windows, a new feature borrowed from UNIX has also appeared: mounting. In Windows
terminology, it is called reparse points. This is the ability to mount a CD-ROM drive into a directory
on your C drive. The concept of mounting resources (optical media, network shares, and so on) in
Linux/UNIX might seem a little strange, but as you get used to Linux, you’ll understand and appreciate
the beauty in this design. To get anything close to this functionality in Windows, you have to map a
network share to a drive letter.
Right from inception, Linux was built with support for the concept of mounting, and as a result,
different types of file systems can be mounted using different protocols and methods. For example, the
popular Network File System (NFS) protocol can be used to mount remote shares/folders and make
them appear local. In fact, the Linux Automounter can dynamically mount and unmount different file
systems on an as-needed basis.
A common example of mounting partitions under Linux involves mounted home directories. The
user’s home directories can reside on a remote server, and the client systems can automatically mount
the directories at boot time. So the /home directory exists on the client, but the /home/username
directory (and its contents) can reside on the server.
Under Linux NFS and other Network File Systems, users never have to know server names or
directory paths, and their ignorance is your bliss. No more questions about which server to connect
to. Even better, users need not know when the server configuration must change. Under Linux, you can
change the names of servers and adjust this information on client-side systems without making any
announcements or having to reeducate users. Anyone who has ever had to reorient users to new server
arrangements will appreciate the benefits and convenience of this.
Printing works in much the same way. Under Linux, printers receive names that are independent of
the printer’s actual host name. (This is especially important if the printer doesn’t speak Transmission
Control Protocol/Internet Protocol, or TCP/IP.) Clients point to a print server whose name cannot be
changed without administrative authorization. Settings don’t get changed without you knowing it. The
print server can then redirect all print requests as needed. The unified interface that Linux provides
will go a long way toward improving what might be a chaotic printer arrangement in your network
environment. This also means you don’t have to install print drivers in several locations.
The Registry vs. Text Files
Think of the Windows Registry as the ultimate configuration database—thousands upon thousands of
entries, only a few of which are completely documented.
“What? Did you say your Registry got corrupted?” <maniacal laughter> “Well, yes, we can try to
restore it from last night’s backups, but then Excel starts acting funny and the technician (who charges
$65 just to answer the phone) said to reinstall.…”
In other words, the Windows Registry system can be at best, difficult to manage. Although it’s a
good idea in theory, most people who have serious dealings with it don’t emerge from battling it
without a scar or two.
Linux does not have a registry, and this is both a blessing and a curse. The blessing is that
configuration files are most often kept as a series of text files (think of the Windows .ini files). This
setup means you’re able to edit configuration files using the text editor of your choice rather than tools
such as regedit. In many cases, it also means you can liberally comment those configuration files so
that six months from now you won’t forget why you set up something in a particular way. Most
software programs that are used on Linux platforms store their configuration files under the /etc
directory or one of its subdirectories. This convention is widely understood and accepted in the
FOSS world.
The curse of a no-registry arrangement is that there is no standard way of writing configuration
files. Each application can have its own format. Many applications are now coming bundled with
GUI-based configuration tools to alleviate some of these problems. So you can do a basic setup
easily, and then manually edit the configuration file when you need to do more complex adjustments.
In reality, having text files hold configuration information usually turns out to be an efficient
method. Once set, they rarely need to be changed; even so, they are straight text files and thus easy to
view when needed. Even more helpful is that it’s easy to write scripts to read the same configuration
files and modify their behavior accordingly. This is especially helpful when automating server
maintenance operations, which is crucial in a large site with many servers.
Domains and Active Directory
If you’ve been using Windows long enough, you might remember the Windows NT domain controller
model. If twinges of anxiety ran through you when reading the last sentence, you might still be
suffering from the shell shock of having to maintain Primary Domain Controllers (PDCs), Backup
Domain Controllers (BDCs), and their synchronization.
Microsoft, fearing revolt from administrators all around the world, gave up on the Windows NT
model and created Active Directory (AD). The idea behind AD was simple: Provide a repository for
any kind of administrative data, whether it is user logins, group information, or even just telephone
numbers. In addition, provide a central place to manage authentication and authorization for a domain.
The domain synchronization model was also changed to follow a Domain Name System (DNS)–style
hierarchy that has proved to be far more reliable. NT LAN Manager (NTLM) was also dropped in
favor of Kerberos. (Note that AD is still somewhat compatible with NTLM.)
While running dcpromo might not be anyone’s idea of a fun afternoon, it is easy to see that AD
works pretty well.
Out of the box, Linux does not use a tightly coupled authentication/authorization and data store
model the way that Windows does with AD. Instead, Linux uses an abstraction model that allows for
multiple types of stores and authentication schemes to work without any modification to other
applications. This is accomplished through the Pluggable Authentication Modules (PAM)
infrastructure and the name resolution libraries that provide a standard means of looking up user and
group information for applications. It also provides a flexible way of storing that user and group
information using a variety of schemes.
For administrators looking to Linux, this abstraction layer can seem peculiar at first. However,
consider that you can use anything from flat files, to Network Information Service (NIS), to
Lightweight Directory Access Protocol (LDAP) or Kerberos for authentication. This means you can
pick the system that works best for you. For example, if you have an existing UNIX infrastructure that
uses NIS, you can simply make your Linux systems plug into that. On the other hand, if you have an
existing AD infrastructure, you can use PAM with Samba or LDAP to authenticate against the domain.
Use Kerberos? No problem. And, of course, you can choose to make your Linux system not interact
with any external authentication system. In addition to being able to tie into multiple authentication
systems, Linux can easily use a variety of tools, such as OpenLDAP, to keep directory information
centrally available as well.
Summary
In this chapter, we offered an overview of what Linux is and what it isn’t. We discussed a few of the
guiding principles, ideas, and concepts that govern open source software and Linux by extension. We
ended the chapter by covering some of the similarities and differences between core technologies in
the Linux and Microsoft Windows Server worlds. Most of these technologies and their practical uses
are dealt with in greater detail in the rest of this book.
If you are so inclined and would like to get more detailed information on the internal workings of
Linux itself, you might want to start with the source code. The source code can be found at
www.kernel.org. It is, after all, open source!
CHAPTER
2
Installing Linux in a Server
Configuration
he remarkable improvement and polish in the installation tools (and procedure) are partly
responsible for the mass adoption of Linux-based distributions. What once was a mildly
frightening process many years ago has now become almost trivial. Even better, there are many
ways to install the software; optical media (CD/ DVD-ROMs) are no longer the only choice (although
they are still the most common). Network installations are part of the default list of options as well,
and they can be a wonderful help when you’re installing a large number of hosts. Another popular
method of installing a Linux distribution is installing from what is known as a “live CD,” which
simply allows you to try the software before committing to installing it.
Most default configurations in which Linux is installed are already capable of becoming servers.
It is usually just a question of installing and configuring the proper software to perform the needed
task. Proper practice dictates that a so-called server be dedicated to performing only one or two
specific tasks. Any other installed and irrelevant services simply take up memory and create a drag
on performance and, as such, should be avoided. In this chapter, we discuss the installation process
as it pertains to servers and their dedicated functions.
T
Hardware and Environmental Considerations
As you would with any operating system, before you get started with the installation process, you
should determine what hardware configurations will work. Each commercial vendor publishes a
hardware compatibility list (HCL) and makes it available on its web site. For example, Red Hat’s
HCL is at http://hardware.redhat.com (Fedora’s HCL can be safely assumed to be similar to Red
Hat’s), openSUSE’s HCL database can be found at http://en.opensuse.org/Hardware, Ubuntu’s HCL
can be found at https://wiki.ubuntu.com/HardwareSupport, and a more generic HCL for most Linux
flavors can be found at www.tldp.org/HOWTO/Hardware-HOWTO.
These sites provide a good starting reference point when you are in doubt concerning a particular
piece of hardware. However, keep in mind that new Linux device drivers are being churned out on a
daily basis around the world, and no single site can keep up with the pace of development in the open
source community. In general, most popular Intel-based and AMD-based configurations work without
difficulty.
A general rule that applies to all operating systems is to avoid cutting-edge hardware and
software configurations. Although their specs might appear impressive, they haven’t had the maturing
process some of the slightly older hardware has undergone. For servers, this usually isn’t an issue,
since there is no need for a server to have the latest and greatest toys such as fancy video cards and
sound cards. Your main goal, after all, is to provide a stable and highly available server for your
users.
Server Design
By definition, server-grade systems exhibit three important characteristics: stability, availability, and
performance. These three factors are usually improved through the purchase of more and better
hardware, which is unfortunate. It’s a shame to pay thousands of dollars extra to get a system capable
of excelling in all three areas when you could have extracted the desired level of performance out of
existing hardware with a little tuning. With Linux, this is not difficult; even better, the gains are
outstanding.
One of the most significant design decisions you must make when managing a server may not even
be technical, but administrative. You must design a server to be unfriendly to casual users. This
means no cute multimedia tools, no sound card support, and no fancy web browsers (when at all
possible). In fact, casual use of a server should be strictly prohibited as a rule.
Another important aspect of designing a server is making sure that it resides in the most
appropriate environment. As a system administrator, you must ensure the physical safety of your
servers by keeping them in a separate room under lock and key (or the equivalent). The only access to
the servers for non-administrative personnel should be through the network. The server room itself
should be well ventilated and kept cool. The wrong environment is an accident waiting to happen.
Systems that overheat and nosy users who think they know how to fix problems can be as great a
danger to server stability as bad software (arguably even more so).
Once the system is in a safe place, installing battery backup is also crucial. Backup power serves
two key purposes:
It keeps the system running during a power failure so that it can gracefully shut down, thereby
avoiding data corruption or loss.
It ensures that voltage spikes, drops, and other electrical noises don’t interfere with the health
of your system.
Here are some specific things you can do to improve your server performance:
Take advantage of the fact that the graphical user interface (GUI) is uncoupled from the core
operating system, and avoid starting the X Window System (Linux’s GUI) unless someone
needs to sit at a console and run an application. After all, like any other application, the X
Window System uses memory and CPU time, both of which are better off going to the more
essential server processes instead.
Determine what functions the server is to perform, and disable all other unrelated functions.
Not only are unused functions a waste of memory and CPU time, but they are just another
issue you need to deal with on the security front.
Unlike some other operating systems, Linux allows you to pick and choose the features you
want in the kernel. (You’ll learn about this process in Chapter 10.) The default kernel will
already be reasonably well tuned, so you won’t have to worry about it. But if you do need to
change a feature or upgrade the kernel, be picky about what you add. Make sure you really
need a feature before adding it.
NOTE You might hear an old recommendation that you recompile your kernel only to make the most
effective use of your system resources. This is no longer entirely true—the other reasons to recompile
your kernel might be to upgrade or add support for a new device or even to remove support for
components you don’t need.
Uptime
All of this chatter about taking care of servers and making sure silly things don’t cause them to crash
stems from a longtime UNIX philosophy: Uptime is good. More uptime is better.
The UNIX (Linux) uptime command tells the user how long the system has been running since its
last boot, how many users are currently logged in, and how much load the system is experiencing. The
last two are useful measures that are necessary for day-to-day system health and long-term planning.
(For example, if the server load has been staying abnormally and consistently high, it might mean that
it’s time to buy a faster/bigger/better server.)
But the all-important number is how long the server has been running since its last reboot. Long
uptime is regarded as a sign of proper care, maintenance, and, from a practical standpoint, system
stability. You’ll often find UNIX administrators boasting about their server’s uptime the way you hear
car buffs boast about horsepower. This is also why you’ll hear UNIX administrators cursing at system
changes (regardless of operating system) that require a reboot to take effect. You may deny caring
about it now, but in six months, you’ll probably scream at anyone who reboots the system
unnecessarily. Don’t bother trying to explain this phenomenon to a non-admin, because they’ll just
look at you oddly. You’ll just know in your heart that your uptime is better than theirs!
Methods of Installation
With the improved connectivity and speed of both local area networks (LANs) and Internet
connections, it is becoming an increasingly popular option to perform installations over the network
rather than using a local optical drive (CD-ROM, DVD-ROM, and so on).
Depending on the particular Linux distribution and the network infrastructure already in place,
you can design network-based installations around several protocols, including the following popular
ones:
FTP (File Transfer Protocol)
installations.
This is one of the earliest methods for performing network
HTTP (Hypertext Transfer Protocol)
NFS (Network File System)
The installation tree is served from a web server.
The distribution tree is shared/exported on an NFS server.
SMB (Server Message Block) This method is relatively uncommon, and not all
distributions support it. The installation tree can be shared on a Samba server or shared from
a Windows box.
The other, more typical method of installation is through the use of optical media provided by the
vendor. All the commercial distributions of Linux have boxed sets of their brand of Linux that contain
the install media. They usually also make CD/DVD-ROM images (ISOs) of the OS available on their
FTP and/or HTTP sites. The distros (distributions) that don’t make their ISOs available will usually
have a stripped-down version of the OS available in a repository tree on their site.
Another variant of installing Linux that has become popular is installing via a live distro
environment. This environment can be a live USB or even a live CD/DVD. This method provides
several advantages: It allows the user to try out (test drive) the distribution first before actually
installing anything onto the drive. It also allows the user to have a rough idea of how hardware and
other peripherals on the target system will behave. Live distros are usually a stripped-down version
of the full distribution and, as such, no conclusion should be drawn from them. And this is because,
with a little tweak here and there, you can usually get troublesome hardware working after the fact—
though your mileage will vary.
We will be performing a server class install in this chapter using an image that was burnt to a
DVD. Of course, once you have gone through the process of installing from an optical medium
(CD/DVD-ROM), you will find performing the network-based installations straightforward. A side
note regarding automated installations is that server-type installs aren’t well suited to automation,
because each server usually has a unique task; thus, each server will have a slightly different
configuration. For example, a server dedicated to handling logging information sent to it over the
network is going to have especially large partitions set up for the appropriate logging directories,
compared to a file server that performs no logging of its own. (The obvious exception is for server
farms with large numbers of replicated servers. But even those installations have nuances that require
attention to detail specific to the installation.)
Installing Fedora
In this section, we will install a 64-bit version of a Fedora 16 distribution on a standalone system.
We will take a liberal approach to the process, installing some tools and subsystems possibly
relevant to server operations. Later chapters explore (as well as add new ones) each subsystem’s
purpose and help you determine which ones you really need to keep.
NOTE Don’t worry if you choose to install a different version or architecture of Fedora, because the
installation steps involved between versions are similar. You’ll be just fine if you choose to install a
Linux distro other than Fedora; luckily, most of the concepts carry over among the various
distributions. Some installers are just prettier than others.
Project Prerequisites
First, you need to download the ISO for Fedora that we will be installing. Fedora’s project web page
has a listing of several mirrors located all over the world. You should, of course, choose the mirror
geographically closest to you. The list of official mirrors can be found at
http://mirrors.fedoraproject.org/publiclist.
The DVD image used for this installation was downloaded from
http://download.fedoraproject.org/pub/fedora/linux/releases/16/Fedora/x86_64/iso/Fedora-16x86_64-DVD.iso.
You can alternatively download the image from this mirror:
http://mirrors.kernel.org/fedora/releases/16/Fedora/x86_64/iso/Fedora-16-x86_64-DVD.iso.
NOTE Linux distributions are often packaged by the architecture on which they were compiled to
run. You would often find ISO images (and other software) named to reflect an architecture type.
Examples of the architecture types are x86, x86_64, ppc, and so on. The x86 refers to the Pentium
class family and their equivalents (such as i386, i586, i686, AMD Athlon, AthlonXP, Duron,
AthlonMP, Sempron, and so on). The PPC family refers to the PowerPC family (such as G3, G4, G5,
IBM pSeries, and so on). And the x86_64 family refers to the 64-bit platforms (such as Athlon,
Opteron, Phenom, EM64T, Intel Core i3 / i5 / i7, i9, and so on).
The next step is to burn the ISO to a suitable medium. In this case, we’ll use a blank DVD. Use
your favorite CD/DVD burning program to burn the image. Remember that the file you downloaded is
already an exact image of a DVD medium and so should be burnt as such. Most CD/DVD burning
programs have an option to create a CD or DVD from an image. Note that if you burn the file you
downloaded as you would a regular data file, you will end up with a single file on the root of your
DVD-ROM. This is not what you want. For optical media–based installations, the system on which
you are installing should have a DVD-ROM drive. If you plan on performing the installation using an
external flash-based media (such as a USB stick, a Secure Digital (SD) card, an external hard disk,
and so on), the system needs to be able to boot off such hardware. Appendix B discusses how to
create a Linux installer on flash-based media.
NOTE Some Linux distribution install images may also be available as a set of CD-ROM images or
a single DVD image. If you have a choice, as well as the proper hardware, you should opt for the
DVD image—you avoid having to swap out CDs in the middle of the install because all the required
files are already on the DVD, as opposed to multiple CDs. Also, the chances of having a bad
installation medium are reduced (that is, there is a higher probability of having one bad CD out of
four than of having one bad DVD out of one).
The Installation
Let’s begin the installation process.
Boot off the DVD-ROM. The system Basic Input Output System (BIOS) should be preconfigured
for this already. In the case of newer hardware, the Unified Extensible Firmware Interface (UEFI)
should likewise be configured to boot from the correct medium. This will present a welcome splash
screen:
1. If you do not press any key, the prompt will begin a count-down, after which the installation
process will start by booting the highlighted Install Or Upgrade Fedora option. You can also
press ENTER to start the process immediately.
2. At the Disc Found screen, press ENTER to test/verify your install media. This media verification
step can save you the trouble of starting the installation only to find out halfway through that
the installer will abort because of bad installation media. Press ENTER again at the Media Check
screen to begin testing.
3. After the media check runs to completion, you should see the Success screen that reports that
the media was successfully verified. At this point, it is safe to use the keyboard to select OK
to continue with the installation.
4. Click Next at the next screen.
5. Select the language you want to use to perform the installation in this screen (see illustration).
Then click the Next button.
6. Select your keyboard layout type. For this example, click the U.S. English layout. Click Next
to continue.
Initialize the Disk
This portion of the installation is probably the part that most new Linux users find the most awkward,
because of the different naming conventions that Linux uses. This needn’t be a problem, however—all
it takes is a slight mind shift. You should also keep in mind that “a partition is a partition is a
partition” in Linux or Windows or Mac OS.
1. You will see a screen (shown in Figure 2-1) asking you to select the type of devices that the
installation will involve. We will be performing the installation on our sample system using
traditional storage devices, such as hard disks. Select Basic Storage Devices and click Next.
Figure 2-1. Select devices
2. If you are performing the installation on a brand new disk (or a disk with no readable
partitions), you will see a storage device warning message about existing data. Select Yes,
Discard Any Data.
NOTE If the installer detects the presence of more than one block or disk device (SATA, IDE, SCSI,
flash drive, memory card, and so on) attached to the system, you will be presented with a different
screen that allows you to include or exclude the available block devices from the installation process.
Configure the Network
The next phase of the installation procedure is for network configuration, where you can configure or
tweak network-related settings for the system (see Figure 2-2).
Figure 2-2. Network configuration
The network configuration phase will give you the option to configure the hostname of the system
(the name defaults to localhost.localdomain). Note that this name can be changed easily after the OS
has been installed. For now, accept the default value supplied for the hostname.
The next important configuration task is related to the network interfaces on the system.
1. While still on the current screen, click the Configure Network button. You’ll see a Network
Connections dialog similar to the following:
2. Open the Wired tab and verify that an Ethernet card is listed. The first Ethernet interface—
System eth0 (or System em1 or System p1p1)—will be automatically configured using the
Dynamic Host Configuration Protocol (DHCP). You do not need to make any changes here.
Click Close.
NOTE Different network connection types (wired, wireless, mobile broadband, VPN, and DSL) will
be listed in the Network Connections dialog. And under the different network connection types, all the
correctly detected network interface hardware (such as Ethernet network cards) will be listed under
the corresponding connection type.
Depending on the distribution and the specific hardware setup, Ethernet devices in Linux are
normally named eth0, eth1, em1, em2, p1p1, p1p2, and so on. For each interface, you can either
configure it using DHCP or manually set the Internet Protocol (IP) address. If you choose to configure
manually, be sure to have all the pertinent information ready, such as the IP address, netmask, and so
on.
Also, don’t worry if you know that you don’t have a DHCP server available on your network that
will provide your new system with IP configuration information. The Ethernet interface will simply
remain unconfigured. The hostname of the system can also be automatically set via DHCP—if you
have a reachable and capable DHCP server on the network.
3. Back at the main screen (Figure 2-2), click Next.
Time Zone Configuration
The Time Zone Configuration section is the next stage in the installation. Here you select the time
zone in which the machine is located.
1. If your system’s hardware clock keeps time in Coordinated Universal Time (UTC), select the
System Clock Uses UTC check box so that Linux can display the correct local time.
2. Scroll through the list of locations, and select the nearest city to your time zone. You can also
use the interactive map to select a specific city (marked by a yellow dot) to set your time
zone.
3. Click Next when you’re done.
Set the Root Password
Now you’ll set a password for the root user, also called the superuser. This user is the most
privileged account on the system and typically has full control of the system. It is equivalent to the
administrator account in Windows operating systems. Thus, it is crucial that you protect this account
with a good password. Be sure not to choose dictionary words or names as passwords, because they
are easy to guess and crack.
1. Enter a strong password in the Root Password text box.
2. Enter the same password again in the Confirm text box.
3. Click Next.
Storage Configuration
Before we delve into the partitioning setup proper, we will provide a quick overview of the
partitioning scheme and file system layout you will be employing for this installation. Note that the
installer provides the option to lay out the disk partition automatically, but we will not accept the
default layout so that we can configure the server optimally. The equivalent partitions in the Windows
world are also included in the overview:
/ The root partition/volume is identified by a forward slash (/). All other directories are
attached (mounted) to this parent directory. It is equivalent to the system drive (C:\) in
Windows.
/boot This partition/volume contains almost everything required for the boot process. It
stores data that is used before the kernel begins executing user programs. The equivalent of
this in Windows is the system partition (not the boot partition).
/usr This is where all of the program files will reside (similar to C:\Program Files in
Windows).
/home This is where everyone’s home directory will be (assuming this server will house
them). This is useful for keeping users from consuming an entire disk and leaving other
critical components without space (such as log files). This directory is synonymous with
C:\Documents and Settings\ in Windows XP/200x or C:\Users\ in the newer Windows
operating systems.
/var This is where system/event logs are generally stored. Because log files tend to grow in
size quickly and can also be affected by outside users (for instance, individuals visiting a
web site), it is important to store the logs on a separate partition so that no one can perform a
denial-of-service attack by generating enough log entries to fill up the entire disk. Logs are
generally stored in the C:\WINDOWS\system32\config\ directory in Windows.
/tmp This is where temporary files are placed. Because this directory is designed so that it
is writable by any user (similar to the C:\Temp directory in Windows), you need to make
sure arbitrary users don’t abuse it and fill up the entire disk. You ensure this by keeping it on
a separate partition.
Swap This is where the virtual memory file is stored. This isn’t a user-accessible file
system. Although Linux (and other flavors of UNIX as well) can use a normal disk file to
hold virtual memory the way Windows does, you’ll find that putting your swap file on its
own partition improves performance. You will typically want to configure your swap file to
be double the physical memory that is in your system. This is referred to as the paging file in
Windows.
Each of these partitions is mounted at boot time. The mount process makes the contents of that
partition available as if it were just another directory on the system. For example, the root directory
(/) will be on the first (root) partition. A subdirectory called /usr will exist on the root directory but
will have nothing in it. A separate partition can then be mounted such that going into the /usr directory
will allow you to see the contents of the newly mounted partition. All the partitions, when mounted,
appear as a unified directory tree rather than as separate drives; the installation software does not
differentiate one partition from another. All it cares about is which directory each file goes into. As a
result, the installation process automatically distributes its files across all the mounted partitions, as
long as the mounted partitions represent different parts of the directory tree where files are usually
placed.
The disk partitioning tool used during the operating system installation provides an easy way to
create partitions and associate them to the directories on which they will be mounted. Each partition
entry will typically show the following information:
Device Linux associates each partition with a separate device. For the purpose of this
installation, you need to know only that under Integrated Drive Electronics (IDE) disks, each
device begins with /dev/sdXY, where X is a for an IDE master on the first chain, b for an IDE
slave on the first chain, c for an IDE master on the second chain, or d for an IDE slave on the
second chain, and where Y is the partition number of the disk. For example, /dev/sda1 is the
first partition on the primary chain, primary disk. Native Small Computer System Interface
(SCSI) disks follow the same basic idea, and each partition starts with /dev/ sdXY, where X
is a letter representing a unique physical drive (a is for SCSI ID 1, b is for SCSI ID 2, and so
on). The Y represents the partition number. Thus, for example, /dev/sdb4 is the fourth
partition on the SCSI disk with ID 2. The system is a little more complex than Windows, but
each partition’s location is explicit—no more guessing “What physical device does drive E:
correspond to?”
Mount point The location where the partition is mounted.
Type This field shows the partition’s type (for example, ext2, ext3, ext4, swap, or vfat).
Format This field indicates whether the partition will be formatted.
Size (MB) This field shows the partition’s size (in megabytes, or MB).
Start This field shows the cylinder on your hard drive where the partition begins.
End This field shows the cylinder on your hard drive where the partition ends.
For the sake of simplicity, you will use only some of the disk boundaries described earlier for
your installation. In addition, you will leave some free space (unpartitioned space) that we can play
with in a later chapter (Chapter 7). You will carve up your hard disk into the following:
NOTE The /boot partition cannot be created on a Logical Volume Management (LVM) partition type.
The Fedora boot loader cannot read LVM-type partitions. This is true at the time of this writing, but it
could change in the future. For more on LVM, see Chapter 7.
The sample system on which this installation is being performed has a 100-gigabyte (GB) hard
disk. You will use the following sizes as a guideline on how to allocate the various sizes for each
partition/volume. You should, of course, adjust the suggested sizes to suit the overall size of the disk
you are using.
Mount Point/Partition
Size
BIOS Boot
2MB
/boot
500MB
/
~ 20GB
SWAP
~ 4GB
/home
~ 50GB
/tmp
~ 5952MB (~6GB)
Free Space
~ 20GB
NOTE A time-tested traditional/conservative approach was used in partitioning the disk in this
chapter. The same approach is used in creating the file systems. If you prefer a more cutting-edge
approach that employs all the latest disk and file system technologies (such as B-tree file system
[Btrfs], GPT partition labels, and so on), please see Appendix A, which walks through the
installation and setup of an openSUSE Linux distribution.
Now that you have some background on partitioning under Linux, let’s go back to the installation
process itself:
1. The current screen will present you with different types of installation options. Select the
Create Custom Layout option; then click Next.
2. You’ll see the Disk Setup screen:
3. Click Create. The Create Storage dialog box appears. Select Standard Partition and click
Create.
4. The Add Partition dialog box appears. Complete it with the following information for the
corresponding fields, as shown in the next illustration.
Mount Point
File System Type
Allowable Drives
Size (MB)
Additional Size Options
Accept the default value
BIOS Boot
Accept the default value
2 (~ 2MB)
Fixed size
5. Click OK when you’re done.
6. Click Create. The Create Storage dialog box appears. Select Standard Partition and click
Create again.
7. The Add Partition dialog box appears. Complete it with the following information for the
corresponding fields, as shown in the illustration.
Mount Point
File System Type
Allowable Drives
Size (MB)
Additional Size Options
/boot
ext4
Accept the default value
500
Fixed size
8. Click OK when you’re done.
NOTE The Fedora installer supports the creation of encrypted file systems. We will not use any
encrypted file systems on our sample system.
9. You will create the / (root), /home, /tmp, and swap containers on an LVM-type partition. To
do this, you will first need to create the parent physical volume. Click Create to open the
Create Storage dialog box.
10. Select LVM Physical Volume and then click Create to open another Add Partition dialog box,
shown next. The physical volume will be created with the information that follows:
Mount Point
File System Type
Allowable Drives
Size (MB)
Additional Size Options
Leave this field blank
physical volume (LVM)
Accept the default value
80000 (Approximately 80GB)
Fixed size
11. Click OK when you’re done.
12. Back at the main disk overview screen, click the Create button again to open the Create
Storage dialog. Select the LVM Volume Group option and click Create.
13. In the Make LVM Volume Group dialog, accept the default values already provided for the
various fields (Volume Group Name, Physical Extent, and so on). Click Add.
14. The Make Logical Volume dialog box will appear. Complete the fields in the dialog box with
the information that follows:
Mount Point
File System Type
Logical Volume Name
Size (MB)
/
ext4
LogVol00
20000 (approximately 20GB)
The completed dialog box should resemble the one shown here:
15. Click OK when you’re done.
16. Click Add again in the Make LVM Volume Group dialog box. The Make Logical Volume
dialog box will appear. Complete the fields in the dialog box with the information that
follows:
Mount Point
File System Type
Logical Volume Name
Size (MB)
Leave blank
Swap
LogVol01
4000 (approximately double the total amount of
random access memory, or RAM, available)
The completed dialog box should resemble the one shown here:
17. Click OK when you’re done.
18. Click Add again in the Make LVM Volume Group dialog box. The Make Logical Volume
dialog box will appear. Complete the fields in the dialog box with the information that
follows:
/home
ext4
LogVol02
50000 (Approximately 50GB)
Mount Point
File System Type
Logical Volume Name
Size (MB)
19. Click OK when you’re done.
20. Click Add again in the Make LVM Volume Group dialog box. The Make Logical Volume
dialog box will appear. Complete the fields in the dialog box with the information that
follows:
Mount Point
File System Type
Logical Volume
Name
Size (MB)
/tmp
ext4
LogVol03
5952 (or Use Up All the Remaining Free Space on the Volume
Group)
21. Click OK when you’re done.
22. The final and completed Make LVM Volume Group dialog box should resemble the one
shown here:
23. Click OK to close the dialog box.
24. You will be returned to the main disk overview screen. The final screen should be similar to
the one shown here:
NOTE You will notice that some free unpartitioned space remains under the Device column. This
was done deliberately so that we can play with that space in a later chapter without necessarily
having to reinstall the entire operating system to create free space.
25. Click Next to complete the disk-partitioning portion of the installation.
26. You might see a Format Warnings screen about pre-existing devices that need to be formatted,
thereby destroying all data. If you do see this warning, it is okay to confirm the format.
27. You might see another confirmation dialog box warning about “Writing Partitioning Options
to Disk” before the changes are actually executed. If you do see this warning, it is okay to
confirm writing the changes to disk. Click Write Changes to Disk.
Configure the Boot Loader
A boot manager handles the process of actually starting the load process of an operating system.
GRUB is one of the popular boot managers for Linux. If you’re familiar with Windows, you have
already dealt with the NT Loader (NTLDR), which presents the menu at boot time.
The Boot Loader Configuration screen presents you with some options (see Figure 2-3). The first
option allows you to install a boot loader and the accompanying Change Device button, which lets
you specify the device on which to install the boot loader. On our sample system, it is being installed
on the Master Boot Record (MBR) of /dev/sda. The MBR is the first thing the system will read off
the disk when booting the computer. It is essentially the point where the built-in hardware tests finish
and pass off control to the software.
Figure 2-3. Boot Loader Configuration screen
The second option allows you specify a boot loader password. We will not enable this option.
Typically, unless you really know what you are doing, you will want to accept the defaults
provided here and click Next.
NOTE Various Linux distributions customize the boot loader menu in different ways. Some
distributions automatically add a rescue mode entry to the list of available options. Some
distributions also add a Memory Test utility option to the menu.
To reiterate, most of the default values provided in this stage of the installation usually work fine
for most purposes. So, accept the default values provided, and click Next.
Select the Package Group
In this part of the installation, you can select what packages (applications) get installed onto the
system. Fedora categorizes these packages into several high-level categories, such as Graphical
Desktop, Software Development, Office and Productivity, and so on. Each category houses the
individual software packages that complement that category. This organization allows you to make a
quick selection of what types of packages you want installed and safely ignore the details.
Fedora gives you a menu of top-level package groups. You can simply pick the group(s) that
interest you.
1. In the top half of the screen, make sure that Graphical Desktop option is selected.
2. Select the Customize Now option, and click Next.
The next screen allows you to customize the software packages to be installed. Here you can
choose to install a bare-bones system or install all the packages available on the installation medium.
CAUTION A full/everything install is not a good idea for a server-grade system such as the one we
are trying to set up.
The popular GNOME Desktop Environment might already be selected for you.
In addition to the package groups that are selected by default, we will install the KDE (K Desktop
Environment) package group. This additional selection will allow you to sample another popular
desktop environment that is available to Linux. An age-old debate among open source aficionados
regards which of the desktop environments is the best, but you will have to play around with them to
decide for yourself.
1. Select the KDE Software Compilation package group in the right pane, and accept the other
defaults. The completed screen with KDE selected is shown here:
NOTE The installer will begin the actual installation (check for software dependencies, writing the
operating system to the disk, and so on) after the next step. If you develop cold feet at this point, you
can still safely back out of the installation without any loss of data (or self-esteem). To quit the
installer, simply reset your system by pressing CTRL-ALT-DEL on the keyboard or by pushing the reset or
power switch for the system.
2. Click Next to begin the installation.
3. The installation will begin, and the installer will show the progress of the installation. This is
a good time to study any available version-specific release notes for the operating system you
are installing.
4. Click the Reboot button in the final screen after the installation has completed. The system
will reboot itself.
Initial System Configuration
After the boot process completes, you will have to click through a quick, one-time customization
process. It is here that you can view the software license, add users to the system, and configure other
options.
1. On the Welcome screen, click Forward.
2. You’ll see a license information screen. Unlike other proprietary software licenses, you might
actually be able to read and understand the Fedora license in just a few seconds! Click the
Forward button to continue.
Create a User
This section of the initial system configuration allows you to create a nonprivileged (nonadministrative) user account on the system. Creating and using a nonprivileged account on a system
for day-to-day tasks on a system is a good system administration practice. You’ll learn how to create
additional users manually in Chapter 4. But for now, we’ll create a nonprivileged user as required by
the initial configuration process.
Select the Add To Administrators Group box and complete the fields in the Create User screen
with the following information and then click Forward.
Full Name
Username
Password
Confirm Password
master
master
72erty7!2
72erty7!2
Date and Time Configuration
This section allows you to fine-tune the date- and time-related settings for the system. The system can
be configured to synchronize its time with a Network Time Protocol (NTP) server.
1. In the Date and Time screen, make sure that the current date and time shown reflect the actual
current date and time. Accept the other default settings.
2. Click Forward when you’re done.
Hardware Profile
This section’s settings are optional. Here you can submit a profile of your current hardware setup to
the Fedora project maintainers. The information sent does not include any personal information, and
the submission is completely anonymous.
1. Accept the preselected default, and click Finish.
2. If you see a dialog box prompting you to reconsider sending the hardware profile, go with
your heart.
Log In
The system is now set up and ready for use. You will be see a Fedora login screen similar to the one
shown here. To log on to the system, click on the master username, and enter master’s password
—72erty7!2.
Installing Ubuntu Server
Here we provide a quick overview of installing the Ubuntu Linux distribution in a server
configuration.
First you need to download the ISO image for Ubuntu Server (Version 12.04 LTS, 64 bit). The
ISO image that was used on our sample system was downloaded from www.ubuntu.com/startdownload?distro=server&bits=64&release=latest.
We will be performing the installation using an optical CD-ROM media. The downloaded CD
image therefore needs to be burned to a CD. Please note that the image can also be written to and
used on an external flash-based media (such as a USB stick, a Secure Digital (SD) card, an external
hard disk, and so on). Appendix B discusses how to create a Linux installer on flash-based media.
The same cautions and rules that were stated earlier in the chapter during the burning of the
Fedora image also apply here. After burning the ISO image onto an optical media, you should have a
bootable Ubuntu Server distribution. Unlike the Fedora installer or the Ubuntu Desktop installer, the
Ubuntu Server installer is text-based and is not quite as pretty as the others. Complete the following
steps to start and complete the installation.
Start the Installation
1. Insert the Ubuntu Server install media into the system’s optical drive.
2. Make sure that the system is set to use the optical drive as its first boot device in the system
BIOS or the UEFI.
3. Reboot the system if it is currently powered on.
4. Once the system boots from the install media, you will see an initial language selection splash
screen. Press ENTER to accept the default English language. The installation boot menu
shown next will be displayed.
5. Using the arrow keys on your keyboard, select the Install Ubuntu Server option, and then press
ENTER.
6. Select English in the Select A Language screen.
7. Select a country in the next screen. The installer will automatically suggest a country based on
your earlier choice. If the country is correct, press ENTER to continue. If not, manually select the
correct country and press ENTER.
8. Next comes the Keyboard Layout section of the installer. On our sample system, we choose
No to pick the keyboard layout manually.
9. Select English (US) when prompted for the origin of the keyboard in the next screen, and then
press ENTER.
10. Select English (US) again when prompted for keyboard layout, and press ENTER.
Configure the Network
Next comes the Configure the Network section. In the Hostname field, type ubuntu-server and then
press ENTER.
Set up Users, Passwords
1. After the software installation, you will be presented with the Set Up Users and Passwords
screen. In the Full Name field, type in the full name master admin, and then press ENTER.
2. Type master in the Username For Your Account field. Press ENTER to continue.
3. Create a password for the user yyang. Enter the password 72erty7!2 for the user, and press
ENTER. Retype the same password at the next screen to verify the password. Press ENTER again
when you’re done.
4. You will be prompted to encrypt your home directory, select No and press ENTER.
NOTE You might be prompted for proxy server information at some point during this stage of the
install. You can safely ignore the prompt and continue.
Configure the Time Zone
The Ubuntu installer will attempt to guess your correct time zone in the Configure the Clock screen.
If the suggested time zone is correct, select Yes and press ENTER.
If it is incorrect, select No and press ENTER. In the list of time zones, select the appropriate
time zone and press ENTER.
NOTE On some platforms, the time zone configuration portion of the installer might come before or
after the user and password creation portion of the installer.
Set up the Disk Partition
Use the arrow key on your keyboard to select the Guided – Use Entire Disk and Set Up LVM option,
as shown here, and then press ENTER.
1. Another screen will appear, prompting you to select the disk to partition. Accept the default
and press ENTER.
2. If prompted to write the changes to disk and configure LVM, select Yes and press ENTER. You
might get a different prompt if you are installing the operating system on a disk with existing
partitions or volumes. In that case, you will need to confirm that you want to remove any
existing partitions or volumes in order to continue.
NOTE This section of the installer allows you to customize the actual partition structure of the
system. It is here that you can elect to set up different file systems for different uses (such as /var,
/home, and so on). The same concept used in creating the different partitions during the Fedora
installation earlier transfers over for Ubuntu. For the sake of brevity, we won’t show this on our
sample system here. We will instead accept the default partition and LVM layout recommended by the
installer. As a result, we will end up with only three file systems: /boot, /, and swap.
3. The Ubuntu server installer will prompt you to specify the amount of volume group to use for
the guided partitioning. Accept the default value. Select Continue and press ENTER.
4. A summary of the disk partitions and LVM layout will be displayed in the next screen. You
will be prompted to write the changes to disk. Select Yes and press ENTER.
5. The base software installation begins.
Other Miscellaneous Tasks
1. You’ll see a screen asking you to select how to manage upgrades on the system. Select the No
Automatic Updates option and press ENTER.
2. The next screen will inform you that your system has only the core system software installed.
Since we are not ready to do any software customization at this time, ignore this section and
press ENTER to continue.
NOTE If at any point you are prompted for UTC settings for the system, select Yes to confirm that
the system clock is set to UTC, and then press ENTER.
3. When you are presented with a screen asking to install the GRUB boot loader to the master
boot record, select Yes, and then press ENTER.
4. You will be presented with the Installation Complete screen and prompted to remove the
installation media. Press ENTER to continue.
5. The installer will complete the installation process by rebooting the system. Once the system
reboots, you will be presented with a simple login prompt (see Figure 2-4) You can log in as
the user that was previously created during the installation. The username is master and the
password is 72erty7!2.
Figure 2-4. Classic Linux login prompt
NOTE Appendix A walks through the installation of an openSUSE Linux distro.
Summary
You have successfully completed the installation process. If you are having problems with the
installation, be sure to visit Fedora’s web site at http://fedoraproject.org and the Ubuntu web site at
www.ubuntu.com and take a look at the various manuals and tips available.
The version release notes are also a good resource for specific installation issues. Even though
the install process discussed in this chapter used Fedora as the operating system of choice (with a
quick overview of the Ubuntu Server install process), you can rest assured that the installation
concepts for other Linux distributions are virtually identical. The install steps also introduced you to
some Linux/UNIX-specific concepts that will be covered in more detail in later chapters (for
example, hard disk naming conventions, partitioning, volume management, network configuration,
software management, and so on).
CHAPTER 3
Managing Software
ystem administrators deal with software and application management in various ways. Some
system administrators like to play it safe and generally abide by the principle of “if it’s not
broken, don’t fix it.” This approach has its benefits as well as its drawbacks. One of the
benefits is that the system tends to be more stable and behave in a predictable manner. Because the
core system software hasn’t changed drastically, it should pretty much behave the same way it did
yesterday, last week, last month, and so on. The drawback to this approach is that the system will lose
the benefits of bug fixes and security fixes that are available for the various installed applications if
these fixes are not applied.
Other system administrators take the exact opposite approach: They like to install the latest and
greatest software available. This approach also has its benefits and drawbacks. One of its benefits is
that the system tends to stay current as security flaws in applications are discovered and fixed. The
obvious drawback is that some of the newer software might not have had time to benefit from the
maturing process and hence might behave in slightly unpredictable ways.
Regardless of your system administration style, you will find that a great deal of your time will be
spent interacting with the various software components of the system, whether in keeping them up to
date, maintaining what you already have installed, or installing new software.
Of the many approaches to installing software on a Linux system, the preferred approach can
depend on the Linux distribution, the administrator’s skill level, and philosophical considerations.
From a purely technical perspective, software management under the mainstream Linux distros is
done via the following:
S
RPM The Red Hat Package Manager is the common method for Red Hat–like systems such
as Fedora, Red Hat Enterprise Linux (RHEL), and CentOS.
DPMS The Debian Package Management System is the basis for software management on
Debian-based systems, such as Ubuntu, Kubuntu, and Debian.
Source code The more traditional approach for the Linux die-hards and purists involves
compiling and installing the software by hand using the standard GNU compilation method or
the specific software directives.
The Red Hat Package Manager
RPM is a software management system that allows the easy installation and removal of software
packages—typically, precompiled software. An RPM file is a package that contains files needed for
the software to function correctly. A package consists of an archive of files and other metadata,
including configuration files, binaries, and even pre- and post-scripts to run while installing the
software. RPM is wonderfully easy to use, and several graphical interfaces have been built around it
to make it even easier. Several Linux distros and various third parties use this tool to distribute and
package their software. In fact, almost all of the software mentioned in this book is available in RPM
form. The reason you’ll go through the process of compiling software yourself in other chapters is so
that you can customize the software to suit your system, which might not be possible with an RPM.
NOTE In the present context mentioned, we are assuming that the RPM files contain precompiled
binaries. However, adhering to the open source principle, the various commercial and
noncommercial Linux distros are obliged to make the source code for most GNU binaries available.
(Those who don’t make it available by default are obliged to give it to you if you ask for it.) Some
Linux vendors stick to this principle more than others. Several Linux vendors, therefore, make the
source code for their binaries available in RPM form. For instance, Fedora and openSUSE make
source code available as an RPM, and it is becoming increasingly common to download and compile
source code in this fashion.
The RPM tool performs the installation and uninstallation of RPMs. The tool also maintains a
central database of what RPMs you have installed, where they are installed, when they were
installed, and other information about the package.
In general, software that comes in the form of an RPM is less work to install and maintain than
software that needs to be compiled. The trade-off is that by using an RPM, you accept the default
parameters supplied by the RPM maintainer. In most cases, these defaults are acceptable. However,
if you need to be more intimately aware of what is going on with a piece of software, or you require
functionality that is unusual or different from what is available in the RPM, you might find that by
compiling the source yourself, you will learn more about what software components and options exist
and how they work together.
Assuming that all you want to do is install a simple package, RPM is perfect. You can find
several great resources for RPM packages, beyond the base distribution repositories, at web sites
such as the following:
http://rpm.pbone.net
http://mirrors.kernel.org
http://freshrpms.net
Of course, if you are interested in more details about RPM itself, you can visit the RPM web site
at www.rpm.org. RPM comes with Fedora, CentOS, openSUSE, Mandrake, and countless other Red
Hat derivatives, including, most surprising of all, the Red Hat version of Linux! If you aren’t sure if
RPM comes with your distribution, check with your vendor.
NOTE Although the name of the package says “Red Hat,” the software can be used with other
distributions as well. In fact, RPM has even been ported to other operating systems, such as Solaris,
AIX, and IRIX. The source code to RPM is open source software, so anyone can take the initiative to
make the system work for them.
Following are the primary functions of the RPM:
Querying, verifying, updating, installing, and uninstalling software
Maintaining a database that stores various items of information about the packages
Packaging other software into an RPM form
Table 3-1, which includes frequently used RPM options, is provided for reference purposes only.
Command-Line Option
--install
--upgrade
--erase
--query
--force
Description
Installs a new package.
Upgrades or installs the package currently installed to a
newer version.
Removes or erases an installed package.
Used for querying or retrieving information about various
attributes concerning installed (or uninstalled) packages.
Tells RPM to forego any sanity checks and just do it, even if
it thinks you’re trying to fit a square peg into a round hole.
Be careful with this option; it’s the sledgehammer of
installation. Typically, you use it when you’re knowingly
installing an odd or unusual configuration and RPM’s
-h
--percent
-nodeps
-q
--test
-V
-v
safeguards are trying to keep you from doing so.
Prints hash marks to indicate progress during an installation.
Use with the -v option for a pretty display.
Prints the percentage completed to indicate progress. It’s
handy if you’re running RPM from another program, such as
a Perl script, and you want to know the status of the install.
Causes RPM to not perform any dependency checks if RPM
is complaining about missing dependency files, but you
want the installation to happen anyway.
Queries the RPM system for information.
Checks to see whether an installation would succeed
without performing an actual installation. If it anticipates
problems, it displays what they’ll be.
Verifies RPMs or files on the system.
Tells RPM to be verbose about its actions.
Table 3-1. Common RPM Options
Getting Down to Business
Chapter 2 walked you through the operating system installation process. Now that you have a
working system, you will need to log into the system to carry out the exercises in this and other
chapters of the book. Most of the exercises will implicitly ask you to type a command. Although
it might seem obvious, whenever you are asked to type a command, you will have to type it into
a console at the shell prompt. This is akin to the command or DOS prompt in Microsoft
Windows but is much more powerful.
You can type a command at the shell in several ways. One way is to use a nice, windowed
(GUI) terminal; another is to use the system console. The windowed consoles are known as
“terminal emulators” (or “pseudo-terminals”), and there are tons of them.
After logging into your chosen desktop (GNOME, KDE, Xfce, and so on), you can usually
launch a pseudo-terminal by right-clicking the desktop and selecting Launch Terminal from the
context menu. If you don’t have that particular option, look for an option in the applications menu
that says Run Command (or press ALT-F2 to launch the Run Application dialog box). After the
Run dialog box appears, you can then type the name of a terminal emulator into the Run text box.
A popular terminal emulator that is almost guaranteed (or your money back!) to exist on all
Linux systems is the venerable xterm. If you are in a GNOME desktop, the gnometerminal is
the default. If you are using KDE, the default is konsole.
NOTE Installing software and uninstalling software on a system is considered an administrative or
privileged function. This is why you will notice that most of the commands in the following sections
are performed with elevated privileges. The method of achieving this privileged elevation status
depends on the distro, but the ideas and tools used are common across almost all the distros. On the
other hand, querying the software database is not considered a privileged function.
Managing Software Using RPM
The following sections cover details of querying, installing, uninstalling, and verifying software on
Red Hat–type Linux distributions such as Fedora, RHEL, CentOS, and openSUSE. We will use actual
examples to clarify the details.
Querying for Information the RPM Way (Getting to Know One Another)
One of the best ways to begin any relationship is by getting to know the other party. Some of the
relevant information could include the person’s name, what she does for a living, her birthday, and
her likes and dislikes. The same rules apply to RPM packages. In a similar way, after you obtain a
piece of software (from the Internet, from the distribution’s CD/DVD, from a third party), you should
get to know the software before making it a part of your life—that is, your system. As you continue
working with Linux/UNIX, you will find that software names are somewhat intuitive, and you can
usually tell what a package is and does just by looking at its name. For example, to the uninitiated, it
might not be immediately obvious that a file named gcc-5.1.1.rpm is a package for the GNU Compiler
Collection (GCC). But once you get used to the system and you know what to look for, these types of
things will become more intuitive. You can also use RPM to query for other types of information,
such as the package’s build date, its weight (or its size), its likes and dislikes (or its dependencies),
and so on.
Let’s start working with RPM. Begin by logging into the system and launching a terminal.
Querying for All Packages Use the rpm command to list all the packages that are currently installed
on your system. At the shell prompt, type
This will give you a long listing of software installed.
NOTE Like most Linux commands, the rpm command also has its own long forms and short (or
abbreviated) forms of options or arguments. For example, the short form of the --query option is -q,
and the short form for --all is -a. We will mostly use short forms in this book, but we’ll
occasionally use the long forms just so you can see their relationship.
Querying Details for a Specific Package Let’s zero-in on one of the packages listed in the output of
the preceding command, the bash application. Use rpm to see if you indeed have the bash application
installed on your system.
The output should be something similar to the second line, which shows that the bash package is
installed. It also shows the version number 4.2 appended to the package name.
NOTE When dealing with software packages in Linux distros, the exact software version number on
your system might be different from the version number on our sample system. Factors such as
updating the system and Linux distro version affect and determine exact package versions.
This is one of the reasons why we might sometimes truncate the version number in package names
in some exercises. For example, instead of writing bash-9.8.4.2.rpm we might instead cheat and
write bash-9.8.*.
One thing you can be assured of is that the main package name will almost always be the same—
that is, bash is bash in openSUSE, Fedora, Mandrake, CentOS, RHEL, Ubuntu, and so on.
This brings us to the next question. What is bash and what does it do? To find out, type the query
shown here:
This output gives us a lot of information. It shows the version number, the release, the description, the
packager, and more.
The bash package looks rather impressive. Let’s see what else comes with it. This command lists
all the files that come along with the bash package:
To list the configuration files (if any) that come with the bash package, type this:
The querying capabilities of rpm are extensive. RPM packages have a lot of information stored in
tags, which make up the metadata of the package. You can query the RPM database for specific
information using these tags. For example, to find out the date that the bash package was installed on
your system, you can type the following command:
NOTE Because bash is a standard part of most Linux distros and would have been installed
automatically when you initially installed the OS, you will find that its install date will be close to the
day you installed the OS.
To find out what package group the bash application comes under, type the following:
You can, of course, always query for more than one package at the same time and also query for
multiple tag information. For example, to display the names and package groups for the bash and
xterm packages, type this:
To determine what other packages on the system depend on the bash package, type
TIP The RPM queries noted here were issued on software that is currently installed on the system.
You can perform similar queries on software that you get from other sources as well—such as
software you are planning to install and that you have obtained from the Internet or from the
distribution CD/DVD. Similar queries can also be performed on packages that have not yet been
installed. To do this, you simply add the -p option to the end of the query command. For example, say
you’ve just downloaded a package named joe-9.1.6.noarch.rpm into your current working directory
and you want to query the uninstalled package to get more information about it. You would type this:
Installing Software with RPM (Moving in Together)
Okay, you are now both ready to take the relationship to the next stage. You have decided to move in
together. This can be a good thing, because it allows both of you to see and test how truly compatible
you are. This stage of relationships is akin to installing the software package on your system—that is,
moving the software into your system.
In the following procedures, you will install a simple text-based web browser application called
“lynx” onto your system. First, you will need to get a copy of the RPM package for lynx. You can get
this program from several places (the install CDs/ DVD, the Internet, and so on). The example that
follows uses a copy of the program that came with the DVD used during the installation.
The CD/DVD needs to be mounted before the system can access its contents. To mount the DVD,
insert it into the drive and launch a console. An icon for the DVD should appear on the desktop after a
brief delay.
The RPM files are stored under the Packages directory under the mount point of your DVD/CD
device. For example, if your Fedora DVD is mounted under the /media/dvd directory, the path to the
Packages folder will be /media/dvd/Packages/.
NOTE If you don’t have a Fedora CD or DVD, you can download the RPM we’ll be using in the next
section from
http://download.fedora.redhat.com/pub/fedora/linux/releases/16/Everything/x86_64/os/Packages/lynx2.*.x86_64.rpm.
Let’s step through the process of installing an RPM.
1. Launch a virtual terminal.
2. Assuming your distribution install media disc is mounted at the /media/dvd mount point,
change to the directory that usually contains the RPM packages on the DVD. Type the
following:
3. You can first make sure that the file you want is indeed in that directory. Use the ls
command to list all the files that start with the letters lyn in the directory:
4. Now that you have confirmed that the file is there, perform a test install of the package. This
will run through the motions of installing the package without actually installing anything on
the system. This is useful in making sure that all the needs (dependencies) of a package are
met. Type the following:
Everything looks okay. If you get a warning message about the signature, you can safely ignore
it for now.
5. Now perform the actual installation:
6. Run a simple query to confirm that the application is installed on your system:
The output shows that lynx is now available on the system.
In case you were wondering, lynx is a text-based web browser. You can launch it by simply
typing lynx at the shell prompt. To quit lynx, simply press Q. You will get a prompt at the lower-right
corner of your terminal to confirm that you want to quit lynx. Press ENTER to confirm.
As you can see, installing packages via RPM can be easy. But sometimes installing packages can
be trickier, usually due to failed or missing dependencies. For example, the lynx package might
require that the bash package be installed on the system before lynx can be successfully installed.
TIP You can easily make the contents of the operating system image that you downloaded in Chapter
1 accessible by mounting the ISO file. For example, to mount the Fedora DVD image named Fedora16-x86_64-DVD.iso at the directory /media/iso, you can type
Let’s step through installing a more complex package to see how dependencies are handled with
RPM. Assuming you are still in the Package directory of the DVD media, do the following:
1. Install the gcc package by typing the following:
The output does not look good. It tells us that gcc-4.* depends on (needs) some other packages
—binutils, cloog-ppl, cpp, glibc-devel, and libmpc.
2. Fortunately, because we have access to the DVD media that contains most of the packages for
this distro in a single directory, we can easily include the additional package to our install list
like so:
Uh-oh. It looks like this particular partner is not going to be easy to move in. The output tells
us that the glibc-devel* package depends on another package, called glibc-headers*. And the
cloog-ppl-* package needs something else called libppl*, and so on.
3. Add the newest dependency to the install list:
After all we have given to this relationship, all we get is more complaining. The last
requirement is the kernel-headers* package. We need to satisfy this requirement, too.
4. Looks like we are getting close to the end. We add the final required package to the list:
It was tough, but you managed to get the software installed.
TIP When you perform multiple RPM installations in one shot, as you did in this example, it is
called an RPM transaction.
A popular option used in installing packages via RPM is the -U (for upgrade) option. It is
especially useful when you want to install a newer version of a package that already exists. It will
simply upgrade the already installed package to the newer version. This option also does a good job
of keeping intact your custom configuration for an application.
For example, if you had lynx-9-7.rpm installed and you wanted to upgrade to lynx-9-9. rpm, you
would type rpm -Uvh lynx-9-9.rpm. Note that you can use the -U option to perform a regular
installation of a package even when you are not upgrading.
Uninstalling Software with RPM (Ending the Relationship)
Things didn’t quite work out the way you both had anticipated. Now it is time to end the relationship.
The other partner was never any good anyhow, so we’ll simply clean them out of your system.
Cleaning up after itself is one of the areas in which RPM truly excels. And this is one of its key
selling points as a software manager in Linux systems. Because a database of various pieces of
information is stored and maintained along with each installed package, it is easy for RPM to refer
back to its database to collect information about what was installed and where.
NOTE A slight caveat applies here. As with Windows install/uninstall tools, all the wonderful things
that RPM can do are also dependent on the software packager. For example, if a software application
was badly packaged and its removal scripts were not properly formatted, you might still end up with
bits of the package on your system, even after uninstalling. This is one of the reasons why you should
always get software only from trusted sources.
Removing software with RPM is quite easy and can be done in a single step. For example, to
remove the lynx package that we installed earlier, we simply need to use the -e option, like so:
This command will usually not give you any feedback if everything went well. To get a more
verbose output for the uninstallation process, add the -vvv option to the command.
A handy feature of RPM is that it will also protect you from removing packages that are needed
by other packages. For example, if we try to remove the kernel-headers package (recall that the
gcc package depended on it), we’d see the following:
NOTE Remember that the glibc-headers* package required this package, too. And so RPM will do
its best in helping you maintain a stable software environment. But if you are adamant about shooting
yourself in the foot, RPM will also allow you to do that (perhaps because you know what you are
doing). If, for example, you wanted to forcefully perform the uninstallation of the kernel-headers
package, you would add the --nodeps option to the uninstallation command, like this:
Other Things RPM Can Do
In addition to basic installation and uninstallation of packages with RPM, you can do numerous other
things with it. In this section, we walk through some of these other functions.
Verifying Packages One of the many useful functionalities provided by the RPM tool is the ability
to verify a package. What happens is that RPM looks at the package information in its database,
which is assumed to be good. It then compares that information with the binaries and files that are on
your system.
In today’s Internet world, where being hacked is a real possibility, this kind of test should tell you
instantly if anyone has tampered with the software installed on your system. For example, to verify
that the bash package is as it should be, type the following:
The absence of any output is a good sign.
You can also verify specific files on the file system that a particular package installed. For
example, to verify that the /bin/ls command is valid, you would type this:
Again, the lack of output is a good thing.
If something was amiss—for example, if the /bin/ls command had been replaced by a dud version
—the verify output might be similar to this:
If something is wrong, as in this example, RPM will inform you of what test failed. Some example
tests are the MD5 checksum test, file size, and modification times. The moral of the story is that RPM
is an ally in finding out what is wrong with your system.
Table 3-2 provides a summary of the various error codes and their meanings. You can use the
following command to verify all the packages installed on your system:
Code
S
M
5
D
L
U
G
T
Meaning
File size differs
Mode differs (includes permissions and file type)
MD5 sum differs
Device major/minor number mismatch
readLink-path mismatch
User ownership differs
Group ownership differs
Modification Time (mtime) differs
Table 3-2. RPM Verification Error Attributes
This command verifies all the packages installed on your system. That’s a lot of files, so you
might have to give it some time to complete.
Package Validation Another feature of RPM is that it allows the packages to be digitally signed.
This provides a type of built-in authentication mechanism that allows a user to ascertain that the
package in his or her possession was truly packaged by the expected (trusted) party and also that the
package has not been tampered with along the line somewhere.
You sometimes need to tell your system manually whose digital signature to trust. This explains
the reason why you might see some warnings in the earlier procedures when you were trying to install
a package (such as this message: “warning: lynx-2.*.rpm: Header V3 RSA/SHA256 Signature, key ID
069c8460: NOKEY”). To prevent this warning message, you should import Fedora’s digital key into
your system’s key ring, like so:
You might also have to import other vendors’ keys into the key ring. To be extra certain that even
the local key you have is not a dud, you can import the key directly from the vendor’s web site. For
instance, to import a key from Fedora’s project site, you would type (replace <version> in the
command with your Fedora version—for example, 16, 17, 18, and so on):
Yum Yum is one of the more popular packaging/updating tools for managing software on Linux
systems. It is basically a wrapper program for RPM, with great enhancements. It has been around for
a while, but it has become more widely used and more prominent because major Linux vendors
decided to concentrate on their (more profitable) commercial product offerings. Yum has changed and
enhanced the traditional approach to package management on RPM-based systems. Popular large sites
that serve as repositories for open source software have had to retool slightly to accommodate
“Yumified” repositories.
According to the Yum project’s web page:
“Yum is an automatic updater and package installer/remover for RPM systems. It automatically
computes dependencies and figures out what things should occur to install packages. It makes it
easier to maintain groups of machines without having to manually update each one using RPM.”
This summary is an understatement. Yum can do a lot beyond that. Several Linux distributions rely
heavily on the capabilities provided by Yum.
Using Yum is simple on supported systems. You mostly need a single configuration file
(/etc/yum.conf). Other configuration files can be stored under the /etc/yum.repos.d/ directory that
points to the Yum-enabled (Yumified) software repository. Fortunately, several Linux distributions
now ship with Yum already installed and preconfigured. Fedora is one such distro.
To use Yum on a Fedora system (or any other Red Hat–like distro), to install a package called
gcc, for example, you would type the following at the command line:
Yum will automatically take care of any dependencies that the package might need and install the
package for you. (The first time it is run, it will build up its local cache.) Yum will even do your
dishes for you (your mileage may vary). Yum also has extensive search capabilities that will help you
find a package, even if you don’t know its correct name. All you need to know is part of the name. For
example, if you wanted to search for all packages that have the word “headers” in the name, you can
try a Yum option like this:
This will return a long list of matches. You can then look through the list and pick the package you
want.
NOTE By default, Yum tries to access repositories that are located somewhere on the Internet.
Therefore, your system needs to be able to access the Internet to use Yum in its default state. You can
also create your own local software repository on the local file system or on your local area network
(LAN) and Yumify it. Simply copy the entire contents of the distribution media (DVD/CD)
somewhere and run the createrepo command against the directory location.
GUI RPM Package Managers
For those who like a good GUI tool to help simplify their lives, several package managers with GUI
front-ends are available. Doing all the dirty work behind these pretty GUI front-ends on Red Hat–
based distros is RPM. The GUI tools allow you to do quite a few things without forcing you to
remember command-line parameters. Some of the more popular tools that ship with various
distributions or desktop environments are listed in the sections that follow.
Fedora
You can launch the GUI package management tool (see Figure 3-1) in Fedora by choosing
Applications | System Tools | Add/Remove Software. You can also launch the Fedora package
manager from the command line by typing:
Figure 3-1. Fedora GUI package manager
openSUSE and SLE
In openSUSE and SUSE Linux Enterprise (SLE), most of the system administration is done via a tool
called YaST, which stands for Yet Another Setup Tool. YaST is made up of different modules. For
adding and removing packages graphically on the system, the relevant module is called sw_single. So
to launch this module from the command line of a system running the SUSE distribution, you would
type
The Debian Package Management System
The Debian Package Management System (DPMS) is the foundation for managing software on Debian
and Debian-like systems. As is expected of any software management system, DPMS provides for
easy installation and removal of software packages. Debian package names end with the .deb
extension.
At the core of the DPMS is the dpkg (Debian Package) application. dpkg works in the back-end of
the system, and several other command-line tools and GUI tools have been written to interact with it.
Packages in Debian are fondly called “.deb files” (pronounced dot deb). dpkg can directly manipulate
.deb files. Various other wrapper tools have been developed to interact with dpkg, either directly or
indirectly.
APT
APT is a highly regarded and sophisticated toolset. It is an example of a wrapper tool that interacts
directly with dpkg. APT is actually a library of programming functions that are used by other middle-
ground tools, such as apt-get and apt-cache, to manipulate software on Debian-like systems.
Several user-land applications have been developed that rely on APT. (User-land refers to nonkernel
programs and tools.) Examples of such applications are synaptic, aptitude, and dselect. The user-land
tools are generally more user-friendly than their command-line counterparts. APT has also been
successfully ported to other operating systems.
One fine difference between APT and dpkg is that APT does not directly deal with .deb packages;
instead, it manages software via the locations (repositories) specified in a configuration file. This file
is the sources.list file. APT utilities use the sources .list file to locate archives (or repositories) of
the package distribution system in use on the system.
It should be noted that any of the components of the DPMS (dpkg, APT, or the GUI tools) can be
used to manage software directly on Debian-like systems. The tool of choice depends on the user’s
level of comfort and familiarity with the tool in question.
Figure 3-2 shows what can be described as the “DPMS triangle.” The tool at the apex of the
triangle (dpkg) is the most difficult to use and the most powerful, followed by the next easiest to use
(APT), and then the user-friendly user-land tools.
Figure 3-2. DPMS triangle
Software Management in Ubuntu
As mentioned earlier, software management in the Debian-like distros such as Ubuntu is done using
DPMS and all the attendant applications built around it, such as APT and dpkg. In this section, we
will look at how to perform basic software management tasks on Debian-like distros.
Querying for Information
On your Ubuntu server, the equivalent command to list all currently installed software is
The command to get basic information about the installed bash package is
The command to get more detailed information about the bash package is
To view the list of files that comes with the bash package, type
The querying capabilities of dpkg are extensive. You can use DPMS to query for specific
information about a package. For example, to find out the size of the installed bash package, you can
type
Installing Software in Ubuntu
You can install software on Ubuntu systems in several ways. You can use dpkg to install a .deb file
directly, or you can use apt-get to install any software available in the Ubuntu repositories on the
Internet or locally (CD/DVD ROM, file system, and so on).
NOTE Installing and uninstalling software on a system is considered an administrative or privileged
function. This is why you will notice that any commands that require superuser privileges are
preceded by the sudo command. The sudo command can be used to execute commands in the context
of a privileged user (or another user). On the other hand, querying the software database is not
considered a privileged function.
To use dpkg to install a .deb package named lynx_2.9.8-2ubuntu4_amd64.deb, type
Using apt-get to install software is a little easier, because APT will usually take care of any
dependency issues for you. The only caveat is that the repositories configured in the sources.list file
(/etc/apt/sources.list) have to be reachable either over the Internet or locally. The other advantage to
using APT to install software is that you need to know only a part of the name of the software; you
don’t need to know the exact version number. You also don’t need to manually download the software
before installing. Some common apt-get options are listed in Table 3-3.
Command
update
upgrade
install
remove
autoremove
purge
dist-upgrade
check
Meaning
Retrieve new lists of packages
Perform an upgrade
Install new packages
Remove packages
Remove automatically all unused packages
Remove packages and config files
Distribution upgrade
Verify that there are no broken dependencies
Table 3-3. Common apt-get Options
To use apt-get to install a package called lynx, type
Removing Software in Ubuntu
Uninstalling software (such as lynx) in Ubuntu using dpkg is as easy as typing this:
You can also use apt-get to remove software by using the remove option. To remove the lynx
package using apt-get, type
A less commonly used method for uninstalling software with APT is to use the install switch,
appending a minus sign to the package name to be removed. This can be useful when you want to
install and remove another package in one shot.
To remove the already installed lynx package and simultaneously install another package called
curl using this method, type
APT makes it easy to remove software and any attendant configuration file(s) completely from a
system. This allows you truly to start from scratch by getting rid of any customized configuration files.
Assuming you want to remove the lynx application completely from the system, you would type
GUI Package Managers for Debian-Based Systems (Ubuntu)
Several GUI software management tools are available on Debian-based distros such as Ubuntu. For
desktop-class systems, GUI tools are installed by default. Some of the more popular GUI tools in
Ubuntu are synaptic (see Figure 3-3) and adept. Ubuntu also has a couple of tools that are not exactly
GUIs, but offer a similar ease of use as their fat GUI counterparts. These tools are console-based or
text-based and menu-driven. Examples of such tools are aptitude (see Figure 3-4) and dselect.
Figure 3-3. Synaptic Package Manager
Figure 3-4. Aptitude Package Manager
Compile and Install GNU Software
One of the key benefits of open source software is that it gives you access to the source code. If the
developer chooses to stop working on it, you can continue (if you know how). If you find a problem,
you can fix it. In other words, you are in control of the situation and not at the mercy of a commercial
developer you can’t control. But having the source code means you need to be able to compile it, too.
Otherwise, all you have is a bunch of text files that can’t do much.
Although almost every piece of software in this book is available in RPM or .deb format, we will
step through the process of compiling and building software from source code. Being able to do this
has the benefit of allowing you to pick and choose compile-time options, which is something you
can’t do with prebuilt RPMs. Also, an RPM might be compiled for a specific architecture, such as the
Intel 686, but that same code might run better if you compile it natively on, for example, your GigaexTeraCore-class CPU.
In this section, we will step through the process of compiling the hello package, a GNU software
package that might seem useless at first but exists for good reasons. Most GNU software conforms to
a standard method of compiling and installing; the hello package tries to conform to this standard, so
it makes an excellent example.
Getting and Unpacking the Package
The other relationship left a bad taste in your mouth, but you are ready to try again. Perhaps things
didn’t quite work out because there were so many other factors to deal with—RPM with its endless
options and seemingly convoluted syntax. And so, out with the old, in with the new. Maybe you’ll be
luckier this time around if you have more control over the flow of things. Although a little more
involved, working directly with source code will give you more control over the software and how
things take form.
Software that comes in source form is generally made available as a tarball—that is, it is
archived into a single large file and then compressed. The tools commonly used to do this are tar and
gzip. tar handles the process of combining many files into a single large file, and gzip is
responsible for the compression.
NOTE Typically, a single directory is selected in which to build and store tarballs. This allows the
system administrator to keep the tarball of each package in a safe place in the event he or she needs to
pull something out of it later. It also lets all the administrators know which packages are installed on
the system in addition to the base system. A good directory choice for this is /usr/local/ src, since
software local to a site is generally installed in /usr/local.
Let’s try installing the hello package, one step at a time. We’ll begin by first obtaining a copy of
the source code.
Pull down the latest copy of the hello program used in this example from
www.gnu.org/software/hello or directly from http://ftp.gnu.org/gnu/hello/hello-2.7.tar.gz. We use
hello version 2.7 (hello-2.7.tar.gz) in this example. Save the file to the /usr/local/ src/ directory.
TIP A quick way to download a file from the Internet (via FTP or HTTP) is using the command-line
utility called wget. For example, to pull down the hello program while at a shell prompt, you’d
simply type
The file will be automatically saved into your /usr/local/src/ working directory.
After downloading the file, you will need to unpack (or untar) it. When unpacked, a tarball will
generally create a new directory for all of its files. The hello tarball (hello-2.7.tar.gz), for example,
creates the subdirectory hello-2.7. Most packages follow this standard. If you find a package that
does not follow it, it is a good idea to create a subdirectory with a reasonable name and place all the
unpacked source files there. This allows multiple builds to occur at the same time without the risk of
the two builds conflicting.
First change your current working directory to the /usr/local/src directory where the hello tarball
was downloaded and saved to:
Next use the tar command to unpack and decompress the hello archive:
The z parameter in this tar command invokes gzip to decompress the file before the untar
process occurs. The v parameter tells tar to show the name of the file it is untarring as it goes
through the process. This way, you’ll know the name of the directory where all the sources are being
unpacked.
NOTE You might encounter files that end with the .tar.bz2 extension. Bzip2 is a compression
algorithm that is gaining popularity, and GNU tar does support decompressing it on the command
line with the y or j option, instead of the z parameter.
A new directory, called hello-2.7, should have been created for you during the untarring. Change
to the new directory and list its contents:
NOTE Do not confuse the Linux gzip program with the Microsoft Windows WinZip program. They
are two different programs that use two different (but comparable) methods of compression. The
Linux gzip program can handle files that are compressed by WinZip, and the WinZip program knows
how to deal with tarballs.
Looking for Documentation
You have both now downloaded (found each other). Now is probably a good time to look around and
see if either of you comes with any special documentation (needs).
A good place to look for software documentation will be in the root of its directory tree. Once
you are inside the directory with all of the source code, begin looking for documentation.
NOTE Always read the documentation that comes with the source code! If there are any special
compile directions, notes, or warnings, they will most likely be mentioned here. You will save
yourself a great deal of agony by reading the relevant files first.
So, then, what are the relevant files? These files typically have names like README and
INSTALL. The developer might also have put any available documentation in a directory aptly
named docs.
The README file generally includes a description of the package, references to additional
documentation (including the installation documentation), and references to the author of the package.
The INSTALL file typically has directions for compiling and installing the package. These are not, of
course, absolutes. Every package has its quirks. The best way to find out is simply to list the directory
contents and look for obvious signs of additional documentation. Some packages use different
capitalization: readme, README, ReadMe, and so on. (Remember that Linux is case-sensitive.)
Some introduce variations on a theme, such as README.1ST or README.NOW, and so on.
While you’re in the /usr/local/src/hello-2.7 directory, use a pager to view the INSTALL file that
comes with the hello program:
Exit the pager by typing q when you are done reading the file.
TIP Another popular pager you can use instead of less is called more! (Historical note: more came
way before less.)
Configuring the Package
You both want this relationship to work and possibly last longer than the previous ones. So this is a
good time to establish guidelines and expectations.
Most packages ship with an auto-configuration script; it is safe to assume they do, unless their
documentation says otherwise. These scripts are typically named configure (or config), and they can
accept parameters. A handful of stock parameters are available across all configure scripts, but the
interesting stuff occurs on a program-by-program basis. Each package will have a few features that
can be enabled or disabled, or that have special values set at compile time, and they must be set up
via configure.
To see what configure options come with a package, simply run
Yes, those are two hyphens (--) before the word “help.”
NOTE One commonly available option is --prefix. This option allows you to set the base
directory where the package gets installed. By default, most packages use /usr/local. Each component
in the package will install into the appropriate directory in /usr/local.
If you are happy with the default options that the configure script offers, type the following:
With all of the options you want set up, a run of the configure script will create a special type of
file called a makefile. Makefiles are the foundation of the compilation phase. Generally, if configure
fails, you will not get a makefile. Make sure that the configure command did indeed complete without
any errors.
Compiling the Package
This stage does not quite fit anywhere in our dating model! But you might consider it as being similar
to that period when you are so blindly in love and everything just flies by and a lot of things are just
inexplicable.
All you need to do is run make, like so:
The make tool reads all of the makefiles that were created by the configure script. These files tell
make which files to compile and the order in which to compile them—which is crucial, since there
could be hundreds of source files. Depending on the speed of your system, the available memory, and
how busy it is doing other things, the compilation process could take a while to complete, so don’t be
surprised.
As make is working, it will display each command it is running and all the parameters associated
with it. This output is usually the invocation of the compiler and all the parameters passed to the
compiler—it’s pretty tedious stuff that even the programmers were inclined to automate!
If the compile goes through smoothly, you won’t see any error messages. Most compiler error
messages are clear and distinct, so don’t worry about possibly missing an error. If you do see an
error, don’t panic. Most error messages don’t reflect a problem with the program itself, but usually
with the system in some way or another. Typically, these messages are the result of inappropriate file
permissions or files/libraries that cannot be found.
In general, slow down and read the error message. Even if the format is a little odd, it might
explain what is wrong in plain English, thereby allowing you to fix it quickly. If the error is still
confusing, look at the documentation that came with the package to see if there is a mailing list or email address you can contact for help. Most developers are more than happy to provide help, but you
need to remember to be nice and to the point. (In other words, don’t start an e-mail with a rant about
why the software is terrible.)
Installing the Package
You’ve done almost everything else. You’ve found your partner, you’ve studied them, you’ve even
compiled them—now it’s time to move them in with you—again!
Unlike the compile stage, the installation stage typically goes smoothly. In most cases, once the
compile completes successfully, all that you need to do is run the following:
This will install the package into the location specified by the default prefix (or the --prefix)
argument that was used with the configure script earlier.
It will start the installation script (which is usually embedded in the makefile). Because make
displays each command as it is executing it, you will see a lot of text fly by. Don’t worry about it—
it’s perfectly normal. Unless you see an error message, the package will be safely installed.
If you do see an error message, it is most likely because of permissions problems. Look at the last
file it was trying to install before failure, and then go check on all the permissions required to place a
file there. You might need to use the chmod, chown, and chgrp commands for this step.
TIP If the software being installed is meant to be used and available system-wide, then the make
install stage is almost always the stage that needs to be performed by the superuser (the root user).
Accordingly, most install instructions will require you to become root before performing this step. If,
on the other hand, a regular user is compiling and installing a software package for his or her own
personal use into a directory for which that user has full permissions (for example, by specifying -prefix=/home/user_name), then there is no need to become root to run the make install stage.
Testing the Software
A common mistake administrators make is to go through the process of configuring and compiling,
and then, when they install, they do not test the software to make sure that it runs as it should. Testing
the software also needs to be done as a regular user, if the software is to be used by non-root users.
In our example, you’ll run the hello command to verify that the permissions are correct and that
users won’t have problems running the program. You can quickly switch users (using the su
command) to make sure the software is usable by everyone.
Assuming that you accepted the default installation prefix for the hello program (the relevant files
will be under the /usr/local directory), use the full path to the program binary to execute it:
Finally, try running the newly installed hello program as a regular nonprivileged user (yyang, for
example):
That’s it—you’re done.
Cleanup
Once the package is installed, you can do some cleanup to get rid of all the temporary files created
during the installation. Since you have the original source-code tarball, it is okay to get rid of the
entire directory from which you compiled the source code. In the case of the hello program, you
would get rid of /usr/local/src/hello-2.7.
Begin by going one directory level above the directory you want to remove. In this case, that
would be /usr/local/src:
Now use the rm command to remove the actual directory, like so:
The rm command, especially with the -rf parameter, is dangerous. It recursively removes an
entire directory without stopping to verify any of the files. It is especially potent when run by the root
user—it will shoot first and leave you asking questions later.
CAUTION Be careful, and make sure you are erasing what you mean to erase. There is no easy way
to undelete a file in Linux when working from the command line.
Common Problems When Building from Source Code
The GNU hello program might not seem like a useful tool, but one valuable thing it provides is the
ability to test the compiler on your system. If you’ve just finished the task of upgrading your compiler,
compiling this simple program will provide a sanity check that indeed the compiler is working.
Following are some other problems (and their solutions) you might encounter when building from
source.
Problems with Libraries
One problem you might run into is when the program can’t find a file of the type libsomething.so and
terminates for that reason. This file is a library. Linux libraries are synonymous with Dynamic Link
Libraries (DLLs) in Windows. They are stored in several locations on the Linux system and typically
reside in /usr/lib/, /usr/lib64/, and /usr/local/lib/. If you have installed a software package in a
location other than /usr/ local, you will have to configure your system or shell to know where to look
for those new libraries.
NOTE Linux libraries can be located anywhere on your file system. You’ll appreciate the usefulness
of this when, for example, you have to use the Network File System (NFS) to share a directory (or, in
our case, software) among network clients. You’ll find that this design makes it easy for other
networked users or clients to run and use software or system libraries residing on the network shares
—as long as they can mount or access the share.
There are two methods for configuring libraries on a Linux system. One is to modify
/etc/ld.so.conf, by adding the path of your new libraries, or place a custom configuration file for your
application under the /etc/ld.so.conf.d/ directory. Once this is done, use the ldconfig -m command
to load in the new configuration.
You can also use the LD_LIBRARY_PATH environment variable to hold a list of library
directories to look for library files. Read the man pages for ldconfig and ld.so for more information.
Missing Configure Script
Sometimes, you will download a package and instantly type cd into its directory and run
./configure. And you will probably be shocked when you see the message “No such file or
directory.” As stated earlier in the chapter, read the README and INSTALL files that are packaged
with the software. Typically, the authors of the software are courteous enough to provide at least
these two files.
TIP It is common for many of us to want to jump right in and begin compiling something without first
looking at these documents, and then to come back hours later to find that a step was missed. The first
step you take when installing software is to read the documentation. It will probably point out the fact
that you need to run something exotic like imake first and then make. You get the idea: Always read
the documentation first, and then proceed to compiling the software.
Broken Source Code
No matter what you do, it is possible that the source code that you have is simply broken and the only
person who can get it to work or make any sense of it is its original author or some other software
developers. You might have already spent countless hours trying to get the application to compile and
build before coming to this conclusion and throwing in the towel. It is also possible that the author of
the program has not documented some valuable or relevant information. In cases like this, you should
try to see if precompiled binaries for the application already exist for your specific Linux distro.
Summary
You’ve explored the common functions of the popular RPM and DPMS. You used various options to
manipulate RPM and .deb packages by querying, installing, and uninstalling sample packages.
You learned and explored various software management techniques using purely command line
tools. We also discussed a few GUI tools that are used on popular Linux distributions. The GUI tools
are similar to the Windows Add/Remove Programs Control Panel applet. Just point and click. The
chapter also briefly touched on a popular software management system in Linux called Yum.
Using an available open source program as an example, you went through the steps involved in
configuring, compiling, and building software from the raw source code.
As a bonus, you also learned a thing or two about the mechanics of human relationships!
PART II
Single-Host Administration
CHAPTER 4
Managing Users and Groups
inux/UNIX was designed from the ground up to be a multiuser operating system. But a multiuser
operating system would not be much good without users! And this brings us to the topic of
managing users in Linux.
On computer systems, a system administrator sets up user accounts, which determine who has
access to what. The ability of a person to access a system is determined by whether that user exists
and has the proper permissions to use the system.
Associated with each user is some “baggage,” which can include files, processes, resources, and
other information. When dealing with a multiuser system, a system administrator needs to understand
what constitutes a user (and the user’s baggage) and a group, how they interact together, and how they
affect available system resources.
In this chapter, we will examine the technique of managing users on a single host. We’ll begin by
exploring the actual database files that contain information about users. From there, we’ll examine the
system tools available to manage the files automatically.
L
What Exactly Constitutes a User?
Under Linux, every file and program must be owned by a user. Each user has a unique identifier
called a user ID (UID). Each user must also belong to at least one group, a collection of users
established by the system administrator. Users may belong to multiple groups. Like users, groups also
have unique identifiers, called group IDs (GIDs).
The accessibility of a file or program is based on its UIDs and GIDs. A running program inherits
the rights and permissions of the user who invokes it. (An exception to this rule is SetUID and SetGID
programs, discussed in “Understanding SetUID and SetGID Programs” later in this chapter.)
Each user’s rights can be defined in one of two ways: as those of a normal user or the root user.
Normal users can access only what they own or have been given permission to run; permission is
granted because the user either belongs to the file’s group or because the file is accessible to all
users. The root user is allowed to access all files and programs in the system, whether or not root
owns them. The root user is often called a superuser.
If you are accustomed to Windows, you can draw parallels between that system’s user
management and Linux’s user management. Linux UIDs are comparable to Windows SIDs (security
identifiers), for example. In contrast to Microsoft Windows, you might find the Linux security model
maddeningly simplistic: Either you’re root or you’re not. Normal users do not easily have root
privileges in the same way normal users can be granted administrator access under Windows.
Although this approach is a little less common, you can also implement finer grained access control
through the use of access control lists (ACLs) in Linux, as you can with Windows. Which system is
better? Depends on what you want and whom you ask.
Where User Information Is Kept
If you’re already accustomed to user management in Microsoft Windows Server environments, you’re
definitely familiar with Active Directory (AD), which takes care of the nitty-gritty details of the user
and group database. Among other things, this nitty-gritty includes the SIDs for users, groups, and other
objects. AD is convenient, but it makes developing your own administrative tools trickier, since the
only other way to read or manipulate user information is through a series of Lightweight Directory
Access Protocol (LDAP), Kerberos, or programmatic system calls.
In contrast, Linux takes the path of traditional UNIX and keeps all user information in straight text
files. This is beneficial for the simple reason that it allows you to make changes to user information
without the need of any other tool but a text editor. In many instances, larger sites take advantage of
these text files by developing their own user administration tools so that they can not only create new
accounts, but also automatically make additions to the corporate phone book, web pages, and so on.
However, users and groups working with UNIX style for the first time may prefer to stick with the
basic user management tools that come with the Linux distribution. We’ll discuss those tools in “User
Management Tools” later in this chapter. For now, let’s examine the text files that store user and
group information in Linux.
NOTE This chapter covers the traditional Linux/UNIX methods for storing and managing user
information. Chapters 25 and 26 of this book discuss some other mechanisms (such as NIS and
LDAP) for storing and managing users and groups in Linux-based operating systems.
The /etc/passwd File
The /etc/passwd file stores the user’s login, encrypted password entry, UID, default GID, name
(sometimes called GECOS), home directory, and login shell. Each line in the file represents
information about a user. The lines are made up of various standard fields, with each field delimited
by a colon. A sample entry from a passwd file with its various fields is illustrated in Figure 4-1.
Figure 4-1. Fields of the /etc/passwd file
The fields of the /etc/passwd file are discussed in detail in the sections that follow.
Username Field
This field is also referred to as the login field or the account field. It stores the name of the user on the
system. The username must be a unique string and uniquely identifies a user to the system. Different
sites use different methods for generating user login names. A common method is to use the first letter
of the user’s first name and append the user’s last name. This usually works, because the chances are
relatively slim that an organization would have several users with the same first and last names. For
example, for a user whose first name is “Ying” and whose last name is “Yang,” a username of
“yyang” can be assigned. Of course, several variations of this method are also used.
Password Field
This field contains the user’s encrypted password. On most modern Linux systems, this field contains
a letter x to indicate that shadow passwords are being used on the system (discussed further on, in this
chapter). Every regular user or human account on the system should have a password. This is crucial
to the security of the system—weak passwords make compromising a system just that much simpler.
How Encryption Works
The original philosophy behind passwords is actually quite interesting, especially since we still
rely on a significant part of it today. The idea is simple: Instead of relying on protected files to
keep passwords a secret, the system encrypts the password using an AT&T-developed (and
National Security Agency–approved) algorithm called Data Encryption Standard (DES) and
leaves the encrypted value publicly viewable. What originally made this secure was that the
encryption algorithm was computationally difficult to break. The best most folks could do was a
brute-force dictionary attack, where automated systems would iterate through a large dictionary
and rely on the natural tendency of users to choose English words for their passwords. Many
people tried to break DES itself, but since it was an open algorithm that anyone could study, it
was made much more bulletproof before it was actually deployed.
When users entered their passwords at a login prompt, the password they entered would be
encrypted. The encrypted value would then be compared against the user’s password entry. If the
two encrypted values matched, the user was allowed to enter the system. The actual algorithm
for performing the encryption was computationally cheap enough that a single encryption
wouldn’t take too long. However, the tens of thousands of encryptions that would be needed for
a dictionary attack would take prohibitively long.
But then a problem occurred: Moore’s Law on processor speed doubling every 18 months
held true, and home computers were becoming powerful and fast enough that programs were
able to perform a brute-force dictionary attack within days rather than weeks or months.
Dictionaries got bigger, and the software got smarter. The nature of passwords thus needed to be
reevaluated. One solution has been to improve the algorithm used to perform the encryption of
passwords. Some distributions of Linux have followed the path of the FreeBSD operating system
and used the Message Digest 5 (MD5) scheme. This has increased the complexity involved in
cracking passwords, which, when used in conjunction with shadow passwords, works quite
well. (Of course, this is assuming you make your users choose good passwords!)
TIP Choosing good passwords is always a chore. Your users will inevitably ask, “What then, O
Almighty System Administrator, makes a good password?” Here’s your answer: a non-language word
(not English, not Spanish, not German, in fact not a human-language word), preferably with mixed
case, numbers, and punctuation—in other words, a string that looks like line noise. Well, this is all
nice and wonderful, but if a password is too hard to remember, most people will quickly defeat its
purpose by writing it down and keeping it in an easily viewed place. So better make it memorable! A
good technique might be to choose a phrase and then pick the first letter of every word in the phrase.
Thus, the phrase “coffee is VERY GOOD for you and me” becomes ciVG4yam. The phrase is
memorable, even if the resulting password isn’t.
User ID Field (UID)
This field stores a unique number that the operating system and other applications use to identify the
user and determine access privileges. It is the numerical equivalent of the Username field. The UID
must be unique for every user, with the exception of the UID 0 (zero). Any user who has a UID of 0
has root (administrative) access and thus has the full run of the system. Usually, the only user who has
this specific UID has the login root. It is considered bad practice to allow any other users or
usernames to have a UID of 0. This is notably different from the Microsoft (MS) Windows model, in
which any number of users can have administrative privileges.
Different Linux distributions sometimes adopt different UID numbering schemes. For example,
Fedora and Red Hat Enterprise Linux (RHEL) reserve the UID 99 for the user “nobody,” while
openSUSE and Ubuntu Linux use the UID 65534 for the user “nobody.”
Group ID Field (GID)
The next field in the /etc/passwd file is the group ID entry. It is the numerical equivalent of the
primary group to which the user belongs. This field also plays an important role in determining user
access privileges. It should be noted that in addition to a primary group, a user can belong to other
groups as well (more on this later in the section “The /etc/group File”).
GECOS
This field can store various pieces of information for a user. It can act as a placeholder for the user
description, full name (first and last name), telephone number, and so on. This field is optional and as
a result can be left blank. It is also possible to store multiple entries in this field by simply separating
the different entries with a comma.
NOTE GECOS is an acronym for General Electric Comprehensive Operating System (now referred
to as GCOS) and is a carryover from the early days of computing.
Directory
This is usually the user’s home directory, but it can also be any arbitrary location on the system. Each
user who actually logs into the system needs a place for configuration files that are unique to that user.
Along with configuration files, the directory (often referred to as the home directory) also stores the
users’ personal data such as documents, music, pictures, and so on.
The home directory allows each user to work in an environment that he or she has specifically
customized—without having to worry about the personal preferences and customizations of other
users. This applies even if multiple users are logged into the same system at the same time.
Startup Scripts
Startup scripts are not quite a part of the information stored in the users’ database in Linux, but
they nonetheless play an important role in determining and controlling a user’s environment. In
particular, the startup scripts in Linux are usually stored under the user’s home directory, and
hence the need to mention them while we’re still on the subject of the directory (home directory)
field in the /etc/passwd file.
Linux/UNIX was built from the get-go for multiuser environments and tasks. Each user is
allowed to have his or her own configuration files; thus, the system appears to be customized for
each particular user (even if other people are logged in at the same time). The customization of
each individual user environment is done through the use of shell scripts, run control files, and
the like. These files can contain a series of commands to be executed by the shell that starts
when a user logs in. In the case of the bash shell, for example, one of its startup files is the
.bashrc file. (Yes, there is a period in front of the filename—filenames preceded by periods,
also called dot files, are hidden from normal directory listings.) You can think of shell scripts in
the same light as MS Windows batch files, except shell scripts can be much more capable. The
.bashrc script in particular is similar in nature to autoexec.bat in the Windows world.
Various Linux software packages use application-specific and customizable options in
directories or files that begin with a dot (.) in each user’s home directory. Some examples are
.mozilla and .kde. Here are some common dot files that are present in each user’s home
directory:
.bashrc/.profile Configuration files for the bash shell.
.tcshrc/.login Configuration files for tcsh.
.xinitrc This script overrides the default script that gets called when you log into the X
Window System.
.Xdefaults This file contains defaults that you can specify for X Window System
applications.
When you create a user’s account, a set of default dot files are also created for the user; this
is mostly for convenience, to help get the user started. The user management tools discussed
later in this chapter help you do this automatically. The default files are stored under the
/etc/skel directory.
For the sake of consistency, most sites place home directories at /home and name each user’s
directory by that user’s login name. Thus, for example, if your login name were “yyang,” your home
directory would be /home/yyang. The exception to this is for some special system accounts, such as
a root user’s account or a system service. The superuser’s (root’s) home directory in Linux is usually
set to /root (but for most variants of UNIX, such as Solaris, the home directory is traditionally /). An
example of a special system service that might need a specific working directory could be a web
server whose web pages are served from the /var/www/ directory.
In Linux, the decision to place home directories under /home is strictly arbitrary, but it does make
organizational sense. The system really doesn’t care where we place home directories, so long as the
location for each user is specified in the password file.
Shell
When users log into the system, they expect an environment that can help them be productive. This
first program that users encounter is called a shell. If you’re used to the Windows side of the world,
you might equate this with command.com, Program Manager, or Windows Explorer (not to be
confused with Internet Explorer, which is a web browser).
Under UNIX/Linux, most shells are text-based. A popular default user shell in Linux is the Bourne
Again Shell, or BASH for short. Linux comes with several shells from which to choose—you can see
most of them listed in the /etc/shells file. Deciding which shell is right for you is kind of like
choosing a favorite beer—what’s right for you isn’t right for everyone, but still, everyone tends to get
defensive about their choice!
What makes Linux so interesting is that you do not have to stick with the list of shells provided in
/etc/shells. In the strictest of definitions, the password entry for each user doesn’t list what shell to
run so much as it lists what program to run first for the user. Of course, most users prefer that the first
program run be a shell, such as BASH.
The /etc/shadow File
This is the encrypted password file that stores the encrypted password information for user accounts.
In addition to storing the encrypted password, the /etc/shadow file stores optional password aging or
expiration information. The introduction of the shadow file came about because of the need to
separate encrypted passwords from the /etc/passwd file. This was necessary because the ease with
which the encrypted passwords could be cracked was growing with the increase in the processing
power of commodity computers (home PCs). The idea was to keep the /etc/passwd file readable by
all users without storing the encrypted passwords in it and then make the /etc/ shadow file readable
only by root or other privileged programs that require access to that information. An example of such
a program would be the login program.
You might wonder, “Why not just make the regular /etc/passwd file readable by root only or
other privileged programs?” Well, it isn’t that simple. By having the password file open for so many
years, the rest of the system software that grew up around it relied on the fact that the password file
was always readable by all users. Changing this could cause some software to fail.
Just as in the /etc/passwd file, each line in the /etc/shadow file represents information about a
user. The lines are made up of various standard fields, shown next, with each field delimited by a
colon:
Login name
Encrypted password
Days since January 1, 1970, that password was last changed
Days before password may be changed
Days after which password must be changed
Days before password is to expire that user is warned
Days after password expires that account is disabled
Days since January 1, 1970, that account is disabled
A reserved field
A sample entry from the /etc/shadow file is shown here for the user account mmel:
UNIX Epoch: January 1, 1970
January 1, 1970 00:00:00 UTC was chosen as the starting point or origin for keeping time on
UNIX systems. That specific instant in time is also known as the UNIX epoch. Time
measurements in various computing fields are counted and incremented in seconds from the
UNIX epoch. Put simply, it is a count of the seconds that have gone past since January 1, 1970
00:00:00.
An interesting UNIX time—1000000000—fell on September 9, 2001, at 1:46:40 A.M.
(UTC).
Another interesting UNIX time—1234567890—fell on February 13, 2009, at 11:31:30 P.M.
(UTC).
Numerous web sites are dedicated to calculating and displaying the UNIX epoch, but you
can quickly obtain the current value by running this command at the shell prompt:
The /etc/group File
The /etc/group file contains a list of groups, with one group per line. Each group entry in the file has
four standard fields, each colon-delimited, as in the /etc/passwd and /etc/shadow files. Each user on
the system belongs to at least one group, that being the user’s default group. Users can then be
assigned to additional groups if needed. You will recall that the /etc/passwd file contains each user’s
default group ID (GID). This GID is mapped to the group’s name and other members of the group in
the /etc/ group file. The GID should be unique for each group.
Also, like the /etc/passwd file, the group file must be world-readable so that applications can test
for associations between users and groups. Following are the fields of each line in the /etc/group:
Group name The name of the group
Group password Optional, but if set, allows users who are not part of the group to join
Group ID (GID) The numerical equivalent of the group name
Group members A comma-separated list
A sample group entry in the /etc/group file is shown here:
This entry is for the bin group. The GID for the group is 1, and its members are root, bin, and daemon.
User Management Tools
One of the many benefits of having password database files that have a well-defined format in straight
text is that it is easy for anyone to write custom management tools. Indeed, many site administrators
have already done this to integrate their tools along with the rest of their organization’s infrastructure.
They can, for example, start the process of creating a new user from the same form that lets them
update the corporate phone and e-mail directory, LDAP servers, web pages, and so on. Of course, not
everyone wants to write custom tools, which is why Linux comes with several existing tools that do
the job for you.
In this section, we discuss user management tools that can be launched from the command-line
interface, as well as graphical user interface (GUI) tools. Of course, learning how to use both is the
preferred route, since they both have advantages.
Command-Line User Management
You can choose from among several command-line tools to perform the same actions performed by
the GUI tools. Some of the most popular command-line tools are useradd, userdel, usermod,
groupadd, groupdel, and groupmod. The compelling advantage of using command-line tools for
user management, besides speed, is the fact that the tools can usually be incorporated into other
automated functions (such as scripts).
NOTE Linux distributions other than Fedora and RHEL may have slightly different parameters from
the tools used here. To see how your particular installation is different, read the built-in
documentation (also known as man page) for the particular program in question.
useradd
As the name implies, useradd allows you to add a single user to the system. Unlike the GUI tools,
this tool has no interactive prompts. Instead, all parameters must be specified on the command line.
Here’s the syntax for using this tool:
Take note that most of the options are optional. The useradd tool assumes preconfigured defaults
in its usage. The only non-optional parameter is the LOGIN parameter or the desired username. Also,
don’t be intimidated by this long list of options! They are all quite easy to use and are described in
Table 4-1.
Option
Description
-c, --comment
Allows you to set the user’s name in the GECOS field. As with any
command-line parameter, if the value includes a space, you will need
to add quotes around the text. For example, to set the user’s name to
Ying Yang, you would have to specify -c “Ying Yang”.
-d, --home-dir
By default, the user’s home directory is /home/user_name. When a
new user is created, the user’s home directory is created along with
the user account, so if you want to change the default to another place,
you can specify the new location with this parameter.
-e, --expiredate
It is possible for an account to expire after a certain date. By default,
accounts never expire. To specify a date, use the YYYY MM DD
format. For example, -e 2017 10 28 means the account will expire
on October 28, 2017.
-f, --inactive
This option specifies the number of days after a password expires that
the account is still usable. A value of 0 (zero) indicates that the
account is disabled immediately. A value of -1 will never allow the
account to be disabled, even if the password has expired. (For
example, -f 3 will allow an account to exist for three days after a
password has expired.) The default value is -1.
-g, --gid
Using this option, you can specify the user’s default group in the
password file. You can use a number or name of the group; however,
if you use a name of a group, the group must exist in the /etc/ group
file.
-G, --groups
This option allows you to specify additional groups to which the new
user will belong. If you use the -G option, you must specify at least
one additional group. You can, however, specify additional groups by
separating the elements in the list with commas. For example, to add a
user to the project and admin groups, you would specify -G
project,admin.
-m, --create-home [-k
skel-dir]
By default, the system automatically creates the user’s home directory.
This option is the explicit command to create the user’s home
directory. Part of creating the directory is copying default
configuration files into it. These files come from the /etc/skel
directory by default. You can change this by using the secondary
option -k skel-dir. (You must specify -m in order to use -k.) For
example, to specify the /etc/adminskel directory, you would use -m k /etc/adminskel.
-M
If you used the -m option, you cannot use -M, and vice versa. This
option tells the command not to create the user’s home directory.
-n
Red Hat Linux creates a new group with the same name as the new
user’s login as part of the process of adding a user. You can disable
this behavior by using this option.
-s shell
A user’s login shell is the first program that runs when a user logs into
a system. This is usually a command-line environment, unless you are
logging in from the X Window System login screen. By default, this is
the Bourne Again Shell (/bin/bash), though some folks like to use
other shells, such as the Turbo C Shell (/bin/tcsh).
-u, --uid
By default, the program will automatically find the next available UID
and use it. If, for some reason, you need to force a new user’s UID to
be a particular value, you can use this option. Remember that UIDs
must be unique for all users.
LOGIN or username
Finally, the only parameter that isn’t optional! You must specify the
new user’s login name.
Table 4-1. Options for the useradd Command
usermod
The usermod command allows you to modify an existing user in the system. It works in much the
same way as useradd. Its usage is summarized here:
Every option you specify when using this command results in that particular parameter being
modified for the user. All but one of the parameters listed here are identical to the parameters
documented for the useradd command. The one exception is -l.
The -l option allows you to change the user’s login name. This and the -u option are the only
options that require special care. Before changing the user’s login or UID, you must make sure the
user is not logged into the system or running any processes. Changing this information if the user is
logged in or running processes will cause unpredictable results.
userdel
The userdel command does the exact opposite of useradd—it removes existing users. This
straightforward command has only two optional parameters and one required parameter:
groupadd
The group-related commands are similar to the user commands; however, instead of working on
individual users, they work on groups listed in the /etc/group file. Note that changing group
information does not cause user information to be automatically changed. For example, if you remove
a group whose GID is 100 and a user’s default group is specified as 100, the user’s default group
would not be updated to reflect the fact that the group no longer exists.
The groupadd command adds groups to the /etc/group file. The command-line options for this
program are as follows:
Table 4-2 describes some common groupadd command options.
Option
Description
-g gid
Specifies the GID for the new group as gid. This value must be
unique, unless the -o option is used. By default, this value is
automatically chosen by finding the first available value greater than
or equal to 1000.
-r, --system
By default, Fedora, RHEL, and CentOS distros search for the first
GID that is higher than 999. The -r options tell groupadd that the
group being added is a system group and should have the first
available GID under 999.
-f, --force
This is the force flag. This will cause groupadd to exit without an
error when the group about to be added already exists on the system.
If that is the case, the group won’t be altered (or added again). It is a
Fedora- and RHEL-specific option.
GROUP
This option is required. It specifies the name of the group you want to
add to be group.
Table 4-2. Options for the groupadd Command
groupdel
Even more straightforward than userdel, the groupdel command removes existing groups specified
in the /etc/group file. The only usage information needed for this command is
where group is the name of the group to remove.
groupmod
The groupmod command allows you to modify the parameters of an existing group. The syntax and
options for this command are shown here:
The -g option allows you to change the GID of the group, and the -n option allows you to specify a
new name of a group. In addition, of course, you need to specify the name of the existing group as the
last parameter.
GUI User Managers
The obvious advantage to using the GUI tool is ease of use. It is usually just a point-and-click affair.
Many of the Linux distributions come with their own GUI user managers. Fedora, CentOS, and RHEL
come with a utility called system-config-users, and openSUSE/SEL Linux has a YaST module
that can be invoked with yast2 users. Ubuntu uses a tool called Users Account, which is bundled
with the gnome-control-center system applet. All these tools allow you to add, edit, and maintain
the users on your system. These GUI interfaces work just fine—but you should be prepared to have to
change user settings manually in case you don’t have access to the pretty GUI front-ends. Most of
these interfaces can be found in the System | Administration menu within the GNOME or KDE
desktop environment. They can also be launched directly from the command line. To launch Fedora’s
GUI user manager, you’d type this:
A window similar to the one in Figure 4-2 will open.
Figure 4-2. Fedora User Manager tool
In openSUSE or SLE, to launch the user management YaST module (see Figure 4-3), you’d type
this:
Figure 4-3. openSUSE User and Group Administration tool
In Ubuntu, to launch the user management tool (see Figure 4-4), you’d type this:
Figure 4-4. Ubuntu Users Settings tool
Users and Access Permissions
Linux determines whether a user or group has access to files, programs, or other resources on a
system by checking the overall effective permissions on the resource. The traditional permissions
model in Linux is simple—it is based on four access types, or rules. The following access types are
possible:
(r) Read permission
(w) Write permission
(x) Execute permission
(-) No permission or no access
In addition, these permissions can be applied to three classes of users:
Owner The owner of the file or application
Group The group that owns the file or application
Everyone All users
The elements of this model can be combined in various ways to permit or deny a user (or group)
access to any resource on the system. There is, however, a need for an additional type of permissiongranting mechanism in Linux. This need arises because every application in Linux must run in the
context of a user. This is explained in the next section, which covers SetUID and SetGID programs.
Understanding SetUID and SetGID Programs
Normally, when a program is run by a user, it inherits all of the user’s rights (or lack thereof). For
example, if a user can’t read the /var/log/messages file, neither can the program/application that is
needed to view the file. Note that this permission can be different from the permissions of the user
who owns the program file (usually called the binary). Consider the ls program (which is used to
generate directory listings), for example. It is owned by the root user. Its permissions are set so that
all users of the system can run the program. Thus, if the user yyang runs ls, that instance of ls is
bound by the permissions granted to the user yyang, not root.
However, there is an exception. Programs can be tagged with what’s called a SetUID bit (also
called a sticky bit), which allows a program to be run with permissions from the program’s owner,
not the user who is running it. Using ls as an example, setting the SetUID bit on it and having the file
owned by root means that if the user yyang runs ls, that instance of ls will run with root permissions,
not with yyang’s permissions. The SetGID bit works the same way, except instead of applying to the
file’s owner, it is applied to the file’s group setting.
To enable the SetUID bit or the SetGID bit, you need to use the chmod command. To make a
program SetUID, prefix whatever permission value you are about to assign it with a 4. To make a
program SetGID, prefix whatever permission you are about to assign it with a 2.
For example, to make /bin/ls a SetUID program (which is a bad idea, by the way), you would use
this command:
You can also use the following variation of the chmod command to add the user sticky bit:
To undo the effect of the previous command, type this:
You can also use the following variation of the chmod command to remove the user sticky bit:
To make /bin/ls a SetGID program (which is also bad idea, by the way), you would use this
command:
To remove the SetGID attribute from the /bin/ls program, you would use this command:
Pluggable Authentication Modules
Pluggable Authentication Modules (PAM) allow the use of a centralized authentication mechanism on
Linux/UNIX systems. Besides providing a common authentication scheme on a system, the use of
PAM allows for a lot of flexibility and control over authentication for application developers, as
well as for system administrators.
Traditionally, programs that grant users access to system resources have performed the user
authentication through some built-in mechanism. Although this worked great for a long time, the
approach was not very scalable and more sophisticated methods were required. This led to a number
of ugly hacks to abstract the authentication mechanism. Taking a cue from Solaris, Linux folks created
their own implementation of PAM.
The idea behind PAM is that instead of applications reading the password file, they would simply
ask PAM to perform the authentication. PAM could then use whatever authentication mechanism the
system administrator wanted. For many sites, the mechanism of choice is still a simple password file.
And why not? It does what we want. Most users understand the need for it, and it’s a well-tested
method to get the job done.
In this section, we discuss the use of PAM under the Fedora distribution. Note that although the
placement of files may not be exactly the same in other distributions, the underlying configuration
files and concepts still apply.
How PAM Works
PAM is to other Linux programs what a Dynamic Link Library (DLL) is to a Windows application—
it is just a library. When programs need to perform authentication on some user, they call a function
that exists in the PAM library. PAM provides a library of functions that an application can use to
request that a user be authenticated.
When invoked, PAM checks the configuration file for that application. If it finds no applicationspecific configuration file, it falls back to a default configuration file. This configuration file tells the
library what types of checks need to be done to authenticate the user. Based on this, the appropriate
module is called upon. Fedora, RHEL, and CentOS folks can see these modules in the /lib64/security
directory (or /lib/security directory on 32-bit platforms).
This module can check any number of things. It can simply check the /etc/passwd file or the
/etc/shadow file, or it can perform a more complex check, such as calling on an LDAP server.
Once the module has made the determination, an “authenticated/not authenticated” message is
passed back to the calling application.
If this seems like a lot of steps for what should be a simple check, you’re almost correct. Each
module here is small and does its work quickly. From a user’s point of view, there should be no
noticeable difference in performance between an application that uses PAM and one that does not.
From a system administrator’s and developer’s point of view, the flexibility this scheme offers is
incredible and a welcome addition.
PAM’s Files and Their Locations
On a Fedora-type system, PAM puts configuration files in certain places. These file locations and
their definitions are listed in Table 4-3.
File Location
Definition
/lib64/security or
/lib/security (32 Dynamically loaded authentication modules called by the actual PAM library.
bit)
/etc/security
Configuration files for the modules located in /lib64/security.
/etc/pam.d
Configuration files for each application that uses PAM. If an application that
uses PAM does not have a specific configuration file, the default is automatically
used.
Table 4-3. Important PAM Directories
Looking at the list of file locations in Table 4-3, you might ask why PAM needs so many different
configuration files. “One configuration file per application? That seems crazy!” Well, maybe not. The
reason PAM allows this is that not all applications are created equal. For instance, a Post Office
Protocol (POP) mail server that uses the Dovecot mail server might want to allow all of a site’s users
to fetch mail, but the login program might want to allow only certain users to log into the console. To
accommodate this, PAM needs a configuration file for POP mail that is different from the
configuration file for the login program.
Configuring PAM
The configuration files that we will be discussing here are located in the /etc/ pam.d directory. If you
want to change the configuration files that apply to specific modules in the /etc/security directory,
you should consult the documentation that came with the module. (Remember that PAM is just a
framework. Specific modules can be written by anyone.)
The nature of a PAM configuration file is interesting, because of its “stackable” nature. That is,
every line of a configuration file is evaluated during the authentication process (with the exceptions
shown next). Each line specifies a module that performs some authentication task and returns either a
success or failure flag. A summary of the results is returned to the application program calling PAM.
NOTE By “failure,” we do not mean the program did not work. Rather, we mean that when some
process was undertaken to verify whether a user could do something, the return value was “NO.”
PAM uses the terms “success” and “failure” to represent this information that is passed back to the
calling application.
Each PAM configuration file in the /etc/pam.d/ directory consists of lines that have the following
syntax/format,
where module_type represents one of four types of modules: auth, account, session, or password.
Comments must begin with the hash (#) character. Table 4-4 lists these module types and their
functions.
Module Type
Function
auth
Instructs the application program to prompt the user for a password
and then grants both user and group privileges.
account
Performs no authentication, but determines access from other factors,
such as time of day or location of the user. For example, the root login
can be given only console access this way.
session
Specifies what, if any, actions need to be performed before or after a
user is logged in (for example, logging the connection).
password
Specifies the module that allows users to change their password (if
appropriate).
Table 4-4. PAM Module Types
The control_flag allows you to specify how you want to deal with the success or failure of a
particular authentication module. Some common control flags are described in Table 4-5.
Control Flag
Description
required
If this flag is specified, the module must succeed in authenticating the
individual. If it fails, the returned summary value must be failure.
requisite
This flag is similar to required; however, if requisite fails
authentication, modules listed after it in the configuration file are not
called, and a failure is immediately returned to the application. This
allows you to require that certain conditions hold true before even a
login attempt is accepted (for example, the user must be on the local
area network and cannot be attempting to log in over the Internet).
sufficient
If a sufficient module returns a success and there are no more
required or sufficient control flags in the configuration file, PAM
returns a success to the calling application.
optional
This flag allows PAM to continue checking other modules, even if this
one has failed. In other words, the result of this module is ignored.
For example, you might use this flag when a user is allowed to log in
even if a particular module has failed.
include
This flag is used for including all lines or directives from another
configuration file specified as an argument. It is used as a way of
chaining or stacking together the directives in different PAM
configuration files.
Table 4-5. PAM Control Flags
The module_path specifies the actual directory path of the module that performs the
authentication task. The modules are usually stored under the /lib64/security (or /lib/security)
directory.
The final entry in a PAM configuration line is arguments. These are the parameters passed to the
authentication module. Although the parameters are specific to each module, some generic options
can be applied to all modules. These arguments are described in Table 4-6.
Argument
Description
debug
Sends debugging information to the system logs.
no_warn
Does not give warning messages to the calling application.
use_first_pass
Does not prompt the user for a password a second time. Instead, the
password that was entered in the preceding auth module should be
reused for the user authentication. (This option is for the auth and
password modules only.)
try_first_pass
This option is similar to use_first_pass, because the user is not
prompted for a password the second time. However, if the existing
password causes the module to return a failure, the user is then
prompted for a password again.
use_mapped_pass
This argument instructs the module to take the cleartext authentication
token entered by a previous module and use it to generate an
encryption/decryption key with which to safely store or retrieve the
authentication token required for this module.
expose account
This argument allows a module to be less discreet about account
information—as deemed fit by the system administrator.
nullok
This argument allows the called PAM module to allow blank (null)
passwords
Table 4-6. PAM Configuration Arguments
An Example PAM Configuration File
Let’s examine a sample PAM configuration file, /etc/pam.d/login:
We can see that the first line begins with a hash symbol and is therefore a comment. Thus, we can
ignore it. Let’s go on to line 2:
Because the module_type is auth, PAM will want a password. The control_flag is set to
required, so this module must return a success or the login will fail. The module itself,
pam_securetty.so, verifies that logins on the root account can happen only on the terminals
mentioned in the /etc/securetty file. There are no arguments on this line.
Similar to the first auth line, line 3 wants a password for authentication, and if the password
fails, the authentication process will return a failure flag to the calling application. The pam_stack.so
module lets you call from inside the stack for a particular service or the stack defined for another
service. The service=system-auth argument in this case tells pam_stack.so to execute the stack
defined for the service system-auth (system-auth is also another PAM configuration under the
/etc/pam.d directory).
In line 4, the pam_nologin.so module checks for the /etc/nologin file. If it is present, only root is
allowed to log in; others are turned away with an error message. If the file does not exist, it always
returns a success.
In line 5, since the module_type is account, the pam_stack.so module acts differently. It
silently checks that the user is allowed to log in (for example, “Has their password expired?”). If all
the parameters check out OK, it will return a success.
The same concepts apply to the rest of the lines in the /etc/pam.d/login file (as well as other
configuration files under the /etc/pam.d directory).
If you need more information about what a particular PAM module does or about the arguments it
accepts, you can consult the man page for the module. For example, to find out more about the
pam_selinux.so module, you would issue this command:
The “Other” File
As mentioned earlier, if PAM cannot find a configuration file that is specific to an application, it will
use a generic configuration file instead. This generic configuration file is called /etc/pam.d/other. By
default, the “other” configuration file is set to a paranoid setting so that all authentication attempts are
logged and then promptly denied. It is recommended you keep it that way.
D’oh! I Can’t Log In!
Don’t worry—screwing up a setting in a PAM configuration file happens to everyone. Consider it
part of learning the ropes. First thing to do: Don’t panic. Like most configuration errors under Linux,
you can fix things by booting into single-user mode (see Chapter 7) and fixing the errant file.
If you’ve screwed up your login configuration file and need to bring it back to a sane state, here is
a safe setting you can put in:
This setting will give Linux the default behavior of simply looking into the /etc/passwd or
/etc/shadow file for a password. This should be good enough to get you back in, where you can make
the changes you meant to make!
NOTE The pam_unix.so module is what facilitates this behavior. It is the standard Linux/ UNIX
authentication module. According to the module’s man page, it uses standard calls from the system’s
libraries to retrieve and set account information as well as authentication. Usually, this is obtained
from the /etc/passwd file and from the /etc/shadow file as well if shadow is enabled.
Debugging PAM
Like many other Linux services, PAM makes excellent use of the system log files (you can read more
about them in Chapter 8). If things are not working the way you want them to, begin by looking at the
tail end of the log files and see if PAM is spelling out what happened. More than likely, it is. You
should then be able to use this information to change your settings and fix the problem. The main
system log file to monitor is the /var/log/messages file.
A Grand Tour
The best way to see many of the utilities discussed in this chapter interact with one another is to show
them at work. In this section, we take a step-by-step approach to creating, modifying, and removing
users and groups. Some new commands that were not mentioned but that are also useful and relevant
in managing users on a system are also introduced and used.
Creating Users with useradd
On our sample Fedora server, we will add new user accounts and assign passwords with the
useradd and passwd commands.
1. Create a new user whose full name is “Ying Yang,” with the login name (account name) of
yyang. Type the following:
This command will create a new user account called yyang. The user will be created with the
usual Fedora default attributes. The entry in the /etc/passwd file will be
From this entry, you can tell these things about the Fedora (and RHEL) default new user
values:
The UID number is the same as the GID number. The value is 1000 in this example.
The default shell for new users is the bash shell (/bin/bash).
A home directory is automatically created for all new users (for example,
/home/yyang).
2. Use the passwd command to create a new password for the username yyang. Set the
password to be 19ang19, and repeat the same password when prompted. Type the following:
3. Create another user account called mmellow for the user, with a full name of “Mel Mellow,”
but this time, change the default Fedora behavior of creating a group with the same name as
the username (that is, this user will instead belong to the general users group). Type this:
4. Use the id command to examine the properties of the user mmellow:
5. Again, use the passwd command to create a new password for the account mmellow. Set the
password to be 2owl!78, and repeat the same password when prompted:
6. Create the final user account, called bogususer. But this time, specify the user’s shell to be the
tcsh shell, and let the user’s default primary group be the system “games” group:
7. Examine the /etc/passwd file for the entry for the bogususer user:
From this entry, you can tell the following:
The UID is 1003.
The GID is 20.
A home directory is also created for the user under the /home directory.
The user’s shell is /bin/tcsh.
Creating Groups with groupadd
Next, create a couple of groups: nonsystem and system.
1. Create a new group called research:
2. Examine the entry for the research group in the /etc/group file:
This output shows that the group ID for the research group is 1002.
3. Create another group called sales:
4. Create the final group called bogus, and force this group to be a system group (that is, the GID
will be lower than 999). Type the following:
5. Examine the entry for the bogus group in the /etc/group file:
The output shows that the group ID for the bogus group is 989.
Modifying User Attributes with usermod
Now try using usermod to change the user and group IDs for a couple of accounts.
1. Use the usermod command to change the user ID (UID) of the bogususer to 1600:
2. Use the id command to view your changes:
The output shows the new UID (1600) for the user.
3. Use the usermod command to change the primary GID of the bogususer account to that of the
bogus group (GID = 989) and also to set an expiry date of 12-12-2017 for the account:
4. View your changes with the id command:
5. Use the chage command to view the new account expiration information for the user:
Modifying Group Attributes with groupmod
Now try using the groupmod command.
1. Use the groupmod command to rename the bogus group as bogusgroup:
2. Again use the groupmod command to change the GID of the newly renamed bogusgroup to
1600:
3. View your changes to the bogusgroup in the /etc/group file:
Deleting Users and Groups with userdel and groupdel
Try using the userdel and groupdel commands to delete users and groups, respectively.
1. Use the userdel command to delete the user bogususer that you created previously. At the
shell prompt, type this:
2. Use the groupdel command to delete the bogusgroup group:
Notice that the bogusgroup entry in the /etc/group file is removed.
NOTE When you run the userdel command with only the user’s login specified on the command
line (for example, userdel bogususer), all of the entries in the /etc/passwd and /etc/shadow files,
as well as references in the /etc/group file, are automatically removed. But if you use the optional -r
parameter (for example, userdel -r bogususer), all of the files owned by the user in that user’s
home directory are removed as well.
Summary
This chapter documented the ins and outs of user and group management under Linux. Much of what
you read here also applies to other variants of UNIX, which makes administering users in
heterogeneous environments much easier with the different *NIX varieties.
The following main points were covered in this chapter:
Each user gets a unique UID.
Each group gets a unique GID.
The /etc/passwd file maps UIDs to usernames.
Linux handles encrypted passwords in multiple ways.
Linux includes tools that help you administer users.
Should you decide to write your own tools to manage the user databases, you’ll now
understand the format for doing so.
PAM, the Pluggable Authentication Modules, is Linux’s generic way of handling multiple
authentication mechanisms.
These changes are pretty significant for an administrator coming from a Microsoft Windows
environment and can be a little tricky at first. Not to worry, though—the Linux/UNIX security model
is quite straightforward, so you should quickly get comfortable with how it all works.
If the idea of getting to build your own tools to administer users appeals to you, definitely look
into books on the Perl scripting language. It is remarkably well suited for manipulating tabular data
(such as the /etc/passwd file). Take some time and page through a few Perl programming books at
your local bookstore if this is something that interests you.
CHAPTER 5
The Command Line
he level of power, control, and flexibility that the command line offers Linux/UNIX users has
been one of its most endearing and enduring qualities. There is also a flip side to this,
however: for the uninitiated, the command line can also produce extremes of emotions,
including awe, frustration, and annoyance. Casual observers of Linux/UNIX gurus are often astounded
at the results of a few carefully crafted and executed commands. Unfortunately, this power comes at a
cost—it can make using Linux/UNIX appear less intuitive to the average user. For this reason,
graphical user interface (GUI) front-ends for various UNIX/Linux tools, functions, and utilities have
been written.
More experienced users, however, may find that it is difficult for a GUI to present all of the
available options. Typically, doing so would make the interface just as complicated as the commandline equivalent. The GUI design is often oversimplified, and experienced users ultimately return to the
comprehensive capabilities of the command line. After all is said and done, the fact remains that it
just looks plain cool to do things at the command line.
Before we begin our study of the command-line interface under Linux, understand that this chapter
is far from an exhaustive resource. Rather than trying to cover all the tools without any depth, this
chapter thoroughly describes a handful of tools that are most critical for the day-to-day work of a
system administrator.
T
NOTE This chapter assumes that you are logged into the system as a regular/nonprivileged user and
that the X Window System is up and running. If you are using the GNOME desktop environment, for
example, you can start a virtual terminal in which to issue commands. To launch a virtual terminal
application, simultaneously press the ALT-F2 key combination on your keyboard to bring up the Run
Application dialog box. After the Run Application dialog box appears, you can type the name of a
terminal emulator (for example, xterm, gnome-terminal, or konsole) into the Run text box and then
press ENTER. You can alternatively look under the Applications menu for any of the installed terminal
emulator applications. All the commands you enter in this chapter should be typed into the virtual
terminal window.
An Introduction to BASH
In Chapter 4, you learned that one of the fields in a user’s password entry is that user’s login shell,
which is the first program that runs when a user logs into a workstation. The shell is comparable to
the Windows Program Manager, except that in the Linux case, the system administrator has a say in
the choice of shell program used.
The formal definition of a shell is “a command language interpreter that executes commands.” A
less formal definition might be simply “a program that provides an interface to the system.” The
Bourne Again Shell (BASH), in particular, is a command-line–only interface containing a handful of
built-in commands; it has the ability to launch other programs and to control programs that have been
launched from it (job control). This might seem simple at first, but you will begin to realize that the
shell is a powerful tool.
A variety of shells exist, most with similar features but different means of implementing them.
Again for the purpose of comparison, you can think of the various shells as being like web browsers;
among several different browsers, the basic functionality is the same—displaying content from the
Web. In any situation like this, everyone proclaims that his or her shell is better than the others, but it
all really comes down to personal preference.
In this section, we’ll examine some of BASH’s built-in commands. A complete reference on
BASH could easily fill a large book in itself, so we’ll stick with the commands that a system
administrator (or regular user) might use frequently. However, it is highly recommended that you
eventually study BASH’s other functions and operations. As you get accustomed to BASH, you’ll find
it easy to pick up and adapt to the slight nuances of other shells as well—in other words, the
differences between them are subtle. If you are managing a large site with lots of users, it will be
advantageous for you to be familiar with as many shells as possible.
Job Control
When working in the BASH environment, you can start multiple programs from the same prompt.
Each program is considered a job. Whenever a job is started, it takes over the terminal. On today’s
machines, the terminal is either the straight text interface you see when you boot the machine or the
window created by the X Window System on which BASH runs. (The terminal interfaces in X
Window System are called a pseudo-tty, or pty for short.) If a job has control of the terminal, it can
issue control codes so that text-only interfaces (the Pine e-mail reader, for instance) can be made
more attractive. Once the program is done, it gives full control back to BASH, and a prompt is
redisplayed for the user.
Not all programs require this kind of terminal control, however. Some, including programs that
interface with the user through the X Window System, can be instructed to give up terminal control
and allow BASH to present a user prompt, even though the invoked program is still running.
In the following example, with the user yyang logged into the system, the user launches the Firefox
web browser from the command line or shell, with the additional condition that the program (Firefox)
gives up control of the terminal (this condition is specified by appending the ampersand symbol to the
program name):
Immediately after you press ENTER, BASH will present its prompt again. This is called
backgrounding the task.
If a program is already running and has control of the terminal, you can make the program give up
control by pressing CTRL-Z in the terminal window. This will stop the running job (or program) and
return control to BASH so that you can enter new commands. At any given time, you can find out how
many jobs BASH is tracking by typing this command:
The running programs that are listed will be in one of two states: running or stopped. The
preceding sample output shows that the Firefox program is in a running state. The output also shows
the job number in the first column—[1].
To bring a job back to the foreground—that is, to give it back control of the terminal—you would
use the fg (foreground) command, like this:
Here, number is the job number you want in the foreground. For example, to place the Firefox
program (with job number 1) launched earlier in the foreground, type this:
If a job is stopped (that is, in a stopped state), you can start it running again in the background,
thereby allowing you to keep control of the terminal and resume running the job. Or a stopped job can
run in the foreground, which gives control of the terminal back to that program.
To place a running job in the background, type this
Here, number is the job number you want to background.
NOTE You can background any process. Applications that require terminal input or output will be
put into a stopped state if you background them. You can, for example, try running the top utility in
the background by typing top &. Then check the state of that job with the jobs command.
Environment Variables
Every instance of a shell, and every process that is running, has its own “environment”—these are
settings that give it a particular look, feel, and, in some cases, behavior. These settings are typically
controlled by environment variables. Some environment variables have special meanings to the shell,
but there is nothing stopping you from defining your own and using them for your own needs. It is
through the use of environment variables that most shell scripts are able to do interesting things and
remember results from user inputs as well as program outputs. If you are already familiar with the
concept of environment variables in Windows 200x/XP/Vista/7, you’ll find that many of the things
that you know about them will apply to Linux as well; the only difference is how they are set, viewed,
and removed.
Printing Environment Variables
To list all of your environment variables, use the printenv command. Here’s an example:
To show a specific environment variable, specify the variable as a parameter to printenv. For
example, here is the command to see the environment variable TERM:
Setting Environment Variables
To set an environment variable, use the following format:
Here, variable is the variable name and value is the value you want to assign the variable. For
example, to set the environment variable FOO to the value BAR, type this:
Whenever you set environment variables in this way, they stay local to the running shell. If you
want that value to be passed to other processes that you launch, use the export built-in command.
The format of the export command is as follows:
Here, variable is the name of the variable. In the example of setting the variable FOO, you would
enter this command:
TIP You can combine the steps for setting an environment variable with the export command, like
so:
If the value of the environment variable you want to set has spaces in it, surround the variable
with quotation marks. Using the preceding example, to set FOO to “Welcome to the BAR of FOO.”,
you would enter this:
You can then use the printenv command to see the value of the FOO variable you just set by
typing this:
Unsetting Environment Variables
To remove an environment variable, use the unset command. Here’s the syntax for the unset
command:
Here, variable is the name of the variable you want to remove. For example, here’s the command to
remove the environment variable FOO:
NOTE This section assumes that you are using BASH. You can choose to use many other shells; the
most popular alternatives are the C shell (csh) and its brother, the Tenex/Turbo/Trusted C shell
(tcsh), which uses different mechanisms for getting and setting environment variables. BASH is
documented here because it is the default shell of all new Linux user accounts in most Linux
distributions.
Pipes
Pipes are a mechanism by which the output of one program can be sent as the input to another
program. Individual programs can be chained together to become extremely powerful tools.
Let’s use the grep program to provide a simple example of how pipes can be used. When given a
stream of input, the grep utility will try to match the line with the parameter supplied to it and display
only matching lines. You will recall from the preceding section that the printenv command prints all
the environment variables. The list it prints can be lengthy, so, for example, if you were looking for
all environment variables containing the string “TERM”, you could enter this command:
The vertical bar ( | ) character represents the pipe between printenv and grep.
The command shell under Windows also utilizes the pipe function. The primary difference is that
all commands in a Linux pipe are executed concurrently, whereas Windows runs each program in
order, using temporary files to hold intermediate results.
Redirection
Through redirection, you can take the output of a program and have it automatically sent to a file.
(Remember that everything in Linux/UNIX is regarded as a file!) The shell rather than the program
itself handles this process, thereby providing a standard mechanism for performing the task. Having
the shell handle redirection is therefore much cleaner and easier than having individual programs
handle redirection themselves.
Redirection comes in three classes: output to a file, append to a file, and send a file as input.
To collect the output of a program into a file, end the command line with the greater-than symbol
(>) and the name of the file to which you want the output redirected. If you are redirecting to an
existing file and you want to append additional data to it, use two symbols (>>) followed by the
filename. For example, here is the command to collect the output of a directory listing into a file
called /tmp/directory_listing:
Continuing this example with the directory listing, you could append the string “Directory Listing”
to the end of the /tmp/directory_listing file by typing this command:
The third class of redirection, using a file as input, is done by using the less-than sign (<)
followed by the name of the file. For example, here is the command to feed the /etc/passwd file into
the grep program:
Command-Line Shortcuts
Most of the popular Linux/UNIX shells have a tremendous number of shortcuts. Learning and getting
used to the shortcuts can be a huge cultural shock for users coming from the Windows world. This
section explains the most common of the BASH shortcuts and their behaviors.
Filename Expansion
Under UNIX-based shells such as BASH, wildcards on the command line are expanded before being
passed as a parameter to the application. This is in sharp contrast to the default mode of operation for
DOS-based tools, which often have to perform their own wildcard expansion. The UNIX method also
means that you must be careful where you use the wildcard characters. The wildcard characters
themselves in BASH are identical to those in command.com or cmd.exe in the Windows world.
The asterisk (*) matches against all filenames, and the question mark (?) matches against single
characters. If you need to use these characters as part of another parameter for whatever reason, you
can escape them by preceding them with a backslash (\) character. This causes the shell to interpret
the asterisk and question mark as regular characters instead of wildcards.
NOTE Most Linux documentation refers to wildcards as regular expressions. The distinction is
important, since regular expressions are substantially more powerful than just wildcards alone. All of
the shells that come with Linux support regular expressions. You can read more about them in the
shell’s manual page (for example, man bash, man csh, and man tcsh).
Environment Variables as Parameters
Under BASH, you can use environment variables as parameters on the command line. (Although the
Windows command prompt can do this as well, it’s not a common practice and thus is an oftenforgotten convention.) For example, issuing the parameter $FOO will cause the value of the FOO
environment variable to be passed rather than the string “$FOO”.
Multiple Commands
Under BASH, multiple commands can be executed on the same line by separating the commands with
semicolons (;). For example, here’s how to execute this sequence of commands (cat and ls) on two
single lines:
You could instead type the following:
Since the shell is also a programming language, you can run commands serially only if the first
command succeeds. This is achieved by using the double ampersand (&&) symbol. For example, you
can use the ls command to try to list a file that does not exist in your home directory, and then
execute the date command right after that on the same line:
This command will run the ls command, but that command will fail because the file it is trying to
list does not exist, and, therefore, the date command will not be executed either. But if you switch the
order of commands around, you will notice that the date command will succeed, while the ls
command will fail:
Backticks
How’s this for wild? You can take the output of one program and make it the parameter of another
program. Sound bizarre? Well, time to get used to it—this is one of the many useful and innovative
features available in all UNIX shells.
Any text enclosed within backticks (’) is treated as a command to be executed. This allows you to
embed commands within backticks and pass the result as parameters to other commands, for example.
You’ll see this technique used often in this book and in various system scripts. For example, you can
pass the value of a number (a process ID number) stored in a file and then pass that number as a
parameter to the kill command. A typical use of this is for killing (stopping) the Domain Name
System (DNS) server named. When named starts, it writes its process identification (PID) number
into the file /var/run/named/named.pid. Thus, the generic and dirty way of killing the named process
is to look at the number stored in /var/run/named/named.pid using the cat command, and then issue
the kill command with that value. Here’s an example:
One problem with killing the named process in this way is that it cannot be easily automated—we
are counting on the fact that a human will read the value in /var/run/ named/named.pid in order to
pass the kill utility the number. Another issue isn’t so much a problem as it is a nuisance: It takes two
steps to stop the DNS server.
Using backticks, however, you can combine the steps into one and do it in a way that can be
automated. The backticks version would look like this:
When BASH sees this command, it will first run cat /var/run/named/named.pid and store the
result. It will then run kill and pass the stored result to it. From our point of view, this happens in
one graceful step.
NOTE So far in this chapter, we have looked at features that are internal to BASH (or “BASH builtins” as they are sometimes called). The remainder of the chapter explores several common commands
accessible outside of BASH.
Documentation Tools
Linux comes with two superbly useful tools for making documentation accessible: man and info.
Currently, a great deal of overlap exists between these two documentation systems, because many
applications are moving their documentation to the info format. This format is considered superior to
man because it allows the documentation to be hyperlinked together in a web-like way, but without
actually having to be written in Hypertext Markup Language (HTML) format.
The man format, on the other hand, has been around for decades. For thousands of Linux
utilities/programs, their man (short for manual) pages are their only source of documentation.
Furthermore, many applications continue to utilize the man format because many other UNIX-like
operating systems (such as Sun Solaris) use it.
Both the man and info documentation systems will be around for a long while to come. It is
highly recommended that you get comfortable with them both.
TIP Many Linux distributions also include a great deal of documentation in the /usr/doc or /usr/
share/doc directory.
The man Command
I mentioned quite early in this book that man pages are documents found online (on the local system)
that cover the use of tools and their corresponding configuration files. The syntax of the man command
is as follows:
Here, program_name identifies the program in which you’re interested. For example, to view the man
page for the ls utility that we’ve been using, type this:
While reading about Linux and Linux-related information sources (newsgroups and so forth), you
may encounter references to commands followed by numbers in parentheses—for example, ls (1).
The number represents the section of the manual pages (see Table 5-1). Each section covers various
subject areas to accommodate the fact that some tools (such as printf) are commands/functions in
the C programming language as well as command-line commands.
Manual Section
Subject
1
Standard commands, executable programs, or shell commands
1p
POSIX versions of standard commands; the lowercase “p” stands for
POSIX
2
Linux kernel system calls (functions provided by the kernel)
3
C library calls
4
Device driver information
5
Configuration files
6
Games
7
Packages
8
System tools
Table 5-1. Man Page Sections
To refer to a specific man page section, simply specify the section number as the first parameter
and then the command as the second parameter. For example, to get the C programmers’ information
on printf, you’d enter this:
To get the plain command-line information (user tools), you’d enter this:
If you don’t specify a section number with the man command, the default behavior is that the
lowest section number gets printed first. Unfortunately, this organization can sometimes be difficult to
use, and as a result, several other alternatives are available.
TIP A handy option to the man command is an -f preceding the command parameter. With this
option, man will search the summary information of all the man pages and list pages matching your
specified command, along with their section number. Here’s an example:
The texinfo System
Another common form of documentation is texinfo. Established as the GNU standard, texinfo is a
documentation system similar to the hyperlinked World Wide Web format. Because documents can be
hyperlinked together, texinfo is often easier to read, use, and search in comparison to man pages.
To read the texinfo documents on a specific tool or application, invoke info with the parameter
specifying the tool’s name. For example, to read about the wget program, you’d type this:
In general, you will want to verify whether a man page exists before using info (because there is
still a great deal more information available in man format than in texinfo). On the other hand, some
man pages will explicitly state that the texinfo pages are more authoritative and should be read
instead.
Files, File Types, File Ownership, and File Permissions
Managing files under Linux is different from managing files under Windows 200x/XP/Vista/7, and
radically different from managing files under Windows 95/98. This section covers basic file
management tools and concepts under Linux. We’ll start with specifics on some useful generalpurpose commands, and then we’ll step back and look at some background information.
Under Linux (and UNIX in general), almost everything is abstracted to a file. Originally, this was
done to simplify the programmer’s job. Instead of having to communicate directly with device
drivers, special files (which look like ordinary files to the application) are used as a bridge. We
discuss the different types of file categories in the following sections.
Normal Files
Normal files are just that—normal. They contain data and can also be executables. The operating
system makes no assumptions about their contents.
Directories
Directory files are a special instance of normal files. Directory files list the locations of other files,
some of which may be other directories. (This is similar to folders in Windows.) In general, the
contents of directory files won’t be of importance to your daily operations, unless you need to open
and read the file yourself rather than using existing applications to navigate directories. (This would
be similar to trying to read the DOS file allocation table directly rather than using cmd.exe to
navigate directories or using the findfirst/findnext system calls.)
Hard Links
Each file in the Linux file system gets its own i-node. An i-node keeps track of a file’s attributes and
its location on the disk. If you need to be able to refer to a single file using two separate filenames,
you can create a hard link. The hard link will have the same i-node as the original file and will,
therefore, look and behave just like the original. With every hard link that is created, a reference
count is incremented. When a hard link is removed, the reference count is decremented. Until the
reference count reaches zero, the file will remain on disk.
NOTE A hard link cannot exist between two files on separate file systems (or partitions). This is
because the hard link refers to the original file by i-node, and a file’s i-node is only unique on the file
system on which it was created.
Symbolic Links
Unlike hard links, which point to a file by its i-node, a symbolic link points to another file by its
name. This allows symbolic links (often abbreviated symlinks) to point to files located on other file
systems, even other network drives.
Block Devices
Since all device drivers are accessed through the file system, files of type block device are used to
interface with devices such as disks. A block device file has three identifying traits:
It has a major number.
It has a minor number.
When viewed using the ls -l command, it shows b as the first character of the permissions field.
Here’s an example:
Note the b at the beginning of the file’s permissions; the 8 is the major number, and the 0 is the
minor number.
A block device file’s major number identifies the represented device driver. When this file is
accessed, the minor number is passed to the device driver as a parameter, telling it which device it is
accessing. For example, if there are two serial ports, they will share the same device driver and thus
the same major number, but each serial port will have a unique minor number.
Character Devices
Similar to block devices, character devices are special files that allow you to access devices through
the file system. The obvious difference between block and character devices is that block devices
communicate with the actual devices in large blocks, whereas character devices work one character
at a time. (A hard disk is a block device; a modem is a character device.) Character device
permissions start with a c, and the file has a major number and a minor number. Here’s an example:
Named Pipes
Named pipes are a special type of file that allows for interprocess communication. Using the mknod
command, you can create a named pipe file that one process can open for reading and another process
can open for writing, thus allowing the two to communicate with one another. This works especially
well when a program refuses to take input from a command-line pipe, but another program needs to
feed the other some data and you don’t have the disk space for a temporary file.
For a named pipe file, the first character of its file permissions is a p. For example, if a named
pipe called mypipe exists in your present working directory (PWD), a long listing of the named pipe
file would show this:
Listing Files: ls
Out of necessity, we have been using the ls command in previous sections and chapters of this book,
without properly explaining it. We will look at the ls command and its options in more details here.
The ls command is used to list all the files in a directory. Of more than 50 available options,
those listed in Table 5-2 are the most commonly used. The options can be used in any combination.
Option for ls
Description
-l
Long listing. In addition to the filename, shows the file size, date/time,
permissions, ownership, and group information.
-a
All files. Shows all files in the directory, including hidden files.
Names of hidden files begin with a period.
-t
Lists in order of last modified time.
-r
Reverses the listing.
-1
Single-column listing.
-R
Recursively lists all files and subdirectories.
Table 5-2. Common ls Options
To list all files in a directory with a long listing, type this command:
To list a directory’s nonhidden files that start with the letter A, type this:
If no such file exists in your working directory, ls prints out a message telling you so.
TIP Linux/UNIX is case-sensitive. For example, a file named thefile.txt is different from a file
named Thefile.txt.
Change Ownership: chown
The chown command allows you to change the ownership of a file to another user. Only the root user
can do this. (Normal users may not assign file ownership or steal ownership from another user.) The
syntax of the command is as follows:
Here, username is the login of the user to whom you want to assign ownership, and filename is
the name of the file in question. The filename may be a directory as well.
The -R option applies when the specified filename is a directory name. This option tells the
command to descend recursively through the directory tree and apply the new ownership, not only to
the directory itself, but also to all of the files and directories within it.
NOTE The chown command supports a special syntax that allows you also to specify a group name
to assign to a file. The format of the command becomes
Change Group: chgrp
The chgrp command-line utility lets you change the group settings of a file. It works much like chown.
Here is the format:
Here, groupname is the name of the group to which you want to assign filename ownership. The
filename can be a directory as well.
The -R option applies when the specified filename is a directory name. As with chown, the -R
option tells the command to descend recursively through the directory tree and apply the new
ownership, not only to the directory itself, but also to all of the files and directories within it.
Change Mode: chmod
Directories and files within the Linux file system have permissions associated with them. By default,
permissions are set for the owner of the file, the group associated with the file, and everyone else
who can access the file (also known as owner, group, and other, respectively).
When you list files or directories, you see the permissions in the first column of the output.
Permissions are divided into four parts. The first part is represented by the first character of the
permission. Normal files have no special value and are represented with a hyphen (-) character. If the
file has a special attribute, it is represented by a letter. The two special attributes we are most
interested in here are directories (d) and symbolic links (l).
The second, third, and fourth parts of a permission are represented in three-character chunks. The
first part indicates the file owner’s permission. The second part indicates the group permission. The
last part indicates the world permission. In the context of UNIX, “world” means all users in the
system, regardless of their group settings.
Following are the letters used to represent permissions and their corresponding values. When you
combine attributes, you add their values. The chmod command is used to set permission values.
Using the numeric command mode is typically known as the octal permissions, since the value
can range from 0 to 7. To change permissions on a file, you simply add or subtract these values for
each permission you want to apply.
For example, if you want to make it so that only the user (owner) can have full access (RWX) to a
file called foo, you would type this:
What is important to note is that using the octal mode replaces any permissions that were
previously set. So if a file in the /usr/local directory was tagged with a SetUID bit, and you ran the
command chmod -R 700 /usr/local, that file will no longer be a SetUID program.
If you want to change certain bits, you should use the symbolic mode of chmod. This mode turns
out to be much easier to remember, and you can add, subtract, or overwrite permissions.
The symbolic form of chmod allows you to set the bits of the owner, the group, or others. You can
also set the bits for all. For example, if you want to change a file called foobar.sh so that it is
executable for the owner, you can run the following command:
If you want to change the group’s bit to execute also, use the following:
If you need to specify different permissions for others, just add a comma and its permission
symbols. For example, to make the foobar.sh file executable for the user and the group, but also
remove read, write, and executable permissions for all others, you could try this:
If you do not want to add or subtract a permission bit, you can use the equal (=) sign instead of a
plus (+) sign or minus (–) sign. This will write the specific bits to the file and erase any other bit for
that permission. The preceding examples used + to add the execute bit to the User and Group fields. If
you want only the execute bit, you would replace the + with =. You can also use a fourth character: a.
This will apply the permission bits to all the fields.
The following list shows the most common combinations of the three permissions. Other
combinations, such as -wx, also exist, but they are rarely used.
For each file, three of these three-letter chunks are grouped together. The first chunk represents
the permissions for the owner of the file, the second chunk represents the permissions for the file’s
group, and the last chunk represents the permissions for all users on the system. Table 5-3 shows
some permission combinations, their numeric equivalents, and their descriptions.
Table 5-3. File Permissions
File Management and Manipulation
This section covers the basic command-line tools for managing files and directories. Most of this will
be familiar to anyone who has used a command-line interface—same old functions, but new
commands to execute.
Copy Files: cp
The cp command is used to copy files. It has a substantial number of options. See its man page for
additional details. By default, this command works silently, displaying status information only if an
error condition occurs. Following are the most common options for cp:
Option
Description
-f
Forces copy; does not ask for verification
-i
Interactive copy; before each file is copied, verifies with user
-R, -r
Copies directories recursively
First, let’s use the touch command to create an empty file called foo.txt in the user yyang’s home
directory:
Then use the cp (copy) command to copy foo.txt to foo.txt.html:
To copy all files in the current directory ending in .html to the /tmp directory, type this:
To interactively recopy all files in the current directory ending in .html to the /tmp directory, type
this command:
You will notice that using the interactive (-i) option with cp forces it to prompt or warn you
before overwriting existing files with the same name in the destination. To continue the copy and
overwrite the existing file at the destination, type yes or y at the prompt like this:
Move Files: mv
The mv command is used to move files from one location to another. Files can be moved across
partitions/file systems as well. Moving files across partitions involves a copy operation, and as a
result, the move command can take longer. But you will find that moving files within the same file
system is almost instantaneous. Following are the most common options for mv:
Option
Description
-f
Forces move
-i
Interactive move
To move a file named foo.txt.html from /tmp to your present working directory, for example,
you’d use this command:
NOTE That last dot (.) is not a typo—it literarily means “this directory.”
Besides being used for moving files and folders around the system, mv can also be used simply as
a renaming tool.
To rename the file foo.txt.html to foo.txt.htm, type the following:
Link Files: ln
The ln command lets you establish hard links and soft links (see “Files, File Types, File Ownership,
and File Permissions” earlier in this chapter). The general format of ln is as follows:
Although ln has many options, you’ll rarely need to use most of them. The most common option, s, creates a symbolic link instead of a hard link.
To create a symbolic link called link-to-foo.txt that points to the original file called foo.txt,
issue this command:
Find a File: find
The find command lets you search for files using various search criteria. Like the tools we have
already discussed, find has a large number of options that you can read about in its man page. Here
is the general format of find:
is the directory from which the search should start.
To find all files in the current directory (that is, the “.” directory) that have not been accessed in
at least seven days, you’d use the following command:
start_directory
Type this command to find all files in your present working directory whose names are core and
then delete them (that is, automatically run the rm command):
TIP The syntax for the -exec option with the find command as used here can be difficult to
remember, so you can also use the xargs method instead of the exec option used in this example.
Using xargs, the command would then be written like so:
To find all files in your PWD whose names end in .txt (that is, files that have the .txt extension)
and are also less than 100 kilobytes (K) in size, issue this command:
To find all files in your PWD whose names end in .txt (that is, files that have the .txt extension)
and are also greater than 100K in size, issue this command:
File Compression: gzip
In the original distributions of UNIX, the tool to compress files was appropriately called compress.
Unfortunately, the algorithm was patented by someone hoping to make a great deal of money. Instead
of paying out, most sites sought out and found another compression tool with a patent-free algorithm:
gzip. Even better, gzip consistently achieves better compression ratios than compress does.
Another bonus: Recent changes have allowed gzip to uncompress files that were compressed using
the legacy compress command.
NOTE The filename extension or suffix usually identifies a file compressed with gzip. These files
typically end in .gz (files compressed with compress end in .z).
Note that gzip compresses the file in place, meaning that after the compression process, the
original file is removed, and the only thing left is the compressed file.
To compress a file named foo.txt.htm in your PWD, type this:
And then to decompress it, use gzip again with the -d option:
Issue this command to compress all files ending in .htm in your PWD using the best compression
possible:
bzip2
The bzip2 tool uses a different compression algorithm that usually turns out smaller files than those
compressed with the gzip utility, but it uses semantics that are similar to gzip. In other words, bzip2
offers better compression ratios in comparison to gzip.
File archives compressed using the bzip2 utility usually have the .bz extension or suffix.
For more information, read the man page on bzip2.
Create a Directory: mkdir
The mkdir command in Linux is identical to the same command in other flavors of UNIX, as well as
those in MS-DOS. An often-used option of the mkdir command is the -p option. This option will
force mkdir to create parent directories if they don’t exist already. For example, if you need to create
/tmp/bigdir/subdir/mydir and the only directory that exists is /tmp, using -p will cause bigdir and
subdir to be automatically created along with mydir.
To create a single directory called mydir, use this command:
Create a directory tree like bigdir/subdir/finaldir in your PWD:
Remove a Directory: rmdir
The rmdir command offers no surprises for those familiar with the DOS version of the command; it
simply removes an existing directory. This command also accepts the -p parameter, which removes
parent directories as well.
For example, to remove a directory called mydir, you’d type this:
If you want to get rid of all the directories from bigdir to finaldir that were created earlier, you’d
issue this command:
TIP You can also use the rm command with the -r option to delete directories.
Show Present Working Directory: pwd
Inevitably, you will find yourself at the terminal or shell prompt of an already logged-in workstation
and you won’t know where you are in the file system hierarchy or directory tree. To get this
information, you need the pwd command. Its only task is to print the current working directory. To
display your current working directory, use this command:
Tape Archive: tar
If you are familiar with the PKZip program, you are accustomed to the fact that the compression tool
not only reduces file size but also consolidates files into compressed archives. Under Linux, this
process is separated into two tools: gzip and tar.
The tar command combines multiple files into a single large file. It is separate from the
compression tool, so it allows you to select which compression tool to use or whether you even want
compression. In addition, tar is able to read and write to devices, thus making it a good tool for
backing up to tape devices.
NOTE Although the tape archive, or tar, program, includes the word “tape,” it isn’t necessary to
read or write to a tape drive when you’re creating archives. In fact, you’ll rarely use tar with a tape
drive in day-to-day situations (traditional backups aside). When the program was originally created,
limited disk space meant that tape was the most logical place to put archives. Typically, the -f option
in tar would be used to specify the tape device file, rather than a traditional UNIX file. You should
be aware, however, that you can still tar straight to a device.
Here’s the syntax for the tar command:
Some of the options for the tar command are shown here:
Option
Description
-c
Creates a new archive
-t
views the contents of an archive
-x
Extracts the contents of an archive
-f
Specifies the name of the file (or device) in which the archive is
located
-v
Provides verbose descriptions during operations
-j
Filters the archive through the bzip2 compression utility
-z
Filters the archive through the gzip compression utility
In order to see sample usage of the tar utility, first create a folder called junk in the PWD that
contains some empty files named 1, 2, 3, 4:
Now create an archive called junk.tar containing all the files in the folder called junk that you
just created by typing this:
Create another archive called 2junk.tar containing all the files in the junk folder, but this time,
add the -v (verbose) option to show what is happening as it happens:
NOTE The archives created in these examples are not compressed in any way. The files and
directory have only been combined into a single file.
To create a gzip-compressed archive called 3junk.tar.gz containing all of the files in the junk
folder and to show what is happening as it happens, issue this command:
To extract the contents of the gzipped tar archive created here and be verbose about what is
being done, issue this command:
TIP The tar command is one of the few Linux/UNIX utilities that care about the order with which
you specify its options. If you issued the preceding tar command as tar -xvfz 3junk.tar.gz, the
command would fail, because the -f option was not immediately followed by a filename.
If you like, you can also specify a physical device to tar to and from. This is handy when you
need to transfer a set of files from one system to another and for some reason you cannot create a file
system on the device. (Or, sometimes, it’s just more entertaining to do it this way.)
Assuming you have a floppy disk drive attached to your system, you can try creating an archive on
the first floppy device (/dev/fd0), by typing this:
NOTE The command tar -cvzf /dev/fd0 will treat the disk as a raw device and erase anything
that is already on it.
To pull that archive off of a disk, you would type this:
Concatenate Files: cat
The cat program fills an extremely simple role: it displays files. More creative things can be done
with it, but nearly all of its usage will be in the form of simply displaying the contents of text files—
much like the type command under DOS. Because multiple filenames can be specified on the
command line, it’s possible to concatenate files into a single, large, continuous file. This is different
from tar in that the resulting file has no control information to show the boundaries of different files.
To display the /etc/passwd file, use this command:
To display the /etc/passwd file and the /etc/group file, issue this command:
Type this command to concatenate /etc/passwd with /etc/group and send the output into the file
users-and-groups.txt:
To append the contents of the file /etc/hosts to the users-and-groups.txt file you just created,
type this:
TIP If you want to cat a file in reverse, you can use the tac command.
Display a File One Screen at a Time: more
The more command works in much the same way the DOS version of the program does. It takes an
input file and displays it one screen at a time. The input file can come either from its stdin or from a
command-line parameter. Additional command-line parameters, though rarely used, can be found in
the man page.
To view the /etc/passwd file one screen at a time, use this command:
To view the directory listing generated by the ls command one screen at a time, enter this:
Disk Utilization: du
You will often need to determine where and by whom disk space is being consumed, especially when
you’re running low on it! The du command allows you to determine the disk utilization on a directoryby-directory basis.
Following are some of the options available.
Option
Description
-c
Produces a grand total at the end of the run
-h
Prints sizes in human-readable format
-k
Prints sizes in kilobytes rather than block sizes (note that under Linux,
one block is equal to 1K, but this is not true for all forms of UNIX)
-s
Summarizes; prints only a total for each argument
To display the total amount of space being used by all the files and directories in your PWD in
human-readable format, use this command:
NOTE You can use the pipe feature of the shell, discussed in the previous section of this chapter, to
combine the du command with some other utilities (such as sort and head) to gather some interesting
statistics about the system. The sort command is used for sorting lines of text in alphanumeric,
numeric order. And the head command is used for printing or displaying any specified number of
lines of text to the standard output (screen).
So, for example, to combine du, sort, and head together to list the 12 largest files and directories
taking up space, under the /home/yyang directory, you could run this:
Show the Directory Location of a File: which
The which command searches your entire path to find the name of an executable specified on the
command line. If the file is found, the command output includes the actual path to the file.
Use the following command to find out in which directory the binary for the rm command is
located:
You might find this similar to the find command. The difference here is that since which
searches only the path, it is much faster. Of course, it is also much more limited in features than find,
but if all you’re looking for is a program path, you’ll find which to be a better/faster choice.
Locate a Command: whereis
The whereis tool searches your path and displays the name of the program and its absolute directory,
the source file (if available), and the man page for the command (again, if available).
To find the location of the program, source, and manual page for the grep command, use this:
Disk Free: df
The df program displays the amount of free space partition by partition (or volume by volume). The
drives/partitions must be mounted in order to get this information. Network File System (NFS)
information can be gathered this way as well. Some parameters for df are listed here; additional
(rarely used) options are listed in the df manual page.
Option
Description
-h
Generates free-space amount in human-readable numbers rather than
free blocks
-l
Lists only the locally mounted file systems; does not display any
information about network-mounted file systems
To show the free space for all locally mounted drives, use this command:
To show the free space in a human-readable format for the file system in which your current
working directory is located, enter this:
To show the free space in a human-readable format for the file system on which /tmp is located,
type this command:
Synchronize Disks: sync
Like most other modern operating systems, Linux maintains a disk cache to improve efficiency. The
drawback, of course, is that not everything you want written to disk may have been written at any
given moment.
To force the disk cache to be written out to disk, you use the sync command. If sync detects that
writing the cache out to disk has already been scheduled, the kernel is instructed to flush the cache
immediately. This command takes no command-line parameters. Type this command to ensure the
disk cache has been flushed:
NOTE Manually issuing the sync command is rarely necessary anymore, since the Linux kernel and
other subcomponents do a good job of it on their own. Furthermore, if you properly shut down or
reboot the system, all file systems will be properly unmounted and data synced to disk.
Moving a User and Its Home Directory
This section will demonstrate how to put together some of the topics and utilities covered so far in
this chapter (as well as some new ones). The elegant design of Linux allows you to combine simple
commands to perform advanced operations.
Sometimes in the course of administration you might have to move a user and the user’s files
around on the system. This section will cover the process of moving a user’s home directory. In this
section, you are going to move the user named project5 from his default home directory
/home/project5 to /export/home/project5. You will also have to set the proper permissions and
ownership of the user’s files and directories so that the user can access it.
Unlike the previous exercises, which were performed as a regular user (the user yyang), you will
need superuser privileges to perform the steps in this exercise.
1. Use the su command to change your identity temporarily from the current logged in user to the
superuser (root). You will need to provide root’s password, when prompted. At the virtual
terminal prompt type:
2. Create the user to be used for this project. The username is project5. Type the following:
3. Use the grep command to view the entry for the user you created in the /etc/passwd file:
4. Use the ls command to display a listing of the user’s home directory:
5. Check the total disk space being used by the user:
6. Use the su command to change your identity temporarily from the root user to the newly
created project5 user:
7. As user project5, view your present working directory:
8. As user project5, create some empty files:
9. Go back to being the root user by exiting out of project5’s profile:
10. Create the /export directory that will house the user’s new home:
11. Now use the tar command to archive and compress project5’s current home directory
(/home/project5) and untar and decompress it into its new location:
TIP The dashes (-) you used here with the tar command force it to send its output to standard output
(stdout) first and then receive its input from standard input (stdin).
12. Use the ls command to ensure that the new home directory was properly created under the
/export directory:
13. Make sure that the project5 user account has complete ownership of all the files and
directories in his new home:
14. Now delete project5’s current home directory:
15. We are almost done. Try to assume the identity of project5 again, temporarily:
One more thing left to do. We have deleted the user’s home directory (/home/ project5). The
path to the user’s home directory is specified in the /etc/passwd file (see Chapter 4), and
since we already deleted that directory, the su command helpfully complained.
16. Exit out of project5’s profile using the exit command:
17. Now we’ll use the usermod command to update the /etc/passwd file automatically with the
user’s new home directory:
NOTE On a system with SELinux enabled, you might get a warning about not being able to relabel
the home directory. You can ignore this warning for now.
18. Use the su command again to become project5 temporarily:
19. While logged in as project5, use the pwd command to view your present working directory:
The output shows that our migration worked out well.
20. Exit out of project5’s profile to become the root user, and then delete the user called project5
from the system:
List Processes: ps
The ps command lists all the processes in a system, their state, size, name, owner, CPU time, wall
clock time, and much more. Many command-line parameters are available; those most often used are
described in Table 5-4.
Option
Description
-a
Shows all processes with a controlling terminal, not just the current
user’s processes
-r
Shows only running processes (see the description of process states
later in this section)
-x
Shows processes that do not have a controlling terminal
-u
Shows the process owners
-f
Displays parent/child relationships among processes
-l
Produces a list in long format
-w
Shows a process’s command-line parameters (up to half a line)
-ww
Shows a process’s command-line parameters (unlimited width
fashion)
Table 5-4. Common ps Options
The most common set of parameters used with the ps command is auxww. These parameters show
all the processes (regardless of whether they have a controlling terminal), each process’s owners,
and all the processes’ command-line parameters. Let’s examine some sample output of an invocation
of ps auxww:
The first line of the output provides column headers for the listing. The column headers are
described in Table 5-5.
ps Column
Description
USER
The owner of the process.
PID
Process identification number.
%CPU
Percentage of the CPU taken up by a process. Note: For a system with
multiple processors, this column will add up to more than 100
percent.
%MEM
Percentage of memory taken up by a process.
VSZ
Amount of virtual memory a process is taking.
RSS
Amount of actual (resident) memory a process is taking.
TTY
Controlling terminal for a process. A question mark in this column
means the process is no longer connected to a controlling terminal.
STAT
State of the process. These are the possible states:
S Process is sleeping. All processes that are ready to run (that is,
being multitasked, and the CPU is currently focused elsewhere) will
be asleep.
R Process is actually on the CPU.
D Uninterruptible sleep (usually I/O related).
T Process is being traced by a debugger or has been stopped.
Z Process has gone zombie. This means either the parent process has
not acknowledged the death of its child using the wait system call, or
the parent was improperly killed, and until the parent is completely
killed, the init process (see Chapter 8) cannot kill the child itself. A
zombied process usually indicates poorly written software.
In addition, the STAT entry for each process can take one of the
following modifiers:
W No resident pages in memory (it has been completely swapped
out).
< High-priority process.
N Low-priority task.
L Pages in memory are locked there (usually signifying the need for
real-time functionality).
START
Date the process was started.
TIME
Amount of time the process has spent on the CPU.
COMMAND
Name of the process and its command-line parameters.
Table 5-5. ps Output Fields
Show an Interactive List of Processes: top
The top command is an interactive version of ps. Instead of giving a static view of what is going on,
top refreshes the screen with a list of processes every 2–3 seconds (user-adjustable). From this list,
you can reprioritize processes or kill them. Figure 5-1 shows a top screen.
Figure 5-1. top output
The top program’s main disadvantage is that it’s a CPU hog. On a congested system, this program
tends to complicate system management issues. Users start running top to see what’s going on, only to
find several other people running the program as well, slowing down the overall system even more.
By default, top is shipped so that everyone can use it. You might find it prudent, depending on
your environment, to restrict top’s use to root only. To do this, as root, change the program’s
permissions with the following command:
After running the command, regular users will get an error output similar to the next one if they try
running the top utility:
If you change your mind and decide to be a benevolent System Administrator and allow your
users to run the top utility, you can restore the original permissions by running:
Send a Signal to a Process: kill
This program’s name is misleading: It doesn’t really kill processes. What it does is send signals to
running processes. The operating system, by default, supplies each process with a standard set of
signal handlers to deal with incoming signals. From a system administrator’s standpoint, the most
common handlers are for signals number 9 and 15, kill process and terminate process, respectively.
When kill is invoked, it requires at least one parameter: the process identification number (PID)
as derived from the ps command. When passed only the PID, kill sends signal 15. Some programs
intercept this signal and perform a number of actions so that they can shut down cleanly. Others just
stop running in their tracks. Either way, kill isn’t a guaranteed method for making a process stop.
Signals
An optional parameter available for kill is −n, where the n represents a signal number. As system
administrators, we are most interested in the signals 9 (kill) and 1 (hang up).
The kill signal, 9, is the impolite way of stopping a process. Rather than asking a process to stop,
the operating system simply kills the process. The only time this will fail is when the process is in the
middle of a system call (such as a request to open a file), in which case the process will die once it
returns from the system call.
The hang-up signal, 1, is a bit of a throwback to the VT100 terminal days of UNIX. When a user’s
terminal connection dropped in the middle of a session, all of that terminal’s running processes would
receive a hang-up signal (often called a SIGHUP or HUP). This gave the processes an opportunity to
perform a clean shutdown or, in the case of background processes, to ignore the signal. These days, a
HUP is used to tell certain server applications to go and reread their configuration files (you’ll see
this in action in several of the later chapters).
Security Issues
The ability to terminate a process is obviously a powerful one, thereby making security precautions
important. Users may kill only processes they have permission to kill. If non-root users attempt to
send signals to processes other than their own, error messages are returned. The root user is the
exception to this limitation; root may send signals to all processes in the system. Of course, this
means root needs to exercise great care when using the kill command.
Examples Using the kill Command
NOTE The following examples are arbitrary; the PIDs used are completely fictitious and will be
different on your system.
Use this command to terminate a process with PID number 205989:
For an almost guaranteed kill of process number 593999, issue this command:
Type the following to send the HUP signal to the init program (which is always PID 1):
This command does the same thing:
TIP To get a listing of all the possible signals available, along with their numeric equivalents, issue
the kill -l command.
Miscellaneous Tools
The following tools don’t fall into any specific category covered in this chapter, but they all make
important contributions to daily system administration chores.
Show System Name: uname
The uname program produces some system details that can be helpful in several situations. Perhaps
you’ve managed to log into a dozen different computers remotely and have lost track of where you
are! This tool is also helpful for script writers, because it allows them to change the path of a script
according to the system information.
Here are the command-line parameters for uname:
Option
Description
-m
Prints the machine hardware type (such as i686 for Pentium Pro and
better architectures)
-n
Prints the machine’s hostname
-r
Prints the operating system’s release name
-s
Prints the operating system’s release name
-v
Prints the operating system’s version
-a
Prints all of the above
To get the operating system’s name and release, enter the following command:
The -s option might seem wasted here (after all, we know this is Linux), but this parameter
proves quite useful on almost all UNIX-like operating systems as well. For example, on a Silicon
Graphics, Inc. (SGI) workstation terminal, uname -s will return IRIX, and it will return SunOS at a
Sun workstation. Folks who work in heterogeneous environments often write scripts that will behave
differently, depending on the OS, and uname with -s is a consistent way to determine that
information.
TIP Another command that offers distribution-specific information is the lsb_release command.
Specifically, it can show Linux Standard Base (LSB)–related information, such as the distribution
name, distribution code name, release or version information, etc. A common option used with the
lsb_release command is -a. For example, lsb_release -a.
Who Is Logged In: who
On multiuser systems that have many user accounts that can be simultaneously logged in locally or
remotely, the system administrator may need to know who is logged on.
A report showing all logged on users as well as other useful statistics can be generated by using
the who command:
A Variation on who: w
The w command displays the same information that who displays, plus a whole lot more. The details
of the report include who is logged in, what their terminal is, from where they are logged in, how long
they’ve been logged in, how long they’ve been idle, and their CPU utilization. The top of the report
also gives you the same output as the uptime command.
Switch User: su
This command was used earlier on, when we moved a user and its home directory. Once you have
logged into the system as one user, you need not log out and back in again in order to assume another
identity (root user, for instance). Instead, use the su command to switch. This command has few
command-line parameters.
Running su without any parameters will automatically try to make you the root user. You’ll be
prompted for the root password, and, if you enter it correctly, you will drop down to a root shell. If
you are already the root user and want to switch to another ID, you don’t need to enter the new
password when you use this command.
For example, if you’re logged in as the user yyang and want to switch to the root user, type this
command:
You will be prompted for root’s password.
If you’re logged in as root and want to switch to, say, user yyang, enter this command:
You will not be prompted for yyang’s password.
The optional hyphen (-) parameter tells su to switch identities and run the login scripts for that
user. For example, if you’re logged in as root and want to switch over to user yyang with all of his
login and shell configurations, type this command:
TIP The sudo command is used extensively (instead of su) on Debian-based distributions such as
Ubuntu to execute commands as another user. When configured properly, sudo offers finer grained
controls than su does.
Editors
Editors are easily among the bulkiest of common tools, but they are also the most useful. Without
them, making any kind of change to a text file would be a tremendous undertaking. Regardless of your
Linux distribution, you will have gotten a few editors. You should take a few moments to get
comfortable with them.
NOTE Different Linux distributions favor some editors over others. As a result, you might have to
find and install your preferred editor if it doesn’t come installed with your distribution by default.
vi
The vi editor has been around UNIX-based systems since the 1970s, and its interface shows it. It is
arguably one of the last editors to use a separate command mode and data entry mode; as a result,
most newcomers find it unpleasant to use. But before you give vi the cold shoulder, take a moment to
get comfortable with it. In difficult situations, you might not have the luxury of a pretty graphical
editor at your disposal, but you will find that vi is ubiquitous across all Linux/UNIX systems.
The version of vi that ships with most Linux distributions is vim (VI iMproved). It has a lot of
what made vi popular in the first place and many features that make it useful in today’s typical
environments (including a graphical interface if the X Window System is running).
To start vi, simply type this:
The vim editor has an online tutor that can help you get started with it quickly. To launch the tutor,
type this:
Another easy way to learn more about vi is to start it and enter :help. If you ever find yourself
stuck in vi, press the ESC key several times and then type :q! to force an exit without saving. If you
instead want to save the file, type :wq.
emacs
It has been argued that emacs can easily be an entire operating system all by itself! It’s big, featurerich, expandable, programmable, and all-around amazing. If you’re coming from a GUI background,
you’ll probably find emacs a pleasant environment to work with at first. On its face, it works like
Notepad in terms of its interface. Yet underneath is a complete interface to the GNU development
environment, a mail reader, a news reader, a web browser, and, believe it or not, it even has a cute
built-in help system that’s disguised as your very own personal psychotherapist! You can have some
“interesting” conversations with this automated/robotic psychotherapist.
To start emacs, simply type the following:
Once emacs has started, you can visit the therapist by pressing ESC-X and then typing doctor. To
get help using emacs, press CTRL-H.
joe
is a simple text editor. It works much like Notepad and offers onscreen help. Anyone who
remembers the original WordStar command set will be pleasantly surprised to see that all those brain
cells hanging on to CTRL-K commands can be put back to use with joe.
To start joe, simply type the following:
joe
pico
The pico program is another editor inspired by simplicity. Typically used in conjunction with the
Pine e-mail reading system, pico can also be used as a stand-alone editor. Like joe, it can work in a
manner similar to Notepad, but pico uses its own set of key combinations. Thankfully, all available
key combinations are always shown at the bottom of the screen.
To start pico, simply type this:
TIP The pico program will perform automatic word wraps. If you’re using it to edit configuration
files, for example, be careful that it doesn’t word-wrap a line into two lines if it should really be
parsed as a single line.
Summary
This chapter discussed Linux’s command-line interface, the Bourne Again Shell (BASH), many
command-line tools, and a few editors. As you continue through this book, you’ll find many
references to the information in this chapter, so be sure that you get comfortable with working at the
command line. You might find it a bit annoying at first, especially if you are accustomed to using a
GUI for performing many of the basic tasks mentioned here—but stick with it. You might even find
yourself eventually working faster at the command line than with the GUI!
Obviously, this chapter can’t cover all the command-line tools available as part of your default
Linux installation. It is highly recommend that you take some time to look into some of the reference
books available. In addition, there is a wealth of texts on shell scripting/programming at various
levels and from various points of view. Get whatever suits you; shell scripting/programming is a skill
well worth learning, even if you don’t do system administration.
And above all else, R.T.F.M., that is, Read The Fine Manual (documentation).
CHAPTER 6
Booting and Shutting Down
s the complexity in modern-day operating systems has grown, so has the complexity in the
starting up and shutting down process. Anyone who has undergone the transition from a straight
DOS-based system to a Microsoft Windows–based system has experienced this transition
firsthand. Not only is the core operating system brought up and shut down, but an impressive list of
services and processes must also be started and stopped. Like Windows, Linux comprises an
impressive list of services (some critical and other less so) that can be turned on as part of the boot
procedure.
In this chapter, we discuss the bootstrapping of the Linux operating system with GRUB and Linux
Loader (LILO). We then step through the processes of starting up and shutting down the Linux
environment. We discuss the scripts that automate parts of this process, as well as modifications that
may sometimes be desirable in the scripts. We finish up with coverage of a few odds and ends that
pertain to booting up and shutting down.
A
NOTE Apply a liberal dose of common sense in following the practical exercises in this chapter on
a real/production system. As you experiment with modifying startup and shutdown scripts, bear in
mind that it is possible to bring your system to a nonfunctional state that cannot be recovered by mere
rebooting. Don’t mess with a production system; if you must, first make sure that you back up all the
files you want to change, and most importantly, have a boot disk ready (or some other boot medium)
that can help you recover.
Boot Loaders
For any operating system to boot on standard PC hardware, you need what is called a boot loader. If
you have only dealt with Windows on a PC, you have probably never needed to interact directly with
a boot loader. The boot loader is the first software program that runs when a computer starts. It is
responsible for handing over control of the system to the operating system.
Typically, the boot loader will reside in the Master Boot Record (MBR) of the disk, and it knows
how to get the operating system up and running. The main choices that come with Linux distributions
are GRUB and, much less commonly, LILO. This chapter focuses on GRUB, because it is the most
common boot loader that ships with the newer distributions of Linux and because it offers a lot more
features than LILO. GRUB currently comes in two versions—GRUB Legacy and GRUB version 2
(GRUB 2).
A brief mention of LILO is made for historical reasons only. Both LILO and GRUB can be
configured to boot other non-native operating systems.
NOTE You might notice that GRUB is a pre-1.0 or 2.0 release software (version 0.98, 0.99, 1.98,
1.99, and so on)—also known as alpha software. Don’t be frightened by this. Considering the fact that
major commercial Linux vendors use it in their distributions, it is deemed quality “alpha” code. The
older stable version of GRUB is known as GRUB Legacy. And the next-generation, bleeding-edge
version that will replace GRUB Legacy is simply known as GRUB 2.
GRUB Legacy
Most modern Linux distributions use GRUB as the default boot loader during installation, including
Fedora, Red Hat Enterprise Linux (RHEL), openSUSE, Debian, Mandrake, CentOS, Ubuntu, and a
host of other Linux distributions. GRUB aims to be compliant with the Multiboot Specification and
offers many features.
The GRUB boot process happens in stages. Each stage is taken care of by special GRUB image
files, with each preceding stage helping the next stage along. Two of the stages are essential, and the
other stages are optional and dependent on the particular system setup.
CAUTION Please remember that most of the information and all of the sample exercises in this
entire GRUB Legacy section apply to systems running legacy versions of GRUB. This includes
GRUB version 0.99 and earlier.
On RPM-based distros, you can check your GRUB version by running:
On Debian-based systems, you can check your version of GRUB by running:
If the output shows you are running GRUB2 or newer (version 1.98, 1.99, and so on), you will not be
able to follow along with the exercises without first downgrading to GRUB Legacy.
Stage 1
The image file used in this stage is essential and is used for booting up GRUB in the first place. It is
usually embedded in the MBR of a disk or in the boot sector of a partition. The file used in this stage
is appropriately named stage1. A Stage 1 image can next either load Stage 1.5 or load Stage 2
directly.
Stage 2
The Stage 2 images actually consist of two types of images: the intermediate (optional image) and the
actual stage2 image file. To blur things further, the optional images are called Stage 1.5. The Stage
1.5 images serve as a bridge between Stage 1 and Stage 2. The Stage 1.5 images are file system–
specific; that is, they understand the semantics of one file system or the other.
The Stage 1.5 images have names of the form—x_stage_1_5—where x can be a file system of
type e2fs, ReiserFS, FAT, JFS, MINIX, XFS, and so on. For example, the Stage 1.5 image that will
be required to load an OS that resides on a File Allocation Table (FAT) file system will have a name
similar to fat_stage1_5. The Stage 1.5 images allow GRUB to access several file systems. When
used, the Stage 1.5 image helps to locate the Stage 2 image as a file within the file system.
Next comes the actual stage2 image. It is the core of GRUB. It contains the actual code to load the
kernel that boots the OS, it displays the boot menu, and it also contains the GRUB shell from which
GRUB commands can be entered. The GRUB shell is interactive and helps to make GRUB flexible.
For example, the shell can be used to boot items that are not currently listed in GRUB’s boot menu or
to bootstrap the OS from an alternative supported medium.
Other types of Stage 2 images are the stage2_eltorito image, the nbgrub image, and the pxegrub
image. The stage2_eltorito image is a boot image for CD-ROMs. The nbgrub and pxegrub images
are both network-type boot images that can be used to bootstrap a system over the network (using
Bootstrap Protocol [BOOTP], Dynamic Host Configuration Protocol [DHCP], Preboot Execution
Environment [PXE], Etherboot, or the like). A quick listing of the contents of the /boot/grub directory
of most Linux distributions will show some of the GRUB images.
Conventions Used in GRUB
GRUB has its own special way of referring to devices (CD-ROM drives, floppy drives, hard disk
drives, and so on). The device name has to be enclosed in parentheses: “( )”. GRUB starts numbering
its devices and partitions from 0, not from 1. Therefore, GRUB would refer to the master Integrated
Drive Electronics (IDE) hard drive on the primary IDE controller as (hd0), where “hd” means “hard
disk” drive and the number 0 means it is the primary IDE master.
NOTE GRUB does not distinguish between IDE devices, Serial Advanced Technology Attachment
(SATA) devices, or Small Computer System Interface (SCSI) devices.
In the same vein, GRUB will refer to the fourth partition on the fourth hard disk (that is, the slave
on the secondary IDE controller) as “(hd3,3).” To refer to the whole floppy disk in GRUB would
mean “(fd0)”—where “fd” means “floppy disk.”
Installing GRUB
Most Linux distributions will give you a choice to install and configure the boot loader during the
initial operating system installation. Thus, you wouldn’t normally need to install GRUB manually
during normal system use.
However, there are times, either by accident or by design, that you don’t have a boot loader. It
could be by accident if you, for example, accidentally overwrite your boot sector or if another OS
accidentally wipes out GRUB. It could be by design if, for example, you want to set up your system to
dual-boot with another OS (Windows or another Linux distribution).
This section will walk you through getting GRUB installed (or reinstalled) on your system. This
can be achieved in several ways. You can do it the easy way from within the running OS using the
grub-install utility or using GRUB’s native command-line interface. You can get to this interface
using what is called a GRUB boot floppy, using a GRUB boot CD, or from a system that has the
GRUB software installed.
NOTE GRUB is installed only once. Any modifications are stored in a text file, and any changes
don’t need to be written to the MBR or partition boot sector every time.
Backing Up the MBR
Before you proceed with the exercises that follow, it is a good idea to make a backup of your current
“known good” MBR. It is easy to do this using the dd command. Since the MBR of a PC’s hard disk
resides in the first 512 bytes of the disk, you can easily copy the first 512 bytes to a file (or to a
floppy disk) by typing the following:
This command will save the MBR into a file called COPY_OF_MBR under the /tmp directory.
Installing GRUB Legacy from the GRUB Shell
Now that we have dealt with the safety measures, we can proceed to exploring GRUB in full. In this
section, you will learn how to install GRUB natively using GRUB’s command shell from inside the
running Linux operating system. You will normally go this route if, for example, you currently have
another type of boot loader (such as LILO or the NT Loader, NTLDR) but you want to replace or
overwrite that boot loader with GRUB.
1. Launch GRUB’s shell by issuing the grub command:
2. Display GRUB’s current root device:
The output shows that GRUB will, by default, use the first floppy disk drive (fd0) as its root
device, unless you tell it otherwise.
3. Set GRUB’s root device to the partition that contains the boot directory on the local hard disk:
NOTE The boot directory may or may not be on the same partition that houses the root (/) directory.
During the OS installation on our sample system, the /boot directory was stored on the /dev/sda1
partition, and hence, we use the GRUB (hd0,0) device.
4. Make sure that the stage1 image can be found on the root device:
The output means that the stage1 image file was located on the (hd0,0) device.
5. Finally, (re)install the GRUB boot loader directly on the MBR of the hard disk:
6. Quit the GRUB shell:
grub> quit
You are done. But you should note that you really didn’t make any serious changes to the system,
because you simply reinstalled GRUB to the MBR (where it used to be). You would normally reboot
at this point to make sure that everything is working as it should.
TIP A simple-to-use script that can help you perform all the steps detailed in the preceding exercise
with a single command is the grub-install script (see man grub-install). This method is not always
perfect, and the authors of the GRUB software admit that it is a less safe route to take. But still—it
almost always works just fine.
USB GRUB Legacy Boot Disk
Let’s create a bootable USB GRUB disk the manual way. This will allow you to boot the system
using the USB (or flash) disk and then use GRUB to write (or install) itself to the MBR. This is
especially useful if your system does not currently have a boot loader installed but you have access to
another system that has GRUB Legacy installed.
The general idea behind using a USB GRUB boot disk is that it is assumed that you currently have
a system with an unbootable, corrupt, or unwanted boot loader—and since the system cannot be
booted by itself from the hard disk, you need another medium with which to bootstrap the system. For
this, you can use a GRUB USB disk, a GRUB CD, or even a GRUB floppy disk. You want any means
by which you can gain access to the GRUB shell so that you can install GRUB into the MBR and then
boot the OS.
You first need to locate the GRUB Legacy images, located by default in /usr/share/
grub/x86_64-redhat/ directory on a Fedora/RHEL/CentOS distribution. In the 32-bit architectures in
the same distros, the images are stored under a different path—/usr/share/grub/i386-redhat/.
We will be performing the following exercises on our sample server running openSUSE.
openSUSE stores the GRUB Legacy image files in the /usr/lib/grub/ directory. And the images are
stored under /usr/lib/grub/i386-pc/ on Ubuntu-based systems.
Use the dd command to write the stage1 and stage2 images to a USB flash drive that is plugged
into the system.
Assuming we are ready to lose the entire contents of the USB drive and that the current block
device for the USB drive is /dev/sdb, we can carry out the following procedures:
1. Change to the directory that contains the GRUB images on your system:
2. Write the file stage1 to the first 512 bytes of the USB drive:
3. Write the stage2 image right after the first image:
TIP You can also use the cat command to do the same thing in steps 2 and 3 in one shot. Here’s the
command to do this:
Your USB GRUB drive is now ready. You can boot off of this disk on any system that permits
booting off USB devices. Once booted, you can then install a fresh copy of the GRUB boot loader, as
demonstrated in the next section.
Installing GRUB Legacy on the MBR Using a USB GRUB Legacy Disk
Make sure that the GRUB disk you created is inserted into an appropriate port on the system. Reboot
the system if necessary and elect to use the USB drive boot medium (adjust the BIOS settings if
necessary).
After the system has booted off the USB GRUB disk, you will be presented with a simple grub>
prompt.
Set the root device for GRUB to your boot partition (or the partition that contains the /boot
directory). On our sample system, the /boot directory resides on the /dev/sda1 (hd0,0) partition. To
do this, type the following command:
grub> root (hd0,0)
Now you can write GRUB to the MBR by using the setup command:
grub> setup (hd0)
That’s it. You can now reboot the system without the GRUB drive.
The procedure outlined here is a good way to let GRUB reclaim management of the MBR, if, for
example, it had previously been overwritten by another boot manager.
Configuring GRUB Legacy
Since you have to install GRUB only once on the MBR or partition of your choice, you have the
luxury of simply editing a text file (/boot/grub/menu.1st) to make changes to your boot loader. When
you are done editing this file, you can reboot and select any new kernel that you added to the
configuration. The configuration file looks like the following (note that line numbers 1–16 have been
added to the output to aid readability):
The entries in the preceding sample configuration file for GRUB are discussed here:
Lines 1–8 All lines that begin with the pound sign (#) are comments and are ignored.
Line 9, default This directive tells GRUB which entry to boot automatically. The numbering
starts from zero. The preceding sample file contains only one entry, openSUSE 12.1
(3.6.*.x86_64).
Line 10, timeout This means that GRUB will automatically boot the default entry after 5
seconds. This can be interrupted by pressing any key on the keyboard before the counter runs
out.
Line 11, splashimage This line specifies the name and location of an image file to be
displayed at the boot menu. This is optional and can be any custom image that fits GRUB’s
specifications. The splashimage directive is similar to the gfxmenu directive, which also
affects the looks of the boot menu.
Line 12, hiddenmenu This entry hides the usual GRUB menu. It is an optional entry.
Line 13, title This is used to display a short title or description for the following entry it
defines. The title field marks the beginning of a new boot entry in GRUB.
Line 14, root You should notice from the preceding listing that GRUB still maintains its
device-naming convention—for example, (hd0,0) instead of the usual Linux /dev/sda1.
Line 15, kernel Used for specifying the path to a kernel image. The first argument is the path
to the kernel image in a volume or partition (/dev/sda3 in this example). Any other arguments
are passed to the kernel as boot parameters. An example boot parameter is the rd.lvm.lv
parameter, which activates the specified logical volumes (LV). Another example is the quiet
parameter, which disables most of the verbose log messages as the system boots.
NOTE The path names are relative to the /boot directory, so, for example, instead of specifying the
path to the kernel to be /boot/vmlinuz-3.6.*.x86_64, GRUB’s configuration file references this path
as /vmlinuz-3.6.*.x86_64.
Line 16, initrd The initrd option allows you to load kernel modules from an image, not the
modules from /lib/modules. See the GRUB info pages, available through the info command,
for more information on the configuration options.
Initial RAM Disk (initrd)
You might be wondering about the initrd option. It is used for preloading modules or drivers.
The initial random access memory (RAM) disk is a special device or an abstraction of RAM. It
is initialized by the boot loader before the actual kernel kicks in.
One sample problem solved by initrd happens when a file system module is needed to
allow access to the file system in order to load the other necessary modules. For example, your
boot partition might be formatted with some exotic file system (such as the B-tree file system
[Btrfs], ReiserFS, and so on) for which the kernel has no built-in drivers and whose
modules/drivers reside on the disk.
This is a classic chicken-and-egg problem—that is, which came first? You can’t access the
file system because you don’t have the file system modules.
The solution in GRUB legacy is to provide the kernel with a RAM-based structure (image)
that contains necessary loadable modules to get to the rest of the modules. This image is
executed and resides in RAM, and as a result it does not need immediate access to the on-disk
file system.
Adding a New Kernel to Boot with GRUB Legacy
In this section, you will learn how to add a new boot entry manually to GRUB’s configuration file on
a server running openSUSE Linux distro (GRUB Legacy). If you are compiling and installing a new
kernel by hand, you will need to do this so that you can boot into the new kernel to test it or use it. If,
on the other hand, you are installing or upgrading the Linux kernel using a prepackaged Red Hat
Package Manager (RPM), this is usually automatically done for you.
Because you don’t have any new Linux kernel to install on the system, you will add only a dummy
entry to GRUB’s configuration file in this exercise. The new entry will not do anything useful—we’re
adding this for illustration purposes.
Here’s a summary of what you’ll be doing: You will make a copy of the current default kernel that
your system uses and name the copy duplicate-kernel. You will also make a copy of the
corresponding initrd image for the kernel and name the copy duplicate-initrd. Both files should be
saved into the /boot directory. You will then create an entry for the supposedly new kernel and give it
a descriptive title, such as The Duplicate Kernel.
In addition to the preceding boot entry, you will create another entry that does nothing more than
change the foreground and background colors of GRUB’s boot menu.
Let’s begin.
1. Change your current working directory to the /boot directory:
2. Make a copy of your current kernel, and name the copy duplicate-kernel:
3. Make a copy of the corresponding initrd image, and name the copy duplicate-initrd:
4. Create an entry for the new pseudo-kernels in the /boot/grub/menu.1st configuration file,
using any text editor you are comfortable with (the vim editor is used in this example). Type
the following text at the end of the file:
NOTE The value of root (line 3) used above was obtained from the existing entry in the menu.1st
file that we are duplicating. The exact partition or volume on which the root file system (/) resides
was specified in this example. Some distros also identify the root device by its Universally Unique
Identifiers (UUIDs). So, for example, we could have the kernel entry in the menu.1st file identified as
follows:
5. Create another entry that will change the foreground and background colors of the menu when
it is selected. The menu colors will be changed to yellow and black when this entry is
selected. Enter the following text at the end of the file (beneath the entry you created in the
preceding step):
6. Increase the value of the timeout variable if necessary by editing the menu.1st file. If the
current value is 0, change it to 5. The new entry should look like this:
7. If it’s present, you should comment out the gfxmenu or splashimage entry at the top of the
file. The presence of the splash image will prevent your new custom foreground and
background colors from displaying properly. The commented-out entry for the splash image
will look like this:
or
8. Finally, also comment out the hiddenmenu entry (if present) from the file so that the boot
menu will appear, showing your new entries instead of being hidden. The commented-out
entry should look like this:
9. Save the changes you made to the file, and reboot the system.
The final /boot/grub/menu.1st file (with some of the comment fields removed) will resemble the
one shown here:
When the system reboots, you can test your changes by following the next steps while at the initial
grub screen:
1. After the GRUB menu appears, select The Change Color Entry, and press ENTER. The color of
the menu should change to the color you specified in the menu.1st file using the color
directive.
2. Finally, verify that you are able to boot the new kernel entry that you created—that is, The
Duplicate Kernel entry. Select The Duplicate Kernel entry and press ENTER .
GRUB 2
GRUB 2 is the successor to GRUB Legacy boot loader. Some Linux distros still use and standardize
on GRUB legacy, but many mainstream distros have adopted GRUB 2. Debian and Debian-based
distros such as Ubuntu, Kubuntu, and others use GRUB 2 and some RPM-based systems still use
GRUB Legacy. It is reasonable to assume that everybody will eventually move to GRUB 2 or to
something else, if something better comes along.
The main features of GRUB 2, as well as some differences when compared with GRUB Legacy,
are listed in Table 6-1.
GRUB 2
Feature
Description
Configuration The primary configuration file for GRUB2 is now named, grub.cfg
(/boot/grub/grub.cfg). This is different from GRUB Legacy’s configuration file,
files
which is named menu.1st.
grub.cfg is not meant to be edited directly. Its content is automatically generated.
Multiple files (scripts) are used for configuring GRUB’s menu, and some of these files
are stored under the /etc/grub.d/ directory, such as the following:
00_header Sets the default values for some general GRUB variables such as
graphics mode, default selection, timeouts, and so on.
10_linux Helps to find all the kernels on the root device of the current operating
system, and automatically creates associated GRUB entries for all the kernels it
finds.
30_os-prober Automatically probes for other operating systems that might be
installed on the system. Especially useful in dual-boot systems (Windows running
with Linux, for example).
40_custom Where users can edit and store custom menu entries and directives.
Partition
numbers
File system
Image files
Partition numbers in GRUB 2 device names start at 1, not 0. The device names,
numbering remains the same; they still start from 0.So, for example, a GRUB 2
directive that reads (hd0,1) refers to the first partition on the first drive.
GRUB 2 natively supports many more files than GRUB Legacy.
GRUB 2 no longer uses the Stage1, Stage 1.5, and Stage 2 files. Most of the functions
served by the Stage* files have been replaced by the core.img file, which is generated
dynamically from the kernel image and some other modules.
Table 6-1. GRUB2 Features
TIP If you don’t want a specific menu entry to be automatically created in a system using GRUB 2 as
the boot loader, you have to delete or disable the corresponding script in the /etc/grub.d/ directory
that creates the entry. For example, if you don’t want to see entries for other non-native operating
systems such as Microsoft Windows in your boot menu, you need to delete /etc/grub.d/30_os-prober
or alternatively make it non-executable by using this command:
LILO
LILO is a boot manager that allows you to boot multiple operating systems, provided each system
exists on its own partition. (Under PC-based systems, the entire boot partition must also exist beneath
the 1024-cylinder boundary.) In addition to booting multiple operating systems with LILO, you can
choose various kernel configurations or versions to boot. This is especially handy when you’re trying
kernel upgrades before adopting them.
Configuring LILO is straightforward: A configuration file (/etc/lilo.conf) specifies which
partitions are bootable and, if the partition is Linux, which kernel to load. When the /sbin/lilo program
runs, it takes this partition information and rewrites the boot sector with the necessary code to present
the options as specified in the configuration file. At boot time, a prompt (usually lilo:) is displayed,
and you have the option of specifying the operating system. (Usually, a default can be selected after a
timeout period.) LILO loads the necessary code, the kernel, from the selected partition and passes full
control over to it.
LILO is what is known as a “two-stage boot loader.” The first stage loads LILO itself into
memory and prompts you for booting instructions with the lilo: prompt or a colorized boot menu.
Once you select the OS to boot and press ENTER, LILO enters the second stage, booting the Linux OS.
As was stated earlier in the chapter, LILO has somewhat fallen out of favor with most of the
newer Linux distributions. Some of the distributions do not even give you the option of selecting or
choosing LILO as your boot manager!
TIP If you are familiar with the Microsoft Windows boot process, you can think of LILO as
comparable to the OS loader (NTLDR). Similarly, the LILO configuration file, /etc/lilo.conf, is
comparable to BOOT.INI (which is typically hidden from view).
Bootstrapping
In this section, I’ll assume you are already familiar with the boot processes of other operating systems
and thus already know the boot cycle of your hardware. This section will cover the process of
bootstrapping the operating system. We’ll begin with the Linux boot loader (usually GRUB for PCs).
Kernel Loading
Once GRUB has started and you have selected Linux as the operating system to boot, the first thing to
get loaded is the kernel. Keep in mind that no operating system exists in memory at this point, and PCs
(by their unfortunate design) have no easy way to access all of their memory. Thus, the kernel must
load completely into the first megabyte of available RAM. To accomplish this, the kernel is
compressed. The head of the file contains the code necessary to bring the CPU into protected mode
(thereby removing the memory restriction) and decompress the remainder of the kernel.
Kernel Execution
With the kernel in memory, it can begin executing. One subtle point to remember is that the kernel is
nothing but a program (albeit a very sophisticated and smart one) that needs to be executed.
The kernel knows only whatever functionality is built into it, which means any parts of the kernel
compiled as modules are useless at this point. At the very minimum, the kernel must have enough code
to set up its virtual memory subsystem and root file system (usually, the ext3 or ext4 or Btrfs file
system). Once the kernel has started, a hardware probe determines what device drivers should be
initialized. From here, the kernel can mount the root file system. (You could draw a parallel of this
process to that of Windows being able to recognize and access its C drive.) The kernel mounts the
root file system and starts a program called init, which is discussed in the next section.
The init Process
On traditional System V (SysV)–style Linux distros, the init process is the first non-kernel process
that is started; therefore, it always gets the process ID number of 1. init reads its configuration file,
/etc/inittab, and determines the runlevel where it should start. Essentially, a runlevel dictates the
system’s behavior. Each level (designated by an integer between 0 and 6) serves a specific purpose.
A runlevel of initdefault is selected if it exists; otherwise, you are prompted to supply a runlevel
value.
Some newer Linux distros have substituted the functionality previously provided by SysV init
with a new startup manager called systemd. The notion of runlevels is slightly different in systemd,
and instead are referred to as targets. Chapter 8 discusses systemd in greater detail. The listing in
Table 6-1 shows the different runlevels in the traditional SysV world as well as their equivalent in
the systemd world.
When it is told to enter a runlevel, init executes a script, as dictated by the /etc/-inittab file. The
default runlevel that the system boots into is determined by the -initdefault entry in the
/etc/inittab file. If, for example, the entry in the file is
this means that the system will boot into runlevel 3. But if, on the other hand, the entry in the file is
this means the system will boot into runlevel 5, with the X Window subsystem running with a
graphical login screen.
NOTE On Debian-like systems, such as Ubuntu, the functionality provided by the /etc/inittab file has
been replaced by the /etc/init/rc-sysinit.conf file. The rc-sysinit.conf file is used to specify the
default runlevel the system should boot into. This is done by setting the value of the
DEFAULT_RUNLEVEL variable to the desired runlevel. The default value in Ubuntu distros is env
DEFAULT_RUNLEVEL=2.
rc Scripts
In the preceding section, we mentioned that on SysV-based distros, the /etc/inittab file specifies
which scripts to run when runlevels change. These scripts are responsible for either starting or
stopping the services that are particular to the runlevel.
Because of the large number of services that might need to be managed, resource control (rc)
scripts are used. On SysV-based distros, the main script—/etc/rc.d/rc—is responsible for calling the
appropriate scripts in the correct order for each runlevel. As you can imagine, such a script could
easily become extremely uncontrollable! To keep this from happening, a slightly more elaborate
system is used.
For each runlevel, a subdirectory exists in the /etc/rc.d directory. These runlevel subdirectories
follow the naming scheme of rcX.d, where X is the runlevel. For example, all the scripts for runlevel
3 are in /etc/rc.d/rc3.d.
In the runlevel directories, symbolic links are made to scripts in the /etc/rc.d/init.d directory.
Instead of using the name of the script as it exists in the /etc/rc.d/init.d directory, however, the
symbolic links are prefixed with an S if the script is to start a service or with a K if the script is to
stop (or kill) a service. Note that these two letters are case-sensitive. You must use uppercase letters
or the startup scripts will not recognize them.
In many cases, the order in which these scripts are run makes a difference. (For example, you
can’t start services that rely on a configured network interface without first enabling and configuring
the network interface!) To enforce order, a two-digit number is suffixed to the S or K. Lower numbers
execute before higher numbers: for example, /etc/rc.d/rc3.d/S10network runs before
/etc/rc.d/rc3.d/S55sshd (S10network configures the network settings, and S55sshd starts the Secure
Shell [SSH] server).
The scripts pointed to in the /etc/rc.d/init.d directory are the workhorses; they perform the actual
process of starting and stopping services. When /etc/rc.d/rc runs through a specific runlevel’s
directory, it invokes each script in numerical order. It first runs the scripts that begin with a K and
then the scripts that begin with an S. For scripts starting with K, a parameter of stop is passed.
Likewise, for scripts starting with S, the parameter start is passed.
Let’s peer into the /etc/rc.d/rc3.d directory of our sample openSUSE server and see what’s there:
From the sample output, you can see that K01cron is one of the many files in the /etc/rc.d/rc3.d
directory (the first line in the output). Thus, when the file K01cron is executed or invoked, this
command is actually being executed instead:
By the same token, if S08sshd is invoked, the following command is what really gets run:
Writing Your Own rc Script
In the course of administering a Linux system and keeping it running, at some point you will need to
modify the startup or shutdown script. You can take two roads to do this.
If your change is to take effect at boot time only and the change is small, you may simply edit the
/etc/rc.d/rc.local or /etc/rc.local script. This script is run at the tail end of the boot process—after
all the other startup scripts.
On the other hand, if your addition is more elaborate and/or requires that the shutdown process
explicitly stop, you should add a script to the /etc/rc.d/ or /etc/rc* directory. This script should take
the parameters start and stop, and should act accordingly.
Of course, the first option, editing the rc.local script, is the easier of the two. To make additions
to this script, simply open it in your editor of choice and append the commands you want run at the
end. This is good for simple one- or two-line changes.
As mentioned, if your situation needs a more elaborate or elegant solution, you will need to create
a separate script and thus use the second option. The process of writing an rc script is not as difficult
as it might seem. Let’s step through the process using an example to see how it works. You can use
this example as a skeleton script, by the way, changing it to add anything you need.
Let’s assume you are running a server that uses SysV-style startup scripts and you want to start a
special program that pops up a message every hour and reminds you that you need to take a break
from the keyboard (a good idea if you don’t want to get carpal tunnel syndrome!). The script to start
this program will include the following:
A description of the script’s purpose (so that you don’t forget it a year later)
Verification that the program really exists before trying to start it
Acceptance of the start and stop parameters and performance of the required actions
NOTE Lines starting with a pound sign (#) are comments and are not part of the script’s actions,
except for the first line.
Given these parameters, let’s begin creating the script.
Creating the carpald.sh Script
First we’ll create the script that will perform the actual function that we want. The script is
unsophisticated, but it will serve our purpose here. A description of what the script does is embedded
in its comment fields.
1. Launch any text editor of your choice, and type the following text:
2. Save the text of the script into a file called carpald.sh.
3. You next need to make the script executable. Type the following:
4. Copy or move the script over to a directory where our startup scripts will find it, we’ll use
the /usr/local/sbin/ directory:
Creating the Startup Script
Here you will create the actual startup script that will be executed during system startup and
shutdown. The file you create here will be called carpald. The file will be chkconfig-enabled. This
means that if you want, you can use the chkconfig utility to control the runlevels at which the
program starts and stops. This is a useful and time-saving functionality.
1. Launch any text editor of your choice, and type the following text:
A few comments about the preceding startup script:
Even though the first line of the script begins with #!/bin/sh, note that /bin/sh is a
symbolic link to /bin/bash. This is not the case on other UNIX systems.
The line chkconfig: 35 99 01 is actually quite important to the chkconfig utility that
we want to use. The number 35 means that chkconfig should create startup and stop
entries for programs in runlevels 3 and 5 by default—that is, entries will be created in the
/etc/rc.d/rc3.d and /etc/rc.d/rc5.d directories.
The fields 99 and 01 mean that chkconfig should set the startup priority of our program
to be 99 and the stop priority to be 01—that is, start up late and end early.
2. Save the text of the script into a file called carpald.
3. Now make the file executable:
4. Copy or move the script over to the directory where startup scripts are stored—the /etc/rc.d/
directory:
5. Now you need to tell chkconfig about the existence of this new start/stop script and what
you want it to do with it:
This will automatically create the symbolic links listed here:
(The meaning and significance of the K (kill) and S (start) prefixes in this listing were
explained earlier.)
This might all appear rather elaborate, but the good news is that because you’ve set up this rc
script, you won’t ever need to do it again. More important, the script will automatically run during
startup and shutdown and is able to manage itself. The overhead up front is well worth the long-term
benefits of avoiding carpal tunnel syndrome!
1. Use the service command to find out the status of the carpald.sh program:
2. Manually start the carpald program to make sure that it will indeed start up correctly upon
system startup:
TIP As long as the e-mail sub-system of the server is running, you should see a mail message from
the carpald.sh script after about an hour. You can use the mail program from the command line by
typing the following:
Type q at the ampersand (&) prompt to quit the mail program.
3. Now stop the program:
4. We are done.
Enabling and Disabling Services
At times, you might find that you simply don’t need a particular service to be started at boot time.
This is especially important if you are configuring the system as a server and need only specific
services and nothing more.
As described in the preceding sections, you can cause a service not to be started by simply
renaming the symbolic link in a particular runlevel directory; rename it to start with a K instead of an
S. Once you are comfortable working with the command line, you’ll find that it is easy to enable or
disable a service.
The startup runlevels of the service/program can also be managed using the chkconfig utility. To
view all the runlevels in which the carpald.sh program is configured to start up, type the following:
To make the carpald.sh program start up automatically in runlevel 2, type this:
If you check the list of runlevels for the carpald.sh program again, you will see that the field for
runlevel 2 has been changed from 2:off to 2:on. Type the following to do this:
GUI tools are available that will help you manage which services start up at any given runlevel.
In Fedora and other Red Hat–type systems (including RHEL and CentOS), one such tool is the
system-config-services utility (see Figure 6-1). To launch the program, type the following:
Figure 6-1. Fedora’s Service Configuration tool
On a system running openSUSE Linux, the equivalent GUI program (see Figure 6-2) can be
launched by typing this:
Figure 6-2. openSUSE’s System Services (Runlevel) editor
On an Ubuntu system, a popular tool for managing services with a GUI front-end is the bum
application (Boot-Up Manager). See Figure 6-3. It can be launched by typing the following:
Figure 6-3. Ubuntu’s Boot-Up Manager
NOTE If you don’t have the bum application installed by default on your Ubuntu server, you can
quickly install it by typing sudo apt-get install bum.
Although a GUI tool is a nice way to perform this task, you might find yourself in a situation
where it is just not convenient or available, such as when you are connected remotely to the server
you are managing over a low-bandwidth or high-latency connection.
Disabling a Service
To disable a service completely, you must, at a minimum, know the name of the service. You can then
use the chkconfig tool to turn it off permanently, thereby preventing it from starting in all runlevels.
For example, to disable our “life-saving” carpald.sh program, you could type this:
If you check the list of runlevels for the carpald.sh program again, you will see that it has been
turned off for all runlevels:
To remove the carpald.sh program permanently from under the chkconfig utility’s control, you
will use chkconfig’s delete option:
We are done with our sample carpald.sh script, and to prevent it from flooding us with e-mail
notifications in the future (in case we accidentally turn it back on), we can delete it from the system
for good:
And that’s how services start up and shut down automatically in Linux. Now go out and take a
break.
Odds and Ends of Booting and Shutting Down
Most Linux administrators do not like to shut down their Linux servers. It spoils their uptime (the
“uptime” is a thing of pride for Linux system admins). Thus, when a Linux box has to be rebooted, it
is usually for unavoidable reasons. Perhaps something bad has happened or the kernel has been
upgraded.
Thankfully, Linux does an excellent job of self-recovery, even during reboots. It is rare to have to
deal with a system that will not boot correctly, but that is not to say that it will never happen—and
that’s what this section is all about.
fsck!
Making sure that data on a system’s hard disk is in a consistent state is an important function. This
function is partly controlled by a runlevel script and another file called the /etc/fstab file. The File
System Check (fsck) tool is automatically run as necessary on every boot, as specified by the
presence or absence of a file named /.autofsck, and also as specified by the /etc/fstab file. The
purpose of the fsck program is similar to that of Windows ScanDisk: to check and repair any damage
on the file system before continuing the boot process. Because of its critical nature, fsck is
traditionally scheduled to run very early in the boot sequence.
If you were able to do a clean shutdown, the /.autofsck file will be deleted and fsck will run
without incident, as specified in the /etc/fstab file (as specified in the sixth field—see the fstab
manual page at man fstab). However, if for some reason you had to perform a hard shutdown (such
as having to press the reset button), fsck will need to run through all of the local disks listed in the
/etc/fstab file and check them. (And it isn’t uncommon for the system administrator to be cursing
through the process.)
If fsck does need to run, don’t panic. It is unlikely you’ll have any problems. However, if
something does arise, fsck will prompt you with information about the problem and ask whether you
want to perform a repair. In general, you’ll find that answering “yes” is the right thing to do.
Virtually all modern Linux distributions use what is called a “journaling file system,” and this
makes it easier and quicker to recover from any file system inconsistencies that might arise from
unclean shutdowns and other minor software errors. Examples of file systems with this journaling
capability are ext4, Btrfs, ext3, ReiserFS, JFS, and XFS.
If your storage partitions or volumes are formatted with any of the journaling capable file systems
(such as ext4, ext3, Btrfs, or ReiserFS), you will notice that recovering from unclean system resets
will be much quicker and easier. The only tradeoff with running a journaled file system is the
overhead involved in keeping the journal, and even this depends on the method by which the file
system implements its journaling.
Booting into Single-User (“Recovery”) Mode
Under Windows, the concept of “Recovery Mode” was borrowed from a long-time UNIX feature of
booting into single-user mode. What this means for you in the Linux world is that if something gets
broken in the startup scripts that affect the booting process of a host, it is possible for you to boot into
this mode, make the fix, and then allow the system to boot into complete multiuser mode (normal
behavior).
If you are using the GRUB Legacy boot loader, these are the steps:
1. Select the GRUB entry that you want to boot from the GRUB menu. The entry for the default
or most recently installed kernel version will be highlighted by default in the GRUB menu.
Press the E key.
2. You will next be presented with a submenu with various directives (directives from the
/boot/grub/menu.1st file).
3. Select the entry labeled kernel, and press E again. Leave a space and then add the keyword
single (or the letter s) to the end of the line.
4. Press ENTER to go back to the GRUB boot menu, and then press B to boot the kernel into
single-user mode.
5. When you boot into single-user mode, the Linux kernel will boot as normal, except when it
gets to the point where it starts the init program, it will only go through runlevel 1 and then
stop. (See previous sections in this chapter for a description of all the runlevels.) Depending
on the system configuration, you will either be prompted for the root password or simply
given a shell prompt. If prompted for a password, type the root password and press ENTER,
and you will get the shell prompt.
6. In this mode, you’ll find that almost all the services that are normally started are not running.
This includes network configuration. So if you need to change the IP address, gateway,
netmask, or any network-related configuration file, you can. This is also a good time to run
fsck manually on any partitions that could not be automatically checked and recovered. (The
fsck
program will tell you which partitions are misbehaving, if any.)
TIP In the single-user mode of many Linux distributions, only the root partition will be automatically
mounted for you. If you need to access any other partitions, you will need to mount them yourself
using the mount command. You can see all of the partitions that you can mount in the /etc/fstab file.
7. Once you have made any changes you need to make, simply press CTRL-D. This will exit
single-user mode and continue with the booting process, or you can just issue the reboot
command to reboot the system.
Summary
This chapter looked at the various aspects involved with starting up and shutting down a typical Linux
system. We started our exploration with the almighty boot loader. We looked at GRUB in particular
as a sample boot loader/manager, because it is the boot loader of choice among the popular Linux
distributions. Next we explored how things (or services) typically get started and stopped in Linux,
and how Linux decides what to start and stop, and at which runlevel it is supposed to do this. We
even wrote a little shell program, as a demonstration, that helps us to avoid carpal tunnel syndrome.
We then went ahead and configured the system to start up the program automatically at specific
runlevels.
CHAPTER 7
File Systems
ile systems provide a means of organizing data on a storage medium. They provide all of the
abstraction layers above sectors and cylinders of disks. This chapter discusses the composition
and management of these abstraction layers supported by Linux. We’ll pay particular attention
to the native Linux file systems—the extended file system family.
This chapter will also cover the many aspects of managing disks. This includes creating partitions
and volumes, establishing file systems, automating the process by which they are mounted at boot
time, and dealing with them after a system crash. It will also touch on Logical Volume Management
(LVM) concepts.
F
NOTE Before beginning your study of this chapter, you should be familiar with files, directories,
permissions, and ownership in the Linux environment. If you haven’t yet read Chapter 5, you should
read that chapter before continuing.
The Makeup of File Systems
Let’s begin by going over the structure of file systems under Linux to clarify your understanding of the
concept and let you see more easily how to take advantage of the architecture.
i-Nodes
The most fundamental building block of many Linux/UNIX file systems is the i-node. An i-node is a
control structure that points either to other i-nodes or to data blocks.
The control information in the i-node includes the file’s owner, permissions, size, time of last
access, creation time, group ID, and other information. The i-node does not provide the file’s name,
however.
As mentioned in Chapter 5, directories themselves are special instances of files. This means each
directory gets an i-node, and the i-node points to data blocks containing information (filenames and inodes) about the files in the directory. Figure 7-1 illustrates the organization of i-nodes and data
blocks in the older ext2 file system.
Figure 7-1. The i-nodes and data blocks in the ext2 file system
As you can see in Figure 7-1, the i-nodes are used to provide indirection so that more data blocks
can be pointed to—which is why each i-node does not contain the filename. (Only one i-node works
as a representative for the entire file; thus, it would be a waste of space if every i-node contained
filename information.) Take, for example, a 6-gigabyte (GB) disk that contains 1,079,304 i-nodes. If
every i-node consumed 256 bytes to store the filename, a total of about 33 megabytes (MB) would be
wasted in storing filenames, even if they weren’t being used!
Each indirect block, in turn, can point to other indirect blocks if necessary. With up to three layers
of indirection, it is possible to store very large files on a Linux file system.
Block
Data on an ext2 file system is organized into blocks. A block is a sequence of bits or bytes, and it is
the smallest addressable unit in a storage device. Depending on the block size, a block might contain
only a part of a single file or an entire file. Blocks are in turn grouped into block groups. Among other
things, the block group contains a copy of the superblock, the block group descriptor table, the block
bitmap, an i-node table, and of course the actual data blocks. The relationship among the different
structures in an ext2 file system is shown in Figure 7-2.
Figure 7-2. Data structure on ext2 file systems
Superblocks
The first piece of information read from a disk is its superblock. This small data structure reveals
several key pieces of information, including the disk’s geometry, the amount of available space, and,
most importantly, the location of the first i-node. Without a superblock, an on-disk file system is
useless.
Something as important as the superblock is not left to chance. Multiple copies of this data
structure are scattered all over the disk to provide backup in case the first one is damaged. Under
Linux’s ext2 file system, a superblock is placed after every group of blocks, and it contains i-nodes
and data. One group consists of 8192 blocks; thus, the first redundant superblock is at 8193, the
second at 16,385, and so on. The designers of most Linux file systems intelligently included this
superblock redundancy into the file system design.
ext3
The third extended file system (ext3) is another popular Linux file system used by the major Linux
distributions. The second extended file system (ext2) forms the base of ext3. The ext3 file system is
an enhanced extension of the ext2 file system.
As of this writing, the ext2 file system on which ext3 is based is more than 18 years old. This
means two things for us as system administrators: First and foremost, ext3 is rock-solid. It is a welltested subsystem of Linux and has had the time to become well optimized. Second, other file systems
that were considered experimental when ext2 was created have matured and become available to
Linux.
In addition to ext3, the other file systems that are popular replacements for ext2 are ReiserFS and
XFS. They offer significant improvements in performance and stability, but their most important
component is that they have moved to a new method of getting the data to the disk. This new method is
called journaling. Traditional file systems (such as ext2) must search through the directory structure,
find the right place on disk to lay out the data, and then lay out the data. (Linux can also cache the
whole process, including the directory updates, thereby making the process appear faster to the user.)
Almost all new versions of Linux distributions now make use of one journaling file system or the
other by default, including Fedora (and other Red Hat Enterprise Linux [RHEL] derivatives),
openSUSE, and Ubuntu.
The problem with not having a journaling file system is that in the event of an unexpected crash,
the file system checker or file system consistency checker (fsck) program has to follow up on all of
the files on the disk to make sure they don’t contain any dangling references (for example, i-nodes that
point to other, invalid i-nodes or data blocks). As disks expand in size and shrink in price, the
availability of these large-capacity disks means more of us will have to deal with the aftermath of
having to fsck a large disk. And as anyone who has had to do that can tell you, it isn’t fun. The
process can take a long time to complete, and that means downtime for your users.
Journaling file systems work by first creating an entry of sorts in a log (or journal) of changes that
are about to be made before actually committing the changes to disk. Once this transaction has been
committed to disk, the file system goes ahead and modifies the actual data or metadata. This results in
an all-or-nothing situation—that is, either all or none of the file system changes get done.
One of the benefits of using a journaling-type file system is the greater assurance that data
integrity will be preserved, and in the unavoidable situations where problems arise, speed, ease of
recovery, and likelihood of success are vastly increased. One such unavoidable situation is a system
crash. In this case, you might not need to run fsck. Think how much faster you could recover a system
if you didn’t have to run fsck on a 1TB disk! (Haven’t had to run fsck on a big disk before? Think
about how long it takes to run chkdsk or ScanDisk under Windows on large disks.) Other benefits of
using journaling-type file systems are that system reboots are simplified, disk fragmentation is
reduced, and I/O operations can be accelerated (depending on the journaling method used).
ext4
As we already hinted, the fourth extended file system (ext4) is the successor of ext3 and is an
enhanced extension of ext3. It is the default file system found in most of the newer Linux distributions.
It offers backward compatibility with ext3 and as such migrating or upgrading to ext4 is easy.
The ext4 file system offers several improvements/features over ext3, as discussed next.
Extents
Unlike ext3, the ext4 file system does not use the indirect block mapping approach. Instead it uses the
concept of extents. An extent is a way of representing contiguous physical blocks of storage on a file
system. An extent provides information about the range or magnitude over which a data file extends
on the physical storage.
So instead of each block carrying a marker to indicate the data file to which it belongs, a single
(or a few) extents can be used to state that the next X number of blocks belong to a specific data file.
Online Defragmentation
As data grows, shrinks, and is moved around, it can become defragmented with time.
Defragmentation can cause the mechanical components of physical storage device to work harder
than necessary, which in turn leads to increased wear and tear on the device.
Traditionally, the process of undoing file fragmentation is to defragment the file system offline.
“Offline” in this instance means to run the defragmenting when no possibility exists that the files are
being accessed or used. ext4 supports online defragmentation of individual files or an entire file
system.
Larger File System and File Size
The older ext3 file system is able to support a maximum of 16TB (terabytes) file system sizes as well
as maximum individual file sizes of up to 2TB. The ext4 system, on the other hand, is able to support
maximum file system sizes of 1EB (exabyte) as well as maximum individual file sizes of up to 16TB
each.
Btrfs
The B-tree file system (Btrfs) is a next-generation Linux file system aimed at solving any enterprise
scalability issues that the current Linux file systems may have. (Btrfs is fondly pronounced “Butter
FS.”) It is expected to be the de-facto file system that will replace ext4.
As of this writing, Btrfs is already available for use and testing in different Linux distributions. In
addition to all the advanced features supported by ext4, Btrfs supports several additional features,
including the following:
Dynamic i-node allocation
Online file system checking (fsck-ing)
Built-in RAID functions such as mirroring and stripping
Online defragmentation
Support for snapshots
Support for sub-volumes
Support for online addition and removal of block devices
Transparent compression
Improved storage utilization via support for data deduplication
Which File System Should You Use?
You might be asking by now, “Which file system should I use?” As of this writing, the current trend is
to standardize on file systems with journaling capabilities. Keep in mind, however, that journaling
brings with it some overhead.
Another important decision is whether to go with a file system that inherently offers performance
benefits for specific workloads or use cases.
You might need to do your own research, perform your own benchmarks, and listen and learn
from the experiences of other people who use the file system in scenarios similar to yours.
As with all things Linux, the choice is yours. Your best bet is to try many file systems and
determine how they perform with the application/workload that’s present on your system.
Finally, in a vast majority of server installations, you will find that the default file system
supplied by the distribution vendor will suffice, so you can go about your merry business without
giving it another thought.
Managing File Systems
The process of managing file systems is trivial—that is, management becomes trivial after you have
memorized all aspects of your networked servers, disks, backups, and size requirements, with the
condition that they will never again have to change. In other words, managing file systems isn’t trivial
at all!
Once the file systems have been created, deployed, and added to the backup cycle, they do tend to
take care of themselves for the most part. What makes them tricky to manage are the administrative
issues, such as users who refuse to do housekeeping on their disks and cumbersome management
policies dictating who can share what disk and under what conditions—depending, of course, on the
account under which the storage/disk was purchased and other completely nontechnical issues.
Unfortunately, there’s no cookbook solution available for dealing with office politics, so in this
section, we’ll stick to the technical issues involved in managing file systems—that is, the process of
mounting and unmounting partitions, dealing with the /etc/fstab file, and performing file-system
recovery with the fsck tool.
Mounting and Unmounting Local Disks
Linux’s strong points include its flexibility and the way it lends itself to seamless management of file
locations. Partitions need to be mounted so that their contents can be accessed. In actuality, the file
system on a partition or volume is mounted, so that it appears as just another subdirectory on the
system. This helps to promote the illusion of one large directory tree structure, even though several
different file systems might be in use. This characteristic is especially helpful to the administrator,
who can relocate data stored on a physical partition to a new location (possibly a different partition)
under the directory tree, with the system users being none the wiser.
The file system management process begins with the root directory. This partition is also fondly
called slash and likewise symbolized by a forward slash character (/). The partition containing the
kernel and core directory structure is mounted at boot time. It is possible and usual for the physical
partition that houses the Linux kernel to be stored on a separate file system, such as /boot. It is also
possible for the root file system (/) to house both the kernel and other required utilities and
configuration files to bring the system up to single-user mode.
As the boot scripts run, additional file systems are mounted, adding to the structure of the root file
system. The mount process overlays a single subdirectory with the directory tree of the partition it is
trying to mount. For example, let’s say that /dev/sda2 is the root partition. It includes the directory
/usr, which contains no files. The partition /dev/sda3 contains all the files that you want in /usr, so
you mount /dev/sda3 to the directory /usr. Users can now simply change directories to /usr to see all
the files from that partition. The user doesn’t need to know that /usr is actually a separate partition.
NOTE In this and other chapters, we might inadvertently say that a partition is being mounted at such
and such a directory. Please note that it is actually the file system on the partition that is being
mounted. For the sake of simplicity, and in keeping with everyday verbiage, we might interchange
these two meanings.
Keep in mind that when a new directory is mounted, the mount process hides all the contents of
the previously mounted directory. So in our /usr example, if the root partition did have files in /usr
before mounting /dev/sda3, those /usr files would no longer be visible. (They’re not erased, of
course, because once /dev/sda3 is unmounted, the /usr files would become visible again.)
Using the mount Command
Like many command-line tools, the mount command has a plethora of options, most of which you
won’t be using in daily work. You can get full details on these options from the mount man page. In
this section, we’ll explore the most common uses of the command.
The structure of the mount command is as follows:
The mount options can be any of those shown in Table 7-1.
Option
Description
-a
Mounts all the file systems listed in /etc/fstab (this file is examined
later in this section).
-t fstype
Specifies the type of file system being mounted. Linux can mount file
systems other than the ext2/ext3/ext4/Btrfs standard—File Allocation
Table (FAT), Virtual File Allocation Table (VFAT), NTFS,
ReiserFS, and so on. The mount command usually senses this
information on its own.
remount
The remount option is used for remounting already-mounted file
systems. It is commonly used for changing the mount flags for a file
system. For example, it can be used for changing a file system that is
mounted as read-only into a writable file system, without unmounting
it.
-o options
Specifies options applying to this mount process, which are specific
to the file system type (options for mounting network file systems may
not apply to mounting local file systems). Some of the more often used
options are listed in Table 7-2.
Table 7-1. Options Available for the mount Command
Option (for Local
Partitions)
Description
ro
Mounts the partition as read-only.
rw
Mounts the partition as read/write (default).
exec
Permits the execution of binaries (default).
noatime
Disables update of the access time on i-nodes. For partitions where
the access time doesn’t matter, enabling this improves performance.
noauto
Disables automatic mount of this partition when the -a option is
specified (applies only to the /etc/fstab file).
nosuid
Disallows application of SetUID program bits to the mounted
partition.
sb=n
Tells mount to use block n as the superblock. This is useful when the
file system might be damaged.
Table 7-2. Options Available for Use with the mount -o Parameter
The options available for use with the mount -o flag are shown in Table 7-2.
Issuing the mount command without any options will list all the currently mounted file systems:
Assuming that a directory named /bogus-directory exists, the following mount command will
mount the /dev/sda3 partition onto the /bogus-directory directory with read-only privileges:
Unmounting File Systems
To unmount a file system, use the umount command (note that the command is not unmount). Here’s
the command format:
Here, directory is the directory to be unmounted. Here’s an example:
This unmounts the partition mounted on the /bogus-directory directory.
When the File System Is in Use
There’s a catch to using umount: If the file system is in use (that is, someone is currently accessing
the contents of the file system or has a file open on the file system), you won’t be able to unmount that
file system. To get around this, you can do any of the following:
You can use the lsof or fuser program to determine which processes are keeping the files
open, and then kill them off or ask the process owners to stop what they’re doing. (Read
about the kill parameter in fuser in the fuser man page.) If you choose to kill the
processes, make sure you understand the repercussions of doing so—in other words, be
extra careful before killing unfamiliar processes, because if you kill an important process,
your job security might just be on the line.
You can use the -f option with umount to force the unmount process. It is especially useful
for Network File System (NFS)–type file systems that are no longer available.
Use the lazy unmount, specified with the -l option. This option almost always works even
when others fail. It detaches the file system from the file-system hierarchy immediately, and
it cleans up all references to the file system as soon as the file system stops being busy.
The safest and most proper alternative is to bring the system down to single-user mode and
then unmount the file system. In reality, of course, you don’t always have the luxury of being
able to do this on production systems.
The /etc/fstab File
As mentioned earlier, /etc/fstab is a configuration file that mount can use. This file contains a list of
all partitions known to the system. During the boot process, this list is read and the items in it are
automatically mounted with the options specified therein.
Here’s the format of entries in the /etc/fstab file:
Following is a sample /etc/fstab file:
Let’s take a look at some of the entries in the /etc/fstab file that haven’t yet been discussed. Note
that line numbers have been added to the output to aid readability.
Line 1 The first entry in our sample /etc/fstab file is the entry for the root volume. The first column
shows the device that houses the file system—the /dev/mapper/VolGroup-LogVol00 logical volume
(more on volumes later in the section “Volume Management”). The second column shows the mount
point—the / directory. The third column shows the file system type—ext4 in this case. The fourth
column shows the options with which the file system should be mounted—only the default options are
required in this case. The fifth field is used by the dump utility (a simple backup tool) to determine
which file systems need to be backed up. And the sixth and final field is used by the fsck program to
determine whether the file system needs to be checked and also to determine the order in which the
checks are done.
Line 2 The next entry in our sample file is the /boot mount point. The first field of this entry shows
the device—in this case, it points to the device identified by its Universally Unique Identifier
(UUID). The current practice is to use the UUID of devices or partitions.
The other fields mean basically the same thing as the field for the root mount point discussed
previously. In the case of the /boot mount point, you might notice that the field for the device looks a
little different from the usual /dev/<path-to-device> convention. The use of a UUID to identify
devices/partitions helps to ensure that they are correctly and uniquely identified under any
circumstances. Circumstances such as adding new disks, and removing or replacing the disks,
changing the drive controller or bus to which the drive is attached.
Some Linux distributions may instead opt to use labels to identify the physical device in the first
field of the /etc/fstab file. The use of labels helps to hide the actual device (partition) from which the
file system is being mounted. When using labels, the device is replaced with a token that looks like
the following: LABEL=/boot. During the initial installation, the partitioning program of the installer
automatically set the label on the partition. Upon bootup, the system scans the partition tables and
looks for these labels. This is especially useful when Small Computer System Interface (SCSI) disks
are being used. Typically, SCSI has a set SCSI ID. Using labels allows you to move the disk around
and change the SCSI ID, and the system will still know how to mount the file system even though the
device might have changed, for example, from /dev/sda10 to /dev/sdb10 (see the section “Traditional
Disk- and Partition-Naming Conventions” a bit later in the chapter).
Labels are useful for transient external media such as flash drives, USB hard drives, and so on.
TIP The command-line utility blkid can be used to display different attributes of the storage devices
attached to a system. One of such attributes is the UUID of the volumes. For example, running blkid
without any options will print a variety of information including the UUID of each block device on the
system:
Lines 3 and 4 The next two entries are for the /home and /tmp mount points. They each refer to an
actual physical entity or device on the system. Specifically, they refer to the VolGroup-LogVol02 and
VolGroup-LogVol03 logical volumes, respectively. The remaining fields of lines 3 and 4 can be
interpreted in the same manner as the fields of the preceding root (/) and /boot mount points.
Line 5 This is the entry for the system swap partition, where virtual memory resides. In Linux, the
virtual memory can be kept on a separate partition from the root partition. (Note that a regular file can
also be used for swap purposes in Linux.) Keeping the swap space on a separate partition helps to
improve performance, since the swap partition can obey rules in ways that differ from a normal file
system. Also, because the partition doesn’t need to be backed up or checked with fsck at boot time,
the last two parameters on it are zeroed out. (Note that a swap partition can be kept in a normal disk
file as well. See the man page on mkswap for additional information.)
Line 6 Next comes the tmpfs file system, also known as a virtual memory (VM) file system. It uses
both the system random access memory (RAM) and swap area. It is not a typical block device
because it does not exist on top of an underlying block device; it sits directly on top of VM. It is used
to request pages from the VM subsystem to store files. The first field—tmpfs—shows that this entry
deals with a VM and, as such, is not associated with any regular UNIX/Linux device file. The second
entry shows the mount point, /dev/shm. The third field shows the file system type, tmpfs. The fourth
field shows that this file system should be mounted with the default options. The fifth and sixth fields
have the same meanings as the fields in previous entries. Note especially that the values are 0 in this
case, which makes perfect sense, because there is no reason to run a dump on a temporary file system
at bootup, and there is also no reason to run fsck on it, since it does not contain an ext2/3-type file
system.
Line 8 Next comes the entry for the sysFS file system. This is new and necessary in the Linux 2.6
kernels. Again, it is temporary and special, just like the tmpfs and proc file systems. It serves as an
in-memory repository for system and device status information. It provides a structured view of a
system’s device tree. This is akin to viewing the devices in Windows Device Manager as a series of
files and directories instead of through a Control Panel view.
Line 9 The next notable entry is for the proc-type file system. Information concerning the system
processes (hence the abbreviation proc) is dynamically maintained in this file system. The proc in
the first field of the proc entry in the /etc/fstab file has the same implication as that of the tmpfs file
system entry. The proc file system is a special file system that provides an interface to kernel
parameters through what looks like any other file system—that is, it provides an almost humanreadable look to the kernel. Although it appears to exist on disk, it really doesn’t—all the files
represent something that is in the kernel. Most notable is /dev/kcore, which is the system memory
abstracted as a file. People new to the proc file system often mistake this for a large, unnecessary file
and remove it, which will cause the system to malfunction in many glorious ways. Unless you are sure
you know what you are doing, it’s a safe bet to leave all the files in the /proc directory alone (more
details on /proc appear in Chapter 10).
Line 10 The last entry in the fstab file that is worthy of mentioning is the entry for the removable
media. In this example, the device field points to the device file that represents the optical device
(CD/DVD ROM drive)—/dev/sr0. The mount point is /media/cdrom, and so when an optical
medium (CD/DVD) is inserted and mounted on the system, the contents can be accessed from the
/media/cdrom directory. The auto in the third field means that the system will automatically try to
probe/detect the correct file system type for the device. For CD/DVD-ROMs, this is usually the
iso9660 or the Universal Disk Format (UDF) file system. The fourth field lists the mount options.
NOTE When mounting partitions with the /etc/fstab file configured, you can run the mount command
with only one parameter: the directory to which you want to mount. The mount command checks
/etc/fstab for that directory; if found, mount will use all parameters that have already been
established there. For example, here’s a short command to mount a CD-ROM given the /etc/fstab file
shown earlier:
Using fsck
The fsck tool (short for File System Check) is used to diagnose and repair file systems that might
have become damaged in the course of daily operations. Such repairs may be necessary after a system
crash in which the system did not get a chance to fully flush all of its internal buffers to disk. (The fact
that this tool’s name—fsck—bears a striking resemblance to one of the expressions often uttered by
system administrators after a system crash, coupled with the fact that this tool can be used as a part of
the recovery process is strictly coincidental.)
Usually, the system runs the fsck tool automatically during the boot process as it deems necessary
(much in the same way Windows runs ScanDisk). If it detects a file system that was not cleanly
unmounted, it runs the utility. A file system check will also be run once the system detects that a check
has not been performed after a predetermined threshold (such as a number of mounts or time passed
between mounts). Linux makes an impressive effort to repair any problems it runs across
automatically, and, in most instances, it does take care of itself. The robust nature of the Linux file
system helps in such situations. Nevertheless, you might get this message:
At this point, you need to run fsck by hand and answer its prompts yourself.
If you do find that a file system is not behaving as it should (log messages are an excellent hint of
this type of anomaly), you may want to run fsck yourself on a running system. The only downside is
that the file system in question must be unmounted in order for this to work. If you choose to take this
path, be sure to remount the file system when you are done.
The name fsck isn’t the actual title for the ext3 repair tool; it’s actually just a wrapper. The fsck
wrapper tries to determine what kind of file system needs to be repaired and then runs the appropriate
repair tool, passing any parameters that were passed to fsck. In ext2, the actual tool is called
fsck.ext2. For the ext3 file system, the actual tool is fsck.ext3; for the ext4 file system, the actual
tool is fsck.ext4; for the VFAT file system, the tool is fsck.vfat; and for a ReiserFS file system,
the utility is called fsck.reiserfs. So, for example, when a system crash occurs on an ext4formatted partition, you might need to call fsck.ext4 directly rather than relying on other
applications to call it for you automatically.
To run fsck on the /dev/mapper/VolGroup-LogVol02 file system mounted at the /home
directory, you would carry out the following steps.
First, unmount the file system:
Note that this step assumes that the /home file system is not currently being used or accessed by any
process.
Since we know that this particular file system type is ext4, we can call the correct utility
(fsck.ext4) directly or simply use the fsck utility:
This output shows that the file system is marked clean.
To forcefully check the file system and answer yes to all questions in spite of what your OS
thinks, type this:
What If I Still Get Errors?
First, relax. The fsck utility rarely finds problems that it cannot correct by itself. When it does ask
for human intervention, telling fsck to execute its default suggestion is often enough. Very rarely does
a single pass of e2fsck not clear up all problems.
On the rare occasions when a second run is needed, it should not turn up any more errors. If it
does, you are most likely facing a hardware failure. Remember to start with the obvious: Check for
reliable power and well-connected cables. Anyone running SCSI systems should verify that the
correct type of terminator is being used, that cables aren’t too long, that SCSI IDs aren’t conflicting,
and that cable quality is adequate. (SCSI is especially fussy about the quality of the cables.)
And when all else fails, and fsck doesn’t want to fix the issue, it will often give you a hint as to
what’s wrong. You can then use this hint to perform a search on the Internet and see what other people
have done to resolve the same issue.
The lost+found Directory
Another rare situation occurs when fsck finds file segments that it cannot rejoin with the original file.
In those cases, it will place the fragment in the partition’s lost+found directory. This directory is
located where the partition is mounted, so if /dev/mapper/VolGroup-LogVol02 is mounted on
/home, for example, then /home/lost+found correlates to the lost+found directory for that particular
file system.
Anything can go into a lost+found directory—file fragments, directories, and even special files.
When normal files wind up there, a file owner should be attached, and you can contact the owner and
see if they need the data (typically, they won’t). If you encounter a directory in lost+found, you’ll
likely want to try to restore it from the most recent backups rather than trying to reconstruct it from
lost+found. At the very least, lost+found tells you whether anything became dislocated. Again, such
errors are extraordinarily rare.
Adding a New Disk
On systems sporting PC hardware architecture, the process of adding a disk under Linux is relatively
easy. Assuming you are adding a disk that is of similar type to your existing disks (for example,
adding a SATA disk to a system that already has SATA drives or adding a SCSI disk to a system that
already has SCSI drives), the system should automatically detect the new disk at boot time. All that
remains is partitioning it and creating a file system on it.
If you are adding a new type of disk (such as a SCSI disk on a system that has only IDE
[Integrated Drive Electronics] drives), you may need to ensure that your kernel supports the new
hardware. This support can either be built directly into the kernel or be available as a loadable
module (driver). Note that the kernels of most Linux distributions come with support for many
popular disk/storage controllers, but you may occasionally come across troublesome kernel and
hardware combinations, especially with the new motherboards that have exotic chipsets.
Once the disk is in place, simply boot the system, and you’re ready to go. If you aren’t sure about
whether the system can see the new disk, run the dmesg command and see whether the driver loaded
and was able to find your disk. Here’s an example:
Overview of Partitions
For the sake of clarity, and in case you need to know what a partition is and how it works, let’s
briefly review this subject. Disks typically need to be partitioned before use. Partitions divide the
disk into segments, and each segment acts as a complete disk by itself. Once a partition is filled with
data, the data cannot automatically overflow onto another partition.
Various things can be done with a partitioned disk, such as installing an OS into a single partition
that spans the entire disk, installing several different OSs into their own separate partitions in what is
commonly called a “dual-boot” configuration, and using the different partitions to separate and
restrict certain system functions into their own work areas.
This last example is especially relevant on a multiuser system, where the content of users’ home
directories should not be allowed to overgrow and disrupt important OS functions.
Traditional Disk and Partition Naming Conventions
Modern Linux distributions use the libATA library to provide support within the Linux kernel for
various storage devices as well as host controllers. Under Linux, each disk is given its own device
name. The device files are stored under the /dev directory.
Hard disks start with the name sdX, where X can range from a through z, with each letter
representing a physical block device. For example, in a system with two hard disks, the first hard
disk would be /dev/sda and the second hard disk would be /dev/sdb.
When partitions are created, corresponding device files are created. They take the form of
/dev/sdXY, where X is the device letter (as described in the preceding paragraph) and Y is the
partition number.
Thus, the first partition on the /dev/sda disk is /dev/sda1, the second partition would be
/dev/sda2, the second partition on the third disk would be /dev/sdc2, and so on. SCSI disks follow
the same basic scheme.
Some standard devices are automatically created during system installation, and others are
created as they are connected to the system.
NOTE In the old days before the advent of the current libATA subsystem, the IDE device naming
conventions for hard disks was different. IDE drive device names were very distinct from other hard
drive interfaces such as SCSI or SATA. The device naming convention for IDE drives used to be
something like /dev/hdX. This means that device names started with hd instead of sd. For example, in
an IDE-only system with one hard disk and one CD-ROM, both on the same IDE chain, the hard disk
would be /dev/hda and the CD-ROM would be /dev/hdb. This information is provided for legacy
systems.
Volume Management
You may have noticed earlier that we use the terms “partition” and “volume” interchangeably in parts
of the text. Although they are not exactly the same, they are similar in a conceptual way. Volume
management is a new approach to dealing with disks and partitions: Instead of viewing a disk or
storage entity along partition boundaries, the boundaries are no longer present and everything is now
seen as volumes.
(That made perfect sense, didn’t it? Don’t worry if it didn’t; this is a tricky concept. Let’s try this
again with more detail.)
This new approach to dealing with partitions is called Logical Volume Management (LVM) in
Linux. It lends itself to several benefits and removes the restrictions, constraints, and limitations that
the concept of partitions imposes. Following are some of the benefits:
Greater flexibility for disk partitioning
Easier online resizing of volumes
Easier to increase storage space by simply adding new disks to the storage pool
Use of snapshots
Following are some important volume management terms:
Physical volume (PV) This typically refers to the physical hard disk(s) or another physical
storage entity, such as a hardware Redundant Array of Inexpensive Disks (RAID) array or
software RAID device(s). Only a single storage entity (for example, one partition) can exist
in a PV.
Volume group (VG) Volume groups are used to house one or more physical volumes and
logical volumes in a single administrative unit. A volume group is created out of physical
volumes. VGs are simply a collection of PVs; however, VGs are not mountable. They are
more like virtual raw disks.
Logical volume (LV) This is perhaps the trickiest LVM concept to grasp, because logical
volumes (LVs) are the equivalent of disk partitions in a non-LVM world. The LV appears as
a standard block device. We put file systems on the LV, and the LV gets mounted. The LV
gets fsck-ed if necessary.
LVs are created out of the space available in VGs. To the administrator, an LV appears as
one contiguous partition independent of the actual PVs that make it up.
Extents Two kinds of extents can be used: physical extents and logical extents. Physical
volumes (PVs) are said to be divided into chunks, or units of data, called “physical extents.”
And logical volumes (LVs) are said to be divided into chunks, or units of data, called
“logical extents.”
Creating Partitions and Logical Volumes
During the installation process, you probably used a “pretty” tool with a nice GUI front-end to create
partitions. The GUI tools available across the various Linux distributions vary greatly in looks and
usability. Two tools that can be used to perform most partitioning tasks, and that have a unified look
and feel regardless of the Linux flavor, are the venerable parted and fdisk utilities. Although fdisk
is small and somewhat awkward, it’s a reliable command-line partitioning tool. parted, on the other
hand, is much more user-friendly and has a lot more built-in functionalities than other tools have. In
fact, a lot of the GUI partitioning tools call the parted program in their back-end. Furthermore, in the
event you need to troubleshoot a system that has gone really wrong, you should be familiar with basic
tools such as parted or fdisk. Other powerful command-line utilities for managing partitions are
sfdisk and cfdisk.
During the installation of the OS, as covered in Chapter 2, you were asked to leave some free
unpartitioned space. We will now use that free space to demonstrate some LVM concepts by walking
through the steps required to create a logical volume.
In particular, we will create a logical volume that will house the contents of our current /var
directory. Because a separate /var volume was not created during the OS installation, the contents of
the /var directory are currently stored under the volume that holds the root (/) tree. The general idea
is that because the /var directory is typically used to hold frequently changing and growing data (such
as log files), it is prudent to put its content on its own separate file system.
The steps involved with creating a logical volume can be summarized this way:
1. Initialize a regular partition for use by the LVM system, or simply create a partition of the
type lvm.
2. Create physical volumes from the hard disk partition.
3. Assign the physical volume(s) to volume group(s).
4. Finally, create logical volumes within the volume groups, and assign mount points to the
logical volumes after formatting.
The following illustration shows the relationship between disks, physical volumes (PVs), volume
groups (VGs), and logical volumes (LVs) in LVM:
CAUTION The process of creating partitions is irrevocably destructive to the data already on the
disk. Before creating, changing, or removing partitions on any disk, you must be sure of what you are
doing and its consequences.
The following section comprises several parts:
Creating a partition
Creating a physical volume
Assigning a physical volume to a volume group
Creating a logical volume
The entire process from start to finish may appear a bit lengthy. It is actually a simple process in
itself, but we intersperse the steps with some extra steps, along with some notes and explanations.
Some LVM utilities that we’ll be using during the process are listed in Table 7-3.
LVM Command
Description
lvcreate
Creates a new logical volume in a volume group by allocating logical
extents from the free physical extent pool of that volume group
lvdisplay
Displays the attributes of a logical volume, such as read/ write status,
size, and snapshot information
pvcreate
Initializes a physical volume for use with the LVM system
pvdisplay
Displays the attributes of physical volumes, such as size and PE size
vgcreate
Creates new volume groups from block devices created using the
pvcreate command
vgextend
Adds one or more physical volumes to an existing volume group to
extend its size
vgdisplay
Displays the attributes of volume groups
Table 7-3. LVM Utilities
Creating a Partition
We will be using the free unpartitioned space on the main system disk, /dev/sda.
1. Begin by running the parted utility with the device name as a parameter to the command:
You will be presented with a simple parted prompt (parted).
2. Print the partition table again while at the parted shell. Type print at the prompt to print the
current partition table:
A few facts are worthy of note regarding this output:
The total disk size is approximately 105GB.
The partition table type is the GUID Partition Table (gpt). Three partitions are currently
defined on our sample system: 1, 2, and 3 (/dev/sda1, /dev/sda2, and /dev/sda3,
respectively).
Partition 2 (/dev/sda2) is marked with a boot flag (*). This means that it is a bootable
partition.
From the partitioning scheme we chose during the OS installation, we can deduce that
partition 2 (/dev/sda2) houses the /boot file system and partition 3 (/dev/sda3) houses
everything else (see the output of the df command for reference).
Partition 1 (/dev/sda1) is of the type bios_grub, and partition 3 (/dev/sda3) is of the type
lvm.
The last partition—(3 or /dev/sda3)—ends at the 84.4GB boundary. Therefore, there is
room to create a partition that will occupy the space from area 84.4GB to the end of the
disk (that is, 105GB).
3. Type mkpart at the prompt to create a new partition:
NOTE If you are curious about the other things you can do at the parted prompt, type help to display
a help menu.
4. We will not assign a name to the new partition, so press ENTER at the Partition name prompt
to leave the name blank:
5. Press ENTER again at the File system type prompt to accept the default value:
6. Now we specify the partition size. The sizes can be specified in units of kilobytes, megabytes,
gigabytes, and so on. First we choose the starting value or the lower limit. Since the last
partition (3), ends at the 84.4GB disk boundary, we’ll set the next starting size to be right after
that, that is 84.5GB. Type the value (84.5GB) and press ENTER when done:
7. Next we’ll be prompted for the ending value or the upper limit of the new partition. Since the
total size of the disk is 105GB, we’ll set the upper limit of the new partition as 105GB,
thereby using up all the remaining disk space. This effectively means that our new partition
size will be approximately 105GB minus 84.5GB in size (~20GB). Type the value (105GB)
and press ENTER when done:
8. We want to set the “type” or “flag” of the newly created partition to be lvm. To do this, we
use the set command within parted. Type the command below at the parted prompt to enable
the lvm flag on partition 4. Press ENTER when done:
TIP You can learn more about the proper syntax and the meaning of the various options and
subcommands of parted while at the parted shell by typing help <sub-command>. For example, to
learn more about the usage and options for the set command, you would type:
9. Check the changes you’ve made by viewing the partition table. Type print:
10. Once you are satisfied with your changes type quit at the parted prompt and press ENTER:
You will be returned back to your regular command shell (bash in our case).
NOTE In some very rare cases, you may need to reboot the entire system or unplug and re-insert the
newly partitioned block device in order to allow the Linux kernel to recognize or use newly created
partitions.
Creating a Physical Volume
In the following set of procedures, we will walk through creating a physical volume.
1. Make sure you are still logged into the system as the superuser.
2. Let’s view the current physical volumes defined on the system. Type the following:
Take note of the physical volume name field (PV Name).
3. Use the pvcreate command to initialize the partition we created in the previous section
(“Creating a Partition”) as a physical volume:
4. Use the pvdisplay command to view your changes again:
Assigning a Physical Volume to a Volume Group
Here we will assign the physical volume created earlier to a volume group (VG).
1. First use the vgdisplay command to view the current volume groups that might exist on your
system:
From the preceding output, we can see the following:
The volume group name (VG Name) is VolGroup.
The current size of the VG is 78.09 GiB (this should increase by the time we are done).
The physical extent size is 32 MiB, and there are a total of 2499 PEs.
There are zero physical extents free in the VG.
2. Assign the new PV to the volume group using the vgextend command. The syntax for the
command is as follows:
Substituting the correct values in this command, type this:
3. View your changes with the vgdisplay command:
Note that the VG Size, Total PE, and Free PE values have dramatically increased. We now have a
total of 609 free PEs (or 19.03 GiB).
Creating a Logical Volume (LV)
Now that we have some room in the VG, we can go ahead and create the logical volume (LV).
1. First view the current LVs on the system:
The preceding output shows the current LVs:
/dev/VolGroup/LogVol01
/dev/VolGroup/LogVol00
/dev/VolGroup/LogVol03
/dev/VolGroup/LogVol02
2. With the background information that we now have, we will create an LV using the same
naming convention that is currently used on the system. We will create a fifth LV called
“LogVol04.” The full path to the LV will be /dev/-VolGroup00/ LogVol04. Type the
following:
NOTE You can actually name your LV any way you want. We named ours LogVol04 for consistency
only. We could have replaced LogVol04 with another name, such as “my-volume,” if we wanted to.
The value for the --name (-n) options determines the name of the LV. The -l option specifies the
size in physical extents units (see Step 1 under “Assigning a Physical Volume to a Volume Group”).
We could have also specified the size in gigabytes or megabytes by using an option such as -L
19.03G or -L 19030M.
3. View the LV you created:
Fedora, RHEL, and CentOS Linux distributions have a GUI tool that can greatly simplify the
entire management of an LVM system. The command system-config-lvm will launch the tool shown
here:
The openSUSE Linux distribution also includes a capable GUI tool for managing disks, partitions,
and the LVM. Issue the command yast2 storage to launch the utility shown here:
Creating File Systems
With the volumes created, you need to put file systems on them. (If you’re accustomed to Microsoft
Windows, this is akin to formatting the disk once you’ve partitioned it.)
The type of file system that you want to create will determine the particular utility that you should
use. In this project, we want to create a Btrfs-type file system; therefore, we’ll use the mkfs.btrfs
utility. As indicated earlier in this chapter, Btrfs is considered a next-generation file system and as
such you should tread softly and carefully when deploying it in production environments—in other
words, “Your Mileage May Vary.” Many command-line parameters are available for the
mkfs.btrfs tool, but we’ll use it in its simplest form here.
Following are the steps for creating a file system:
1. The only command-line parameter you’ll usually have to specify is the partition (or volume)
name onto which the file system should go. To create a file system on
/dev/VolGroup/LogVol04, issue the following command:
Once the preceding command runs to completion, you are done with creating the file system.
We will next begin the process of trying to relocate the contents of the current /var directory
to its own separate file system.
2. Create a temporary folder that will be used as the mount point for the new file system. Create
it under the root folder:
3. Mount the LogVol04 logical volume at the /new_var directory:
4. Copy the content of the current /var directory to the /new_var directory:
5. Now you can rename the current /var to /old_var:
6. Create a new and empty /var directory:
7. To avoid taking down the system into single-user mode to perform the following sensitive
steps, we will resort to some old “military tricks.” Type the following:
This step will temporarily remount the /new_var directory to the /var directory where the
system actually expects it to be using the bind option with the mount utility. This is useful
until we are good and ready to reboot the system.
TIP The bind option can also be useful on systems running the NFS service. This is because the
rpc_pipefs pseudo-file system is often automatically mounted under a subfolder in the /var directory
(/var/lib/nfs/rpc_pipefs). So to get around this, you can use the mount utility with the bind option to
mount the rpc_pipefs pseudo-file system temporarily in a new location so that the NFS service can
continue working uninterrupted. The command to do this in our sample scenario would be as follows:
8. It may be necessary in certain Linux distros (such as Fedora, RHEL, and CentOS) that have
SELinux enabled to restore the security contexts for the new /var folder so that the daemons
that need it can use it:
We are almost done now. We need to create an entry for the new file system in the /etc/fstab
file. To do so, we must edit the /etc/fstab file so that our changes can take effect the next time
the system is rebooted. Open up the file for editing with any text editor of your choice, and
add the following entry into the file:
TIP You can also use the echo command to append the preceding text to the end of the file. The
command is
9. This will be a good time to reboot the system:
10. Hopefully the system came back up fine. After the system boots, delete the /old_var and
/new_var folders using the rm command.
NOTE If, during system bootup, the boot process was especially slow starting the system logger
service, don’t worry too much—it will time out eventually and continue with the boot process. But
you will need to set the proper security contexts for the files now under the /var folder by running the
restorecon -R /var command again, with the actual files now in the directory. Then reboot the
system one more time.
Summary
In this chapter, we discussed some de-facto Linux file systems such as the extended file system family
(ext2, ext3, ext4) and the new Btrfs. We covered the process of administering your file systems, and
we touched on various storage administrative tasks from creating partitions to creating physical
volumes, to extending an existing volume group, and then creating the final logical volume.
We also went through the process of moving a sensitive system directory onto its own separate
file system (Btrfs). The exercise detailed what you might need to do while managing a Linux server in
the real world. With this information, you’re armed with what you need to manage basic file system
issues on a production-grade Linux-based server in a variety of environments.
Like any operating system, Linux undergoes changes from time to time. Although the designers and
maintainers of the file systems go to great lengths to keep the interface the same, you’ll find some
alterations cropping up occasionally. Sometimes they’ll be interface simplifications. Others will be
dramatic improvements in the file system itself. Keep your eyes open for these changes. Linux
provides and supports superb file systems that are robust, responsive, and in general a pleasure to
use. Take the tools we have discussed in this chapter and find out for yourself.
CHAPTER 8
Core System Services
egardless of distribution, network configuration, and overall system design, every Linuxbased system ships with some core services. These services include init, logging daemon,
cron, and others. The functions performed by these services might be simple, but they are also
fundamental. Without their presence, a great deal of Linux’s power would be missed.
This chapter will discuss each of the core services, in addition to another useful system service
called xinetd. It also discusses each service’s corresponding configuration file and the suggested
method of deployment (if appropriate). You’ll find that the sections covering these simple services
are not terribly long, but don’t neglect this material. Take some time to get familiar with them. Many
creative solutions have been realized through the use of these services. Hopefully, this chapter will
inspire a few more.
R
The init Daemon
The init process is the patron of all processes. It is always the first process that gets started in any
Linux/UNIX-based system. init is executed by the kernel and is responsible for starting all other
processes initially on a system. The process ID for init is always 1. Should init ever fail, the rest of
the system will likely follow suit.
The init daemon as it was traditionally known has been largely replaced on most new Linux
distributions by different solutions. One solution is an upstart named upstart (pun intended). Another
more recent solution is known as systemd and is discussed in its own section later in this chapter.
The init process serves two roles: First, it serves as the ultimate parent process. Because init
never dies, the system can always be sure of its presence and, if necessary, make reference to it. The
need to refer to init usually happens when a process dies before all of its spawned child processes
have completed. This causes the children to inherit init as their parent process. A quick execution of
the ps -ef command will show a number of processes that will have a parent process ID (PPID) of
1. init also handles the various runlevels by executing the appropriate programs when a particular
runlevel is reached. This behavior is defined by the /etc/inittab file or its equivalent in other distros.
NOTE If you want to be strictly technically correct, init is not actually the very first process that is
run. But to remain politically correct, we’ll assume that it is! You should also keep in mind that some
so-called “security-hardened Linux systems” deliberately randomize the process identification (PID)
of init, so don’t be surprised if you find yourself working on such a system and notice that the PID of
init is not 1.
upstart: Die init. Die Now!
According to its documentation, “upstart is an event-based replacement for the init daemon which
handles starting of tasks and services during boot, stopping them during shutdown, and supervising
them while the system is running.” This same description of upstart pretty much describes the
function of the init daemon, except that upstart tries to achieve its stated objectives in a more elegant
and robust manner.
Another stated objective of upstart is to achieve complete backward compatibility with init
(System V init). Because upstart handles this backward compatibility with init so well and
transparently, the rest of this section will focus mostly on the traditional init way of doing things.
As previously mentioned, upstart is a replacement for the init daemon. upstart works using the
notion of jobs (or tasks) and events.
On Debian-based distros such as Ubuntu that are using upstart, jobs are created and placed under
the /etc/event.d/ or /etc/init/ directory. The name of the job is the filename under this directory. To
handle the services transparently that were hitherto handled by init, jobs have been defined to handle
the services and daemons that need to be started and stopped at the various runlevels (0, 1, 2, 3, 4, 5,
6, S, and so on).
For example, the job definition that automatically handles the services that are to be started at
runlevel 3 might be defined in a file named /etc/event.d/rc3 or /etc/init/rc3. The contents of the file
look like this:
Without going into too much detail, this job definition can be explained as follows:
The start stanza specifies that the job be run during the occurrence of an event. The event in
this case is the system entering runlevel 3.
The stop stanza specifies that the job be stopped during the occurrence of an event.
The script stanza specifies the shell script code that will be executed using /bin/sh.
The exec stanza specifies the path to a binary on the file system and optional arguments to
pass to it.
You can query the status of any job by using the status command. Here’s an example that queries
the status of our example rc3 job:
The initctl command can be used to display a listing of all jobs and their states. This example
lists all jobs and their states:
The /etc/inittab File
On distributions that still use it, the /etc/inittab file contains all the information init needs for starting
runlevels. The format of each line in this file is as follows:
TIP Lines beginning with the pound symbol (#) are comments. Take a peek at your own /etc/ inittab,
and you’ll find that it’s already liberally commented. If you ever need to make a change to
/etc/inittab, you’ll do yourself a favor by including liberal comments that explain what you’ve done.
Table 8-1 explains the significance of each of the four fields of an entry in the /etc/ inittab file.
Table 8-2 defines some common options available for the action field in this file.
Entry
Description
Id
A unique sequence of one to four characters that identifies this entry in
the /etc/inittab file.
Runlevels
The runlevels at which the process should be invoked. Some events
are special enough that they can be trapped at all runlevels (for
instance, the CTRL-ALT-DEL key combination to reboot). To indicate that
an event is applicable to all runlevels, leave this field blank. If you
want something to occur at multiple runlevels, simply list all of them
in this field. For example, the entry 123 would specify something that
runs at runlevels 1, 2, and 3.
Action
Describes what action should be taken. Options for this field are
explained in Table 8-2.
Process
Names the process (or program) to execute when the runlevel is
entered.
Table 8-1. /etc/inittab Entries
Values
Description
Respawn
The process will be restarted whenever it terminates.
Wait
The process will be started once when the runlevel is entered, and
init will wait for its completion.
Once
The process will be started once when the runlevel is entered;
however, init won’t wait for termination of the process before
possibly executing additional programs to be run at that particular
runlevel.
Boot
The process will be executed at system boot. The runlevels field is
ignored in this case.
Bootwait
The process will be executed at system boot, and init will wait for
completion of the boot before advancing to the next process to be run.
Ondemand
The process will be executed when a specific runlevel request occurs.
(These runlevels are a, b, and c.) No change in runlevel occurs.
Initdefault
Specifies the default runlevel for init on startup. If no default is
specified, the user is prompted for a runlevel on console.
Sysinit
The process will be executed during system boot, before any of the
Boot or Bootwait entries.
Powerwait
If init receives a signal from another process that there are problems
with the power, this process will be run. Before continuing, init will
wait for this process to finish.
Powerfail
Same as Powerwait, except that init will not wait for the process to
finish.
Powerokwait
This process will be executed as soon as init is informed that the
power has been restored.
Ctrlaltdel
The process is executed when init receives a signal indicating that
the user has pressed the CTRL-ALT-DEL key combination. Keep in mind
that most X Window System servers capture this key combination, so
init may not receive this signal if the X Window System is active.
Table 8-2. Available Options for the action Field in /etc/inittab
Now let’s look at a sample entry from a /etc/inittab file:
The first line, which begins with the pound sign (#), is a comment entry and is ignored.
The pr is the unique identifier.
1, 2, 3, 4, and 5 are the runlevels at which this process can be activated.
powerokwait
is the condition under which the process is run.
The /sbin/shutdown… command is the process.
The telinit Command
It’s time to ’fess up: The mysterious force that tells init when to change runlevels is actually the
telinit command. This command takes two command-line parameters. One is the desired runlevel
that init needs to know about, and the other is -t sec, where sec is the number of seconds to wait
before telling init.
NOTE Whether init actually changes runlevels is its decision. Obviously, it usually does, or this
command wouldn’t be terribly useful.
It is extremely rare that you’ll ever have to run the telinit command yourself. Usually, this is all
handled for you by the startup and shutdown scripts.
NOTE Under most UNIX implementations (including Linux), the telinit command is really just a
symbolic link to the init program. Because of this, some folks prefer running init with the runlevel
they want rather than using telinit.
systemd
Alas, it turns out that upstart, was just that—an upstart! The latest thing in the open source world of
startup managers is called systemd. It is being aggressively adopted and incorporated into the
mainstream RPM-based distributions such as Fedora, openSUSE, RHEL, CentOS, and so on.
systemd is an incredibly ambitious project that aims to reengineer the way services and other
boot-up procedures have traditionally worked. At the time of this writing, systemd is the de-facto
system and service manager in several mainstream Linux distributions. And it is likely that other
distros that are not currently standardizing around systemd will adopt it in the very near future.
The systemd project’s web page (www.freedesktop.org/wiki/Software/systemd) offers the
following:
systemd is a system and service manager for Linux, compatible with SysV and LSB init scripts.
systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for
starting services, offers on-demand starting of daemons, keeps track of processes using Linux
cgroups, supports snapshotting and restoring of the system state, maintains mount and automount
points and implements an elaborate transactional dependency-based service control logic. It can
work as a drop-in replacement for sysvinit.
The following sections take apart the official description for systemd and try to explain each part.
systemd’s Role
As we’ve seen in the other start-up managers discussed so far, such as upstart and init (System V
init), systemd manages various system startup and shutdown functions; it also manages the startup and
shutdown of services on Linux-based operating systems. systemd goes a little farther with its ability
to play the role of a babysitter/nanny of sorts to services of which it is aware. This means that in
addition to starting up system services, systemd can monitor the services throughout their lifetime and
automatically restart, gather statistics, or report on them if necessary.
Because for the longest time, the traditional way of managing system services has been through the
use of startup shell scripts (System V init), systemd provides compatibility and support for numerous
System V and Linux Standard Base Specification (LSB) init scripts that are in existence.
The systemd Edge
Old school Linux administrators might scoff at the thought of having to learn or relearn the inner
workings of yet another startup manager. But the edge and benefits that systemd provides will make
the effort worthwhile and are also nothing to scoff at.
New Linux administrators, on the other hand, have the benefit of starting with a clean slate and as
such have no preconceived notions of how services are managed on Linux systems.
One of the advantages that systemd brings to service/system management in Linux is its so-called
“aggressive parallelization” capabilities. Put simply, this means that systemd can start several system
services in parallel or concurrently. systemd does away with the traditional approach of starting
sequentially based on the numbering of the corresponding run control (rc) script. This parallelization
simply equates to quicker startup times for Linux systems.
systemd also no longer uses traditional shell scripts to store the configuration information for
services. The often tedious to read shell scripts have been replaced by simpler configuration files.
In addition, systemd records the start and exit time, the process ID (PID), and the exit status of
every process it spawns and supervises. This is useful for troubleshooting daemons or other services.
How systemd Works
systemd uses various Linux concepts and entities. Some of these are described next.
Control Groups (cgroups) cgroups is a kernel-provided facility that allows processes to be
arranged hierarchically and labeled individually. systemd places every process that it starts in a
control group named after its service, and this allows it to keep track of processes and allows
systemd to have a more intimate knowledge and control of a service throughout its life span. For
example, systemd can safely end or kill a process as well as any child processes it might have
spawned.
Socket Activation systemd’s benefits/edge come from its proper and inherent understanding of the
interdependence among system services—that is, it knows what various system services require from
each other. As it turns out, most startup services or daemons actually need only the socket(s) provided
by certain services and not the high level services themselves. Because systemd knows this, it
ensures that any needed sockets are made available very early on in the system startup. It thus avoids
the need to start a service first that provides a service as well as a socket. (If this is still a little
confusing, please see the sidebar “Human Digestive System vs. systemd” for an analogy.)
TIP The two main types of Linux sockets are the file system–related AF_UNIX or AF_LOCAL
sockets and networking-related AF_INET sockets.
The AF_UNIX or AF_LOCAL socket family is used for communicating between processes on
the same machine efficiently. The AF_INET sockets, on the other hand, provide interprocess
communication (IPC) between processes that run on the same machine as well as processes that run
on different machines.
Human Digestive System vs. systemd
The following steps summarize how the normal human digestive system works, to keep us alive.
Among other things, a person needs the nutrients obtained from food to survive.
1. A person obtains the food, opens the mouth, and puts food into the mouth.
2. She chews the food.
3. The food will travel through various digestive tracts—esophagus, stomach, small/large
intestine—and then get mixed with digestive juices, and so on, and be digested.
4. Through the digestive process, the food is broken down into chemicals that are useful to
the body. These chemicals are the nutrients.
5. The nutrients are then absorbed by the body and transported throughout the body for use.
systemd would distill the previous tedious five-step procedure to get essential nutrients for
the body into two steps:
1. Get the food and extract the raw nutrients from it.
2. Inject the raw nutrients directly into human blood stream intravenously.
Units The things, or objects, that systemd manages are called units, and they form the building
blocks of systemd. These objects can include services or daemons, devices, file system entities such
as mount points, and so on. Units are named as their configuration files, and the configurations files
are normally stored under the /etc/systemd/system/ directory. Standard unit configuration files are
stored under the /lib/systemd/system directory. Any needed files must be copied over to the working
/etc/systemd/system/ folder for actual use.
The following types of units exist:
service units These unit types include traditional system daemons or services. These
daemons can be started, stopped, restarted, and reloaded. Here’s an example service unit:
socket units These units consist of local and network sockets that are used for interprocess
communication in a system. They play a very important role in the socket-based activation
feature that helps reduce the interservice dependencies. Here’s an example socket unit:
device units These allow systemd to see and use kernel devices. Here’s an example device
unit:
mount units These are used for mounting and unmounting file systems:
target units systemd uses targets instead of runlevels. Target units are used for logical
grouping of units. They don’t actually do anything by themselves, but instead reference other
units, thereby allowing the control of groups of units together. Here’s an example target unit:
timer units These units are used for triggering activation of other units based on timers.
Here’s an example:
snapshot units These are used to save the state of the set of systemd units temporarily:
TIP You can use the systemctl command to view and list the units of specific types. For example,
to view all the active target units, type:
To view all the active and inactive mount units, type:
To view all the active and inactive units of every type, enter:
xinetd and inetd
The xinetd and inetd programs are popular services on Linux systems; xinetd is the more modern
incarnation of the older inetd. Strictly speaking, a Linux system can run effectively without the
presence of either of them, but some daemons rely solely on the functionality they provide. If you need
either xinetd or inetd, you need it—no two ways about it.
The inetd and xinetd programs are daemon processes. You probably know that daemons are
special programs that, after starting, voluntarily release control of the terminal from which they
started. The main mechanism by which daemons can interface with the rest of the system is via IPC
channels, by sending messages to the system-wide log file or by appending to a file on disk.
inetd functions as a “super-server” to other network server–related processes, such as Telnet,
FTP, TFTP, and so on. It’s a simple philosophy: Not all server processes (including those that accept
new connections) are called upon so often that they require a program to be running in memory all the
time. The main reason for the existence of a super-server is to conserve system resources. So instead
of needing to maintain potentially dozens of services loaded in memory waiting to be used, they are
all listed in inetd’s configuration file, /etc/inetd.conf. On their behalf, inetd listens for incoming
connections. Thus, only a single process needs to be in memory.
A secondary benefit of inetd falls to those processes needing network connectivity but whose
programmers do not want to have to write it into the system. The inetd program will handle the
network code and pass incoming network streams into the process as its standard input (stdin). Any
of the process’s output (stdout) is sent back to the host that has connected to the process.
NOTE Unless you are programming, you don’t have to be concerned with inetd’s stdin/stdout
feature. On the other hand, if you want to write a simple script and make it available through the
network, it’s worth exploring this powerful tool.
As a rule of thumb, low-volume services (such as TFTP) are usually best run through inetd,
whereas higher-volume services (such as web servers) are better run as standalone processes that are
always in memory, ready to handle requests.
Current versions of Fedora, Red Hat Enterprise Linux (RHEL), openSUSE, Mandrake, and even
Mac OS X ship with a newer incarnation of inetd called xinetd—the name is an acronym for
“extended Internet services daemon.” The xinetd program accomplishes the same task as the regular
inetd program: It helps to start programs that provide Internet services. Instead of having such
programs automatically start up during system initialization and remain unused until a connection
request arrives, xinetd instead stands in the gap for those programs and listens on their normal
service ports. As a result, when xinetd hears a service request meant for one of the services it
manages, it starts or spurns the appropriate service.
Inasmuch as xinetd is similar to inetd in function, you should realize that xinetd includes a new
configuration file format and a lot of additional features. The xinetd daemon uses a configuration file
format that is quite different from the classic inetd configuration file format. (Most other variants of
UNIX, including Solaris, AIX, and FreeBSD, use the classic inetd format.) This means that if an
application relies on inetd, you may need to provide some manual adjustments to make it work. Of
course, you should definitely contact the developers of the application and let them know of the
change so that they can release a newer version that works with the new xinetd configuration format
as well.
In this section, we will cover the newer xinetd daemon. If your system uses inetd, you should be
able to view the /etc/inetd.conf file and see the similarities between inetd and xinetd.
NOTE Your Linux distribution might not have the xinetd software installed out of the box. The
xinetd package can be installed with yum on a Fedora distro (or RHEL or CentOS) by running the
following:
On a Debian-based distro such as Ubuntu, xinetd can be installed using APT by running the
following:
The /etc/xinetd.conf File
The /etc/xinetd.conf file consists of a series of blocks that take the following format:
where blockname is the name of the block that is being defined, variable is the name of a variable
being defined within the context of the block, and value is the value assigned to the variable. Every
block can have multiple variables defined within.
One special block is called defaults. Whatever variables are defined within this block are
applied to all other blocks that are defined in the file.
An exception to the block format is the includedir directive, which tells xinetd to read all the
files in a directory and consider them part of the /etc/xinetd.conf file. Any line that begins with a
pound sign (#) is the start of a comment. The stock /etc/ xinetd.conf file that ships with Fedora looks
like this:
NOTE Don’t worry if all of the variables and values aren’t familiar to you yet; we will go over
some of them in a moment. Let’s first make sure you understand the format of the file.
In this example, the first line of the file is a comment explaining what the file is and what it does.
After the comments, you see the first block: defaults. The first variable that is defined in this block
is instances, which is set to the value of 50. Five variables in total are defined in this block, the last
one being cps. Since this block is titled defaults, the variables that are set within it will apply to all
future blocks that are defined. Finally, the last line of the file specifies that the /etc/xinetd.d directory
must be examined for other files that contain more configuration information. This will cause xinetd
to read all the files in that directory and parse them as if they were part of the /etc/xinetd.conf file.
Variables and Their Meanings
Table 8-3 lists some of the variable names that are supported in the /etc/xinetd.conf file.
Variable
Description
id
This attribute is used to identify a service uniquely. This is useful,
because services exist that can use different protocols and that need to
be described with different entries in the configuration file. By
default, the service ID is the same as the service name.
type
Any combination of the following values may be used: RPC if this is
a Remote Procedure Call (RPC) service, INTERNAL if this service
is provided by xinetd, or UNLISTED if this is a service not listed in
the /etc/services file.
disable
This is either the value yes or no. A yes value means that although the
service is defined, it is not available for use.
socket_type
Valid values for this variable are stream, which indicates that this
service is a stream-based service; dgram, which indicates that this
service is a datagram; or raw, which indicates that this service uses
raw IP datagrams. The stream value refers to connection-oriented
TCP data streams (for example, Telnet and FTP). The dgram value
refers to datagram (User Datagram Protocol [UDP]) streams (for
example, the Trivial File Transfer Protocol [TFTP] service is a
datagram-based protocol). Other protocols outside the scope of
TCP/IP do exist, but you’ll rarely encounter them.
protocol
Determines the type of protocol (either TCP or UDP) for the
connection type.
wait
If this is set to yes, only one connection will be processed at a time. If
this is set to no, multiple connections will be allowed by running the
appropriate service daemon multiple times.
user
Specifies the username under which this service will run. The
username must exist in the /etc/passwd file.
group
Specifies the group name under which this service will run. The group
must exist in the /etc/group file.
instances
Specifies the maximum number of concurrent connections this service
is allowed to handle. The default is no limit if the wait variable is set
to nowait.
server
The name of the program to run when this service is connected.
server_args
The arguments passed to the server. In contrast to inetd, the name of
the server should not be included in server_args.
only_from
Specifies the networks from which a valid connection may arrive.
(This is the built-in TCP Wrapper functionality.) You can specify this
in one of three ways: as a numeric address, a host-name, or a network
address with netmask. The numeric address can take the form of a
complete IP address to indicate a specific host (such as 192.168.1.1).
However, if any of the ending octets are zeros, the address will be
treated like a network where all of the octets that are zero are
wildcards (for instance, 192.168.1.0 means any host that starts with
the numbers 192.168.1). Alternatively, you can specify the number of
bits in the netmask after a slash (for example, 192.168.1.0/24 means a
network address of 192.168.1.0 with a netmask of 255.255.255.0).
no_access
The opposite of only_from in that instead of specifying the addresses
from which a connection is valid, this variable specifies the addresses
from which a connection is invalid. It can take the same type of
parameters as only_from.
Determines where logging information for that service will go. There
are two valid values: SYSLOG and FILE. If SYSLOG is specified,
you must specify to which syslog facility to log as well (see “The
Logging Daemon” later in this chapter, for more information on
facilities). For example, you can specify this:
log_type = SYSLOG local0
Optionally, you can include the log level as well:
log_type
log_type = SYSLOG local0 info
If FILE is specified, you must specify which filename to log.
Optionally, you can also specify the soft limit on the file size—where
an extra log message indicating that the file has gotten too large will
be generated. If the soft limit is specified, a hard limit can also be
specified. At the hard limit, no additional logging will be done. If the
hard limit is not explicitly defined, it is set to be 1 percent higher than
the soft limit. Here’s an example of the FILE option:
log_type = FILE /var/log/mylog
log_on_success
Specifies which information is logged on a connection success. The
options include PID to log the process ID of the service that
processed the request, HOST to specify the remote host connecting to
the service, USERID to log the remote username (if available), EXIT
to log the exit status or termination signal of the process, or
DURATION to log the length of the connection.
port
Specifies the network port under which the service will run. If the
service is listed in /etc/services, this port number must equal the
value specified there.
interface
Allows a service to bind to a specific interface and be available only
there. The value is the IP address of the interface to which you want
this service to be bound. An example of this is binding less secure
services (such as Telnet) to an internal and physically secure interface
on a firewall and not allowing the external, more vulnerable interface
outside the firewall.
The first argument specifies the maximum number of connections per
second this service is allowed to handle. If the rate exceeds this
value, the service is temporarily disabled for the second argument
number of seconds. For example:
cps
cps = 50 10
This will disable a service for 10 seconds if the connection rate ever
exceeds 50 connections per second.
Table 8-3. xinetd Configuration File Variables
You do not need to specify all of the variables when defining a service. The only required ones
are the following:
socket_type
user
server
wait
Examples: A Simple Service Entry and Enabling/Disabling a Service
Using the finger service (provided by the finger-server package) as an example, let’s take a look at
one of the simplest entries possible with xinetd:
As you can see, the entry is self-explanatory. The service name is finger, and because of the
socket_type, we know this is a TCP service. The wait variable tells us that multiple finger
processes can be running concurrently. The user variable tells us that “nobody” will be the process
owner. Finally, the name of the process being run is /usr/sbin/in.fingerd.
TIP You can install the finger-server package on a Fedora distro by issuing the command:
With our understanding of a sample xinetd service entry, let’s try to enable and disable another
service.
Enabling/Disabling the Echo Service
If you want a secure system, chances are you will run with only a few services—some people don’t
even run xinetd at all! It takes just a few steps to enable or disable a service. For example, to enable
a service, you would first enable it in the xinetd configuration file (or inetd.conf if you are using
inetd instead), restart the xinetd service, and finally test things out to make sure you have the
behavior you expect. To disable a service, just do the opposite.
NOTE The service we will be exploring is the echo service. This service is internal to xinetd—that
is, it is not provided by any external daemon.
Let’s step through the enable process.
1. Use any plain-text editor to edit the file /etc/xinetd.d/echo-stream and change the variable
disable to no:
TIP On an Ubuntu-based system, the configuration file for the echo service is /etc/xinetd.d/ echo.
The Ubuntu distro goes further to combine the UDP and TCP versions of the echo service in one file.
Fedora, on the other hand, sorts the UDP and TCP versions of the echo service into two separate
files, /etc/xinetd.d/echo-dgram and /etc/xinetd.d/echo-stream.
2. Save your changes to the file, and exit the editor.
3. Restart the xinetd service. Under Fedora or RHEL, type the following:
On a systemd-enabled distro such as Fedora, CentOS, and RHEL, you can alternatively
restart the xinetd service using the systemctl utility like this:
TIP Note that for other distributions that don’t have the service command available, you can send a
HUP signal to xinetd instead. First, find xinetd’s process ID (PID) using the ps command. Then use
the kill command to send the HUP signal to xinetd’s PID. You can verify that the restart worked by
using the tail command to view the last few messages of the /var/log/ messages file. The commands
to find xinetd’s PID, kill xinetd are
4. Telnet to the port (port 7) of the echo service, and see if the service is indeed running:
Your output should be similar to the preceding, if the echo service has been enabled. You can
type any character on your keyboard at the Telnet prompt and watch the character get echoed
(repeated) back to you.
(As you can see, the echo service is one of those terribly useful and life-saving services that
users and system administrators cannot do without!)
This exercise walked you through enabling a service by directly editing its xinetd configuration
file. It is a simple process to enable or disable a service. But you should actually go back and make
sure that the service is indeed disabled (if that is what you want) by testing it, because it is always
better to be safe than sorry. An example of being sorry is “thinking” that you have disabled the
unsecure Telnet service when it is in fact still running!
TIP You can quickly enable or disable a service that runs under xinetd by using the chkconfig
utility, which is available in Fedora, RHEL, openSUSE, and most other flavors of Linux. For
example, to disable the echo-stream service that you manually enabled, just issue the command
chkconfig echo-stream off.
The Logging Daemon
With so much going on at any one time, especially with services that are disconnected from a terminal
window, a standard mechanism by which special events and messages can be logged is required.
Linux distributions have traditionally used the syslogd (sysklogd) daemon to provide this service.
However, more recently, the newer Linux distros are standardizing on other software besides syslogd
for the logging function. All the popular Linux distros appear to have somewhat standardized on the
rsyslog package.
Regardless of the software used, the idea remains the same, and the end results (get system logs)
are mostly the same; the main differences between the new approaches are in the additional feature
sets offered. In this section, we will be concentrating on the logging daemon that ships with
Fedora/CentOS/openSUSE/Ubuntu (rsyslog), with references to syslogd when appropriate.
Managing and configuring rsyslog is similar to the way it is done in syslogd. The new rsyslog
daemon maintains backward-compatibility with the traditional syslog daemon but offers a plethora of
new features as well.
The rsyslog daemon provides a standardized means of performing logging. Many other UNIX
systems employ a compatible daemon, thus providing a means for cross-platform logging over the
network. This is especially valuable in a large heterogeneous environment where it’s necessary to
centralize the collection of log entries to gain an accurate picture of what’s going on. You could
equate this system of logging facilities to the Event Viewer functionality in Windows.
rsyslogd can send its output to various destinations: straight text files (usually stored in the
/var/log directory), Structured Query Language (SQL) databases, other hosts, and more. Each log
entry consists of a single line containing the date, time, host name, process name, PID, and the
message from that process. A system-wide function in the standard C library provides an easy
mechanism for generating log messages. If you don’t feel like writing code but want to generate
entries in the logs, you have the option of using the logger command.
Invoking rsyslogd
If you do find a need to either start rsyslogd manually or modify the script that starts it up at boot,
you’ll need to be aware of rsyslogd’s command-line parameters, shown in Table 8-4.
Parameter
Description
Debug mode. Normally, at startup, rsyslogd detaches itself from the
current terminal and starts running in the background. With the -d
-d
option, rsyslogd retains control of the terminal and prints debugging
information as messages are logged. It’s extremely unlikely that you’ll
need this option.
-f config
Specifies a configuration file as an alternative to the default
/etc/rsyslog.conf.
-h
By default, rsyslogd does not forward messages sent to it that were
destined for another host. This option will allow the daemon to
forward logs received remotely to other forwarding hosts that have
been configured.
-l hostlist
This option lets you list the hosts for which only the simple hostname
should be logged and not the fully qualified domain name (FQDN).
You can list multiple hosts, as long as they are separated by a colon,
like so:
-l ubuntu-serverA:serverB
-m interval
-s domainlist
By default, rsyslogd generates a log entry every 20 minutes as a “just
so you know I’m running” message. This is for systems that might not
be busy. (If you’re watching the system log and don’t see a single
message in more than 20 minutes, you’ll know for a fact that
something has gone wrong.) By specifying a numeric value for
interval, you can indicate the number of minutes rsyslogd should
wait before generating another message. Setting a value of zero for
this option turns it off completely.
If you are receiving rsyslogd entries that show the entire FQDN, you
can have rsyslogd strip off the domain name and leave just the
hostname. Simply list the domain names to remove in a colonseparated list as the parameter to the -s option. Here’s an example:
-s example.com:domain.com
Table 8-4. rsyslogd Command-Line Parameters
Configuring the Logging Daemon
The /etc/rsyslog.conf file contains the configuration information that rsyslogd needs to run. The
default configuration file that ships with most systems is sufficient for most standard needs. But you
may find that you have to tweak the file a little if you want to do any additional fancy things with your
logs—such as sending local log messages to remote logging machines that can accept them, logging to
a database, reformatting logs, and so on.
Log Message Classifications
A basic understanding of how log messages are classified in the traditional syslog daemon way is
also useful in helping you understand the configuration file format for rsyslogd.
Each message has a facility and a priority. The facility tells you from which subsystem the
message originated, and the priority tells you the message’s importance. These two values are
separated by a period. Both values have string equivalents, making them easier to remember. The
combination of the facility and priority makes up the “selector” part of a rule in the configuration file.
The string equivalents for facility and priority are listed in Tables 8-5 and 8-6, respectively.
Facility String Equivalent
Description
auth
Authentication messages
authpriv
Essentially the same as auth
cron
Messages generated by the cron subsystem
daemon
Generic classification for service daemons
kern
Kernel messages
Lpr
Printer subsystem messages
Mail
Mail subsystem messages
Mark
Obsolete, but some books still discuss it; syslogd simply ignores it
News
Messages through the Network News Transfer Protocol (NNTP)
subsystem
security
Same thing as auth; should not be used
syslog
Internal messages from syslog itself
User
Generic messages from user programs
Uucp
Messages from the UUCP (UNIX to UNIX copy) subsystem
Local0-local9
Generic facility levels whose importance can be decided based on
your needs
Table 8-5. String Equivalents for the Facility Value in /etc/rsyslog.conf
Priority String Equivalent
Description
debug
Debugging statements
info
Miscellaneous information
notice
Important statements, but not necessarily bad news
warning
Potentially dangerous situation
warn
Same as warning; should not be used
err
An error condition
error
Same as err; should not be used
crit
Critical situation
alert
A message indicating an important occurrence
emerg
An emergency situation
Table 8-6. String Equivalents for Priority Levels in /etc/rsyslog.conf
NOTE The priority levels are in the order of severity according to syslogd. Thus, debug is not
considered severe at all, and emerg is the most crucial. For example, the combination facility-andpriority string mail.crit indicates there is a critical error in the mail subsystem (for example, it has
run out of disk space). syslogd considers this message more important than mail.info, which may
simply note the arrival of another message.
In addition to the priority levels in Table 8-6, rsyslogd understands wildcards. Thus, you can
define a whole class of messages; for instance, mail.* refers to all messages related to the mail
subsystem.
Format of /etc/rsyslog.conf
rsyslogd’s configuration relies heavily on the concepts of templates. To help you understand the
syntax of rsyslogd’s configuration file, let’s begin by stating a few key concepts:
Templates define the format of log messages. They can also be used for dynamic filename
generation. Templates must be defined before they are used in rules. A template is made of
several parts: the template directive, a descriptive name, the template text, and possibly
other options.
Any entry in the /etc/rsyslog.conf file that begins with a dollar ($) sign is a directive.
Log message properties refer to well-defined fields in any log message. Example common
message properties are shown in Table 8-7.
Property Name (propname) Description
msg
The MSG part of the message; the actual log message
rawmsg
The message exactly as it was received from the socket
HOSTNAME
Hostname from the message
FROMHOST
Hostname of the system from which the message was received (might
not necessarily be the original sender)
syslogtag
TAG from the message
PRI-text
The PRI part of the message in a textual form
syslogfacility-text
The facility from the message in text form
syslogseverity-text
Severity from the message in text form
timereported
Timestamp from the message
MSGID
The contents of the MSGID field
Table 8-7. rsyslog’s Message Property Names
The percentage sign (%) is used to enclose log message properties.
Properties can be modified by the use of property replacers.
Any entry that begins with a pound sign (#) is a comment and is ignored. Empty lines are also
ignored.
rsyslogd Templates
The traditional syslog.conf file can be used with the new rsyslog daemon without any modifications.
rsyslogd’s configuration file is named /etc/rsyslog.conf. As mentioned, rsyslogd relies on the use of
templates, and the templates define the format of logged messages. The use of templates is what
allows the use of a traditional syslog.conf configuration file syntax to be used in rsyslog.conf.
Templates that support the syslogd log message format are hard-coded into rsyslogd and are used by
default.
A sample template that supports the use of the syslogd message format is shown here:
The various fields of this sample template are explained in the following list and in Table 8-7.
$template This directive implies that the line is a template definition.
TraditionalFormat This is a descriptive template name.
%timegenerated% This specifies the timegenerated property.
%HOSTNAME% This specifies the HOSTNAME property.
%syslogtag% This specifies the syslogtag property.
%msg% This specifies the msg property.
\n The backslash is an escape character. Here, the \n implies a new line.
<options> This entry is optional. It specifies options influencing the template as whole.
rsyslogd Rules
Each rule in the rsyslog.conf file is broken down into a selector field, an action field (or target field),
and an optional template name. Specifying a template name after the last semicolon will assign the
respective action to that template. Whenever a template name is missing, a hard-coded template is
used instead. It is, of course, important that you make sure that the desired template is defined before
referencing it.
Here is the format for each line in the configuration file:
Here’s an example:
Selector Field The selector field specifies the combination of facilities and priorities. Here’s an
example selector field entry:
Here, mail is the facility and info is the priority.
Action Field The action field of a rule describes the action to be performed on a message. This
action can range from simple things such as writing the logs to a file or slightly more complex things
such as writing to a database table or forwarding to another host. Here’s an example action field:
This action example indicates that the log messages should be written to the file named
/var/log/messages.
Other common possible values for the action field are described in Table 8-8.
Action Field
Description
Regular file (e.g.,
/var/log/messages)
A regular file. A full path name to the file should be specified and
should begin with a slash (/). This field can also refer to device files,
such as .tty files, or the console, such as /dev/console.
Named pipe (e.g.,
|/tmp/mypipe)
A named pipe. A pipe symbol ( | ) must precede the path to the named
pipe (First In First Out, or FIFO). This type of file is created with the
mknod command. With rsyslogd feeding one side of the pipe, you can
run another program that reads the other side of the pipe. This is an
effective way to have programs parsing log output.
@loghost or @@loghost
A remote host. The at (@) symbol must begin this type of action,
followed by the destination host. A single @ sign indicates that the log
messages should be sent via traditional UDP. And double at (@@)
symbols imply that the logs should be transmitted using TCP instead.
This type of action indicates that the log messages should be sent to
List of users (e.g., yyang,
dude, root)
the list of currently logged-on users. The list of users is separated by
commas (,). Specifying an asterisk (*) symbol will send the specified
logs to all currently logged-on users.
Discard
This action means that the logs should be discarded and no action
should be performed on them. This type of action is specified by the
tilde symbol (~) in the action field.
This type of action is one of the advanced/new features that rsyslogd
supports natively. It allows the log messages to be sent directly to a
configured database table. This type of location needs to begin with
the greater-than symbol (>). The parameters specified after the > sign
Database table (e.g.,
follow a strict order: After the > sign, the database hostname (dbhost)
>dbhost,dbname,dbuser,
must be given, a comma, the database name (dbname), another comma,
dbpassword;<dbtemplate>) the database user (dbuser), a comma, and then the database user’s
password (dbpassword).
An optional template name (dbtemplate) can be specified if a
semicolon is specified after the last parameter.
Table 8-8. Action Field Descriptions
Sample /etc/rsyslog.conf File
Following is a complete sample rsyslog.conf file. The sample is interspersed with comments that
explain what the following rules do.
The cron Program
The cron program allows any user in the system to schedule a program to run on any date, at any time,
or on a particular day of week, down to the minute. Using cron is an extremely efficient way to
automate your system, generate reports on a regular basis, and perform other periodic chores. (Notso-honest uses of cron include having it invoke a system to have you paged when you want to get out
of a meeting!)
Like the other services we’ve discussed in this chapter, cron is started by the boot scripts and is
most likely already configured for you. A quick check of the process listing should show it quietly
running in the background:
The cron service works by waking up once a minute and checking each user’s crontab file. This
file contains the user’s list of events that he or she want executed at a particular date and time. Any
events that match the current date and time are executed.
The crond command itself requires no command-line parameters or special signals to indicate a
change in status.
The crontab File
The tool that allows you to edit entries to be executed by crond is crontab. Essentially, all it does is
verify your permission to modify your cron settings and then invoke a text editor so you can make
your changes. Once you’re done, crontab places the file in the right location and brings you back to a
prompt.
Whether or not you have appropriate permission is determined by crontab by checking the
/etc/cron.allow and /etc/cron.deny files. If either of these files exists, you must be explicitly listed
there for your actions to be effected. For example, if the /etc/cron.allow file exists, your username
must be listed in that file in order for you to be able to edit your cron entries. On the other hand, if the
only file that exists is /etc/cron.deny, unless your username is listed there, you are implicitly allowed
to edit your cron settings.
The file listing your cron jobs (often referred to as the crontab file) is formatted as follows. All
values must be listed as integers.
If you want to have multiple entries for a particular column (for instance, you want a program to
run at 4:00 A.M., 12:00 P.M., and 5:00 P.M.), then you need to include each of these time values in a
comma-separated list. Be sure not to type any spaces in the list. For the program running at 4:00 A.M.,
12:00 P.M., and 5:00 P.M., the Hour values list would read 4,12,17. Newer versions of cron allow
you to use a shorter notation for supplying fields. For example, if you want to run a process every two
minutes, you just need to put /2 as the first entry. Notice that cron uses military time format.
For the Day_Of_Week entry, 0 represents Sunday, 1 represents Monday, and so on, all the way to
6 representing Saturday.
Any entry that has a single asterisk (*) wildcard will match any minute, hour, day, month, or day
of week when used in the corresponding column.
When the dates and times in the file match the current date and time, the command is run as the
user who set the crontab. Any output generated is e-mailed back to the user.
Obviously, this can result in a mailbox full of messages, so it is important to be thrifty with your
reporting. A good way to keep a handle on volume is to output only error conditions and have any
unavoidable output sent to /dev/null.
Let’s look at some examples. The following entry runs the program /bin/ping -c 5 server-B every
four hours:
Here’s the same command using the shorthand method:
Here is an entry that runs the program /usr/local/scripts/backup_level_0 at 10:00 P.M. every
Friday night:
And finally, here’s a script to send out an e-mail at 4:01 A.M. on April 1 (whatever day that might
be):
NOTE When crond executes commands, it does so with the sh shell. Thus, any environment
variables that you might be accustomed to might not work within cron.
Editing the crontab File
Editing or creating a cron job is as easy as editing a regular text file. But you should be aware of the
fact that the program will, by default, use an editor specified by the EDITOR or VISUAL environment
variable. On most Linux systems, the default editor is vi. But you can always change this default to
any editor you are comfortable with by setting the EDITOR or VISUAL environment variable.
Now that you know the format of the crontab configuration file, you need to edit the file. You
don’t do this by editing the file directly; instead, you use the crontab command to edit your crontab
file:
To list what is in your current crontab file, just give crontab the -l argument to display the
content;
According to this output, the user yyang does not currently have anything in the crontab file.
Summary
In this chapter, we discussed some important system services that come with most Linux systems.
These services do not require network support and can vary from host to host, making them useful,
since they can work whether or not the system is in multiuser mode.
Here’s a quick recap of the chapter:
init is the mother of all processes in the system, with a PID of 1. On pure System V–based
distros, it also controls runlevels and can be configured through the /etc/inittab file.
upstart is an alternative program that aims to replace the functionality of init on some Linux
distributions. upstart also offers additional functionality and improvement.
systemd is another alternative to init and upstart. It is a system and service manager for
Linux-based operating systems. A majority of the popular distros are standardizing around it.
It offers several benefits and advanced features in comparison to any of the currently
available solutions.
inetd, although barely used anymore, is the original super-server that listens to server
requests on behalf of a large number of smaller, less frequently used services. When it
accepts a request for one of those services, inetd starts the actual service and quietly
forwards data between the network and actual service. Its configuration file is
/etc/inetd.conf.
xinetd is the modern replacement for the classic inetd super-server. It offers more
configuration options and better built-in security. Its main configuration file is
/etc/xinetd.conf.
rsyslog is the system-wide logging daemon used on Fedora, openSUSE, Ubuntu, and other
popular distros. It can act as a drop-in replacement for the more common and traditional
sysklog daemon. Some of the advanced features of rsyslogd include writing logs directly to
a configured database and allowing other extensive manipulation of log messages.
Finally, the cron service allows you to schedule events to occur at certain dates and times,
which is great for periodic events, such as backups and e-mail reminders. All the
configuration files on which it relies are handled via the crontab program.
In each section of this chapter, we discussed how to configure a different service, and even
suggested some uses beyond the default settings that come with the system. Try poking around these
services and familiarize yourself with what you can accomplish with them. Many powerful
automation, data collection, and analysis tools have been built around these basic services—as well
as many wonderfully silly and useless things. Don’t be afraid to have fun with them!
CHAPTER 9
The Linux Kernel
ne of Linux’s greatest strengths is that its source code is available to anyone who wants it.
The GNU GPL (General Public License) under which Linux is distributed even allows you to
tinker with the source code and distribute your changes! Real changes to the source code (at
least, those to be taken seriously) go through the process of joining the official kernel tree. This
requires extensive testing and proof that the changes will benefit Linux as a whole. At the end of the
approval process, the code gets a final yes or no from a core group of the Linux project’s original
developers. It is this extensive review process that keeps the quality of Linux’s code so noteworthy.
For system administrators who have used other proprietary operating systems, this approach to
code control is a significant departure from the philosophy of waiting for “the” company to release a
patch, a service pack, or some sort of hotfix.
Instead of having to wade through public relations, customer service, sales engineers, and other
front-end units behind a proprietary operating system, in the Linux world you have the option of
contacting the author of a kernel subsystem directly and explaining your problem. A patch can be
created and sent to you before the next official release of the kernel to get you up and running.
Of course, the flip side of this working arrangement is that you need to be able to compile a kernel
yourself rather than rely on someone else to supply precompiled code. However, you won’t have to
do this often, because production environments, once stable, rarely need a kernel compile. But if need
be, you should know what to do. Luckily, it’s not difficult.
In this chapter, we’ll walk through the process of acquiring a kernel source tree, configuring it,
compiling it, and, finally, installing the end result.
O
What Exactly Is a Kernel?
Before we jump into the process of compiling, let’s back up a step and make sure you’re clear on the
concept of what a kernel is and the role it plays in the system. Most often, when people say “Linux,”
they are usually referring to a “Linux distribution”—for example, Debian is a type of Linux
distribution. As discussed in Chapter 1, a distribution comprises everything necessary to get Linux to
exist as a functional operating system. Distributions make use of code from various open source
projects that are independent of Linux; in fact, many of the software packages maintained by these
projects are used extensively on other UNIX-like platforms as well. The GNU C Compiler, for
example, which comes with most Linux distributions, also exists on many other operating systems
(probably more systems than most people realize).
So, then, what does make up the pure definition of Linux? The kernel. The kernel of any operating
system is the core of all the system’s software. The only thing more fundamental than the kernel is the
system hardware itself.
The kernel has many jobs. The essence of its work is to abstract the underlying hardware from the
software and provide a running environment for application software through system calls.
Specifically, the environment must handle issues such as networking, disk access, virtual memory,
and multitasking—a complete list of these tasks would take up an entire chapter in itself! Today’s
Linux kernel (version 3*, where the asterisk is a wildcard that represents the complete version
number of the kernel) contains more than 6 million lines of code (including device drivers). By
comparison, the sixth edition of UNIX from Bell Labs in 1976 had roughly 9000 lines. Figure 9-1
illustrates the kernel’s position in a complete system.
Figure 9-1. A visual representation of how the Linux kernel fits into a complete system
Although the kernel is a small part of a complete Linux distribution, it is by far the most critical
element. If the kernel fails or crashes, the rest of the system goes with it. Happily, Linux can boast of
its kernel stability. Uptimes (the length of time in between reboots) for Linux systems are often
expressed in years.
CAUTION The kernel is the first thing that loads when a Linux system is booted (after the boot
loader, of course!). If the kernel doesn’t work right, it’s unlikely that the rest of the system will boot.
Be sure to have an emergency or rescue boot medium handy in case you need to revert to an old
configuration. (See the section on GRUB in Chapter 6.)
Finding the Kernel Source Code
Your distribution of Linux probably has the source code for the specific kernel version(s) it supports
available in one form or another. These could be in the form of a compiled binary (*.src.rpm), a
source RPM (*.srpm), or the like.
If you need to download a different (possibly newer) version than the one your particular Linux
distribution provides, the first place to look for the source code is at the official kernel web site:
www.kernel.org. This site maintains a listing of web sites mirroring the kernel source, as well as tons
of other open source software and general-purpose utilities.
The main kernel.org site is mirrored around different parts of the world. The list of mirrors is
maintained at www.kernel.org/mirrors/. Although you can connect to any of the mirrors, you’ll most
likely get the best performance by sticking to your own country or any country closest to you.
Getting the Correct Kernel Version
The web site listing of kernels available will contain folders for v1.0, v1.1, v2.5, v2.6, v3.0, v3.6,
and so forth. Before you follow your natural inclination to get the latest version, make sure you
understand how the Linux kernel versioning system works.
Because Linux’s development model encourages public contributions, the latest version of the
kernel must be accessible to everyone, all the time. This presents a problem, however: Software that
is undergoing significant updates may be unstable and not of production quality.
To circumvent this problem, early Linux developers adopted a system of using odd-numbered
kernels (1.1, 1.3, 2.1, 2.3, and so on) to indicate a design-and-development cycle. Thus, the oddnumbered kernels carry the disclaimer that they might not be stable and should not be used for
situations for which reliability is a must. These development kernels are typically released at a high
rate because there is so much activity around them—new versions of development kernels can be
released as often as twice a week!
On the other hand, even-numbered kernels (1.0, 1.2, 2.0, 2.2, 2.4 and 2.6) are considered readyfor-production systems. They have been allowed to mature under the public’s usage and scrutiny.
Unlike development kernels, production kernels are released at a much slower rate and contain
mostly bug fixes.
Alas—that was then and this is now. The previous kernel naming and versioning convention
ended with the 2.6 series kernel. The latest of the Linux kernels is Linux 3.x series.
TIP Understanding the naming convention and philosophical reasoning behind the older Linux
kernel versions such as the Linux 2.6 series is important because, even though the names are no longer
current, you are guaranteed to find countless instances (such as in smartphones, servers, desktops,
embedded devices, and so on) of those kernels out in the wild, because of their massive adoption and
large user base. And this is guaranteed to remain true for a very long time to come.
The current convention is to name and number major new kernel releases as “Linux 3.x”. Thus the
first of this series will be Linux version 3.0 (same as 3.0.0), the next will be Linux version 3.1 (same
as 3.1.0), followed by Linux version 3.2, and so on and so forth.
But wait, because it doesn’t end there—any minor changes or updates within each major release
version will be reflected by increments to the third digit. These are commonly referred to as stable
point releases. Thus, the next stable point release for the 3.0.0 series kernel will be Linux version
3.0.1, followed by version 3.0.2, and so on and so forth. Another way of stating this is to say, for
example, that Linux version 3.0.4 will be the fourth stable release based on the Linux 3.0.0 series.
The version of the kernel that we are going to use in the following section is version 3.2, which is
available at www.kernel.org/pub/linux/kernel/v3.x/linux-3.2.tar.bz2.
TIP You can use the wget utility to download the kernel source quickly into your current working
directory by typing the following:
Unpacking the Kernel Source Code
Most of the software packages you have dealt with so far have probably been Red Hat Package
Manager (RPM) or .deb packages, and you’re most likely accustomed to using the tools that came
with the system (such as RPM, Advanced Packaging Tool [APT], Yum, or YaST) to manage the
packages. Kernel source code is a little different and requires some user interaction.
The kernel source consists of a bunch of different files, and because of the sheer number and size
of these files collectively, it is useful to compress the files and put them all in a single directory
structure. The kernel source that you will download from the Internet is a file that has been
compressed and tarred. Therefore, to use the source, you need to decompress and untar the source
file. This is what it means to “unpack the kernel.” Overall, it’s really a straightforward process.
The traditional location for the kernel source tree on the local file system is the /usr/src directory.
For the remainder of this chapter, we’ll assume you are working out of the /usr/src directory.
NOTE Some Linux distributions have a symbolic link under the /usr/src directory. This symbolic
link is usually named “linux” and is usually a link to a default or the latest kernel source tree. Some
third-party software packages rely on this link to compile or build properly!
Let’s go through the steps to unpack the kernel. First, copy the kernel tarball that you downloaded
earlier into the /usr/src directory:
Change your working directory to the /usr/src/ directory and use the tar command to unpack and
decompress the file:
You might hear your hard disk whir for a bit as this command runs—the kernel source is, after all,
a large file!
TIP Take a moment to check out what’s inside the kernel source tree. At the very least, you’ll get a
chance to see what kind of documentation ships with a stock kernel. A good portion of the kernel
documentation is conveniently stored in the Documentation directory at the root of the kernel source
tree.
Building the Kernel
So now you have an unpacked kernel tree just waiting to be built. In this section, we’re going to
review the process of configuring and building a kernel. This is in contrast to Windows-based
operating systems, which come preconfigured and therefore contain support for many features you
may or may not want.
The Linux design philosophy allows the individual to decide on the important parts of the kernel.
For example, if you don’t have a SCSI subsystem, what’s the point in wasting memory to support it?
This individualized design has the important benefit of letting you thin down the feature list so that
Linux can run as efficiently as possible. This is also one of the reasons why it is possible to run Linux
in various hardware setups, from low-end systems, to embedded systems, to high-end systems. You
may find that a box incapable of supporting a Windows-based server is more than capable of
supporting a Linux-based OS.
Two steps are required in building a kernel: configuring and compiling. We won’t get into the
specifics of configuration in this chapter, which would be difficult because of the fast-paced
evolution of the Linux kernel. However, once you understand the basic process, you should be able to
apply it from version to version. For the sake of this discussion, we’ll cite examples from the v3.*
kernel that we unpacked in the previous section.
The first step in building the kernel is configuring its features. Usually, your desired feature list
will be based on whatever hardware you need to support. This, of course, means that you’ll need a
list of that hardware.
On a system that is already running Linux, the following command will list all hardware
connected to the system via the Peripheral Component Interconnect (PCI) bus:
NOTE If the lspci command is missing on your system, you can install the program by installing
the pciutils*.rpm package.
You can alternatively use the lshw command to obtain detailed information about the hardware
setup on your system:
NOTE If the lshw command is missing on your system, you can install the program by installing the
package lshw*.rpm.
Having a better understanding of what constitutes your underlying hardware can help you better
determine what you need in your custom kernel. You’re ready to start configuring the kernel.
Avoid Needless Upgrades
Bear in mind that if you have a working system that is stable and well behaved, there is little
reason to upgrade the kernel unless one of these conditions holds for you:
A security fix is affecting your system and must be applied.
You need a specific new feature in a stable release.
A specific bug fix affects you.
In the case of a security fix, decide whether the risk really affects you—for example, if the
security issue is found in a device driver that you don’t use, then there is no reason to upgrade. In
the case of a bug fix release, read carefully through the release notes and decide if the fixes
really affect you—if you have a stable system, upgrading the kernel with patches you never use
may be pointless. On production systems, the kernel shouldn’t simply be upgraded just to have
“the latest kernel”; you should have a truly compelling reason to upgrade.
Preparing to Configure the Kernel
With a rough idea of the types of hardware and features that our new kernel needs to support, we can
begin the actual configuration. But first, some background information.
The Linux kernel source tree contains several files named Makefile (a makefile is simply a text
file that describes the relationships among the files in a program). These makefiles help to glue
together the thousands of other files that make up the kernel source. What is more important to us here
is that the makefiles also contain targets. The targets are the commands, or directives, that are
executed by the make program.
The Makefile in the root of the kernel source tree contains specific targets that can be used in
prepping the kernel build environment, configuring the kernel, compiling the kernel, installing the
kernel, and so on. Some of the targets are discussed in more detail here:
This target cleans up the build environment of any stale files and
dependencies that might have been left over from a previous kernel build. All previous
kernel configurations will be cleaned (deleted) from the build environment.
make mrproper
This target does not do as thorough a job as the mrproper target. It deletes
only most generated files. It does not delete the kernel configuration file (.config).
make clean
This target invokes a text-based editor interface with menus, option lists,
and text-based dialog boxes for configuring the kernel.
make menuconfig
This is an X Window System–based kernel configuration tool that relies on
the Qt graphical development libraries. These libraries are used by KDE-based applications.
make xconfig
This target also invokes an X Window System–based kernel configuration
tool, but it relies on the GTK (GIMP) toolkit. This GTK toolkit is heavily used in the
GNOME desktop world.
make gconfig
This target will show you all the other possible make targets and also serves as a
quick online help system.
make help
To configure the kernel in this section, we will use only one of the targets. In particular, we will
use the make xconfig command. The xconfig kernel config editor is one of the more popular tools
for configuring the Linux 3.x–series kernels. The graphical editor has a simple and clean interface and
is almost intuitive to use.
We need to change (cd) into the kernel source directory, after which we can begin the kernel
configuration. But before beginning the actual kernel configuration, you should clean (prepare) the
kernel build environment by using the make mrproper command:
Kernel Configuration
Next, we will step through the process of configuring a Linux 3.* series kernel. To explore some of
the innards of this process, we will enable the support of a specific feature that we’ll pretend must be
supported on the system. Once you understand how this works, you can apply the same procedure to
add support for any other new kernel feature that you want. Specifically, we’ll enable support for the
NTFS file system into our custom kernel.
Most modern Linux distros that ship with the 3.* series kernels (remember that the asterisk
symbol is a wildcard that represents the complete version number of the kernel) also have a kernel
configuration file for the running kernel available on the local file system as a compressed or regular
file. On our sample system that runs the Fedora distro, this file resides in the /boot directory and is
usually named something like config-3.*. The configuration file contains a list of the options and
features that were enabled for the particular kernel it represents. A config file similar to this one is
what we aim to create through the process of configuring the kernel. The only difference between the
file we’ll create and the ready-made one is that we have added further customization to ours.
TIP Using a known, preexisting config file as a framework for creating our own custom file helps
ensure that we don’t waste too much time duplicating the efforts that other people have already put
into finding what works and what doesn’t work!
The following steps cover how to compile the kernel after you have first gone through the
configuration of the kernel. We will be using a graphical kernel configuration utility, so your X
Window System needs to be up and running.
1. To begin, we’ll copy over and rename the preexisting config file from the /boot directory into
our kernel build environment:
We use ′uname -r′ here to help us obtain the configuration file for the running kernel. The
uname -r command prints the running kernel’s release. Using it helps ensure that we are
getting the exact version we want, just in case other versions are present.
NOTE The Linux kernel configuration editor specifically looks for and generates a file named
.config at the root of the kernel source tree. This file is hidden.
2. Launch the graphical kernel configuration utility:
A window similar to this will appear:
If the preceding command complains about some missing dependencies, you probably don’t
have the appropriate Qt development environment and a few other necessary packages.
Assuming that you are connected to the Internet and that you are running a Fedora distro, you
can take care of its complaints by using Yum to install the proper package(s) over the Internet
by typing the following:
Or, on an openSUSE system, use YaST to install the required dependencies:
The kernel configuration window that appears is divided into three panes. The left pane
shows an expandable tree-structured list of the overall configurable kernel options. The
upper-right pane displays the detailed configurable options of the parent option that currently
has the focus in the left pane. Finally, the lower-right pane displays useful help information
for the currently selected configuration item.
3. We will examine one very important option a little more closely by selecting it in the left
pane. Click the Enable Loadable Module Support item in the left pane. Make sure the check
box is ticked to enable the option. On almost all Linux distributions, you will see that the
support for this feature is enabled by default. Now study the inline help information that
appears in the lower-right pane, as shown in the following illustration:
4. Next we’ll add support for NTFS into our custom kernel. In the left pane, scroll through the
list of available sections, and select and expand the File Systems section. Then select
DOS/FAT/NT Filesystems under that section.
5. In the upper-right pane, click the box next to the NTFS File System Support option so that a
little dot appears in it. Then select the boxes beside the NTFS Debugging Support and NTFS
Write Support options. A check mark should appear in each box, like the ones shown here,
when you are done:
NOTE For each option, in the upper-right pane, a blank box indicates that the feature in question is
disabled. A box with a check mark indicates that the feature is enabled. A box with a dot indicates
that the feature is to be compiled as a module. Selecting the box repeatedly will cycle through the
three states.
6. Finally, save your changes to the .config file in the root of your kernel source tree. From the
menu bar of the kernel configuration window, choose File | Save.
TIP To view the results of the changes you made using the qconf GUI tool, use the grep utility to
view the .config file that you saved directly. Type the following:
7. Close the kernel configuration window when you are done.
A Quick Note on Kernel Modules
Loadable module support is a kernel feature that allows the dynamic loading (or removal) of
kernel modules. Kernel modules are small pieces of compiled code that can be dynamically
inserted into the running kernel, rather than being permanently built into the kernel. Features not
often used can thus be enabled, but they won’t occupy any room in memory when they aren’t
being used. Thankfully, the kernel can automatically determine what to load and when. Naturally,
not every feature is eligible to be compiled as a module. The kernel must know a few things
before it can load and unload modules, such as how to access the hard disk and parse through the
file system where the loadable modules are stored. Some kernel modules are also commonly
referred to as drivers.
Compiling the Kernel
In the preceding section, we walked through the process of creating a configuration file for the custom
kernel that we want to build. In this section, we will perform the actual build of the kernel. But before
doing this, we will add one more simple customization to the entire process.
The final customization will be to add an extra piece of information used in the final name of our
kernel. This will help us be able to differentiate this kernel absolutely from any other kernel with the
same version number. We will add the tag “custom” to the kernel version information. This can be
done by editing the main Makefile and appending the tag that we want to the EXTRAVERSION
variable.
The compilation stage of the kernel-building process is by far the easiest, but it also takes the
most time. All that is needed at this point is simply to execute the make command, which will then
automatically generate and take care of any dependency issues, compile the kernel itself, and compile
any features (or drivers) that were enabled as loadable modules.
Because of the amount of code that needs to be compiled, be prepared to wait a few minutes, at
the very least, depending on the processing power of your system. Let’s dig into the specific steps
required to compile your new kernel.
1. First we’ll add an extra piece to the identification string for the kernel we are about to build.
While still in the root of the kernel source tree, open up the Makefile for editing with any text
editor. The variable we want to change is close to the top of the file. Change the line in the
file that looks like this:
To this:
2. Save your changes to the file, and exit the text editor.
TIP Most modern systems have more than a single central processing unit (CPU) core. In addition
some server-grade hardware might even have more than one physical CPU with multiple cores. You
can take advantage of all that extra processing power on the system and speed up the process when
performing CPU intensive operations like compiling the kernel. To do this, you can pass a parameter
to the make command that specifies the number of jobs to run simultaneously. The specified number of
jobs are then distributed and executed simultaneously on each CPU core. The syntax for the command
is
where N is the number of jobs to run simultaneously. For example, if you have a Quad (4) core–
capable CPU, you can type:
3. The only command that is needed here to compile the kernel is the make command:
4. The end product of this command (that is, the kernel) is sitting pretty and waiting in the path
5. Because we compiled portions of the kernel as modules (for example, the NTFS module), we
need to install the modules. Type the following:
On a Fedora system, this command will install all the compiled kernel modules into the
/lib/modules/<new_kernel-version> directory. In this example, this path will translate to the
/lib/modules/3.2.0-custom/directory. This is the path from which the kernel will load all loadable
modules, as needed.
Installing the Kernel
So now you have a fully compiled kernel just waiting to be installed. You probably have a couple of
questions: Just where is the compiled kernel, and where the heck do I install it?
The first question is easy to answer. Assuming you have a PC and are working out of the
/usr/src/<kernel-source-tree>/ directory, the compiled kernel that was created in the previous
exercise will be called /usr/src/<kernel-source-tree>/arch/x86/boot/bzImage or, to be precise,
/usr/src/linux-3.2/arch/x86/boot/bzImage. The corresponding map file for this will be located at
/usr/src/<kernel-source-tree>/System.map. You’ll need both files for the install phase.
The System.map file is useful when the kernel is misbehaving and generating “Oops” messages.
An “Oops” is generated on some kernel errors because of kernel bugs or faulty hardware. This error
is akin to the Blue Screen of Death (BSOD) in Microsoft Windows. These messages include a lot of
detail about the current state of the system, including several hexadecimal numbers. System.map
gives Linux a chance to turn those hexadecimal numbers into readable names, making debugging
easier. Although this is mostly for the benefit of developers, it can be handy when you’re reporting a
problem.
Let’s go through the steps required to install the new kernel image.
1. While in the root of your kernel build directory, copy and rename the bzImage file into the
/boot directory:
Here, kernel-version is the version number of the kernel. For the sample kernel we are
using in this exercise, the filename would be vmlinuz-3.2.0-custom. So here’s the exact
command for this example:
NOTE The decision to name the kernel image vmlinuz-3.2.0-custom is somewhat arbitrary. It’s
convenient, because kernel images are commonly referred to as vmlinuz, and the suffix of the version
number is useful when you have multiple kernels available. Of course, if you want to have multiple
versions of the same kernel (for instance, one with SCSI support and the other without it), then you
will need to design a more representative name. For example, you can choose a name like vmlinuz3.2.0-wireless for the kernel for a laptop running Linux that has special wireless capabilities.
2. Now that the kernel image is in place, copy over and rename the corresponding System.map
file into the /boot directory using the same naming convention:
3. With the kernel in place, the System.map file in place, and the modules in place, we are now
ready for the final step. Type the following:
Here, kernel-version is the version number of the kernel. For the sample kernel we are using
in this exercise, the kernel version is 3.2.0-custom. So the exact command for this example is
this:
The new-kernel-pkg command used here is a nifty little shell script. It might not be available in
every Linux distribution, but it is available in Fedora, RHEL, and openSUSE. It automates a lot of the
final things we’d ordinarily have to do manually to set up the system to boot the new kernel we just
built. In particular, it does the following:
It creates the appropriate initial RAM disk image (the initrd image—that is, the
/boot/initrd-<kernel-version>.img file). To do this manually on systems where new-kernelpkg is not available, use the -mkinitrd command.
It runs the depmod command (which creates a list of module dependencies).
It updates the boot loader configuration. For systems running the legacy versions of GRUB,
this will be the /boot/grub/grub.conf or /boot/grub/menu.lst file. And for systems running
the newer versions of GRUB2, the file will be /boot/grub2/grub.cfg.
On a Fedora system running the legacy version of GRUB, a new entry similar to the one shown
here will be automatically added to the grub.conf file after running the preceding command:
On systems running the newer GRUB2, a new entry similar to the one here will be added to the
/boot/grub2/grub.cfg file:
NOTE The one thing that the new-kernel-pkg command does not do is automatically make the
most recent kernel installed the default kernel to boot. So you might have to select the kernel that you
want to boot manually from the boot loader menu while the system is booting up. Of course, you can
change this behavior by manually editing the /boot/grub/menu.lst file using any text editor (see
Chapter 6).
Booting the Kernel
The next stage is to test the new kernel to make sure that your system can indeed boot with it.
1. Assuming you did everything the exact way that the doctor prescribed and that everything
worked out exactly as the doctor said it would, you can safely reboot the system and select the
new kernel from the boot loader menu during system bootup:
2. After the system boots up, you can use the uname command to find out the name of the current
kernel:
3. You will recall that one of the features that we added to our new kernel was the ability to
support the NTFS file system. Make sure that the new kernel does indeed have support for
NTFS by displaying information about the NTFS module:
TIP Assuming you indeed have an NTFS-formatted file system that you want to access, you can
manually load the NTFS module by typing this:
The Author Lied—It Didn’t Work!
The kernel didn’t fly, you say? It froze in the middle of booting? Or it booted all the way and then
nothing worked, right? First and foremost, don’t panic. This kind of problem happens to everyone,
even the pros. After all, they’re more likely to try untested software first. So don’t worry—the
situation is most definitely reparable.
First, notice that a new entry was added to the boot loader configuration file
(/boot/grub/menu.lst file for GRUB legacy systems or /boot/grub2/grub.cfg for GRUB2 systems)
and any existing entry was not removed. This allows you to safely fall back to the old kernel that you
know works and boot into it. Reboot, and at the GRUB menu, select the name of the previous kernel
that was known to work. This action should bring you back to a known system state.
Now go back to the kernel configuration and verify that all the options you selected will work for
your system. For example, did you accidentally enable support for the Sun UFS file system instead of
Linux’s ext4 file system? Did you set any options that depended on other options being set?
Remember to view the informative Help screen for each kernel option in the configuration interface,
making sure that you understand what each option does and what you need to do to make it work right.
When you’re sure you have your settings right, step through the compilation process again and
reinstall the kernel. Creating an appropriate initial RAM disk image (initrd file) is also important
(see man mkinitrd). If you are running GRUB legacy, you simply need to edit the /boot/grub/menu.lst
file, create an appropriate entry for your new kernel, and then reboot and try again.
Don’t worry—each time you compile a kernel, you’ll get better at it. When you do make a
mistake, it’ll be easier to go back, find it, and fix it.
Patching the Kernel
Like any other operating system, Linux periodically requires upgrades to fix bugs, improve
performance, improve security, and add new features. These upgrades come out in two forms: in the
form of a complete new kernel release and in the form of a patch. The complete new kernel works
well for people who don’t have at least one complete kernel already downloaded. For those who do
have a complete kernel already downloaded, patches are a much better solution because they contain
only the changed code and, as such, are quicker to download.
Think of a patch as comparable to a Windows hotfix or service pack. By itself, it’s useless, but
when added to an existing version of Windows, you (hopefully) get an improved product. The key
difference between hotfixes and patches is that patches contain the changes in the source code that
need to be made. This allows you to review the source code changes before applying them. This is
much nicer than hoping a fix won’t break the system!
You can find out about new patches to the kernel at many Internet sites. Your distribution vendor’s
web site is a good place to start; it’ll list not only kernel updates, but also patches for other packages.
A primary source is the official Linux Kernel archive at www.kernel.org. (That’s where we got the
complete kernel to use as the installation section’s example.)
In this section, you’ll learn how to apply a patch to update Linux kernel source version 3.2 to
version 3.2.3. The exact patch file that we will use is named patch-3.2.3.bz2.
Downloading and Applying Patches
Patch files are located in the same directory from which the kernel is downloaded. This applies to
each major release of Linux; so, for example, the patch to update Linux version 3.0 to Linux version
3.0.11 might be located at www.kernel.org/pub/linux/kernel/v3.0/patch-3.0.11.bz2. The test patches
(or point release candidates) are stored at the www.kernel.org web site under the
/pub/linux/kernel/v<X.Y>/testing/ directory—where X and Y represent the kernel version number.
Each patch filename is prefixed with the string “patch” and suffixed with the Linux version
number being installed by the patch.
Note that when dealing with patches related to major kernel versions in the linux-3.X series, these
need to be applied in an incremental manner. Each major patch brings Linux up by only one version.
This means, for example, that to go from linux-3.1 to linux-3.3 you’ll need two patches, patch-3.2
and patch-3.3, and these patches must be applied in order (incrementally).
Note also that when dealing with patches within the 3.X.Y kernels (the stable release kernels) the
patches are not incremental. Thus the patch-X.Y.Z file can only be applied to the base linux-X.Y. For
example, if you have a base linux-3.2.3 and want to bring it up to Linux version 3.2.5, you’ll first
need to revert your stable linux-3.2.3 kernel source back to its base linux-3.2 and then apply the new
patch for linux-3.2.5 (patch-3.2.5).
Patch files are stored on the server in a compressed format. In this example, we’ll be using patch3.2.3.bz2 (obtained from www.kernel.org/pub/linux/kernel/v3.x/patch-3.2.3.bz2). You will also need
the actual kernel source tarball that you want to upgrade. In this example, we’ll use the kernel source
that was downloaded from www.kernel.org/pub/linux/kernel/v3.x/linux-3.2.tar.gz.
Once you have the files from the www.kernel.org site (or mirror), move them to the /usr/src
directory. We’ll assume that you unpacked the kernel source that you want to upgrade into the
/usr/src/linux-3.2 directory. You will next decompress the patch using the bzip2 utility, and then
pipe the resulting output to the patch program, which will then do the actual work of
patching/updating your kernel.
1. Copy the compressed patch file that you downloaded into a directory one level above the root
of your target kernel source tree. Assuming, for example, that the kernel you want to patch has
been untarred into the /usr/src/linux-3.2/ directory, you would copy the patch file into the
/usr/src/ directory.
2. First, change your current working directory to the top level of the kernel source tree. This
directory in our example is /usr/src/linux-3.2/. Type the following:
3. It is a good idea to do a test run of the patching process to make sure there are no errors and
that the new patch will indeed apply cleanly:
Kernel Release Candidates
You might sometimes see kernel patch files with names like patch-3.6-rc2.bz2 available at the
www.kernel.org web site. The “rc2” in this example, which makes up part of the patch name and
version (and hence, the final kernel version), means that the patch in question is the “release
candidate 2” patch that can be used to upgrade the appropriate kernel source tree to Linux kernel
version 3.6-rc2. The same goes for a patch file named patch-3.6-rc6.bz2—which will be a
“release candidate 6”—and so on.
The -rcX patches are not incremental. They can be applied to “base” kernel versions. For
example, an -rc7 patch named patch-2.6.39-rc7 should be applied on top of the base 2.6.38
kernel source. This could require that any patches that might have been applied on top of the
2.6.38 kernel be removed first. So assuming we are currently running a kernel version 2.6.38.8,
we need to first download patch-2.6.38.8.bz2 (from
ftp://ftp.kernel.org/pub/linux/kernel/v2.6/patch-2.6.38.8.bz2), decompress the file (bunzip2
patch-2.6.38.8.bz2), and finally use the patch command patch -p1 -R < ../patch2.6.38.8 to downgrade/revert to a base 2.6.38 kernel.
4. Assuming the preceding command ran successfully without any errors, you’re now ready to
apply the patch. Run this command to decompress the patch and apply it to your kernel:
Here, ../patch-3.2.3.bz2 is the name and path to the patch file. A stream of filenames is
printed out to your screen. Each of those files has been updated by the patch file. If any
problems occurred with the upgrade, you will see them reported here.
If the Patch Worked
If the patch worked and you received no errors, you’re just about done! You can rename the directory
holding the patched kernel source tree to reflect the new version. Here’s an example:
All that finally needs to be done is to recompile the kernel. Just follow the steps in the section
“Compiling the Kernel,” earlier in this chapter.
If the Patch Didn’t Work
If you received errors during the process of patching the kernel, don’t despair. This probably means
one of two things:
The patch version number cannot be applied to the kernel version number (for instance, you
tried to apply patch-2.6.50.bz2 to Linux-2.6.60).
The kernel source itself has changed. (This happens to developers who forget that they made
changes!)
The easiest way to fix either situation is to erase the kernel located in the directory where you
unpacked it and then unpack the full kernel there again. This will ensure that you have a pristine
kernel. Then apply the patch. It’s tedious, but if you’ve done it once, it’s easier and faster the second
time. Finally, a vanilla kernel source tree contains great documentation about kernel patching. The file
is usually found here: <kernel-source>/Documentation/applying-patches.txt.
TIP You can usually back out of (remove) any patch that you apply by using the -R option with the
patch command. For example, to back out of a patch version 2.6.39 that was applied to Linux kernel
version 2.6.38, while in the root of the kernel source tree, you would type this:
Remember that backing out of a patch can be risky at times, and it doesn’t always work—that is, your
mileage may vary!
Summary
In this chapter, we discussed the process of configuring and compiling the Linux kernel. This isn’t
exactly a trivial process, but doing it gives you the power of fine-grained control over your computer
that simply isn’t possible with most other operating systems. Compiling the kernel is basically a
straightforward process. The Linux development community has provided excellent tools that make
the process as painless as possible. In addition to compiling kernels, we walked through the process
of upgrading kernels using the patches available from the Linux Kernel web site, www.kernel.org.
When you compile a kernel for the first time, do it on a non-production machine, if possible. This
gives you a chance to take your time and fiddle with the many operational parameters that are
available. It also means you won’t annoy your users if something goes wrong!
For programmers curious about the kernel’s innards, many references are available in the form of
books and web sites, and, of course, the source code itself is the ultimate documentation.
CHAPTER
10
Knobs and Dials: Virtual File
Systems
ost operating systems offer a mechanism by which the insides of the OS can be probed and
operational parameters can be set when needed. In Linux, this mechanism is provided by the
so-called virtual or pseudo-file systems. The proc file system is a popular virtual file system
on Linux-based OSs. The /proc directory is the mount point for the proc file system, and thus the two
terms (proc vs. /proc) are often used interchangeably.
Other popular operating systems also make use of virtual file systems in different forms and to
varying degrees. For example, Microsoft Windows systems make use of the Registry, which allows
manipulation of system runtime parameters to some degree. Solaris OS also makes use of the proc file
system and can be manipulated through the use of the ndd tool.
In this chapter, we discuss the proc file system and how it works under Linux. We’ll step through
some overviews and study some interesting entries in /proc, and then we’ll demonstrate some
common administrative tasks using /proc. We’ll end with a brief mention of the SysFS file system and
the cgroup virtual file system.
M
What’s Inside the /proc Directory?
Because the Linux kernel is such a key component in server operations, it’s important that there be a
method for exchanging information with the kernel. Traditionally, this is done through system calls—
special functions written for programmers to use in requesting the kernel to perform functions on their
behalf. In the context of system administration, however, system calls mean a developer needs to
write a tool for us to use (unless, of course, you like writing your own tools). When all you need is a
simple tweak or to extract some statistics from the kernel, having to write a custom tool is a lot more
effort than should be necessary.
To improve communication between users and the kernel, the proc file system was created. The
entire file system is especially interesting because it doesn’t really exist on disk anywhere; it’s purely
an abstraction of kernel information. All of the files in the directory correspond either to a function in
the kernel or to a set of variables in the kernel.
NOTE The fact that proc is abstract doesn’t mean it isn’t a file system. It does mean that a special
file system had to be developed to treat proc differently from normal disk-based file systems.
For example, to see a report on the type of processor on a system, we can consult one of the files
under the /proc directory. The particular file that holds this information is the /proc/cpuinfo file and
can be viewed with this command:
The kernel will dynamically create the report, showing processor information, and hand it back to
cat so that we can see it. This is a simple yet powerful way for us to examine and query the kernel.
The /proc directory supports an easy-to-read hierarchy using subdirectories, and, as such, finding
information is easy. The directories under /proc are also organized such that files containing
information about similar topics are grouped together. For example, the /proc/scsi directory offers
reports about the Small Computer System Interface (SCSI) subsystem.
Even more of an advantage is that the flow of information goes both ways: The kernel can
generate reports for us, and we can easily pass information back into the kernel. For instance,
performing an ls -l in the /proc/sys/net/ipv4 directory will show us a lot of files that are not readonly, but read/write, which means some of the values stored in those files can be altered on the fly.
“Hey! Most of the /proc files have 0 bytes, and one is huge! What gives?” Don’t worry if you’ve
noticed all those 0-byte files—most of the files in /proc are 0 bytes because /proc doesn’t really exist
on disk. When you use cat to read a /proc file, the content of the file is dynamically generated by a
special program inside the kernel. As a result, the report is never saved back to disk and thus does not
take up space. Think of it in the same light as Common Gateway Interface (CGI) scripts for web sites,
where a web page generated by a CGI script isn’t written back to the server’s disk, but is regenerated
every time a user visits the page.
CAUTION That one huge file you see in /proc is /proc/kcore, which is really a pointer to the
contents of RAM. So if you have 10GB of RAM, the /proc/kcore file is also approximately 10GB!
But don’t worry about the size, because it isn’t occupying any space on your disk-based file systems.
Reading /proc/kcore is like reading the raw contents of memory (and, of course, requires root
permissions).
Tweaking Files Inside of /proc
As mentioned in the preceding section, some of the files under the /proc directory (and
subdirectories) have a read/write mode. Let us examine one of these directories a little more closely.
The files in /proc/sys/net/ipv4 represent parameters in the TCP/IP stack that can be “tuned”
dynamically. Use the cat command to look at a particular file, and you’ll see that most of the files
contain nothing but a single number. But by changing these numbers, you can affect the behavior of the
Linux TCP/IP stack!
For example, the file /proc/sys/net/ipv4/ip_forward contains a 0 (Off) by default. This tells
Linux not to perform IP forwarding when there are multiple network interfaces. But if you want to set
up something like a Linux router, you need to allow forwarding to occur. In this situation, you can edit
the /proc/sys/net/ipv4/ip_forward file and change the number to 1 (On).
A quick way to make this change is by using the echo command, like so:
CAUTION Be very careful when tweaking parameters in the Linux kernel. There is no safety net to
keep you from making the wrong settings for critical parameters, which means it’s entirely possible
that you can crash your system. If you aren’t sure about a particular item, it’s safer to leave it be until
you’ve found out for sure what it’s for.
Some Useful /proc Entries
Table 10-1 lists some /proc entries that you may find useful in managing your Linux system. Note that
this is a far cry from an exhaustive list. For more detail, peruse the directories yourself and see what
you find. Or you can also read the proc.txt file in the Documentation directory of the Linux kernel
source code.
Filename
/proc/cpuinfo
/proc/interrupts
/proc/ioports
Contents
Information about the CPU(s) in the system.
Internetworking Service Request (IRQ) usage in
your system.
Displays a listing of the registered port regions
used for I/O communication with devices.
Displays the current map of the system’s memory
/proc/iomem
/proc/mdstat
/proc/meminfo
/proc/kcore
/proc/modules
/proc/buddyinfo
/proc/cmdline
/proc/swaps
/proc/version
/proc/scsi/*
/proc/net/arp
/proc/net/dev
/proc/net/snmp
/proc/net/sockstat
/proc/sys/fs/*
/proc/sys/net/core/ netdev_max_backlog
/proc/sys/net/ipv4/ icmp_echo_ignore_all
for each physical device.
Status of Redundant Array of Inexpensive Disks
(RAID) configuration.
Status of memory usage.
Represents the physical memory of the system.
Unlike the other files under /proc, this file has a
size associated with it that is usually equal to the
total amount of physical RAM available.
Shows the currently loaded kernel modules. Same
information produced as output from lsmod.
Information stored in this file can be used for
diagnosing memory fragmentation issues.
Displays parameters passed to the kernel when
the kernel started up (boot time parameters).
Status of swap partitions, volume, and/or files.
Current version number of the kernel, the machine
on which it was compiled, and the date and time
of compilation.
Information about all of the SCSI devices.
Address Resolution Protocol (ARP) table (same
as output from arp -a).
Information about each network device (packet
counts, error counts, and so on).
Simple Network Management Protocol (SNMP)
statistics about each protocol.
Statistics on network socket utilization.
Settings for file system utilization by the kernel.
Many of these are writable values; be careful
about changing them unless you are sure of the
repercussions of doing so.
When the kernel receives packets from the
network faster than it can process them, it places
them on a special queue. By default, a maximum
of 300 packets is allowed on the queue. Under
extraordinary circumstances, you may need to edit
this file and change the value for the allowed
maximum.
Default = 0, meaning that the kernel will respond
to Internet Control Message Protocol (ICMP)
echo-reply messages. Set this to 1 to tell the
kernel to stop replying to those messages.
Default = 0, meaning that the kernel will allow
/proc/sys/net/ipv4/icmp_echo
_ignore_broadcasts
/proc/sys/net/ipv4/ip_forward
/proc/sys/net/ipv4/ip_local_ port_range
/proc/sys/net/ipv4/tcp_ syncookies
ICMP responses to be sent to broadcast or
multicast addresses.
Default = 0, meaning the kernel will not forward
packets between network interfaces. To allow
forwarding (such as for routing), change this to 1.
Range of ports Linux will use when originating a
connection. Default = 32768–61000.
Default = 0 (Off). Change to 1 (On) to enable
protection for the system against SYN flood
attacks.
Table 10-1. Useful Entries Under /proc
Unless otherwise stated, you can simply use the cat program to view the contents of a particular
file in the /proc directory.
Enumerated /proc Entries
A listing of the /proc directory will reveal a large number of directories whose names are just
numbers. These numbers are the process identifications (PIDs) for each running process in the system.
Within each of the process directories are several files describing the state of the process. This
information can be useful in finding out how the system perceives a process and what sort of
resources the process is consuming. (From a programmer’s point of view, the process files are also
an easy way for a program to get information about itself.)
For example, here’s a long listing of some of the files under /proc:
If you look a little closer at the folder named 1 in this output, you will notice that this particular
folder represents the information about the init process or the process with the process identification
number of 1 (PID = 1).
Here’s a listing of the files under /proc/1/:
Again, as you can see from the output, the /proc/1/exe file is a soft link that points to the actual
executable for the init program (/sbin/init). On distributions using the systemd service manager, the
link will instead point to /bin/systemd (see Chapter 8). The same logic applies to the other numericnamed directories that are under /proc—that is, they represent processes.
Common proc Settings and Reports
As mentioned, the proc file system is a virtual file system, and as a result, changes to default settings
in /proc do not survive reboots. If you need a change to a value under /proc to be automatically
set/enabled between system reboots, you can either edit your boot scripts so that the change is made
at boot time or use the sysctl tool. The former approach can, for example, be used to enable IP
packet-forwarding functionality in the kernel every time the system is booted. On a Fedora or other
Red Hat–based distro, you can add the following line to the end of your /etc/rc.d/rc.local file:
TIP On an Ubuntu system or other Debian-based distro, the equivalent of the /etc/rc.d/rc.local file
will be the /etc/rc.local file.
Most Linux distributions now have a more graceful way of making persistent changes to the proc
file system. In this section, we’ll look at a tool that can be used to make changes interactively in real
time to some variables stored in the proc file system.
The sysctl utility is used for displaying and modifying kernel parameters in real time.
Specifically, it can be used to tune parameters that are stored under the /proc/sys/ directory of the
proc file system. A summary of its usage and options is shown here:
Some of the possible options are listed in the following table:
Options
Explanation
Used to set or display the value of a key, where variable is the key and
variable [=value] value is the value to which the key is set. For instance, for a key called
kernel.hostname, a possible value might be server.example.com.
-n
Disables printing of the key name when printing values.
-e
This option is used to ignore errors about unknown keys.
-w
Use this option when you want to change a sysctl setting.
Loads in sysctl settings from the file specified or /etc/ sysctl.conf if no
-p <filename>
filename is given.
-a
Displays all values currently available.
We will use actual examples to demonstrate how to use the sysctl tool. Most of the examples
shown here are Linux distribution–independent—the only differences you might encounter are that
some distros might ship with some of the options already enabled or disabled. The examples
demonstrate a few of the many things you can do with proc to complement day-to-day administrative
tasks. Reports and tunable options available through proc are especially useful in network-related
tasks. The examples also provide some background information about the proc setting that we want to
tune.
SYN Flood Protection
When TCP initiates a connection, the first thing it does is send a special packet to the destination,
with the flag set to indicate the start of a connection. This flag is known as the SYN flag. The
destination host responds by sending an acknowledgment packet back to the source, called
(appropriately) a SYNACK. Then the destination waits for the source to return an acknowledgment,
showing that both sides have agreed on the parameters of their transaction. Once these three packets
are sent (this process is called the “three-way handshake”), the source and destination hosts can
transmit data back and forth.
Because it’s possible for multiple hosts to contact a single host simultaneously, it’s important that
the destination host keep track of all the SYN packets it gets. SYN entries are stored in a table until
the three-way handshake is complete. Once this is done, the connection leaves the SYN tracking table
and moves to another table that tracks established connections.
A SYN flood occurs when a source host sends a large number of SYN packets to a destination
with no intention of responding to the SYNACK. This results in overflow of the destination host’s
tables, thereby making the operating system unstable. Obviously, this is not a good thing.
Linux can prevent SYN floods by using a syncookie, a special mechanism in the kernel that tracks
the rate at which SYN packets arrive. If the syncookie detects the rate going above a certain
threshold, it aggressively begins to get rid of entries in the SYN table that don’t move to the
“established” state within a reasonable interval. A second layer of protection is in the table itself: If
the table receives a SYN request that would cause the table to overflow, the request is ignored. This
means it may happen that a client will be temporarily unable to connect to the server—but it also
keeps the server from crashing altogether and kicking everyone off!
First use the sysctl tool to display the current value for the tcp_syncookie setting:
The output shows that this setting is currently disabled (value=0). To turn on tcp_syncookie
support, enter this command:
Because /proc entries do not survive system reboots, you should add the following line to the end
of your /etc/sysctl.conf configuration file. To do this using the echo command, type the following;
NOTE You should, of course, first make sure that the /etc/sysctl.conf file does not already contain
an entry for the key that you are trying to tune. If it does, you can simply manually edit the file and
change the value of the key to the new value.
Issues on High-Volume Servers
Like any operating system, Linux has finite resources. If the system begins to run short of resources
while servicing requests (such as web access requests), it will begin refusing new service requests.
The /proc entry /proc/sys/fs/file-max specifies the maximum number of open files that Linux can
support at any one time. The default value on our Fedora system was 41962, but this may be quickly
exhausted on a busy system with a lot of network connections. Raising it to a larger number, such as
88559, can be useful. Using the sysctl command again, type the following:
Don’t forget to append your change to the /etc/sysctl.conf file if you want the change to be
persistent.
Debugging Hardware Conflicts
Debugging hardware conflicts is always a chore. You can ease the burden by using some of the
entries in /proc. These two entries are specifically designed to tell you what’s going on with your
hardware:
/proc/ioports tells you the relationships of devices to I/O ports and whether any conflicts
exist. With Peripheral Component Interconnect (PCI) devices becoming dominant, this isn’t
as big an issue. Nevertheless, as long as you can buy a new motherboard with Industry
Standard Architecture (ISA) slots, you’ll always want to have this option.
/proc/interrupts shows you the association of interrupt numbers to hardware devices. Again,
like /proc/ioports, PCI is making this less of an issue.
SysFS
SysFS (short for system file system) is similar to the proc file system. The major similarities between
the two are that they are both virtual file systems (in-memory file system) and they both provide a
means for information (data structures, actually) to be exported from within the kernel to the user
space. SysFS is usually mounted at the /sys mount point. The SysFS file system can be used to obtain
information about kernel objects, such as devices, modules, the system bus, firmware, and so on. This
file system provides a view of the device tree (among other things) as the kernel sees it. This view
displays most of the known attributes of detected devices, such as the device name, vendor name, PCI
class, IRQ and Direct Memory Access (DMA) resources, and power status. Some of the information
that used to be available in the old Linux 2.4–series kernel versions under the proc file system can
now be found under SysFS. It provides a lot of useful information in an organized (hierarchical)
manner.
Virtually all modern Linux distros have switched to using udev to manage devices. udev is used
for managing device nodes under the /dev directory. This function used to be performed by the devfs.
The new udev system allows the consistent naming of devices, which, in turn, is useful for the hotplugging of devices. udev is able to do all these wonderful things primarily because of SysFS—udev
does this by monitoring the /sys directory. Using the information gleaned from the /sys directory,
udev can dynamically create and remove device nodes as they are attached to or detached from the
system.
Another purpose of SysFS is that it provides a uniform view of the device space, thus providing a
sharp contrast to what was previously seen under the /dev directory. Administrators familiar with
Solaris will find themselves at home with the naming conventions used. The key difference between
Solaris and Linux, however, is that the representations under SysFS do not provide means to access
the device through the device driver. For device driver–based access, administrators will need to
continue using the appropriate /dev entry.
A listing of the top level of the sysfs directory shows these directories:
The contents of some of the top-level directories under /sys are described as follows:
SysFS Directory
block
bus
class
devices
firmware
module
power
Description
Contains a listing of the block devices (such as, sda, sr0, fd0)
detected on the system. Attributes that describe various things (such
as, size, partitions, and so on) about the block devices are also listed
under each block device.
Contains subdirectories for the physical buses detected and registered
in the kernel.
Describes a type or class of device—such as an audio, graphics
printer, or network device. Each device class defines a set of
behaviors to which devices in that class conform.
Lists all detected devices and contains a listing of every physical
device that is detected by the physical bus types registered with the
kernel.
Lists an interface through which firmware can be viewed and
manipulated.
Lists all loaded modules in subdirectories.
Holds files that can be used to manage the power state of certain
hardware.
A deeper look into the /sys/devices directory reveals this listing:
If we look at a sample representation of a device connected to the PCI bus on our system, we’ll
see these elements:
The topmost element under the devices directory in the preceding output describes the PCI
domain and bus number. The particular system bus here is the pci0000:00 PCI bus, where “0000” is
the domain number and the bus number is “00.” The functions of some of the other files are listed
here:
File
class
config
detach_state
device
irq
local_cpus
resource
resource0 (resource0…n)
vendor
Function
PCI class
PCI config space
Connection status
PCI device
IRQ number
Nearby CPU mask
PCI resource host address
PCI resource zero
PCI vendor ID (a list of
vendor IDs can be found at
/usr/share/hwdata/pci.ids)
cgroupfs
Control groups (cgroups) provide a mechanism for managing system resources on a Linux-based
system. Resources such as memory allocation, process scheduling, disk I/O access to blocked
devices, and network bandwidth can all be controlled and allocated via cgroups.
The resources are managed by so-called “resource controllers” (also known as subsystems or
modules). Following are some common cgroup subsystems that can be used to control specific
systems tasks and processes:
blkio (block input and output controller)
cpuacct (CPU accounting controller)
cpuset (CPUs and memory nodes controller)
freezer (suspending, resuming, and check-pointing tasks)
memory (memory controller)
net_cls (network traffic controller)
devices (tracking, granting, or denying access to creation or use of device files)
To view a list of the subsystems supported on your system, type the following:
TIP The libcgroup package provides various tools and libraries that can be used for manipulating,
controlling, monitoring and administrating control groups. On RPM-based distros such as Fedora,
CentOS, and RHEL, you can install the libcgroup package by running yum -y install libcgroup
cgroups make use of the cgroup pseudo-file system (cgroupfs). This cgroupfs makes use of the
Linux virtual file system (VFS) abstraction. cgroupfs provides a hierarchy of sorts for managing,
grouping, and partitioning tasks and processes running on a system. Subsystems are attached to
directories mounted under the cgroupfs, and then different constraints can be applied to these by
placing processes and tasks in control groups. In other words, the system administrator can use the
cgroupfs to assign resource constraints to a task or group of tasks.
If you have the libcgroup-tools package installed on a Fedora distro you can use the lssubsys
utility to list the cgroup VFS hierarchies of the subsystems as well as their corresponding mount
points.
To use lssubsys, type the following:
On our sample Fedora server, the location of the mount points for the cgroupfs hierarchy is
determined by the /etc/cgconfig.conf configuration file. The file has sample entries like those shown
here:
In practical terms, cgroups can be used to isolate and force memory-hungry applications to use
only a fixed amount of memory and thereby making other user/ system applications appear more
responsive. Chapter 8 discussed the new systemd service manager, which makes extensive use of
cgroups to speed up the system boot process as well as to manage the starting up and stopping of
system services and daemons.
Summary
In this chapter, you learned about the proc file system and how you can use it to get a peek inside the
Linux kernel, as well as to influence the kernel’s operation. The tools used to accomplish these tasks
are relatively trivial (echo and cat), but the concept of a pseudo-file system that doesn’t exist on disk
can be a little difficult to grasp.
Looking at proc from a system administrator’s point of view, you learned to find your way around
the proc file system and how to get reports from various subsystems (especially the networking
subsystem). You learned how to set kernel parameters to accommodate possible future enhancements.
Finally, brief mention was made of the SysFS virtual file system and the all-new (and very important)
cgroup file system.
PART III
Networking and Security
CHAPTER
11
TCP/IP for System
Administrators
etwork awareness has been a key feature of UNIX since its inception. A UNIX system that is
not connected to a network is like a race car without a race track. Linux inherits that legacy
and keeps it going.
To be a system administrator today, you must have a reasonably strong understanding of the
network and the protocols used to communicate over the system network. After all, if your server is
(or is not) receiving or sending information, you are responsible.
This chapter provides an introduction to the guts of the Transmission Control Protocol/Internet
Protocol, better known as TCP/IP. We’ll tackle the contents in two parts: First, we will walk through
the details of packets, Ethernet, TCP/IP, and some related protocol details. This part may seem a little
tedious at first, but perseverance will pay off in the second part. The second part will walk through
several examples of common problems and how you can quickly identify them with your newfound
knowledge of TCP/IP. Along the way, we will use a wonderful tool called tcpdump, which you’ll
find indispensable by the end of the chapter.
Please note that the intent of this chapter is not to be a complete replacement for the many books
on TCP/IP, but rather an introduction from the standpoint of someone who needs to learn about system
administration. If you want a more complete discussion on TCP/IP, we highly recommend TCP/IP
Illustrated, Vol. 1, by Richard Stevens (Addison-Wesley, 1994).
N
The Layers
TCP/IP is built in layers, thus the references to TCP/IP stacks. In this section, we take a look at what
the TCP/IP layers are, their relationship to one another, and, finally, why they really don’t match the
International Organization for Standardization (ISO) seven-layer Open Systems Interconnection (OSI)
model. We’ll also translate the OSI layers into meanings that are relevant to your network.
Packets
At the bottom of the layering system is the smallest unit of data that networks like dealing with:
packets. Packets contain the data that we want to transmit between our systems as well as some
control information that helps networking gear determine where the packet should go.
NOTE The terms “packet” and “frame” are often interchanged in network discussions. In these
situations, people referring to a frame often mean a packet. The difference is subtle. A frame is the
space in which packets go on a network. At the hardware level, frames on a network are separated by
preambles and post-ambles that tell the hardware where one frame begins and ends. A packet is the
data that is contained within the frame.
A typical TCP/IP packet flowing in an Ethernet network looks like that shown in Figure 11-1.
Figure 11-1. A TCP/IP packet on an Ethernet network
As you can see in Figure 11-1, packets are layered by protocol, with the lowest layers coming
first. Each protocol uses a header to describe the information needed to move data from one host to
the next. Packet headers tend to be small—the headers for TCP, IP, and Ethernet in their simplest and
most common combined form take only 54 bytes of space from the packet. This leaves the rest of the
1446 bytes of the packet to data.
Figure 11-2 illustrates how a packet is passed up the protocol stack. Let’s look into this process a
little more closely.
Figure 11-2. The path of a packet through the Linux networking stack
When a host’s network card receives a packet, it first checks to see if it is supposed to accept the
packet. This is done by looking at the destination addresses located in the packet’s headers. (More
about that in “Headers,” later in the chapter.) If the network card thinks it should accept the packet, it
keeps a copy of it in its memory and generates an interrupt to the operating system.
Frames Under Ethernet
In the last few years, the Ethernet specification has been updated to allow frames larger than
1518 bytes. These frames, appropriately called jumbo frames, can hold up to 9000 bytes. This,
conveniently, is enough space for a complete set of TCP/IP headers, Ethernet headers, Network
File System (NFS) control information, and one page of memory (4K to 8K, depending on your
system’s architecture; Intel uses 4K pages). Because servers can now push one complete page of
memory out of the system without having to break it up into tiny packets, throughput on some
applications (such as remote disk service) can go through the roof!
The downside to this is that very few people use jumbo frames, so you need to make sure
your network cards are compatible with your switches, and so on.
Upon receiving this interrupt, the operating system calls on the device driver of the network
interface card (NIC) to process the new packet. The device driver copies the packet from the NIC’s
memory to the system’s memory. Once it has a complete copy, it can examine the packet and
determine what type of protocol is being used. Based on the protocol type, the device driver makes a
note to the appropriate handler for that protocol that it has a new packet to process. The device driver
then puts the packet in a place where the protocol’s software (“the stack”) can find it and returns to
the interrupt processing.
Note that the stack does not begin processing the packet immediately. This is because the
operating system may be doing something important that it needs to finish before letting the stack
process the packet. Because it is possible for the device driver to receive many packets from the NIC
quickly, a queue exists between the driver and the stack software. The queue simply keeps track of
the order in which packets arrive and notes where they are in memory. When the stack is ready to
process those packets, it grabs them from the queue in the appropriate order.
As each layer processes the packet, appropriate headers are removed. In the case of a TCP/IP
packet over Ethernet, the driver will strip the Ethernet headers, IP will strip the IP headers, and TCP
will strip the TCP headers. This will leave just the data that needs to be delivered to the appropriate
application.
TCP/IP Model and the OSI Model
The TCP/IP model is an architectural model that helps describe the components of the TCP/IP
protocol suite. It is also known by other names, including Internet reference model, Department of
Defense (DoD) ARPANET reference model. The original TCP/IP model (RFC 1122) loosely
identifies four layers: Link layer, Internet layer, Transport layer, and Application layer.
The ISO’s OSI (Open Systems Interconnection) model is a well-known reference model for
describing the various abstraction layers in networking. The OSI model has seven layers: Physical
layer, Data Link layer, Network layer, Transport layer, Session layer, Presentation layer, and
Application layer.
The TCP/IP model was created before the OSI model. Unfortunately, the newer OSI model does
not have a convenient one-to-one mapping to the original TCP/IP model. Fortunately, there doesn’t
have to be one to make the concepts useful. Software and hardware network vendors managed to
make a mapping, and a general understanding of what each layer of the OSI model represents in each
layer of the TCP/IP model has emerged. Figure 11-3 shows the relative mapping between the OSI
model and the TCP/IP model.
Figure 11-3. The OSI reference model and the TCP/IP model
The following section discusses the layers of the OSI model in more detail.
Layer 1 (The Wire)
This is the Physical layer. It describes the actual medium on which the data flows. In a network
infrastructure, a pile of CAT 5 Ethernet cable and the signaling protocol are considered part of the
Physical layer.
Layer 2 (Ethernet)
This is the Data Link layer. It is used to describe the Ethernet protocol. The difference between the
OSI’s view of Layer 2 and Ethernet is that Ethernet concerns itself only with sending frames and
providing a valid checksum for them. The purpose of the checksum is to allow the receiver to
validate whether the data arrived as it was sent. This is done by computing the Cyclic Redundancy
Check (CRC) of the packet contents and comparing them against the checksum that was provided by
the sender. If the receiver gets a corrupted frame (that is, the checksums do not match), the packet is
dropped here. From the Linux point of view, it should not receive a packet that the NIC knows is
corrupted.
Although the OSI model formally specifies that Layer 2 should handle the automatic
retransmission of a corrupted packet, Ethernet does not do this. Instead, Ethernet relies on higher
level protocols (TCP in this case) to handle retransmission.
Ethernet’s primary responsibility is simple: Get the packet from one host on a local area network
(LAN) to another host on a LAN. Ethernet has no concept of a global network because of limitations
on the timing of packets, as well as the number of hosts that can exist on a single network segment.
You’ll be pressed to find more than 200 or so hosts on any given segment due to bandwidth issues
and simple management issues. It’s easier to manage smaller groups of machines.
NOTE Ethernet is increasingly used in metro area networks (MANs) and wide area networks
(WANs) as a framing protocol for connectivity. Although the distance may be great between two
endpoints, these networks are not the standard broadcast-style Ethernet that you see in a typical
switch or hub. Rather, networking vendors have opted to maintain the Layer 2 framing information as
Ethernet so that routers don’t need to fragment packets between networks. From a system
administrator’s point of view, don’t be concerned if your network provider says they use Ethernet in
their WAN/MAN—they haven’t strung together hundreds of switches to make the distance!
Layer 3 (IP)
This is the Network layer. And this is the layer at which the Internet Protocol (IP) exists. IP is wiser
to the world around it than Ethernet. IP understands how to communicate with hosts inside the
immediate LAN as well as with hosts that are not directly connected to you (for example, hosts on
other subnets, the Internet, via routers, and so on). This means that an IP packet can make its way to
any other host, so long as a path (route) exists to the destination host.
IP understands how to get a packet from one host to another. Once a packet arrives at the host,
there is no information in the IP header to tell it to which application to deliver the data. The reason
why IP does not provide any more features than those of a simple transport protocol is that it was
meant to be a foundation upon which other protocols can rest. Of the protocols that use IP, not all of
them need reliable connections or guaranteed packet order. Thus, it is the responsibility of higher
level protocols to provide additional features if needed.
Layer 4 (TCP, UDP)
This is the Transport layer. TCP and User Datagram Protocol (UDP) are mapped to the Transport
layer. TCP actually maps to this OSI layer quite well by providing a reliable transport for one
session—that is, a single connection from a client program to a server program. For example, using
Secure Shell (SSH) to connect to a server creates a session. You can have multiple windows running
SSH from the same client to the same server, and each instance of SSH will have its own session.
In addition to sessions, TCP handles the ordering and retransmission of packets. If a series of
packets arrives out of order, the stack will put them back into order before passing them up to the
application. If a packet arrives with any kind of problem or goes missing altogether, TCP will
automatically request that the sender retransmit. Finally, TCP connections are also bidirectional.
This means that the client and server can send and receive data on the same connection.
UDP, by comparison, doesn’t map quite as nicely to OSI. Although UDP understands the concept
of sessions and is bidirectional, it does not provide reliability. In other words, UDP won’t detect lost
or duplicate packets the way TCP does.
Layers 5–7 (HTTP, SSL, XML)
Technically, OSI’s Layers 5–7 each has a specific purpose, but in TCP/IP model lingo, they’re all
clumped together into the Application layer. Technically, all applications that use TCP or UDP sit
here; however, the marketplace generally calls Hypertext Transport Protocol (HTTP) traffic Layer 7.
Why Use UDP at All?
UDP’s seeming limitations are also its strengths! UDP is a good choice for two types of traffic:
short request/response transactions that fit in one packet (such as Domain Name System [DNS])
and streams of data that are better off skipping lost data and moving on (such as streaming audio
and video). In the first case, UDP is better, because a short request/response usually doesn’t
merit the overhead that TCP requires to guarantee reliability. The application is usually better
off adding additional logic to retransmit on its own in the event of lost packets.
In the case of streaming data, developers actually don’t want TCP’s reliability. They would
prefer that lost packets are simply skipped on the (reasonable) assumption that most packets will
arrive in the desired order. This is because human listeners/viewers are much better at handling
(and much less annoyed by!) short drops in audio than they are in delays.
Secure Sockets Layer (SSL) is a bit of an odd bird and is not commonly associated with any
layer. It sits squarely between Layer 4 (TCP) and Layer 7 (Application, typically HTTP), and can be
used to encrypt arbitrary TCP streams. In general, SSL is not referred to as a layer. You should note,
however, that SSL can encrypt arbitrary TCP connections, not just HTTP. Many protocols, such as
Post Office Protocol (POP) and Internet Message Access Protocol (IMAP), offer SSL as an
encryption option, and the emergence of SSL-virtual private network (VPN) technology shows how
SSL can be used as an arbitrary tunnel.
Extensible Markup Language (XML) data can also be confusing. To date, there is no framing
protocol for XML that runs on top of TCP directly. Instead, XML data uses existing protocols, such as
HTTP, Dual Independent Map Encoding (DIME), and Simple Mail Transfer Protocol (SMTP).
(DIME was created specifically for transmitting XML.) For most applications, XML uses HTTP,
which, from a layering point of view, looks like this:
Ethernet -> IP -> TCP -> HTTP -> XML
XML can wrap other XML documents within it. For example, Simple Object Access Protocol
(SOAP) can wrap digital signatures within it. For additional information on XML itself, take a look at
www.oasis-open.org and www.w3c.org.
NOTE You may hear references to “Layer 8” from time to time. This is more of a humorous
reference/sarcasm. Layer 8 typically refers to the “political” or “financial” layer, meaning that above
all networks are people. And people, unlike networks, are nondeterministic. What might make good
technical sense for the network doesn’t always make sense from the upper management’s perspective.
Here’s a simple example: Two department heads within the same company don’t get along with each
other. When they find out they share the network, they may demand to get their own infrastructure
(routers, switches, and so on) and get placed on different networks, yet at the same time be able to
communicate with each other—through secure firewalls only. What might have been a nice, simple
(and functional) network is now much more complex than it needs to be, all because of Layer 8.
ICMP
The Internet Control Message Protocol (ICMP) was especially designed for one host to
communicate to another host on the state of the network. Because the data is used only by the
operating system and not by users, ICMP does not support the concept of port numbers, reliable
delivery, or guaranteed order of packets.
Every ICMP packet contains a type that tells the recipient the nature of the message. The
most popular type is “Echo-Request,” which is used by the infamous ping program. When a host
receives the ICMP “Echo-Request” message, it responds with an ICMP “Echo-Reply” message.
This allows the sender to confirm that the other host is up, and because we can see how long it
takes the message to be sent and replied to, we get an idea of the latency of the network between
the two hosts.
Headers
Earlier in the chapter, we learned that a TCP/IP packet over Ethernet was a series of headers for
each protocol, followed by the actual data being sent. Packet headers, as they are typically called,
are simply those pieces of information that tell the protocol how to handle the packet.
In this section we look at each of these headers (Ethernet, IP, TCP, UDP) using the tcpdump tool.
Most Linux distributions have it preinstalled, but if you don’t, you can quickly install it using the
package management suite in your Linux distro.
NOTE You must have superuser privileges to run the tcpdump command.
Ethernet
Ethernet has an interesting history. As a result, there are two types of Ethernet headers: 802.3 and
Ethernet II. Thankfully, although they both look similar, you can use a simple test to tell them apart.
Let’s begin by looking at the contents of the Ethernet header (see Figure 11-4).
Figure 11-4. The Ethernet header
The Ethernet header contains three entries: the destination address, the source address, and the
packet’s protocol type. Ethernet addresses—also called Media Access Control (MAC) addresses; no
relation to the Apple Macintosh—are 48-bit (6-byte) numbers that uniquely identify every Ethernet
card in the world. Although it is possible to change the MAC address of an interface, this is not
recommended, as the default is guaranteed to be unique, and all MAC addresses on a LAN segment
should be unique.
NOTE A packet that is sent as a broadcast (meaning all network cards should accept this packet) has
the destination address set to ff:ff:ff:ff:ff:ff.
The packet’s protocol type is a 2-byte value that tells us what protocol this packet should be
delivered to on the receiver’s side. For IP packets, this value is hex 0800 (decimal 2048).
The packet we have just described here is an Ethernet II packet. (Typically, it is just called
Ethernet.) In 802.3 packets, the destination and source MAC addresses remain in place; however, the
next 2 bytes represent the length of the packet. The way you can tell the difference between the two
types of Ethernet is that there is no protocol type with a value of less than 1500. Thus, any Ethernet
header where the protocol type is less than 1500 is really an 802.3 packet. Realistically, you
probably won’t see many (if any) 802.3 packets anymore.
Viewing Ethernet Headers
To see the Ethernet headers on your network, run the following command:
This tells tcpdump to dump the Ethernet headers along with the TCP and IP headers.
Now generate some traffic by visiting a web site, or use SSH to communicate with another host.
Doing so will generate output like this:
The start of each line is a timestamp of when the packet was seen. The next two entries in the
lines are the source and destination MAC addresses, respectively, for the packet. In the first line, the
source MAC address is 0:d0:b7:6b:20:17 and the destination MAC address is 0:10:4b:cb:15:9f.
After the MAC address is the packet’s type. In this case, tcpdump saw 0800 and automatically
converted it to ip for us so that it would be easier to read. If you don’t want tcpdump to convert
numbers to names for you (especially handy when your DNS resolution isn’t working), you can run
this:
The -n option tells tcpdump to not do name resolution. The same two preceding lines without name
resolution would look like this:
Notice that in each line of the new output, the host name server became 10.2.2.1 and the port
number ssh became 22. We will discuss the meaning of the rest of the lines in the section “TCP,”
later in this chapter.
IP (IPv4)
The Internet Protocol has a slightly more complex header than Ethernet, as you can see in Figure 11-5.
Let’s step through what each of the header values signifies.
Figure 11-5. The IP header
The first value in the IP header is the version number.
NOTE The version of IP that is in most common use today is version 4 (IPv4); however, you will be
seeing more of version 6 (IPv6) over the next few years. Version 6 offers many improvements (and
changes) over version 4, such as an increase in the usable address space, integrated security, more
efficient routing, and auto-configuration.
The next value is the length of the IP header itself. You need to know the length of the header
because optional parameters may be appended to the base header. The header length tells you how
many, if any, options are there. To get the byte count of the total IP header length, multiply this number
by 4. Typical IP headers will have the header length value set to 5, indicating that there are 20 bytes
in the complete header.
The Type of Service (ToS) header tells IP stacks what kind of treatment should be given to the
packet. As of this writing, the only defined values are minimized delay, maximized throughput,
maximized reliability, and minimized cost. See RFCs 1340 (www.faqs.org/rfcs/rfc1340.html) and
1349 (www.faqs.org/rfcs/rfc1349.html) for more details. The use of ToS bits is sometimes referred
to as “packet coloring”; they are used by networking devices for the purpose of rate shaping and
prioritization.
The total length value tells you how long the complete packet is, including the IP and TCP
headers, but not including the Ethernet headers. This value is represented in bytes. An IP packet
cannot be longer than 65,535 bytes.
The identification number field is supposed to be a unique number used by a host to identify a
particular packet. The flags in the IP packet indicate whether the packet is fragmented. Fragmentation
occurs when an IP packet is larger than the smallest maximum transmission unit (MTU) between two
hosts. MTU defines the largest packet that can be sent over a particular network. For example,
Ethernet’s MTU is 1500 bytes. Thus, if we have a 4000-byte (3980 byte data + 20 byte IP header) IP
packet that needs to be sent over Ethernet, the packet will be fragmented into three smaller packets.
The first packet can be 1500 bytes (1480 byte data + 20 byte IP header), the second packet can also
be 1500 bytes (1480 byte data + 20 byte IP header), and the last packet will be 1040 bytes (1020 byte
data + 20 byte IP header).
The fragment offset value tells you which part of the complete packet you are receiving.
Continuing with the 4000-byte IP packet example, the first fragment will include bytes 0–1479 of data
and will have an offset value of 0. The second fragment will include bytes 1480–2959 of data and
will have an offset value of 185 (or 1480/8). And the third and final fragment will include fragments
2960–3999 of data and will have an offset value of 370 (or 2960/8). The receiving IP stack will take
these three packets and reassemble them into one large packet before passing it up the stack.
NOTE IP fragments don’t happen too frequently over the Internet anymore. Thus, many firewalls take
a paranoid approach about dealing with IP fragments, since they can be a source of denial-of-service
(DoS) attacks.
The time-to-live (TTL) field is a number between 0 and 255 that signifies how much time a
packet is allowed to have on the network before being dropped. The idea behind this is that in the
event of a routing error, where the packet is going around in a circle (also known as a “routing
loop”), the TTL would cause the packet to time out eventually and be dropped, thus keeping the
network from becoming completely congested with circling packets. As each router processes the
packet, the TTL value is decreased by one. When the TTL reaches zero, the router at which this
happens sends a message via the ICMP protocol (refer to “ICMP” earlier in the chapter), informing
the sender of this.
NOTE Layer 2 switches do not decrement the TTL, only routers decrement the TTL. Layer 2 switch
loop detection does not rely on tagging packets, but instead uses the switches’ own protocol for
communicating with other Layer 2 switches to form a “spanning tree.” In essence, a Layer 2 switch
maps all adjacent switches and sends test packets (bridge protocol data units, or BPDUs) and looks
for test packets generated by itself. When a switch sees a packet return to it, a loop is found and the
offending port is automatically shut down to normal traffic. Tests are constantly run so that if the
topology changes or the primary path for a packet fails, ports that were shut down to normal traffic
may be reopened.
The protocol field in the IP header tells you to which higher level protocol this packet should be
delivered. Typically, this has a value for TCP, UDP, or ICMP. In the tcpdump output you’ve seen, it
is this value that determines whether the output reads udp or tcp after displaying the source and
destination IP/port combination.
The last small value in this IP header is the checksum. This field holds the sum of every byte in
the IP header, including any options. When a host builds an IP packet to send, it computes the IP
checksum and places it into this field. The receiver can then do the same math and compare values. If
the values mismatch, the receiver knows that the packet was corrupted during transmission. (For
example, a lightning strike creating an electrical disturbance might create packet corruption; same
thing with a bad connection in the wire between the NIC and the transmission media.)
Finally come the numbers that matter the most in an IP header: the source and destination IP
addresses. These values are stored as 32-bit integers instead of the more human-readable dotteddecimal notation. For example, instead of 192.168.1.1, the value would be hexadecimal c0a80101 or
decimal 3232235777.
tcpdump and IP
By default, tcpdump doesn’t dump all the details of the IP header. To see everything, you need to
specify the -v option. The tcpdump program will continue displaying all matching packets until you
press CTRL-C to stop the output. You can ask tcpdump to stop automatically after a fixed number of
packets by using the -c parameter followed by the number of packets to look for. Finally, you can
remove the timestamp for brevity by using the -t parameter.
Assuming we want to see the next two IP packets without any DNS decoding, we would use the
following parameters:
The output shows a ping packet sent and returned. Here’s the format of this output:
Here, src and dest refer to the source and destination of the packet, respectively. For TCP and UDP
packets, the source and destination will include the port number after the IP address. The tail end of
the line shows the TTL, IP ID, and length, respectively. Without the -v option, the TTL is shown only
when it is equal to 1.
TCP
The TCP header is similar to the IP header in that it packs quite a bit of information into a little bit of
space. Let’s start by reviewing Figure 11-6.
Figure 11-6. The TCP header
The first two pieces of information in a TCP header are the source and destination port numbers.
Because these are only 16-bit values, their range is 0 to 65535. Typically, the source port is a value
greater than 1024, since ports 1 to 1023 are reserved for system use on most operating systems
(including Linux, Solaris, and the many variants of Microsoft Windows). On the other hand, the
destination port is typically low; most of the popular services reside there, although this is not a
requirement.
In this section, we will be walking through the different fields of TCP header in Figure 11-6 as
well as examining the fields as they are seen in an actual tcpdump capture. The output of the
command tcpdump -n -t -v is shown here:
192.168.1.1.2046 > 192.168.1.12.79 We’ve already explained the starting fields of the output. It is
simply the source and destination IP address and port number combination. The port numbers are
appended immediately after the IP address. The source port number is 2046 and the destination port
number is 79.
Flags [P.] This next part is a bit tricky. TCP uses a series of flags to indicate whether the packet is
supposed to initiate a connection, contain data, or terminate a connection. The flags (in the order they
appear) are Urgent (URG), Acknowledge (ACK), Push (PSH), Reset (RST), Synchronize (SYN), and
Finish (FIN). Their meanings are as follows:
Flag
URG
ACK
PSH
RST
SYN
FIN
Meaning
Implies that urgent data in the packet should receive priority processing
Acknowledges successfully received data
Requests that any received data be processed immediately
Immediately terminates the connection.
Requests that a new connection starts
Requests that a connection finishes
These flags are typically used in combination with one another. For example, it is common to see
PSH and ACK together. Using this combination, the sender essentially tells the receiver two things:
Data in this packet needs to be processed.
I am acknowledging that I have received data from you successfully.
You can see which flags are in a packet in tcpdump’s output immediately after the destination IP
address and port number. Here’s an example:
In this line, the flag is P for PSH. tcpdump uses the first character of the flag’s name to indicate the
flag’s presence (such as S for SYN or F for FIN). The only exception to this is ACK, which is
actually spelled out as ack later in the line. (If the packet has only the ACK bit set, a period is used as
a placeholder where the flags are usually printed.) ACK is an exception, because it makes it easier to
find what the acknowledgment number is for that packet. (See the discussion on acknowledgment
numbers earlier in this section; we will discuss flags in greater detail when we discuss connection
establishment and teardown.)
cksum 0xf4b1 The next element in the TCP header is the checksum. This is similar to the IP
checksum in that its purpose is to provide the receiver a way of verifying that the data received isn’t
corrupted. Unlike the IP checksum, the TCP checksum actually takes into account both the TCP header
and the data being sent. (Technically, it also includes the TCP pseudo-header, but being system
administrators, we can safely gloss over it for now.)
seq 1:6 In tcpdump’s output, we see sequence numbers in packets containing data. Here’s the
format:
starting number : ending number
The sequence numbers in our sample tcpdump output are 1:6, meaning that the data started at
sequence number 1 and ended at sequence number 6. These values are used by TCP to ensure that the
order of packets is correct. In day-to-day administrative tasks, you shouldn’t have to deal with them.
NOTE To make the output more readable, tcpdump uses relative values. Thus, a sequence number of
1 really means that the data contained within the packet is the first byte being sent. If you want to see
the actual sequence number, use the -S option.
ack 1 In this sample output, we also see the acknowledgment number. When the packet has the
acknowledgment flag set, it can be used by the receiver to confirm how much data has been received
from the sender (refer to the discussion of the ACK flag later in this section) and also to let the sender
know which packets have been properly received.
tcpdump prints ack, followed by the acknowledgment number, when it sees a packet with the
acknowledgment bit set.
In this case, the acknowledgment number is 1, meaning that 192.168.1.1 is acknowledging the first
byte sent to it by 192.168.1.12 in the current connection.
win 5740 The next entry in the header is the window size. TCP uses a technique called sliding
window, which allows each side of a connection to tell the other how much buffer space it has
available for dealing with connections. When a new packet arrives on a connection, the available
window size decreases by the size of the packet until the operating system has a chance to move the
data from TCP’s input buffer to the receiving application’s buffer space. Window sizes are computed
on a connection-by-connection basis.
Let’s look at a truncated output from tcpdump -n -t as an example:
In the first line, 192.168.1.1 tells 192.168.1.12 that it currently has 32,120 bytes available in its
buffer for this particular connection.
In the second packet, 192.168.1.12 sends 493 bytes to 192.168.1.1. (At the same time,
192.168.1.12 tells 192.168.1.1 that its available window is 17,520 bytes.)
192.168.1.1 responds to 192.168.1.12 with an acknowledgment saying it has properly accepted
everything up to the 495th byte in the stream, which in this case includes all of the data that has been
sent by 192.168.1.12. It’s also acknowledging that its available window is now 31,626, which is
exactly the original window size (32,120) minus the amount of data that has been received (493
bytes).
A few moments later, in the fourth line, 192.168.1.1 sends a note to 192.168.1.12 stating that it
has successfully transferred the data to the application’s buffer and that its window is back to 32,120.
A little confusing? Don’t worry too much about it. As a system administrator, you shouldn’t have
to deal with this level of detail, but it is helpful to know what the numbers mean.
NOTE You may have noticed an off-by-one error in the math here. 32,120 – 493 is 31,627, not
31,626. This has to do with the nuances of sequence numbers, calculations of available space, and
other factors. For the full ugliness of how the math works, read RFC 793 (ftp://ftp.isi.edu/innotes/rfc793.txt).
length 5 At the end of the output, you can see the length of the data being sent (5 in this example).
Similar to IP’s header length, TCP’s header length tells us how long the header is, including any TCP
options. Whatever value appears in the header length field is multiplied by 4 to get the byte value.
Finally, the last notable piece of the TCP header is the urgent pointer (see Figure 11-6). The
urgent pointer points to the offset of the octet following important data. This value is observed when
the URG flag is set and tells the receiving TCP stack that some important data is present. The TCP
stack is supposed to relay this information to the application so that it knows it should treat that data
with special importance.
In reality, you’ll be hard pressed to see a packet that uses the URG bit. Most applications have no
way of knowing whether data sent to them is urgent or not, and most applications don’t really care. As
a result, a small chord of paranoia should strike you if you do see urgent flags in your network. Make
sure it isn’t part of a probe from the outside trying to exploit bugs in your TCP stack and cause your
servers to crash. (Don’t worry about Linux—it knows how to handle the urgent bit correctly.)
UDP
In comparison to TCP headers, UDP headers are much simpler. Let’s start by looking at Figure 11-7.
Figure 11-7. The UDP packet header
The first fields in the UDP header are the source and destination port numbers. These are
conceptually the same thing as the TCP port numbers. In tcpdump output, they appear in a similar
manner. Let’s look at a DNS query to resolve www.example.com into an IP address as an example
with the command tcpdump -nn -t port 53:
In this output, you can see that the source port of this UDP packet is 1096 and the destination port is
53. The rest of the line is the DNS request in a human-readable form. The next field in the UDP
header is the length of the packet. tcpdump does not display this information.
Finally, the last field is the UDP checksum. This is used by UDP to validate that the data has
arrived to its destination without corruption. If the checksum is corrupted, tcpdump will tell you.
A Complete TCP Connection
As we discussed earlier, TCP supports the concept of a connection. Each connection must go through
a sequence to get established; after both sides are done sending data, they must go through another
sequence to close the connection. In this section, we review the complete process of a simple HTTP
request and view the process as seen by tcpdump. Note that all of the tcpdump logs in this section
were generated with the tcpdump -nn -t port 80 command. Unfortunately, because of the
complex nature of TCP, we cannot cover every possible scenario that a TCP connection can take.
However, the coverage provided here should be enough to help you determine when things are going
wrong at the network level rather than at the server level.
Opening a Connection
TCP undergoes a three-way handshake for every connection that it opens. This allows both sides to
send each other their state information and give each other a chance to acknowledge the receipt of that
data.
The first packet is sent by the host that wants to open the connection with a server. For this
discussion, we will call this host the client. The client sends a TCP packet over IP and sets the TCP
flag to SYN. The sequence number is the initial sequence number that the client will use for all of the
data it will send to the other host (which we’ll call the server).
The second packet is sent from the server to the client. This packet contains two TCP flags set:
SYN and ACK. The purpose of the ACK flag is to tell the client that it has received the first SYN
packet. This is double-checked by placing the client’s sequence number in the acknowledgment field.
The purpose of the SYN flag is to tell the client with which sequence number the server will be
sending its responses.
Finally, the third packet goes from the client to the server. It has only the ACK bit set in the TCP
flags for the purpose of acknowledging to the server that it received its SYN. This ACK packet has
the client’s sequence number in the sequence number field and the server’s sequence number in the
acknowledgment field.
Sound a little confusing? Don’t worry—it is. Let’s try to clarify it with a real example from
tcpdump. The first packet is sent from 192.168.1.1 to 207.126.116.254, and it looks like this (note
that both lines are actually one long line):
You can see the client’s port number is 1367 and the server’s port number is 80 (HTTP). The S
means that the SYN bit is set and that the sequence number is 2524389053. The length 0 at the end
of the output means that there is no data in this packet. After the window is specified as being 32,120
bytes large, you can see that tcpdump has shown which TCP options were part of the packet. The
only option worth noting as a system administrator is the MSS (Maximum Segment Size) value. This
value tells you the maximum size that TCP is tracking for a nonsegmented packet for that given
connection. Connections that require small MSS values because of the networks that are being
traversed typically require more packets to transmit the same amount of data. More packets mean
more overhead, and that means more CPU cycles required to process a given connection.
Notice that no acknowledgment bit is set and there is no acknowledgment field to print. This is
because the client has no sequence number to acknowledge yet! Time for the second packet from the
server to the client:
Like the first packet, the second packet has the SYN bit set, meaning that it is telling the client
what it will start its sequence number with (in this case, 1998624975). It’s OK that the client and
server use different sequence numbers. What’s important, though, is that the server acknowledges
receiving the client’s first packet by turning on the ACK bit and setting the acknowledgment field to
2524389054 (the sequence number that the client used to send the first packet plus one).
Now that the server has acknowledged receiving the client’s SYN, the client needs to
acknowledge receiving the server’s SYN. This is done with a third packet that has only the ACK bit
set in its TCP flags. This packet looks like this:
You can clearly see that there is only one TCP bit set: ACK (indicated by the dot). The value of
the acknowledgment field is shown as a 1. But wait! Shouldn’t it be acknowledging 1998624975?
Well, don’t worry—it is. tcpdump has been kind enough to switch automatically into a mode that
prints out the relative sequence and acknowledgment numbers instead of the absolute numbers. This
makes the output much easier to read. So in this packet, the acknowledgment value of 1 means that it
is acknowledging the server’s sequence number plus one. We now have a fully established
connection.
So why all the hassle to start a connection? Why can’t the client just send a single packet over to
the server stating, “I want to start talking—okay?” and have the server send back an “okay”? The
reason is that without all three packets going back and forth, neither side is sure that the other side
received the first SYN packet—and that packet is crucial to TCP’s ability to provide a reliable and
in-order transport.
Transferring Data
With a fully established connection in place, both sides are able to send data. Since we are using an
HTTP request as an example, we will first see the client generate a simple request for a web page.
The tcpdump output looks like this:
Here we see the client sending 7 bytes to the server with the PSH bit set. The intent of the PSH bit
is to tell the receiver to process the data immediately, but because of the nature of the Linux network
interface to applications (sockets), setting the PSH bit is unnecessary. Linux (like all socket-based
operating systems) automatically processes the data and makes it available for the application to read
as soon as it can.
Along with the PSH bit is the ACK bit, because TCP always sets the ACK bit on outgoing
packets. The acknowledgment value is set to 1, which, based on the connection setup we observed in
the previous section, means that there has been no new data that needs acknowledging.
Given that this is an HTTP transfer, it is safe to assume that since it is the first packet going from
the client to the server, it is probably the request itself.
Now the server sends a response to the client with this packet:
Here the server is sending 766 bytes to the client and acknowledging the first 8 bytes that the
client sent to the server. This is probably the HTTP response. Since we know that the web page we
requested is small, this is probably all of the data that is going to be sent in this request.
The client acknowledges this data with the following packet:
This is a pure acknowledgment, meaning that the client did not send any data, but it did
acknowledge up to the 767th byte that the server sent.
The process of the server sending some data and then getting an acknowledgment from the client
can continue as long as there is data that needs to be sent.
Closing the Connection
TCP connections have the option of ending ungracefully. That is to say, one side can tell the other
“stop now!” Ungraceful shutdowns are accomplished with the RST (reset) flag, which the receiver
does not acknowledge upon receipt. This is to keep both hosts from getting into an “RST war,” where
one side resets and the other side responds with a reset, thus causing a never-ending ping-pong effect.
Let’s start with examining a clean shutdown of the HTTP connection we’ve been observing so far.
In the first step in shutting down a connection, the side that is ready to close the connection sends a
packet with the FIN bit set, indicating that it is finished. Once a host has sent a FIN packet for a
particular connection, it is not allowed to send anything other than acknowledgments. This also means
that even though it might be finished, the other side may still send it data. It is not until both sides send
a FIN that both sides are finished. And like the SYN packet, the FIN packet must receive an
acknowledgment.
In the next two packets, we see the server tell the client that it is finished sending data and the
client acknowledges this:
We then see the reverse happen. The client sends a FIN to the server, and the server
acknowledges it:
And that’s all there is to a graceful connection shutdown.
As I indicated earlier, an ungraceful shutdown is simply one side sending another the RST packet,
which looks like this:
In this example, 192.168.1.1 is ending a connection with 207.126.116.254 by sending a reset.
After receiving this packet, a run of netstat on 207.126.116.254 (which happens to be another Linux
server) affirmed that the connection was completely closed.
How ARP Works
The Address Resolution Protocol (ARP) is a mechanism that allows IP to map Ethernet addresses to
IP addresses. This is important, because when you send a packet on an Ethernet network, it is
necessary to put in the Ethernet address of the destination host.
The reason we separate ARP from Ethernet, IP, TCP, and UDP is that ARP packets do not go up
the normal packet path. Instead, because ARP has its own Ethernet header type (0806), the Ethernet
driver sends the packet to the ARP handler subsystem, which has nothing to do with TCP/IP.
The basic steps of ARP are as follows:
1. The client looks in its ARP cache to see if it has a mapping between its IP address and its
Ethernet address. (You can see your ARP cache by running arp -a on your system.)
2. If an Ethernet address for the requested IP address is not found, a broadcast packet is sent
out requesting a response from the person with the IP we want.
3. If the host with that IP address is on the LAN, it will respond to the ARP request, thereby
informing the sender of its Ethernet address/IP address combination.
4. The client saves this information in its cache and is now ready to build a packet for
transmission.
Here’s an example of this from tcpdump with the command tcpdump -e -t -n arp:
The first packet is a broadcast packet asking all of the hosts on the LAN for 192.168.1.1’s
Ethernet address. The second packet is a response from 192.168.1.1 giving its IP/MAC address
mapping.
This, of course, begs the question, “If we can find the MAC address of the destination host using a
broadcast, why can’t we just send all packets to the broadcast?” The answer has two parts. The first
is that the broadcast packet requires that hosts on the LAN receiving the packet take a moment and
process it. This means that if two hosts are having an intense conversation (such as a large file
transfer), all the other hosts on the same LAN would incur a lot of overhead checking on packets that
don’t belong to them. The second part is that networking hardware (such as switches) relies on
Ethernet addresses to forward packets quickly to the right place and to minimize network congestion.
Any time a switch sees a broadcast packet, it must forward that packet to all of its ports. This makes a
switch no better than a hub.
“Now, if I need the MAC address of the destination host to send a packet to it, does that mean I
have to send an ARP request to hosts that are sitting across the Internet?” The answer is a reassuring
no.
When IP figures out where a packet should head off to, it first checks the routing table. If it can’t
find the appropriate route entry, IP looks for a default route. This is the path that, when all else fails,
should be taken. Typically, the default route points to a router or firewall that understands how to
forward packets to the rest of the world.
This means that when a host needs to send something to another server across the Internet, it only
needs to know how to get the packet to the router, and, therefore, it only needs to know the MAC
address of the router.
To see this happen on your network, do a tcpdump on your host and then visit a web site that is
elsewhere on the Internet, such as www.kernel.org. You will see an ARP request from your machine
to your default route, a reply from your default route, and then the first packet from your host with the
destination IP of the remote web server.
The ARP Header: ARP Works with Other Protocols, Too!
The ARP protocol is not specific to Ethernet and IP. To see why, let’s take a quick peek at the ARP
header (see Figure 11-8).
Figure 11-8. The ARP packet header
The first field that we see in the ARP header (which follows the Ethernet header) is the hard type.
The hard type field specifies the type of hardware address. (Ethernet has the value of 1.)
The next field is the prot type. This specifies the protocol address being mapped. In the case of
IP, this is set to 0800 (hexadecimal).
The hard size and prot size fields that immediately follow tell ARP the size of the addresses it is
mapping. Ethernet has a size of 6, and IP has a size of 4.
The op field tells ARP what needs to be done. ARP requests are 1, and ARP replies are 2.
Finally, there are the fields that we are trying to map. A request has the sender’s Ethernet and IP
addresses as well as the destination IP address filled in. The reply fills in the destination Ethernet
address and responds to the sender.
NOTE A variant of ARP, called RARP (which stands for Reverse ARP), has different values for the
op field.
Bringing IP Networks Together
Now that you have some of the fundamentals of TCP/IP under your belt, let’s take a look at how they
work to let you glue networks together. This section will cover the differences between hosts and
networks, and netmasks, static routing, and some basics in dynamic routing.
The purpose of this section is not to show you how to configure a Linux router, but to introduce
the concepts. Although you might find it less exciting than actually getting down and dirty, you’ll find
that understanding the basics makes the other stuff a little more interesting. More important, should
you be looking to apply for a Linux system administrator’s job, these could be things that pop up in
interview questions.
Hosts and Networks
The Internet is a large group of interconnected networks. All of these networks have agreed to
connect with some other network, thus allowing everyone to connect to one another. Each of these
component networks is assigned a network address.
Traditionally, in a 32-bit IP address, the network component typically takes up 8, 16, or 24 bits to
encode a class A, B, or C network, respectively. Since the remainder of the bits in the IP address is
used to enumerate the host within the network, the fewer bits that are used to describe the network, the
more bits are available to enumerate the hosts. For example, class A networks have 24 bits left for
the host component, which means there can be upward of 16,777,214 hosts within that network.
(Classes B and C have 65,534 and 254 nodes, respectively.)
NOTE There are also class D and class E ranges. Class D is used for multicast, and class E is
reserved for experimental use.
In order to better organize the various classes of networks, it was decided early in IP’s life that
the first few bits would decide to which class the address belonged. For the sake of readability, the
first octet of the IP address specifies the class.
NOTE An octet is 8 bits, which in the typical dotted-decimal notation of IP means the number before
a dot. For example, in the IP address 192.168.1.42, the first octet is 192, the second octet is 168, and
so on.
The ranges are as follows:
Class
A
B
C
Octet Range
0–126
128–192.167
192.169–223
You probably noted some gaps in the ranges. This is because some special addresses are
reserved for special uses. The first special address is one you’ll likely find familiar: 127.0.0.1. This
is also known as the loopback address. It is set up on every host using IP so that it can refer to itself.
It seems a bit odd to do it this way, but just because a system is capable of speaking IP doesn’t mean
it has an IP address allocated to it! On the other hand, the 127.0.0.1 address is virtually guaranteed.
(If it isn’t there, more likely than not, something has gone wrong.)
Three other ranges are notable and they are considered private IP address blocks. These ranges
are not allowed to be allocated to anyone on the Internet, and, therefore, you may use them on your
internal networks. They include
Every IP address in the 10.0.0.0 network
The 172.16 – 172.31 networks
The 192.168 network
NOTE We define internal networks as networks that are behind a firewall—not really connected to
the Internet—or that have a router performing network address translation at the edge of the network
connecting to the Internet. (Most firewalls perform this address translation as well.)
Subnetting
Imagine a network with a few thousand hosts on it, which is normal in most medium-to large-sized
companies. Trying to tie them all together into a single large network would probably lead you to pull
out all your hair, bang your head on the wall, or possibly both. And that’s just the figurative stuff.
The reasons for not keeping a network as a single large entity range from technical issues to
political ones. On the technical front, there are limitations to every technology on how large a
network can get before it becomes too large. Ethernet, for instance, cannot have more than 1024 hosts
on a single collision domain. Realistically, having more than a dozen on an even mildly busy network
will cause serious performance issues. Even migrating hosts to switches doesn’t solve the entire
problem, since switches, too, have limitations on how many hosts they can deal with.
Of course, you’re likely to run into management issues before you hit limitations of switches;
managing a single large network is difficult. Furthermore, as an organization grows, individual
departments will begin compartmentalizing. Human resources is usually the first candidate to need a
secure network of its own so that nosy engineers don’t peek into things they shouldn’t. To support a
need like that, you need to create subnetworks, a task more commonly referred to as subnetting.
Assuming our corporate network is 10.0.0.0, we could subnet it by setting up smaller class C
networks within it, such as 10.1.1.0, 10.1.2.0, 10.1.3.0, and so on. These smaller networks would
have 24-bit network components and 8-bit host components. Since the first 8 bits would be used to
identify our corporate network, we could use the remaining 16 bits of the network component to
specify the subnet, giving us 65,534 possible subnetworks. Of course, you don’t have to use all of
them!
NOTE As you’ve seen earlier in this chapter, network addresses have the host component of an IP
address typically set to all zeros. This convention makes it easy for other humans to recognize which
addresses correspond to entire networks and which addresses correspond specifically to hosts.
Netmasks
The purpose of a netmask is to tell the IP stack which part of the IP address is the network and which
part is the host. This allows the stack to determine whether a destination IP address is on the LAN or
if it needs to be sent to a router for forwarding elsewhere.
The best way to start looking at netmasks is to look at IP addresses and netmasks in their binary
representations. Let’s look at the 192.168.1.42 address with the netmask 255.255.255.0:
In this example, we want to find out what part of the IP address 192.168.1.42 is network and what
part is host. Now, according to the definition of netmask, those bits that are zero are part of the host.
Given this definition, we see that the first three octets make up the network address and the last octet
makes up the host.
In discussing network addresses with other people, you’ll often find it handy to be able to state
the network address without having to give the original IP address and netmask. Thankfully, this
network address is computable, given the IP address and netmask, using a bitwise AND operation.
The way the bitwise AND operation works can be best explained by observing the behavior of
two bits being ANDed together. If both bits are 1, then the result of the AND is also 1. If either bit (or
both bits) is zero, the result is zero. You can see this more clearly in this table:
So computing the bitwise AND operation on 192.168.1.42 and 255.255.255.0 yields the bit
pattern 11000000 10101000 00000001 00000000. Notice that the first three octets remained identical
and the last octet became all zeros. In dotted-decimal notation, this reads 192.168.1.0.
NOTE Remember that we usually need to give up one IP to the network address and one IP to the
broadcast address. In this example, the network address is 192.168.1.0 and the broadcast address is
192.168.1.255.
Let’s walk through another example. This time, we want to find the address range available to us
for the network address 192.168.1.176 with a netmask of 255.255.255.240. (This type of netmask is
commonly given by ISPs to business digital subscriber line [DSL] and T1 customers.)
A quick breakdown of the last octet in the netmask shows us that the bit pattern for 240 is
11110000. This means that the first three octets of the network address, plus four bits into the fourth
octet, are held constant (255.255.255.240 in binary is 11111111 11111111 11111111 11110000).
Since the last four bits are variable, we know we have 16 possible addresses (24 = 16). Thus, our
range goes from 192.168.1.176 to 192.168.1.192 (192 – 176 = 16).
Because it is so tedious to type out complete netmasks, most people use the abbreviated format,
where the network address is followed by a slash and the number of bits in the netmask. So the
network address 192.168.1.0 with a netmask of 255.255.255.0 would be abbreviated to
192.168.1.0/24.
NOTE The process of using netmasks that do not fall on the class A, B, or C boundaries is also
known as classless interdomain routing (CIDR). You can read more about CIDR in RFC 1817
(www.rfc-editor.org/rfc/rfc1817.txt).
Static Routing
When two hosts on the same LAN want to communicate, it is quite easy for them to find each other:
Simply send out an ARP message, get the other host’s MAC address, and be done with it. But when
the second host is not local, things become trickier.
To get two or more LANs to communicate with one another, a router needs to be put into place.
The purpose of the router is to know about the topology of multiple networks. When you want to
communicate with another network, your machine will set the destination IP as the host on the other
network, but the destination MAC address will be for the router. This allows the router to receive the
packet and examine the destination IP, and since it knows that IP is on the other network, it will
forward the packet. The reverse is also true for packets that are coming from the other network to
your network (see Figure 11-9).
Figure 11-9. Two networks connected by a router
In turn, the router must know what networks are plugged into it. This information is called a
routing table. When the router is manually informed about what paths it can take, the table is called
static, thus the term static routing. Once routes are plugged into the routing table by a human, they
cannot be changed until a human operator comes back to change them.
Unfortunately, commercial grade routers can be rather expensive devices. They are typically
dedicated pieces of hardware that are highly optimized for the purpose of forwarding packets from
one interface to another. You can, of course, make a Linux-based router (I discuss this in Chapter 12)
using a stock PC that has two or more network cards. Such configurations are fast and cheap enough
for small to medium-sized networks. In fact, many companies are already starting to do this, since
older PCs that are too slow to run the latest web browsers and word-processing applications are still
plenty fast to perform routing.
As with any advice, take it within the context of your requirements, budget, and skills. Open
source and Linux are great tools, but like anything else, make sure you’re using the right tool for the
job.
Routing Tables
As mentioned earlier, routing tables are lists of network addresses, netmasks, and destination
interfaces. A simplified version of a table might look like this:
When a packet arrives at a router that has a routing table like this, it will go through the list of
routes and apply each netmask to the destination IP address. If the resulting network address is equal
to the network address in the table, the router knows to forward the packet on to that interface.
So let’s say that the router receives a packet with the destination IP address set to 192.168.2.233.
The first table entry has the netmask 255.255.255.0. When this netmask is applied to 192.168.2.233,
the result is not 192.168.1.0, so the router moves on to the second entry. Like the first table entry, this
route has the netmask of 255.255.255.0. The router will apply this to 192.168.2.233 and find that the
resulting network address is equal to 192.168.2.0. So now the appropriate route is found. The packet
is forwarded out of interface 2.
If a packet arrives that doesn’t match the first three routes, it will match the default case. In our
sample routing table, this will cause the packet to be forwarded to interface 4. More than likely, this
is a gateway to the Internet.
Limitations of Static Routing
The example of static routing we’ve used is typical of smaller networks. Static routing is useful when
only a handful of networks need to communicate with one another and they aren’t going to change
often.
There are, however, limitations to this technique. The biggest limitation is human—you are
responsible for updating all of your routers with new information whenever you make any changes.
Although this is usually easy to do in a small network, it does leave room for error. Furthermore, as
your network grows and more routes get added, it is more likely that the routing table will become
trickier to manage this way.
The second—but almost as significant—limitation is that the time it takes the router to process a
packet is almost proportional to the number of routes that exist. With only three or four routes, this
isn’t a big deal. But as you start getting into dozens of routes, the overhead can become noticeable.
Given these two limitations, it is best to use static routes only in small networks.
Dynamic Routing with RIP
As networks grow, the need to subnet them grows, too. Eventually, you’ll find that you have a lot of
subnets that can’t all be tracked easily, especially if they are being managed by different
administrators. One subnet, for instance, might need to break its network in half for security reasons.
In a situation this complex, going around and telling everyone to update their routing tables would be
a real nightmare and would lead to all sorts of network headaches.
The solution to this problem is to use dynamic routing. The idea behind dynamic routing is that
each router knows only immediately adjacent networks when it starts up. It then announces to other
routers connected to it what it knows, and the other routers reply back with what they know. Think of
it as “word of mouth” advertising for your network. You tell the people around you about your
network, they then tell their friends, and their friends tell their friends, and so on. Eventually,
everyone connected to the network knows about your new network.
On campus-wide networks (such as a large company with many departments), you’ll typically see
this method of announcing route information. As of this writing, the two most commonly used routing
protocols are Routing Information Protocol (RIP) and Open Shortest Path First (OSPF).
RIP is currently up to version 2. It is a simple protocol that is easy to configure. Simply tell the
router information about one network (making sure each subnet in the company has a connection to a
router that knows about RIP), and then have the routers connect to one another. RIP broadcasts happen
at regular time intervals (usually less than a minute), and in only a few minutes, the entire campus
network knows about you.
Let’s see how a smaller campus network with four subnets would work with RIP. Figure 11-10
shows how the network is connected.
Figure 11-10. A small campus network using RIP
NOTE For the sake of simplicity, we’re serializing the events. In reality, many of these events would
happen in parallel.
As illustrated in this figure, router 1 would be told about 192.168.1.0/24 and about the default
route to the Internet. Router 2 would be told about 192.168.2.0/24, router 3 would know about
192.168.3.0/24, and so on. At startup, each router’s table looks like this:
Router
Router 1
Router 2
Router 3
Router 4
Table
192.168.1.
Internet gateway
192.168.2.
192.168.3.
192.168.4.
Router 1 then makes a broadcast stating what routes it knows about. Since routers 2 and 4 are
connected to it, they update their routes. This makes the routing table look like this (new routes in
italics):
Router
Router 1
Router 2
Table
192.168.1.0/24
Internet gateway
192.168.2.0/24
Router 3
Router 4
192.168.1.0/24 via router 1
Internet gateway via router 1
192.168.3.0/24
192.168.4.0/24
192.168.1.0/24 via router 1
Internet gateway via router 1
Router 2 then makes its broadcast. Routers 1 and 3 see these packets and update their tables as
follows (new routes in italics):
Router
Router 1
Router 2
Router 3
Router 4
Table
192.168.1.0/24
Internet gateway
192.168.1.0/24 via router 2
192.168.2.0/24
192.168.1.0/24 via router 1
Internet gateway via router 1
192.168.3.0/24
192.168.2.0/24 via router 2
192.168.1.0/24 via router 2
Internet gateway via router 2
192.168.4.0/24
192.168.1.0/24 via router 1 Internet gateway via router 1
Router 3 then makes its broadcast, which routers 2 and 4 hear. This is where things get
interesting, because this introduces enough information to open up multiple routes to the same
destination. The routing tables now look like this (new routes in italics):
Router
Router 1
Router 2
Router 3
Table
192.168.1.0/24
Internet gateway
192.168.2.0/24 via router 2
192.168.2.0/24
192.168.1.0/24 via router 1
Internet gateway via router 1
192.168.3.0/24 via router 3
192.168.3.0/24
192.168.2.0/24 via router 2
192.168.1.0/24 via router 2
Internet gateway via router 2
Router 4
192.168.4.0/24
192.168.1.0/24 via router 1 or 3
Internet gateway via router 1 or 3
192.168.3.0/24 via router 3
192.168.2.0/24 via router 3
Next, router 4 makes its broadcast. Routers 1 and 3 hear this and update their tables to the
following (new routes in italics):
Router
Router 1
Router 2
Router 3
Router 4
Table
192.168.1.0/24
Internet gateway
192.168.2.0/24 via router 2 or 4
192.168.3.0/24 via router 4
192.168.4.0/24 via router 4
192.168.2.0/24
192.168.1.0/24 via router 1
Internet gateway via router 1
192.168.3.0/24 via router 3
192.168.3.0/24
192.168.2.0/24 via router 2
192.168.1.0/24 via router 2 or 4
Internet gateway via router 2 or 4
192.168.4.0/24 via router 4
192.168.1.0/24 via router 1
192.168.4.0/24
Internet gateway via router 1
192.168.3.0/24 via router 3
192.168.2.0/24 via router 3
Once all the routers go through another round of broadcasts, the complete would look like this:
Router
Router 1
Router 2
Table
192.168.1.0/24
Internet gateway
192.168.2.0/24 via router 2 or 4
192.168.3.0/24 via router 4 or 2
192.168.4.0/24 via router 4 or 2
192.168.2.0/24
192.168.1.0/24 via router 1 or 3
Router 3
Router 4
Internet gateway via router 1 or 3
192.168.3.0/24 via router 3 or 1
192.168.3.0/24
192.168.2.0/24 via router 2 or 4
192.168.1.0/24 via router 2 or 4
Internet gateway via router 2 or 4
192.168.4.0/24 via router 4 or 2
192.168.4.0/24
192.168.1.0/24 via router 1 or 3
Internet gateway via router 1 or 3
192.168.3.0/24 via router 3 or 1
192.168.2.0/24 via router 3 or 1
Why is this mesh important? Let’s say router 2 fails. If router 3 was relying on router 2 to send
packets to the Internet, it can immediately update its tables, reflecting that router 2 is no longer
available, and then forward Internet-bound packets through router 4.
RIP’s Algorithm (and Why You Should Use OSPF Instead)
Unfortunately, when it comes to figuring out the most optimal path from one subnet to another, RIP is
not the smartest protocol. Its method of determining which route to take is based on the fewest number
of routers (hops) between it and the destination. Although that sounds optimal, what this algorithm
doesn’t take into account is how much traffic is on the link or how fast the link is.
Looking back at Figure 11-10, you can see where this situation might play itself out. Let’s assume
that the link between routers 3 and 4 becomes congested. Now if router 3 wants to send a packet out
to the Internet, RIP will still evaluate the two possible paths (3 to 4 to 1, and 3 to 2 to 1) as being
equidistant. As a result, the packet may end up going via router 4 when, clearly, the path through
router 2 (whose links are not congested) would be much faster.
OSPF (Open Shortest Path First) is similar to RIP in how it broadcasts information to other
routers. What makes it different is that instead of keeping track of how many hops it takes to get from
one router to another, it keeps track of how quickly each router is talking to the others. Thus, in our
example, where the link between routers 3 and 4 becomes congested, OSPF will realize that and be
sure to route a packet destined to router 1 via router 2.
Another feature of OSPF is its ability to realize when a destination address has two possible
paths that would take an equal amount of time. When it sees this, OSPF will share the traffic across
both links—a process called equal-cost multipath—thereby making optimal use of available
resources.
There are two “gotchas” with OSPF: Older networking hardware and some lower end networking
hardware might not have OSPF available or might have it at a substantially higher cost. The second
gotcha is complexity: RIP is much simpler to set up than OSPF. For a small network, RIP may be a
better choice at first.
Digging into tcpdump
The tcpdump tool is truly one of the more powerful tools you will use as a system administrator. The
GUI equivalent of it, Wireshark, is an even better choice when a graphical front-end is available.
Wireshark offers all of the power of tcpdump, with the added bonus of richer filters, additional
protocol support, the ability to follow TCP connections quickly, and some handy statistics.
This section walks through a few examples of how you can use tcpdump.
A Few General Notes
Here are a few quick tips regarding these tools before you jump into more advanced examples.
Wireshark (The Tool Formerly Known as Ethereal)
Wireshark (which used to be known as Ethereal) is a graphical tool for taking packet traces and
decoding them. It offers a lot more features than tcpdump and is a great way to peer inside of various
protocols. You can download the latest version of Wireshark from www.wireshark.org.
An extra-nice feature of Wireshark is its cross-platform support. It can work under native
Windows, OS X, and UNIX environments. So, for example, if you have a Windows desktop and a lot
of Linux servers, you can capture packets on the server and then view/analyze them from any of the
other supported platforms.
Before you get too excited about Wireshark, don’t neglect to get your hands dirty with tcpdump,
too. In troubleshooting sessions, you don’t always have the time or luxury of pulling up Wireshark,
and if you’re just looking for a quick validation that packets are moving, starting up a GUI tool might
be a bit more than you need. The tcpdump tool offers a quick way to get a handle on the situation.
Therefore, learning it will help you get a grip on a lot of situations quickly.
TIP Your Sun Solaris friends might have spoken about snoop. The tcpdump tool and snoop, while
not identical, have a lot of similarities. Learn one, and you’ll have a strong understanding of the other.
Reading and Writing Dumpfiles
If you need to capture and save a lot of data, you’ll want to use the -w option to write all the packets
to disk for later processing. Here is a simple example:
The tcpdump tool will continue capturing packets seen on the eth0 interface until the terminal is
closed, the process is killed, or CTRL-C is pressed. The resulting file can be loaded by Wireshark or
read by any number of other programs that can process tcpdump-formatted captures. (The packet
format itself is referred to as “pcap.”)
NOTE When the -w option is used with tcpdump, it is not necessary to issue the -n option to avoid
DNS lookups for each IP address seen.
To read back the packet trace using tcpdump, use the -r option. When you’re reading back a
packet trace, additional filters and options can be applied to affect how the packets will be displayed.
For example, to show only ICMP packets from a trace file and avoid DNS lookups (using the -n
option) as the information is displayed, do the following:
Capturing More or Less per Packet
By default, tcpdump limits itself to capturing the first 65,535 bytes of a packet. If you’re just looking
to track some flows and see what’s happening on the wire, this is usually good enough. However, if
you need to capture the entire packet for further decoding, you’ll need to increase this value. Or you
might need to capture less of the packet possibly to speed up the capture process.
To change the length of the packet that tcpdump captures, use the -s (snaplen) option. For
example, to capture a full 1500-byte packet and write it to disk, you could use this:
Performance Impact
Taking a packet trace can have a performance impact, especially on a heavily loaded server. There
are two parts to the performance piece: the actual capture of packets and the decoding/printing of
packets.
The actual capture of packets, while somewhat costly, can be minimized with a good filter. In
general, unless your server load is extremely high or you’re moving a lot of traffic (a lot being
hundreds of megabits/second), this penalty is not too significant. The existing cost comes from the
penalty of moving packets from the kernel up to the tcpdump application, which requires both a buffer
copy and a context switch.
The decoding/printing of packets, by comparison, is substantially more expensive. The decoding
itself is a small fraction of the cost, but the printing is high. If your server is loaded, you want to
avoid printing for two reasons: It generates load to format the strings that are output, and it generates
load to update your screen. The latter factor can be especially costly if you’re using a serial console,
since each byte sent over the serial port generates a high-priority interrupt (higher than the network
cards) that takes a long time to process, because serial ports are comparatively so much slower than
everything else. Printing decoded packets over a serial port can generate enough interrupt traffic to
cause network cards to drop packets as they are starved for attention from the main CPU.
To alleviate the stress of the decode/print process, use the -w option to write raw packets to disk.
The process of writing raw packets is much faster and lower in cost than printing them. Furthermore,
writing raw packets means you skip the entire decode/print step, since that is done only when you
need to see the packets.
In short, if you’re not sure, use the -w option to write the packets to disk, copy them off to another
machine, and then read them there.
Don’t Capture Your Own Network Traffic
A common mistake made when using tcpdump is to log in via the network and then start a capture.
Without the appropriate filter, you’ll end up capturing your session packets, which, in turn, if you’re
printing them to the screen, can generate new packets, which get captured again, and so on. A quick
way to skip your own traffic (and that of other administrators) is simply to skip port 22 (the SSH
port) in the capture, like so:
If you want to see what other people are doing on that port, add a filter that applies only to your
host. For instance, if you’re coming from 192.168.1.8, you can write this:
(Note the addition of the quotation marks. This was done so as not to confuse the shell with the added
parentheses, whose contents are meant for tcpdump.)
Why Is DNS Slow?
Odd or intermittent problems are great reasons for using tcpdump. Using a trace of the packets
themselves, you can look at activity over a period of time and identify issues that might be masked by
other activity on the system or a lack of debugging tools.
Let’s assume for a moment that you are using the DNS server managed by your DSL provider.
Everything is working until one day, things seem to be acting up. Specifically, when you visit a web
site, the first connection seems to take a long time, but once connected, the system seems to run pretty
quickly. Every couple of sites, the connection doesn’t even work, but clicking “reload” seems to do
the trick. That means that DNS is working and connectivity is there. What gives?
Time to take a packet trace. Because this is web traffic, we know that two protocols are at work:
DNS for the hostname resolution and TCP for connection setup. That means we want to filter out all
the other noise and focus on those two protocols. Because there seems to be some kind of speed
issue, getting the packet timestamps is necessary, so we don’t want to use the -t option. Here’s the
result:
Now visit the desired web site. For this example, we’ll go to www.labmanual.org. Let’s look at
the first few UDP packets:
That’s interesting; we needed to retransmit the DNS request to get the IP address for the hostname.
Looks like some kind of connectivity problem is happening here, because we do eventually get a
response. What about the rest of the connection? Does the connectivity problem affect other activity?
Clearly, the rest of the connection went quickly. Time to poke at the DNS server.
Yikes! We’re losing packets (50 percent packet loss), and the jitter on the wire is bad. This
explains the odd DNS behavior. Time to look for another DNS server while this issue is resolved.
Graphing Odds and Ends
When it comes to collecting network information, tcpdump is a gold mine. Presenting the data
collected using tcpdump in some kind of statistical or graphical manner may sometimes be
useful/informative (or a good time-killing exercise at any rate!). Here are a few examples of things
you can do.
Graphing Initial Sequence Numbers
The Initial Sequence Number (ISN) in a TCP connection is the sequence number specified in the SYN
packet that starts a connection. For security reasons, it is important that you have a sufficiently
random ISN so that others can’t spoof connections to your server. To see a graph of the distribution of
ISNs that your server is generating, let’s use tcpdump to capture five SYN/ACK packets sent from the
web server. To capture the data, we use the following bit of tcpdump piped to Perl:
The tcpdump command introduces a new parameter, -l. This parameter tells tcpdump to linebuffer its output. This is necessary when piping tcpdump’s output to another program such as Perl. I
also introduce a new trick whereby we look into a specific byte offset of the TCP packet and check
for a value. In this case, I used the figure of the TCP header to determine that the 13th byte holds the
TCP flags. For SYN/ ACK, the value is 18. The resulting line is piped into a Perl script that pulls the
sequence number out of the line and prints it. The resulting file, graphme, will simply be a string of
numbers that looks something like this:
We now use gnuplot (www.gnuplot.info) to graph these. We could use another spreadsheet to
plot these, but depending on how many entries we have, that could be an issue. The gnuplot program
works well with large data sets, and it is free.
We start gnuplot and issue the following commands:
Taking a look at the generated syns.png file, we see a graph that shows a good distribution of ISN
values. This implies that it is difficult to spoof TCP connections to this host. Clearly, the more data
we have to graph here, the surer we can be of this result. Taking the data to a statistics package to
confirm the result can be equally interesting.
IPv6
IPv6, Internet Protocol version 6, is also referred to as IPng—Internet Protocol: the Next Generation.
IPv6 offers many new features and improvements over its predecessor, IPv4, including the following:
A larger address space
Built-in security capabilities; offers Network layer encryption and authentication
A simplified header structure
Improved routing capabilities
Built-in auto-configuration capabilities
IPv6 Address Format
IPv6 offers an increased address space, because it is 128 bits long (compared to the 32 bits for IPv4).
Because an IPv6 address is 128 bits long (or 16 bytes), there are about 3.4 × 1038 possible addresses
available (compared to the roughly 4 billion available for IPv4).
A human being representing or memorizing (without error) a string of digits that is 128 bits long
on paper is not easy. Therefore, several abbreviation techniques exist that make it easier to represent
or shorten an IPv6 address to make it more human-friendly. The 128 bits of an IPv6 address can be
shortened by representing the digits in hexadecimal format. This effectively reduces the total length to
32 digits in hexadecimal. IPv6 addresses are written in groups of four hexadecimal numbers. The
eight groups are separated by colons (:). Here’s a sample IPv6 address:
The leading zeros of a section of an IPv6 address can be omitted; so, for example, the sample
address can be shortened to this:
The rule also permits the previous address to be rewritten like this:
One or more consecutive four-digit groups of zeros in an IPv6 address can be shortened and
represented by double colon symbols (::), as long as this is done only once in the entire address.
Therefore, using this rule, our sample address can be abbreviated to this:
Using the proviso in the rule would make the following address invalid, because there is now
more than one set of double colons in use:
IPv6 Address Types
There are several types of IPv6 addresses. Each address type has additional special address types, or
scopes, which are used for different things. Three particularly special IPv6 address classifications
are unicast, anycast, and multicast addresses.
Unicast Addresses
A unicast address in IPv6 refers to a single network interface. Any packet sent to a unicast address is
meant for a specific interface on a host. Examples of unicast addresses are link-local (such as ::/128,
an unspecified address; ::1/128, a loopback address; and fe80::/10, an auto-configuration address),
global unicast, site-local, and other special addresses.
Anycast Addresses
An anycast address is a type of IPv6 address that is assigned to multiple interfaces (possibly
belonging to different hosts). Any packet sent to an anycast address will be delivered to the closest
interface that shares the anycast type address—“closest” is interpreted according to the routing
protocol’s idea of distance or simply the most easily accessible host. Hosts in a group sharing an
anycast address have the same address prefix.
Multicast Addresses
An IPv6 multicast-type address is similar in functionality to an IPv4-type multicast address. A packet
sent to a multicast address will be delivered to all the hosts (interfaces) that have the multicast
address. The hosts (or interfaces) that make up a multicast group do not necessarily need to share the
same prefix and do not need to be connected to the same physical network.
IPv6 Backward-Compatibility
The designers of IPv6 built in backward-compatibility functionality into IPv6 to accommodate the
various hosts or sites that are not fully IPv6-compliant or ready. The support for legacy IPv4 hosts
and sites is handled in several ways: mapped addresses (IPv4-mapped IPv6 address), compatible
addresses (IPv4-compatible IPv6 address), and tunneling.
Mapped Addresses
Mapped addresses are special unicast-type addresses used by IPv6 hosts. They are used when an
IPv6 host needs to send packets to an IPv4 host via a mostly IPv6 infrastructure. The format for a
mapped IPv6 address is as follows: the first 80 bits are all 0’s, followed by 16 bits of 1’s, and then it
ends with 32 bits of the IPv4 address.
Compatible Addresses
The compatible type of IPv6 address is used to support IPv4-only hosts or infrastructures—that is,
those that do not support IPv6 in any way. It can be used when an IPv6 host wants to communicate
with another IPv6 host via an IPv4 infrastructure. The first 96 bits of a compatible IPv6 address is
made up of all 0’s and ends with 32 bits of the IPv4 address.
Tunneling
This method is used by IPv6 hosts that need to transmit information over a legacy IPv4 infrastructure
using configured tunnels. This is achieved by encapsulating an IPv6 packet in a traditional IPv4
packet and sending it via the IPv4 network.
Summary
This chapter covered the fundamentals of TCP/IP and other protocols, ARP, subnetting and netmasks,
and routing. It’s a lot to digest, but hopefully this simplified version should make it easier to
understand. Specifically, we discussed the following:
How TCP/IP relates to the ISO OSI seven-layer model
The composition of a packet
The specifications of packet headers and how to explore them using tcpdump
The complete process of a TCP connection setup, data transfer, and connection teardown
How to calculate netmasks
How static routing works
How dynamic routing works with RIP
Several packet analysis examples covering the use of tcpdump
A brief overview of IPv6
Because the information here is (substantially) simplified, you might want to take a look at some
other books for more information regarding this topic. This is especially important if you have
complex networks in which your machines need to participate or if you need to understand the
operation of your firewall better.
One classic book we recommend is TCP/IP Illustrated, Volume 1, by Richard Stevens (AddisonWesley, 1994). This book covers TCP/IP in depth as well as several popular protocols that send
their data over IP. The complex subject of TCP/IP is explained in a clear and methodical manner.
As always, the manual pages for the various tools and utilities discussed will always be a good
source of information. For example, the latest version of tcpdump’s documentation (man page) can be
found at www.tcpdump.org/tcpdump_man.html.
CHAPTER 12
Network Configuration
s with most modern operating systems, Linux distributions ship with a robust set of capable
graphical tools for administrating most of the networking-related functions within the system.
Examples of these tools include NetworkManager (nm) and Wireless Interface Connection
Daemon (WICD). Invariably, the GUI tools are merely pretty front-ends for manipulating plain-text
files in the back-end.
Your understanding how network configuration works under the hood in Linux distros is
invaluable and can come in handy in several scenarios. First and foremost is that when things are
breaking and you can’t start your favorite GUI, you’ll find that being able to handle network
configuration from the command line is crucial. Another benefit is remote administration: You might
not be able to run a graphical configuration tool easily from a remote site. Issues such as firewalls
and network latency will probably restrict your remote administration to the command line only.
Finally, it’s always nice to be able to manage network configuration through scripts, and commandline tools are well suited for scripting.
In this chapter, we will tackle an overview of network interface drivers, the tools necessary for
performing command-line administration of your network interface(s).
A
Modules and Network Interfaces
Network devices under Linux break the tradition of accessing all devices through the file abstraction
layer. Not until the network driver initializes the card and registers itself with the kernel does there
exist a mechanism for anyone to access the card. Typically, Ethernet devices register themselves as
being ethX, where X is the device number. The first Ethernet device is eth0, the second is eth1, and so
on.
Depending on how your kernel was compiled, the device drivers for your network interface cards
may have been compiled as a module. For most distributions, this is the default mechanism for
shipping, since it makes it much easier to probe for new hardware.
If the driver is configured as a module and you have auto-loading modules set up, you may
sometimes need to tell the kernel the mapping between device names and the module to load, or you
may simply need to pass on some special options to the module. This can be done by using (or
creating) an appropriate configuration file under the /etc/modprobe.d directory. For example, if
your eth0 device is an Intel PRO/1000 card, you would add the following line to your
/etc/modprobe.d/example.conf file:
Here, e1000 is the name of the device driver.
You can set this up for every network card that exists in the same system. For example, if you
have two network cards, one based on the DEC Tulip chipset and another on the RealTek 8169
chipset, you would need to make sure your sample module configuration file
—/etc/modprobe.d/example.conf—includes these lines:
Here, tulip refers to the network card with the Tulip chip on it, and r8169 refers to the RealTek
8169 card.
The udev subsystem can be used to manipulate the device name assigned to network devices
such as Ethernet cards. This can be useful in overcoming the occasional unpredictability with which
the Linux kernel names and detects network devices.
TIP
You can find a listing of all the network device drivers that are installed for your kernel in the
/lib/modules/`uname -r`/kernel/drivers/net directory, like so:
Note that backticks (versus single quotes) surround the embedded uname-r command. This will let
you be sure you are using the correct driver version for your current kernel version. If you are using a
standard installation of your distribution, you’ll find that only one subdirectory name should appear in
the /lib/modules directory. But if you have upgraded or compiled your kernel, you might find more
than one such directory.
If you want to see a driver’s description without having to load the driver itself, use the modinfo
command. For example, to see the description of the yellowfin.ko driver, type the following:
Keep in mind that not all drivers have descriptions associated with them, but most do.
Consistent Network Device Naming
The latest versions of Fedora Linux distribution, from versions 15 upwards, use a new network
device naming convention. The change was brought about because the way the Linux kernel
discovers and assigns names to an Ethernet interface on a system is not always completely
predictable (it is influenced by hardware initialization routines, physical PCI bus topology,
device driver code, and other factors). The new naming convention aims to guarantee that
Ethernet cards and ports will be assigned names that match their physical location on the system
board, and it specifically affects network adapters embedded on the motherboard and add-on
adapters. This new convention may or may not make its way into other mainstream Linux distros
such as Debian, openSUSE, and so on.
The new network device naming convention works like this:
The Ethernet interface on systems with onboard or embedded NICs are named with the
prefix em<PORT_NUMBER>, where <PORT_NUMBER> is the physical chassis
label. For example, the first onboard Ethernet interface will be assigned the name em1,
the second will be named em2, and so on.
Network interface cards that are plugged into PCI, PCI-e, PCI-X, and so on, slots on a
system board are named p<SLOT_NUMBER>p<PORT_NUMBER>, where
<SLOT_NUMBER> identifies the physical PCI slot and <PORT_NUMBER> identifies
the specific physical port number on the corresponding NIC. Example network device
names using this notation would be p1p1, p8p1, p7p1, and so on.
Some virtual machine guests will continue to use the traditional ethX naming convention.
And you can completely bypass the new naming scheme by passing the biosdevname=0
argument as a boot-time option to the Linux kernel in supported platforms.
Network Device Configuration Utilities (ip and ifconfig)
The ifconfig program is responsible primarily for setting up your network interface cards (NICs).
All of its operations can be performed through command-line options, as its native format has no
menus or graphical interface. Administrators who have used the Windows ipconfig program may
see some similarities, as Microsoft implemented some command-line interface (CLI) networking
tools that mimic functional subsets of their UNIX counterparts.
NOTE The ifconfig program typically resides in the /sbin directory, which is included in root’s
PATH. Some login scripts, such as those in openSUSE, do not include /sbin in the PATH for
nonprivileged users by default. Thus, you might need to invoke /sbin/ifconfig when calling on it as a
regular user. If you expect to be a frequent user of commands under /sbin, you may find it prudent to
add /sbin to your PATH.
A number of tools have been written to wrap around ifconfig’s CLI to provide menu-driven or
graphical interfaces, and many of these tools ship with the various Linux distros. Fedora, for example,
has a GUI tool called system-config-network. As an administrator, you should at least know how
to configure the network interface by hand; knowing how to do this is invaluable, as many additional
options not shown in GUIs are exposed in the CLI. For that reason, this section will cover the use of
the ifconfig command-line tool.
Another powerful program that can be used to manage network devices in Linux is the ip
program. The ip utility comes with the iproute software package. The iproute package contains
networking utilities (such as ip) that are designed to be used to take advantage of and manipulate the
advanced networking capabilities of the Linux kernel. The syntax for the ip utility is a little terser and
less forgiving than that of the ifconfig utility. But the ip command is much more powerful.
TIP Administrators still dealing with Windows may find the %SYSTEMROOT%\system32\netsh.exe
program a handy tool for exposing and manipulating the details of Windows networking via the CLI.
The following sections will use both the ifconfig command and the ip command to configure
the network devices on our sample server.
Simple Usage
In its simplest usage, all you need to do is provide the name of the interface being configured and the
IP address. The ifconfig program will deduce the rest of the information from the IP address. Thus,
you could enter the following:
This will set the eth0 device to the IP address 192.168.1.42. Because 192.168.1.42 is a class C
address, the calculated default netmask will be 255.255.255.0 and the broadcast address will be
192.168.1.255.
If the IP address you are setting is a class A, B, or C address that is subnetted differently, you will
need to set the broadcast and netmask addresses explicitly on the command line, like so:
Here, dev is the network device you are configuring, ip is the IP address to which you are setting it,
nmask is the netmask, and bcast is the broadcast address.
The following example will set the eth0 device to the IP address 1.1.1.1 with a netmask of
255.255.255.0 and a broadcast address of 1.1.1.255:
To do the same thing using the ip command, you would type this:
TIP The ip command allows unique abbreviations to be made in its syntax. Therefore, the preceding
command could also have been shortened to this:
To use ip to delete the IP address created previously, type this:
To use the ip command to assign an IPv6 address (for example, 2001:DB8::1) to the interface
eth0, you would use this command:
To use ip to delete the IPv6 address created previously, type this:
The ifconfig command can also be used to assign an IPv6 address to an interface. For example,
we can assign the IPv6 address 2001:DB8::3 to eth2 by running the following:
To display the IPv6 addresses on all interfaces, you can use the ip command like so:
IP Aliasing
In some instances, it is necessary for a single host to have multiple IP addresses. Linux can support
this by using IP aliases. Each interface in the Linux system can have multiple IP addresses assigned.
This is done by enumerating each instance of the same interface with a colon followed by a number—
for example, eth0 is the main interface, eth0:0 is an aliased interface, eth0:1 is also an aliased
interface, eth0:2 is another aliased interface, and so on.
Configuring an aliased interface is just like configuring any other interface: Simply use ifconfig.
For example, to set eth0:0 with the address 10.0.0.2 and netmask 255.255.255.0, we would do the
following:
To do the same thing using the ip command, type this:
You can view your changes by typing the following:
TIP You can list all the active devices by running ifconfig with no parameters. You can list all
devices, regardless of whether they are active, by using the -a option, like this: ifconfig -a.
Note that network connections made to the aliased interface will communicate on the aliased IP
address; however, in most circumstances, any connection originating from the host to another host
will use the first assigned IP of the interface. For example, if eth0 is 192.168.1.15 and eth0:0 is
10.0.0.2, a connection from the machine that is routed through eth0 will use the IP address
192.168.1.15. The exception to this behavior is for applications that bind themselves to a specific IP
address. In those cases, it is possible for the application to originate connections from the aliased IP
address. If a host has multiple interfaces, the route table will decide which interface to use. Based on
the routing information, the first assigned IP address of the interface will be used.
Confused? Don’t worry; it’s a little awkward to grasp at first. The choice of source IP is
associated with routing as well, so we’ll revisit this concept later in the chapter.
Setting up NICs at Boot Time
Unfortunately, each distribution has taken to automating its setup process for network cards a little
differently. We will cover the Fedora (and other Red Hat derivatives) specifics in the next section.
For other distributions, you need to handle this procedure in one of two ways:
Use the network administration tool that comes with that distribution to manage the network
settings. This is probably the easiest and most reliable method.
Find the startup script that is responsible for configuring network cards. (Using the grep tool
to find which script runs ifconfig works well.) At the end of the script, add the necessary
ifconfig statements. Another place to add ifconfig statements is in the rc.local script—
not as pretty, but it works equally well.
Setting up NICs Under Fedora, CentOS, and RHEL
Fedora and other Red Hat–type systems use a simple setup that makes it easy to configure network
cards at boot time. It is done through the creation of files in the /etc/ sysconfig/network-scripts
directory that are read at boot time. All of the graphical tools under Fedora create and manage these
files for you; if you’re one of those people who like to get under the hood, the following sections
show how to manage the configuration files manually.
For each network interface, there is an ifcfg file in /etc/sysconfig/network-scripts. This
filename is suffixed by the name of the device; thus, ifcfg-eth0 is for the eth0 device, ifcfg-eth1 is for
the eth1 device, and so on.
If you choose to use a static IP address at installation time, the format for the interface
configuration file for eth0 will be as follows:
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=none
NETMASK=255.255.255.0
IPADDR= 192.168.1.100
GATEWAY=192.168.1.1
TYPE=Ethernet
HWADDR=00:0c:29:ac:5b:cd
NM_CONTROLLED=no
TIP Sometimes, if you are running other protocols—Internetwork Packet Exchange (IPX), for
instance—you might see variables that start with IPX. If you are not running or using the IPX (which
is typical), you won’t see these IPX variable entries.
If you choose to use Dynamic Host Configuration Protocol (DHCP) at installation time, your file
may look like this:
DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
HWADDR=00:0c:29:ac:5b:cd
NM_CONTROLLED=yes
These fields determine the IP configuration information for the eth0 device. Note how some of the
values correspond to the parameters in ifconfig.
To change the configuration information for this device, simply change the information in the ifcfg
file and run the following:
If you are changing from DHCP to a static IP address, simply change BOOTPROTO to equal
“none” and add lines for IPADDR, NETWORK, DNS1, DNS2, and BROADCAST.
A new interface configuration variable that you will usually find used in distros such as Fedora,
CentOS, and Red Hat Enterprise Linux (RHEL) is the NM_CONTROLLED variable. It is used for
enabling or disabling the use of the NetworkManager utility on the interface for managing network
devices and connections. This variable accepts either a “yes” or a “no.” If set to yes,
NetworkManager will need to be used to manage the interface—either from the GUI or from the
command-line equivalent (using nmcli). If set to no, NetworkManager will ignore this network
connection/device.
TIP In Fedora, RHEL, and CentOS distros, the file /usr/share/doc/initscripts-*/sysconfig.txt
explains the options and variables that can be used in the different /etc/sysconfig/network-scripts/
ifcfg-* configuration files.
If you need to configure a second network interface card (for example, eth1), you can copy the
syntax used in the original ifcfg-eth0 file by copying and renaming the ifcfg-eth0 file to ifcfg-eth1
and changing the information in the new ifcfg-eth1 file to reflect the second network card’s
information. When doing this, you have to make sure that the HWADDR variable (media access control,
or MAC, address) in the new file reflects the MAC address of the actual physical network device you
are trying to configure. Once the new ifcfg-eth1 file exists, Fedora will automatically configure it
during the next boot or the next time the network service is restarted.
If you need to activate the card immediately, run the following:
Assuming your interface is not under the control of the NetworkManager program, you can also
restart the network service to make your changes take effect, like so:
Or use this:
If, on the other hand, the network interface is under the control of the NetworkManager program
(because you have it installed, enabled, and you set NM_CONTROLLED=yes), then any changes to
the interface configuration file (such as ifcfg-eth0, ifcfg-em1, p6p1, and so on) will be automatically
applied to the running system without any more user input. NetworkManager is able to do this through
the magic of Linux subsystems such as udev, dbus, and so on.
TIP On a system configured to be a server, you might want to disable NetworkManager completely
so that the server’s network settings are not “auto-magically” configured or managed for you.
Assuming you have NetworkManager installed and you want to get it out of the way, first set the
parameter NM_CONTROLLED=no in the appropriate interface configuration file and then execute
the following commands:
Additional Parameters
The format of the ifconfig command is as follows:
Here, device is the name of the Ethernet device (for instance, eth0), address is the IP address you
want to apply to the device, and options are one of the following:
Option
Description
up
Enables the device. This option is implicit.
down
Disables the device.
arp
Enables this device to answer arp requests (default).
-arp
Disables this device from answering arp requests.
Sets the maximum transmission unit (MTU) of the device value. Under Ethernet, this
mtu value
defaults to 1500. (See the note following the table regarding certain Gigabit Ethernet
cards.)
Sets the netmask to this interface to address. If a value is not supplied, ifconfig
netmask
calculates the netmask from the class of the IP address. A class A address gets a
address
netmask of 255.0.0.0, class B gets 255.255.0.0, and class C gets 255.255.255.0.
broadcast
Sets
address
pointtopoint
Sets
address
the broadcast address to this interface to address.
up a point-to-point (PPP) connection where the remote address is address.
NOTE Many Gigabit Ethernet cards now support jumbo Ethernet frames. A jumbo frame is 9000
bytes in length, which (conveniently) holds one complete Network File System (NFS) packet. This
allows file servers to perform better, since they have to spend less time fragmenting packets to fit into
1500-byte Ethernet frames. Of course, your network infrastructure as a whole must support this in
order to benefit. If you have a network card and appropriate network hardware to set up jumbo
frames, it is very much worth looking into how to toggle on those features. If your Gigabit Ethernet
card supports it, you can set the frame size to 9000 bytes by changing the MTU setting when
configured with ifconfig (for example, ifconfig eth0 mtu 9000).
Network Device Configuration in Debian-Like Systems (Ubuntu, Kubuntu,
Edubuntu, and so on)
Debian-based systems such as Ubuntu use a different mechanism for managing network
configuration. Specifically, network configuration is done via the /etc/network/interfaces file.
The format of the file is simple and well-documented.
The entries in a sample /etc/network/interfaces file are discussed next. Please note that
line numbers have been added to aid readability.
1)# The loopback network interface
2) auto lo
3) iface lo inet loopback
4)
5) # The first network interface eth0
6) auto eth0
7) iface eth0 inet static
8)
address 192.168.1.45
9)
netmask 255.255.255.0
10)
gateway 192.168.1.1
11) iface eth0:0 inet dhcp
12)
13) # The second network interface eth1
14) auto eth1
15) iface eth1 inet dhcp
16) iface eth1 inet6 static
17)
address 2001:DB8::3
18)
netmask 64
Line 1 Any line that begins with the pound sign (#) is a comment and is ignored. Same
thing goes for blank lines.
Line 2 Lines beginning with the word auto are used to identify the physical interfaces
to be brought up when the ifup command executes, such as during system boot or when
the network run control script is run. The entry auto lo in this case refers to the
loopback device. Additional options can be given on subsequent lines in the same
stanza. The available options depend on the family and method.
Line 7 The iface directive defines the physical name of the interface being processed.
In this case, it is the eth0 interface. The iface directive in this example supports the
inet option, where inet refers to the address family. The inet option, in turn,
supports various methods. Methods such as loopback (line 3), static (line 7), and
dhcp (line 14) are supported. The static method here is simply used to define Ethernet
interfaces with statically assigned IP addresses.
Lines 8–10 The static method specified in Line 7 allows various options, such as
address, netmask, gateway, and so on. The address option here defines the interface IP
address (192.168.1.45), the netmask option defines the subnet mask (255.255.255.0),
and the gateway option defines the default gateway (192.168.1.1).
Line 11 The iface directive is being used to define a virtual interface named eth0:0
that will be configured using DHCP.
Line 15 The iface directive defines the physical name of the interface being
processed. In this case, it is the eth1 interface. The iface directive in this example
supports the inet option, which is using the dhcp option. This means that the interface
will be dynamically configured using DHCP.
Lines 16–18 These lines assign a static IPv6 address to the eth1 interface. The
address assigned in this example is 2001:DB8::3 with the netmask 64.
After making and saving any changes to the interfaces file, the network interface can be
brought up or down using the ifup command. For example, after creating a new entry for the
eth1 device, you would type this:
To bring down the eth1 interface, you would run this:
The sample interfaces file discussed here is a simple configuration. The
/etc/network/interfaces file supports a vast array of configuration options that were barely
covered here. Fortunately, the man page (man 5 interfaces) for the file is well documented.
Managing Routes
If your host is connected to a network with multiple subnets, you might need a router or gateway to
communicate with other hosts. This device sits between networks and redirects packets toward their
actual destination. (Typically, most hosts don’t know the correct path to a destination; they know only
the destination itself.)
If a host doesn’t even have the first clue about where to send a packet, it uses its default route.
This path points to a router, which ideally does have an idea of where the packet should go, or at
least knows of another router that can make smarter decisions.
NOTE On Fedora, RHEL, and CentOS systems, it is also possible to set certain system-wide
network-related values such as the default route, the hostname, NIS domain name, and so on, in the
appropriate /etc/sysconfig/network-scripts/ifcfg-* interface configuration file.
A typical single-homed Linux host knows of several standard routes. Some of the standard routes
are the loopback route, which simply points toward the loopback device. Another is the route to the
local area network (LAN) so that packets destined to hosts within the same LAN are sent directly to
them. Another standard route is the default route. This route is used for packets that are destined for
other networks outside of the LAN. Yet another route that you might see in a typical Linux routing
table is the link-local route (169.254.0.0). This is relevant in auto-configuration scenarios.
NOTE Request For Comment (RFC) 3927 offers details about auto-configuration addresses for IPv4.
RFC 4862 offers details about auto-configuration in IPv6. Microsoft refers to their implementation of
auto-configuration as Automatic Private IP Addressing (APIPA) or Internet Protocol Automatic
Configuration (IPAC).
If you set up your network configuration at install time, this setting is most likely already taken
care of for you, so you don’t need to change it. However, this doesn’t mean you can’t change it.
NOTE In some instances, you will need to change your routes by hand. Typically, this is necessary
when multiple network cards are installed into the same host, where each NIC is connected to a
different network (multi-homed). You should know how to add a route so that packets can be sent to
the appropriate network for a given destination address.
Simple Usage
The typical route command is structured as follows:
The parameters are as follows:
Parameter
cmd
type
addy
netmask mask
gw gway
Description
Either add or del, depending on whether you are adding or deleting a
route. If you are deleting a route, the only other parameter you need is
addy.
Either -net or -host, depending on whether addy represents a
network address or a router address.
The destination network to which you want to offer a route.
Sets the netmask of the addy address to mask.
Sets the router address for addy to gway. Typically used for the
default route.
dev dn
Sends all packets destined to addy through the network device dn as
set by ifconfig.
Here’s how to set the default route on a sample host, which has a single Ethernet device and a
default gateway at 192.168.1.1:
To add a default route to a system without an existing default route using the ip route utility, you
would type this:
To set the default IPv6 route to point to the IPv6 gateway at the address 2001:db8::1 using the ip
command, type this:
To use the ip command to replace or change an existing default route on a host, you would use
this:
The next command line sets up a host route so that all packets destined for the remote host
192.168.2.50 are sent through the first PPP device:
To use ip to set a host route to a host 192.168.2.50 via the eth2 interface, you could try this:
To use the ip command to set up an IPv6 route to a network (for example, 2001::/24) using a
specific gateway (such as 2001:db8::3), we run this command:
Here’s how to delete the route destined for 192.168.2.50:
To delete using ip, you would type this:
NOTE Don’t just set routes arbitrarily on production systems without having a proper understanding
of the network topology. Doing so can easily break the network connectivity. If you are using a
gateway, you need to make sure a route exists to the gateway before you reference it for another route.
For example, if your default route uses the gateway at 192.168.1.1, you need to be sure you have a
route to get to the 192.168.1.0 network first.
To delete an IPv6 route (e.g., to 2001::/24 via 2001:db8::3) using the ip command, run this:
Displaying Routes
You can display your route table in several ways, including using the route command, the netstat
command, and the ip route command.
route
Using route is one of the easiest ways to display your route table—simply run route without any
parameters. Here is a complete run, along with the output:
Here, you see two networks. The first is the 10.10.2.0 network, which is accessible via the first
Ethernet device, eth0. The second is the 192.168.1.0 network, which is connected via the second
Ethernet device, eth1. The third entry is the link-local destination network, which is used for autoconfiguration hosts. The final entry is the default route. Its actual value in our example is 10.10.2.1;
however, because the IP address resolves to the host name “my-firewall” in Domain Name System
(DNS), route prints its hostname instead of the IP address.
We have already discussed the destination, gateway, netmask (referred to as -genmask in this
table), and iface (interface, set by the dev option on route). The other entries in the table have the
following meanings:
Entry
Description
A summary of connection status, where each letter has a significance:
U The connection is up.
Flags
H The destination is a host.
G The destination is a gateway.
The cost of a route, usually measured in hops. This is meant for systems that
have multiple paths to get to the same destination, but one path is preferred over
Metric
the other. A path with a lower metric is typically preferred. The Linux kernel
doesn’t use this information, but certain advanced routing protocols do.
The number of references to this route. This is not used in the Linux kernel. It is
Ref
here because the route tool itself is cross-platform. Thus, it prints this value,
since other operating systems do use it.
The number of successful route cache lookups. To see this value, use the -F
Use
option when invoking route.
Note that route displayed the hostnames to any IP addresses it could look up and resolve.
Although this is nice to read, it presents a problem when there are network outages and DNS or
Network Information Service (NIS) servers become unavailable. The route command will hang on,
trying to resolve hostnames and waiting to see if the servers come back to resolve them. This will go
on for several minutes until the request times out.
To get around this, use the -n option with route so that the same information is shown, but route
will make no attempt to perform hostname resolution on the IP addresses.
To view the IPv6 routes using the route command, type the following:
netstat
Normally, the netstat program is used to display the status of all of the network connections on a
host. However, with the -r option, it can also display the kernel routing table. Note that most other
UNIX-based operating systems require that you use this method of viewing routes.
Here is an example invocation of netstat -r and its corresponding output:
In this example, you see a simple configuration. The host has a single network interface card, is
connected to the 192.168.1.0 network, and has a default gateway set to 192.168.1.1.
Like the route command, netstat can also take the -n parameter so that it does not perform
hostname resolution.
To use the netstat utility to display the IPv6 routing table, you can run the command:
ip route
As mentioned, the iproute package provides advanced IP routing and network device configuration
tools. The ip command can also be used to manipulate the routing table on a Linux host. This is done
by using the route object with the ip command.
As with most commercial carrier-grade routing devices, a Linux-based system can actually
maintain and use several routing tables at the same time. The route command that you saw earlier
was actually displaying and managing only one of the default routing tables on the system—the main
table.
For example, to view the contents of table main (as displayed by the route command), you
would type this:
To view the contents of all the routing tables on the system, type this:
To display only the IPv6 routes, type this:
A Simple Linux Router
Linux has an impressive number of networking features, including its ability to act as a full-featured
router. For networks that need a low-cost router, a standard PC with a few network cards can work
quite nicely. Realistically, a Linux router is able to move a few hundred megabits per second,
depending on the speed of the PC, the CPU cache, the type of NIC, Peripheral Component
Interconnect (PCI) interfaces, and the speed of the front-side bus. In fact, several commercial routers
exist that are running a stripped and optimized Linux kernel under their hood with a nice GUI
administration front-end.
Routing with Static Routes
Let us assume that we want to configure a dual-homed Linux system as a router, as shown in
Figure 12-1.
Figure 12-1. Our sample network
In this network, we want to route packets between the 192.168.1.0/24 network and the
192.168.2.0/24 network. The default route is through the 192.168.1.8 router, which is performing
network address translation (NAT) to the Internet. (We discuss NAT in further detail in Chapter 13.)
For all the machines on the 192.168.2.0/24 network, we want to set their default route to 192.168.2.1
and let the Linux router figure out how to forward on to the Internet and the 192.168.1.0/24 network.
For the systems on the 192.168.1.0/24 network, we want to configure 192.168.1.15 as the default
route so that all the machines can see the Internet and the 192.168.2.0/24 network.
This requires that our Linux system have two network interfaces: eth0 and eth1. We configure
them as follows:
The result looks like this:
NOTE It is possible to configure a one-armed router where the eth0 interface is configured with
192.168.1.15 and eth0:0 is configured with 192.168.2.1. However, doing this will eliminate any
benefits of network segmentation. In other words, any broadcast packets on the wire will be seen by
both networks. Thus, it is usually preferable to put each network on its own physical interface.
When ifconfig adds an interface, it also creates a route entry for that interface based on the
netmask value. Thus, in the case of 192.168.1.0/24, a route is added on eth0 that sends all
192.168.1.0/24 traffic to it. With the two network interfaces present, let’s take a look at the routing
table:
All that is missing here is the default route to 192.168.1.8. Let’s add that using the route
command:
A quick check with ping verifies that we have connectivity through each route:
Looks good. Now it’s time to enable IP forwarding. This tells the Linux kernel that it is allowed
to forward packets that are not destined to it, if it has a route to the destination. This can be done
temporarily by setting /proc/sys/net/ipv4/ip_forward to 1 as follows:
Hosts on the 192.168.1.0/24 network should set their default route to 192.168.1.15, and hosts on
192.168.2.0/24 should set their default route to 192.168.2.1. Most importantly, don’t forget to make
the route additions and the enabling of ip_forward part of the startup scripts.
TIP Need a DNS server off the top of your head? For a quick query against an external DNS server,
try 4.2.2.1, which is currently owned by Verizon. The address has been around for a long time
(originally belonging to GTE Internet) and has numbers that are easy to remember. However, be nice
about it—a quick query or two to test connectivity is fine, but making it your primary DNS server
isn’t.
How Linux Chooses an IP Address
Now that host A has two interfaces (192.168.1.15 and 192.168.2.1) in addition to the loop-back
interface (127.0.0.1), we can observe how Linux will choose a source IP address with which to
communicate.
When an application starts, it has the option to bind to an IP address. If the application does not
explicitly do so, Linux will automatically choose the IP address on behalf of the application on a
connection-by-connection basis. When Linux is making the decision, it examines a connection’s
destination IP address, makes a routing decision based on the current route table, and then selects the
IP address corresponding to the interface from which the connection will leave. For example, if an
application on host A makes a connection to 192.168.1.100, Linux will find that the packet should go
out of the eth0 interface, and thus, the source IP address for the connection will be 192.168.1.15.
Let us assume that the application does choose to bind to an IP address. If the application were to
bind to 192.168.2.1, Linux will use that as the source IP address, regardless of from which interface
the connection will leave. For example, if the application is bound to 192.168.2.1 and a connection is
made to 192.168.1.100, the connection will leave out of eth0 (192.168.1.15) with the source IP
address of 192.168.2.1. It is now the responsibility of the remote host (192.168.1.100) to know how
to send a packet back to 192.168.2.1. (Presumably, the default route for 192.168.1.100 will know
how to deal with that case.)
For hosts that have aliased IP addresses, a single interface may have many IP addresses. For
example, we can assign eth0:0 to 192.168.1.16, eth0:1 to 192.168.1.17, and eth0:2 to 192.168.1.18.
In this case, if the connection leaves from the eth0 interface and the application did not bind to a
specific interface, Linux will always choose the nonaliased IP address—that is, 192.168.1.15 for
eth0. If the application did choose to bind to an IP address—say, 192.168.1.17—Linux will use that
IP address as the source IP, regardless of whether the connection leaves from eth0 or eth1.
Hostname Configuration
A system’s host name (hostname) is the friendly name by which other systems or applications
can address the system on a network. Configuring the hostname for a system is therefore
considered an important network configuration task. You would have been prompted to create or
choose a hostname for your system during the OS installation. It is also possible that a hostname
that can be used to uniquely identify your system was automatically assigned to your system
during the initial installation.
You should take time to pick and assign hostnames to your servers that best describes their
function or role. You should also pick names that can scale easily as your collection of servers
grows. Examples of good and descriptive hostnames are webserver01.example.org,
dbserver09.example.com, logger-datacenterB, jupiter.example.org, saturn.example.org, pluto,
sergent.example.com, hr.example.com, and major.example.org.
Once you’ve settled on a hostname and naming scheme, you next need to configure the system
with the name. There is no standardized method for configuring the hostname among the various
Linux distros and so we offer the following recipes and pointers to different configuration
files/tools that can be used on the popular Linux distros:
Fedora, CentOS, and RHEL The hostname is set on these distributions by assigning
the desired value to the HOSTNAME variable in the /etc/ sysconfig/network file.
openSUSE and SLE The hostname is set on these systems via the /etc/ HOSTNAME
file.
Debian, Ubuntu, Kubuntu Debian-based systems use the /etc/ hostname file to
configure the hostname of the system.
ALL Linux Distributions The sysctl tool can be used to temporarily change the
system hostname on the fly on virtually all the Linux distros. The hostname value set
using this utility will not survive between system reboots. The sytax is
Summary
In this chapter you saw how the ifconfig, ip, and route commands can be used to configure the IP
addresses (IPv4 and IPv6) and route entries (IPv4 and IPv6) on Linux-based systems. We looked at
how this is done in Red Hat–like systems such as Fedora, CentOS, and RHEL. And we also looked at
how this is done in Debian-like systems such as Ubuntu. We also saw how to use these commands
together to build a simple Linux router.
Although kernel modules were covered earlier in the book, they were discussed in this chapter
again in the specific context of network drivers. Remember that network interfaces don’t follow the
same method of access as most other devices with a /dev entry.
Finally, remember that when making IP address and routing changes, you should be sure to add
any and all changes to the appropriate startup scripts. You may want to schedule a reboot if you’re on
a production system to make sure that the changes work as expected so that you don’t get caught off
guard later on.
If you’re interested in more details on routing, it is worth taking a closer look at the next chapter
and some of the advanced Linux routing features. Linux offers a rich set of functions that, while not
typically used in server environments, can be used to build powerful stand-alone appliances, routing
systems, and networks.
For anyone interested in dynamic routing using Routing Information Protocol (RIP), Open Shortest
Path First (OSPF), or Border Gateway Protocol (BGP), be sure to look into the Zebra project
(www.zebra.org) as well as its more current successor Quagga (www.quagga.net). These projects
are dedicated to building and providing highly configurable dynamic routing systems/platforms that
can share route updates with any standard router, including commercial hardware such as Cisco
hardware.
CHAPTER 13
Linux Firewall (Netfilter)
n what feels like a long, long time ago, the Internet was a pretty friendly place. The few users of
the network were focused on research and thus had better things to do than waste their time poking
at other people’s infrastructure. To the extent security was in place, it was largely to keep
practical jokers from doing silly things. Many administrators made no serious effort to secure their
systems, often leaving default administrator passwords in place.
Unfortunately, as the Internet population grew, so did the threat from the bored and malicious. The
need to put up barriers between the Internet and private networks started becoming increasingly
commonplace in the early 1990s. Papers such as “An Evening with Berferd” and “Design of a Secure
Internet Gateway” by Bill Cheswick signified the first popular idea of what is now known as a
firewall. (Both papers are available on Bill’s web site at www.cheswick.com/ches.) Since then,
firewall technology has been through a lot of changes.
The Linux firewall and packet filtering/mangling system has come a long way with these changes
as well; from an initial implementation borrowed from Berkeley Software Distribution (BSD),
through four major rewrites (kernels 2.0, 2.2, 2.4, 2.6, and 3.0) and three user-level interfaces
(ipfwadm, ipchains, and iptables). The current Linux packet filter and firewall infrastructure
(both kernel and user tools) is referred to as “Netfilter.”
In this chapter, we start with a discussion of how Linux Netfilter works, follow up with how those
terms are applied in the Linux 3.0 toolkit, and finish up with several configuration examples.
I
NOTE This chapter provides an introduction to the Netfilter system and demonstrates how firewalls
work, with enough guidance to secure a simple network. Entire volumes have been written about how
firewalls work, how they should be configured, and the intricacies of how they should be deployed. If
you are interested in security beyond the scope of a basic configuration, you should pick up some of
the books recommended at the end of the chapter.
How Netfilter Works
The principle behind Netfilter is simple: Provide a simple means of making decisions on how a
packet should flow. To make configuration easier, Netfilter provides a tool called iptables that can
be run from the command line. The iptables tool specifically manages Netfilter for Internet Protocol
version 4 (IPv4). The iptables tool makes it easy to list, add, and remove rules as necessary from
the system.
To filter and manage the firewall rules for IPv6 traffic, most Linux distros provide the iptablesipv6 package. The command used to manage the IPv6 Netfilter sub-system is aptly named
ip6tables. Most of the discussion and concepts about IPv4 Netfilter discussed in this chapter also
apply to IPv6 Netfilter.
All of the code that processes packets according to your configuration is actually run inside the
kernel. To accomplish this, the Netfilter infrastructure breaks the task down into several distinct types
of operations (tables): network address translation (nat), mangle, raw, and filter. Each operation
has its own table of operations that can be performed based on administrator-defined rules.
The nat table is responsible for handling network address translation—that is, making or changing
IP addresses to a particular source or destination IP address. The most common use for this is to
allow multiple systems to access another network (typically the Internet) from a single IP address.
When combined with connection tracking, network address translation is the essence of the Linux
firewall.
NOTE The nat table was not present in the IPv6 Netfilter sub-system (ip6tables) as of the time of this
writing.
The mangle table is responsible for altering or marking packets. The number of possible uses of
the mangle table is enormous; however, it is also infrequently used. An example of its usage would be
to change the ToS (Type of Service) bits in the TCP header so that Quality of Service (QoS)
mechanisms can be applied to a packet, either later in the routing or in another system.
The raw table is used mainly for dealing with packets at a very low level. It is used for
configuring exemptions from connection tracking. The rules specified in the raw table operate at a
higher priority than the rules in other tables.
Finally, the filter table is responsible for providing basic packet filtering. This can be used to
allow or block traffic selectively according to whatever rules you apply to the system. An example of
filtering is blocking all traffic except for that destined to port 22 (SSH) or port 25 (Simple Mail
Transport Protocol, or SMTP).
A NAT Primer
Network address translation (NAT) allows administrators to hide hosts on both sides of a router so
that each side can, for whatever reason, remain blissfully unaware of the other. NAT under Netfilter
can be broken down into three categories: Source NAT (SNAT), Destination NAT (DNAT), and
Masquerading.
SNAT is responsible for changing the source IP address and port to make a packet appear to be
coming from an administrator-defined IP. This is most commonly used when a private network needs
to use an externally visible IP address. To use a SNAT, the administrator must know what the new
source IP address is when the rule is being defined. If it is not known (for example, the IP address is
dynamically defined by an Internet service provider [ISP]), the administrator should use
Masquerading (defined shortly). Another example of using SNAT is when an administrator wants to
make a specific host on one network (typically private) appear as another IP address (typically
public). SNAT, when used, needs to occur late in the packet-processing stages so that all of the other
parts of Netfilter see the original source IP address before the packet leaves the system.
DNAT is responsible for changing the destination IP address and port so that a packet is
redirected to another IP address. This is useful for situations in which administrators want to hide
servers in a private network (typically referred to as a demilitarized zone, or DMZ, in firewall
parlance) and map select external IP addresses to an internal address for incoming traffic. From a
management point of view, DNAT makes it easier to manage policies, since all externally visible IP
addresses are visible from a single host (also known as a choke point) in the network.
Finally, Masquerading is simply a special case of SNAT. This is useful for situations in which
multiple systems inside a private network need to share a single dynamically assigned IP address to
the outside world; this is the most common use of Linux-based firewalls. In such a case,
Masquerading will make all the packets appear as though they have originated from the NAT device’s
IP address, thus hiding the structure of your private network. Using this method of NAT also allows
your private network to use the RFC 1918 private IP spaces, as shown in Chapter 11
(192.168.0.0/16, 172.16.0.0/12, and 10.0.0.0/8).
Examples of NAT
Figure 13-1 shows a simple example where a host (192.168.1.2) is trying to connect to a server
(200.1.1.1). Using SNAT or Masquerading in this case would apply a transformation to the packet so
that the source IP address is changed to the NAT device’s external IP address (100.1.1.1). From the
server’s point of view, it is communicating with the NAT device, not the host directly. From the
host’s point of view, it has unobstructed access to the public Internet. If multiple clients were behind
the NAT device (say, 192.168.1.3 and 192.168.1.4), the NAT would transform all their packets to
appear as though they originated from 100.1.1.1 as well.
Figure 13-1. Using SNAT on a connection
Alas, this raises a small problem3. The server is going to send back some packets—but how does
the NAT device know which packet to send to whom? Herein lies the magic: The NAT device
maintains an internal list of client connections and associated server connections called flows. Thus,
in the first example, the NAT is maintaining a record that “192.168.1.1:1025 converts to
100.1.1.1:49001, which is communicating with 200.1.1.1:80.” When 200.1.1.1:80 sends a packet
back to 100.1.1.1:49001, the NAT device automatically alters the packet so that the destination IP is
set to 192.168.1.1:1025 and then passes it back to the client on the private network.
In its simplest form, a NAT device is tracking only flows. Each flow is kept open so long as it
sees traffic. If the NAT does not see traffic on a given flow for some time, the flow is automatically
removed. These flows have no idea about the content of the connection itself, only that traffic is
passing between two endpoints, and it is the job of the NAT to ensure that the packets arrive as each
endpoint expects.
Now let’s look at the reverse case, as shown in Figure 13-2: A client from the Internet wants to
connect to a server on a private network through a NAT. Using DNAT in this situation, we can make
it the NAT’s responsibility to accept packets on behalf of the server, transform the destination IP of
the packets, and then deliver them to the server. When the server returns packets to the client, the
NAT engine must look up the associated flow and change the packet’s source IP address so that it
reads from the NAT device rather than from the server itself. Turning this into the IP addresses shown
in Figure 13-2, we see a server on 192.168.1.5:80 and a client on 200.2.2.2:1025. The client
connects to the NAT IP address, 100.1.1.1:80, and the NAT transforms the packet so that the
destination IP address is 192.168.1.5. When the server sends a packet back, the NAT device does the
reverse, so the client thinks that it is talking to 100.1.1.1. (Note that this particular form of NAT is
also referred to as port address translation, or PAT.)
Figure 13-2. Using DNAT on a connection
Connection Tracking and NAT
Although on the surface NAT appears to be a great way to provide security, it is unfortunately not
enough. The problem with NAT is that it doesn’t understand the contents of the flows and whether a
packet should be blocked because it is in violation of the protocol. For example, assume that we have
a network set up, as in Figure 13-2. When a new connection arrives for the web server, we know that
it must be a TCP SYN packet. There is no other valid packet for the purpose of establishing a new
connection. With a blind NAT, however, the packet will be forwarded, regardless of whether or not
it is a TCP SYN.
To make NAT more useful, Linux offers stateful connection tracking. This feature allows NAT
to examine a packet’s header intelligently and determine whether it makes sense from a TCP protocol
level. Thus, if a packet arrives for a new TCP connection that is not a TCP SYN, stateful connection
tracking will reject the packet without putting the server itself at risk. Even better, if a valid
connection is established and a malicious person tries to spoof a random packet into the flow, stateful
connection tracking will drop the packet unless it matches all of the criteria to be a valid packet
between the two endpoints (a difficult feat, unless the attacker is able to sniff the traffic ahead of
time).
As we discuss NAT throughout the remainder of this chapter, keep in mind that wherever NAT
can occur, stateful connection tracking can occur.
NAT-Friendly Protocols
As we get into NAT in deeper detail, you might have noticed that we always seem to be talking about
single connections traversing the network. For protocols that need only a single connection to work,
such as HTTP, and for protocols that don’t rely on communicating the client’s or server’s real IP
address, such as SMTP, this is great. But what happens when you do have a finicky protocol that
needs multiple connections or needs to pass real IP addresses? Well, you have a slight problem in
this case—at least until you read the upcoming paragraph.
There are two solutions to handling these finicky protocols: Use an application-aware NAT or a
full application proxy. In the former case, the NAT will generally do the least possible work to make
the protocol correctly traverse the NAT, such as IP address fixes in the middle of a connection and
logically grouping multiple connections together because they are related to one another. The FTP
NAT is an example of both. The NAT must alter an active FTP stream so that the IP address that is
embedded in the packet is fixed to show the IP address of the NAT itself, and the NAT will know to
expect a connection back from the server and know to redirect it back to the appropriate client.
For more complex protocols or protocols for which full application awareness is necessary to
secure them correctly, an application-level proxy is typically required. The application proxy would
have the job of terminating the connection from the inside network and initiating it on behalf of the
client on the outside network. Any return traffic would have to traverse the proxy before going back to
the client.
From a practical point of view, few protocols actually need to traverse a NAT, and these
protocols are typically NAT-friendly already, in that they require a single client-to-server connection
only. Active FTP is the only protocol that is frequently needed that requires a special module in
Netfilter. An increasing number of complex protocols are offering simple, NAT-friendly fallbacks
that make them easier to deploy. For example, most instant messenger, streaming media, and IP
telephony applications are offering NAT-friendly fallbacks.
As we cover different Netfilter configurations, you’ll be introduced to some of the modules that
support other protocols.
Chains
For each table there exists a series of chains that a packet goes through. A chain is simply a list of
rules that act on a packet flowing through the system.
There are five predefined chains in Netfilter: PREROUTING, FORWARD, POSTROUTING,
INPUT, and OUTPUT. Their relationship to one another is shown in Figure 13-3. You should note,
however, that the relationship between TCP/IP and Netfilter, as shown in the figure, is purely logical.
Figure 13-3. The relationship between the predefined chains in Netfilter
Each of the predefined chains can invoke rules that are in one of the predefined tables (NAT,
mangle, or filter). Not all chains can invoke any rule in any table; each chain can invoke rules only in
a defined list of tables. we will discuss what tables can be used from each chain when we explain
what each of the chains does in the sections that follow.
Administrators can add more chains to the system if they want. A packet matching a rule can then,
in turn, invoke another administrator-defined chain of rules. This makes it easy to repeat a list of rules
multiple times from different chains. You will see examples of this kind of configuration later in the
chapter.
All of the predefined chains are members of the mangle table. This means that at any point along
the path, it is possible to mark or alter the packet in an arbitrary way. The relationship among the
other tables and each chain, however, varies by chain. A visual representation of all of the
relationships is shown in Figure 13-4.
Figure 13-4. The relationship among predefined chains and predefined tables
We’ll cover each of these chains in more detail to help you understand these relationships.
PREROUTING
The PREROUTING chain is the first thing a packet hits when entering the system. This chain can
invoke rules in one of three tables: NAT, raw, and mangle. From a NAT perspective, this is the ideal
point at which to perform a Destination NAT (DNAT), which changes the destination IP address of a
packet.
Administrators looking to track connections for the purpose of a firewall should start the tracking
here, since it is important to track the original IP addresses along with any NAT address from a
DNAT operation.
FORWARD
The FORWARD chain is invoked only in the case when IP forwarding is enabled and the packet is
destined for a system other than the host itself. For example, if the Linux system has the IP address
172.16.1.1 and is configured to route packets between the Internet and the 172.16.1.0/24 network, and
a packet from 1.1.1.1 is destined to 172.16.1.10, the packet will traverse the FORWARD chain.
The FORWARD chain calls rules in the filter and mangle tables. This means that the
administrator can define packet-filtering rules at this point that will apply to any packets to or from
the routed network.
INPUT
The INPUT chain is invoked only when a packet is destined for the host itself. The rules that are run
against a packet happen before the packet goes up the stack and arrives at the application. For
example, if the Linux system has the IP address 172.16.1.1, the packet has to be destined to
172.16.1.1 in order for any of the rules in the INPUT chain to apply. If a rule drops all packets
destined to port 80, any application listening for connections on port 80 will never see any such
packets.
The INPUT chain calls on rules in the filter and mangle tables.
OUTPUT
The OUTPUT chain is invoked when packets are sent from applications running on the host itself. For
example, if an administrator on the command-line interface (CLI) tries to use Secure Shell (SSH) to
connect to a remote system, the OUTPUT chain will see the first packet of the connection. The
packets that return from the remote host will come in through PREROUTING and INPUT.
In addition to the filter and mangle tables, the OUTPUT chain can call on rules in the NAT table.
This allows administrators to configure NAT transformations to occur on outgoing packets that are
generated from the host itself. Although this is atypical, the feature does enable administrators to do
PREROUTING-style NAT operations on packets. (Remember: if the packet originates from the host,
it never has a chance to go through the PREROUTING chain.)
POSTROUTING
The POSTROUTING chain can call on the NAT and mangle tables. In this chain, administrators can
alter source IP addresses for the purposes of Source NAT (SNAT). This is also another point at
which connection tracking can happen for the purpose of building a firewall.
Installing Netfilter
The good news is that if you have a modern distribution of Linux, you should already have Netfilter
installed, compiled, and working. A quick check is simply to try running the iptables command, like
so:
On an Ubuntu system, you would run this command instead:
A quick check to see the IPv6 equivalent is by using this command:
Note that some distributions do not include the /sbin directory in the default path, and there is a
good chance that the iptables program lives there. If you aren’t sure, try using one of the following
full paths: /sbin/iptables, /usr/sbin/iptables, /usr/local/bin/ iptables, or /usr/local/sbin/iptables. The
/bin and /usr/bin directories should already be in your path and should have been checked when you
tried iptables without an absolute path.
If the command gave you a list of chains and tables, you know that Netfilter is installed. In fact,
there is a good chance that the OS installation process enabled some filters already! The Fedora,
RHEL and CentOS distros, for example, provide an option to configure a basic firewall at installation
time, and openSUSE also enables a more extensive firewall during the OS install; Ubuntu, on the
other hand, does not enable any firewall rules out of the box.
With Netfilter already present, you don’t have much else to do besides actually configuring and
using it!
The following sections offer some useful information about some of the options that can be used
when setting up (from scratch) a vanilla kernel that does not already have Netfilter enabled. The
complete process of installing Netfilter is actually two parts: enabling features during the kernel
compilation process and compiling the administration tools. Let’s examine the first part.
Enabling Netfilter in the Kernel
Most of Netfilter’s code actually lives inside of the kernel and ships with the standard kernel.org
distribution of Linux. To enable Netfilter, you simply need to enable the right options during the
kernel configuration step of compiling a kernel. If you are not familiar with the process of compiling a
kernel, see Chapter 9 for details.
Netfilter, however, has a lot of options. In this section, we cover those options and which ones
you want to select just in case you are building your kernel from scratch and want to use Netfilter.
Required Kernel Options
Three required modules must be supported: Network Packet Filtering, IP Tables, and Connection
Tracking.
The first is found under the Networking Support ∣ Networking Options ∣ Network packet
filtering framework (Netfilter) menu when configuring the kernel before compiling. This provides
the basic Netfilter framework functionality in the kernel. Without this option enabled, none of the
other options listed will work. Note that this feature cannot be compiled as a kernel module; it is
either in or out.
The second, IP tables, is found under Networking Support ∣ Networking Options ∣ Network
packet filtering framework (Netfilter) ∣ IP: Netfilter Configuration ∣ IP tables support. The
purpose of this module is to provide the IP Tables interface and management to the Netfilter system.
Technically, this module is optional, as it is possible to use the older ipchains or ipfwadm
interfaces; however, unless you have a specific reason to stick to the old interface, you should use IP
tables instead. If you are in the process of migrating from your very old ipchains/ipfwadm
configuration to IP Tables, you will want all of the modules compiled and available to you.
Finally comes the Connection Tracking option. This can be found under Networking Support ∣
Networking Options ∣ Network packet filtering framework (Netfilter) ∣ IP: Netfilter
Configuration ∣ IPv4 connection tracking support. It offers the ability to add support for intelligent
TCP/IP connection tracking and specific support for key protocols such as FTP. Like the IP Tables
option, this can be compiled as a module.
Optional but Sensible Kernel Options
With the options just named compiled into the kernel, you technically have enough to make Netfilter
work for most applications. However, a few more options can make life easier, provide additional
security, and support some common protocols. For all practical purposes, you should consider these
options as requirements. All of the following can be compiled as modules so that only those in active
use are loaded into memory:
FTP Protocol Support This option is available once Connection Tracking is selected. With
it, you can correctly handle active FTP connections through NAT. Active FTP requires that a
separate connection from the server be made back to the client when transferring data (such
as directory listings, file transfers, and so on). By default, NAT will not know what to do
with the server-initiated connection. With the FTP module, NAT will be given the
intelligence to handle the protocol correctly and make sure that the associated connection
makes it back to the appropriate client.
IRC Protocol Support This option is available once Connection Tracking is selected. If
you expect that users behind NAT will want to use Internet Relay Chat (IRC) to communicate
with others on the Internet, this module will be required to correctly handle connectivity,
IDENT requests, and file transfers.
Connection State Match This option is available once IP Tables Support is enabled. With
it, connection tracking gains the stateful functionality that was discussed in the section
“Connection Tracking and NAT” earlier in the chapter. To reiterate, it allows the matching
of packets based on their relationship to previous packets. This should be considered a
requirement for anyone configuring a system as a firewall.
Packet Filtering This option is required if you want to provide packet-filtering options.
REJECT Target Support This option is related to the Packet Filtering option in that it
provides a way of rejecting a packet based on the packet filter by sending an Internet Control
Message Protocol (ICMP) error back to the source of a packet instead of just dropping it.
Depending on your network, this may be useful; however, if your network is facing the
Internet, the REJECT option is not a good idea. It is better to drop packets you do not want
silently rather than generate more traffic.
LOG Target Support With this option, you can configure the system to log a packet that
matches a rule. For example, if you want to log all packets that are dropped, this option
makes it possible.
Full NAT This option is a requirement to provide NAT functionality in Netfilter.
MASQUERADE Target Support This option is a requirement to provide an easy way to
hide a private network through NAT. This module internally creates a NAT entry.
REDIRECT Target Support This option allows the system to redirect a packet to the NAT
host itself. Using this option allows you to build transparent proxies, which are useful when
it is not feasible to configure every client in your network with proper proxy settings or if the
application itself is not conducive to connecting to a proxy server.
NAT of Local Connections This option allows you to apply DNAT rules to packets that
originate from the NAT system itself. If you are not sure whether you’ll need this later on,
you should go ahead and compile it in.
Packet Mangling This option adds the mangle table. If you think you’ll want the ability to
manipulate or mark individual packets for options such as Quality of Service, you should
enable this module.
Other Options
Many additional options can be enabled with Netfilter. Most of them are set to compile as modules by
default, which means you can compile them now and decide whether you actually want to use them
later without taking up precious memory.
As you go through the compilation process, take some time to look at the other modules and read
their help sections. Many modules offer interesting little functions that you might find handy for doing
offbeat things that are typically not possible with firewalls. In other words, these functions really
allow you to show off the power of Netfilter and Linux.
Of course, there is a trade-off with the obscure. When a module is not heavily used, it doesn’t get
as heavily tested. If you’re expecting to run this NAT as a production system, you might want to stick
to the basics and keep things simple. Simple is easier to troubleshoot, maintain, and, of course,
secure.
Configuring Netfilter
There is a good chance that your Linux distro of choice has already configured some Netfilter settings
for you, especially if you are using a relatively recent distribution. This is usually done via a desktop
GUI tool or may have occurred during the OS installation.
From an administrative point of view, this gives you three choices: stick to the GUI for
configuring Netfilter, learn how to manage the system using the existing set of scripts, or move to the
command line.
If you choose to stick with a GUI, be aware that multiple GUIs are available for Linux in addition
to the one that might have shipped with your system. The key to your decision, however, is that once
you have made up your mind, you’re going to want to stick to it. Although it is possible to switch
between the GUI and CLI, it is not recommended, unless you know how to manage the GUI
configuration files by hand.
Managing the system using the existing set of scripts requires the least amount of changing from a
startup/shutdown script point of view, since you are using the existing framework; however, it also
means getting to know how the current framework is configured and learning how to edit those files.
Finally, ignoring the existing scripts and going with your own means you need to start from
scratch, but you will have the benefit of knowing exactly how it works, when it starts, and how to
manage it. The downside is that that you will need to create all of the start and stop infrastructures as
well. Because of the importance of the firewall functionality, it is not acceptable simply to add the
configuration to the end of the /etc/rc.d/rc.local script, as it runs at the end of startup. Because of the
time to boot, the window between starting a service and starting the firewall offers too much time for
a potential attack to happen.
Saving Your Netfilter Configuration
As you go through this chapter, you will create some custom firewall rules using the iptables
commands, possibly tweak some settings in the /proc file system, and load additional kernel modules
at boot time. To make these changes persistent across multiple reboots, you will need to save each of
these components so that they start as you expect them to at boot time.
Saving under Fedora and other Red Hat–type Linux distributions is quite straightforward. Simply
take the following steps:
1. Save your Netfilter rules to a sample plain text file named FIREWALL_RULES_FILE.txt by
running the following command:
On a Fedora distro, the equivalent of our sample FIREWALL_RULES_FILE.txt is
/etc/sysconfig/iptables and so the command to run is
2. Add the appropriate modules to the IPTABLES_MODULES variable in the
/etc/sysconfig/iptables-config file. For example, to add ip_conntrack_ftp and
ip_nat_ftp, make the IPTABLES_MODULES line read as follows:
TIP The configuration options for the IPv6 firewall (ip6tables) is stored in the /etc/sysconfig/
ip6tables-config file. For example, the IPv6 equivalent of the IPTABLES_MODULES in IPv4 directive
is IP6TABLES_MODULES in the ip6tables-config file.
3. Make any changes to the kernel parameters as needed using the sysctl utility. For example,
to enable IP forwarding, you would run the following command:
NOTE Some distributions already have commonly used kernel parameters defined (but disabled) in
the sysctl.conf file, so all that might be needed is to change the existing variables to the desired
value. So make sure that you examine the file for the presence of the setting that you want to change
and tweak that value, instead of appending to the file as we did previously.
For other distributions, the methods discussed here may vary. If you aren’t sure about how your
distribution works, or if it’s proving to be more headache than it is worth, simply disable the built-in
scripts from the startup sequence and add your own. If you choose to write your own script, you can
use the following outline:
The iptables Command
The iptables command is the key to configuring the Netfilter system. A quick glance at its online
help with the iptables -h command shows an impressive number of configuration options. In this
section, we will walk through some of those options and learn how to use them.
At the heart of the command is the ability to define individual rules that are made a part of a rule
chain. Each individual rule has a packet-matching criterion and a corresponding action. As a packet
traverses a system, it will traverse the appropriate chains, as you saw in Figure 13-3 earlier in the
chapter. Within each chain, each rule will be executed on the packet in order. When a rule matches a
packet, the specified action is taken on the packet. These individual actions are referred to as targets.
Managing Chains
The format of the command varies by the desired action on the chain. These are the possible actions:
Append rule-spec to chain.
Delete rule-spec from chain.
Insert rule-spec at rulenum. If no rule number
is specified, the rule is inserted at the top of the
chain.
Replace rulenum with rule-spec on chain.
List the rules on chain.
Flush (remove all) the rules on chain.
Zero all the counters on chain.
Define a new chain called chain.
Delete chain. If no chain is specified, all
nonstandard chains are deleted.
Define the default policy for a chain. If no rules
are matched for a given chain, the default policy
sends the packet to target.
Rename chain to new-chain.
Recall that there are several built-in tables (NAT, filter, mangle, and raw) and five built-in chains
(PREROUTING, POSTROUTING, INPUT, FORWARD, and OUTPUT), and that Figure 13-4 shows
their relationships. As rules become more complex, however, it is sometimes necessary to break them
up into smaller groups. Netfilter lets you do this by defining your own chain and placing it within the
appropriate table.
When traversing the standard chains, a matching rule can trigger a jump to another chain in the
same table. For example, let’s create a chain called to_net10 that handles all the packets destined to
the 10.0.0.0/8 network that is going through the FORWARD chain:
In this example, the to_net10 chain doesn’t do anything but return control back to the FORWARD
chain.
TIP Every chain should have a default policy—that is, it must have a default action to take in the
event a packet fails to meet any of the rules. When you are designing a firewall, the safe approach is
to set the default policy (using the -P option in iptables) for each chain to be dropped and then
explicitly insert ALLOW rules for the network traffic that you do want to allow.
To create a sample table named to_net10 for the IPv6 firewall, we would use this:
TIP The filter table is the default table used whenever a table name is not explicitly specified with
the iptables command. Therefore this rule
can also be written as:
Defining the Rule-Specification
In the preceding section, we made mention of rule-specification (rule-spec). The rule-spec is the list
of rules that are used by Netfilter to match on a packet. If the specified rule-spec matches a packet,
Netfilter will apply the desired action on it. The following iptables parameters make up the
common rule-specs.
-p [!] protocol This specifies the IP protocol to compare against. You can use any
protocol defined in the /etc/protocols file, such as tcp, udp, or icmp. A built-in value for
“all” indicates that all IP packets will match. If the protocol is not defined in /etc/protocols,
you can use the protocol number here. For example, 47 represents gre. The exclamation mark
(!) negates the check. Thus, specifying -p ! tcp means all packets that are not TCP. If this
option is not provided, Netfilter will assume “all.” The --protocol option is an alias for
this option. Here’s an example of its usage:
For ip6tables, use this:
These rules will accept all packets destined to TCP port 80 on the INPUT chain.
- s [!] address [/mask] This option specifies the source IP address to check against. When
combined with an optional netmask, the source IP can be compared against an entire
netblock. As with -p, the use of the exclamation mark (!) inverts the meaning of the rule.
Thus, specifying -s ! 10.13.17.2 means all packets not from 10.13.17.2. Note that the
address and netmask can be abbreviated. Here’s an example of its usage:
This rule will drop all packets from the 172.16.0.0/16 network. This is the same network as
172.16.0.0/255.255.0.0.
To use ip6tables to drop all packets from the IPv6 network range 2001:DB8::/32, we
would use a rule like this:
- d [!] address [/mask] This option specifies the destination IP address to check against.
When combined with an optional netmask, the destination IP can be compared against an
entire netblock. As with -s, the exclamation mark negates the rule, and the address and
netmask can be abbreviated. Here’s an example of its usage:
This rule will allow all packets going through the FORWARD chain that are destined for the
10.100.93.0/24 network.
-j target This option specifies an action to “jump” to. These actions are referred to as
“targets” in iptables parlance. The targets that we’ve seen so far have been ACCEPT,
DROP, and RETURN. The first two accept and drop packets, respectively. The third is
related to the creation of additional chains.
As you saw in the preceding section, you can create your own chains to help keep things
organized and to accommodate more complex rules. If iptables is evaluating a set of rules in
a chain that is not built-in, the RETURN target will tell iptables to return back to the parent
chain. Using the earlier to_net10 example, when iptables reaches the -j RETURN, it goes
back to processing the FORWARD chain where it left off. If iptables sees the RETURN
action in one of the built-in chains, it will execute the default rule for the chain.
Additional targets can be loaded via Netfilter modules. For example, the REJECT target can
be loaded with ipt_REJECT, which will drop the packet and return an ICMP error packet
back to the sender. Another useful target is ipt_REDIRECT, which can make a packet be
destined to the NAT host itself even if the packet is destined for somewhere else.
-i interface This option specifies the name of the interface on which a packet was received.
This is handy for instances for which special rules should be applied if a packet arrives from
a physical location, such as a DMZ interface. For example, if eth1 is your DMZ interface and
you want to allow it to send packets to the host at 10.4.3.2, you can use this:
-o interface This option can also specify the name of the interface on which a packet will
leave the system. Here’s an example:
In this example, any packets coming in from eth0 and going out to eth1 are accepted.
[!] -f This option specifies whether a packet is an IP fragment or not. The exclamation mark
negates this rule. Here’s an example:
In this example, any IP fragments coming in on the INPUT chain are automatically dropped.
The same rule with negation logic is shown here:
-c PKTS BYTES This option allows you to set the counter values for a particular rule when
inserting, appending, or replacing a rule on a chain. The counters correspond to the number
of packets and bytes that have traversed the rule, respectively. For most administrators, this
is a rare need. Here’s an example of its usage:
In this example, a new rule allowing packet fragments is inserted into the FORWARD chain,
and the packet counters are set to 10 packets and 10 bytes.
-v This option will display any output of iptables (usually combined with the -L option) to
show additional data. Here’s an example:
-n This option will display any hostnames or port names in their numeric form. Normally,
iptables will do Domain Name System (DNS) resolution for you and show hostnames
instead of IP addresses and protocol names (such as SMTP) instead of port numbers (25). If
your DNS system is down, or if you do not want to generate any additional packets, this is a
useful option.
Here’s an example:
-x This option will show the exact values of a counter. Normally, iptables will try to print
values in “human-friendly” terms and thus perform rounding in the process. For example,
instead of showing “10310,” iptables will show “10k.”
Here’s an example:
--line-numbers This option will display the line numbers next to each rule in a chain. This
is useful when you need to insert a rule in the middle of a chain and need a quick list of the
rules and their corresponding rule numbers.
Here’s an example of its usage:
For IPv6 firewall rules, use this:
Rule-Spec Extensions with Match
One of the most powerful aspects of Netfilter is the fact that it offers a “pluggable” design. For
developers, this means that it is possible to make extensions to Netfilter using an application
programming interface (API) rather than having to dive deep into the kernel code and hack away. For
users of Netfilter, this means a wide variety of extensions are available beyond the basic feature set.
These extensions are accomplished with the Match feature in the iptables command-line tool.
By specifying a desired module name after the -m parameter, iptables will take care of loading the
necessary kernel modules and then offer an extended command-line parameter set. These parameters
are used to offer richer packet-matching features.
In this section, we discuss the use of a few of these extensions that, as of this writing, have been
sufficiently well tested so that they are commonly included with Linux distributions.
TIP To get help for a match extension, simply specify the extension name after the -m parameter and
then give the -h parameter. For example, to get help for the ICMP module, use this:
icmp This module provides an extra match parameter for the ICMP protocol:
Here, typename is the name or number of the ICMP message type. For example, to block a ping
packet, use the following:
For a complete list of supported ICMP packet types, see the module help page with the -h option.
limit This module provides a method of limiting the packet rate. It will match so long as the rate of
packets is under the limit. A secondary “burst” option matches against a momentary spike in traffic
but will stop matching if the spike sustains. The two parameters are
The rate is the sustained packet-per-second count. The number in the second parameter specifies
how many back-to-back packets to accept in a spike. The default value for number is 5. You can use
this feature as a simple approach to slowing down a SYN flood:
This will limit the connection rate to an average of one per second, with a burst up to five
connections. This isn’t perfect, and a SYN flood can still deny legitimate users with this method;
however, it will help keep your server from spiraling out of control.
state This module allows you to determine the state of a TCP connection through the eyes of the
conntrack module. It provides one additional option:
Here, state is INVALID, ESTABLISHED, NEW, or RELATED. A state is INVALID if the packet in
question cannot be associated to an existing flow. If the packet is part of an existing connection, the
state is ESTABLISHED. If the packet is starting a new flow, it is considered NEW. Finally, if a
packet is associated with an existing connection (such as an FTP data transfer), it is RELATED.
Using this feature to make sure that new connections have only the TCP SYN bit set, we do the
following:
Reading this example, we see that for a packet on the INPUT chain that is TCP, which does not
have the SYN flag set, and the state of a connection is NEW, we drop the packet. (Recall that
legitimate new TCP connections must start with a packet that has the SYN bit set.)
tcp This module allows you to examine multiple aspects of TCP packets. We have seen some of these
options (such as --syn) already. Here is a complete list of options:
This option examines the source port of a TCP packet.
If a colon followed by a second port number is specified, a range of ports is checked. For
example, “6000:6010” means “all ports between 6000 and 6010, inclusive.” The
exclamation mark negates this setting. For example, --source-port ! 25 means “all
source ports that are not 25.” An alias for this option is --sport.
source-port [!] port: [port]
Similar to the --source-port option, this
examines the destination port of a TCP packet. Port ranges and negation are supported. For
example, --destination-port ! 9000:9010 means “all ports that are not between 9000
and 9010, inclusive.” An alias for this option is --dport.
destination-port [!] port: [port]
This checks the TCP flags that are set in a packet. The mask
tells the option what flags to check, and the comp parameter tells the option what flags must
be set. Both mask and comp can be a comma-separated list of flags. Valid flags are SYN,
ACK, FIN, RST, URG, PSH, ALL, and NONE, where ALL means all flags and NONE
means none of the flags. The exclamation mark negates the setting. For example, to use -tcp-flags ALL SYN,ACK means that the option should check all flags and only the SYN and
ACK flags must be set.
[!] tcp-flags mask comp
[!] --syn This checks whether the SYN flag is enabled. It is logically equivalent to -tcp-flags SYN,RST,ACK SYN. The exclamation point negates the setting.
An example using this module checks whether a connection to DNS port 53 originates from port
53, does not have the SYN bit set, and has the URG bit set, in which case it should be dropped. Note
that DNS will automatically switch to TCP when a request is greater than 512 bytes.
tcpmss This matches a TCP packet with a specific Maximum Segment Size (MSS). The lowest legal
limit for IP is 576, and the highest value is 1500. The goal in setting an MSS value for a connection is
to avoid packet segmentation between two endpoints. Dial-up connections tend to use 576-byte MSS
settings, whereas users coming from high-speed links tend to use 1500-byte values. Here’s the
command-line option for this setting:
Here, value is the MSS value to compare against. If a colon followed by a second value is provided,
an entire range is checked. Here’s an example:
This will provide a simple way of counting how many packets (and how many bytes) are coming
from connections that have a 576-byte MSS and how many are not. To see the status of the counters,
use iptables -L -v.
udp Like the TCP module, the UDP module provides extra parameters to check for a packet. Two
additional parameters are provided:
This option checks the source port of a User Datagram
Protocol (UDP) packet. If the port number is followed by a colon and another number, the
range between the two numbers is checked. If the exclamation point is used, the logic is
inverted.
source-port [!] port:[port]
destination-port [!] port:[port]
Like the source-port option, this option checks
the UDP destination port.
Here’s an example:
This example will accept all UDP packets destined for port 53. This rule is typically set to allow
traffic to DNS servers.
Cookbook Solutions
Now that you’ve made it this far into this chapter, your head is probably spinning just a bit and you
are feeling a little woozy. So many options, so many things to do, so little time!
Not to worry, because we have your back—this section offers some cookbook solutions to
common uses of the Linux Netfilter system that you can learn from and put to immediate use. Even if
you didn’t read the chapter and just landed here, you’ll find some usable cookbook solutions.
However, taking the time to understand what the commands are doing, how they are related, and how
you can change them is worthwhile. It will also turn a few examples into endless possibilities.
With respect to saving the examples for use on a production system, you will want to add the
modprobe commands to your startup scripts. In Fedora, CentOS, RHEL, and other Red Hat–type
systems, add the module name to the IPTABLES_MODULES variable in /etc/sysconfig/iptables-config.
On Ubuntu/Debian-based Linux distros, you can add modprobe directives to the firewall
configuration file /etc/default/ufw.
TIP Debian-based distributions such as Ubuntu, can use a front-end program called Uncomplicated
FireWall (ufw) for managing the iptables/Netfilter firewall stack. As its name implies, ufw, is
designed to make managing iptables rules easy (uncomplicated).
Fedora users can save their current running iptables rule using the following built-in
iptables-save command:
This will write the currently running iptables rules to the /etc/sysconfig/iptables configuration file.
The IPv6 equivalent of the command to write out the IPv6 firewall rules to the configuration file
is shown here:
Other Linux distributions with Netfilter also have the iptables-save and ip6tables-save
commands. The only trick is to find the appropriate startup file in which to write the rules.
Rusty’s Three-Line NAT
Rusty Russell, one of the key developers of the Netfilter system, recognized that the most common use
for Linux firewalls is to make a network of systems available to the Internet via a single IP address.
This is a common configuration in home and small office networks where digital subscriber line
(DSL) or Point-to-Point Protocol over Ethernet (PPPoE) providers give only one IP address to use. In
this section, we honor Rusty’s solution and step through it here.
Assuming that you want to use your ppp0 interface as your connection to the world and use your
other interfaces (such as eth0) to connect to the inside network, run the following commands:
This set of commands will enable a basic NAT to the Internet. To add support for active FTP
through this gateway, run the following:
If you are using Fedora, RHEL, or CentOS and want to make the iptables configuration part of
your startup script, run the following:
NOTE For administrators of other Linux distributions, you can also use the iptables-save or
ip6tables-save command (both are part of the iptables and iptables-ipv6 software suite and thus
apply to all Linux distributions). This command in conjunction with iptables-restore or
ip6tables-restore will allow you to save and restore your iptables settings easily.
Configuring a Simple Firewall
In this section, we start with a deny-all firewall for two cases: a simple network where no servers
are configured, and the same network, but with some servers configured. In the first case, I assume a
simple network with two sides: inside on the 10.1.1.0/24 network (eth1) and the Internet (eth0). Note
that by “server,” we are referring to anything that needs a connection made to it. This could, for
example, mean a Linux system running an ssh daemon or a Windows system running a web server.
Let’s start with the case where there are no servers to support.
First we need to make sure that the NAT module is loaded and that FTP support for NAT is
loaded. We do that with the modprobe commands:
With the necessary modules loaded, we define the default policies for all the chains. For the
INPUT, FORWARD, and OUTPUT chains in the filter table, we set the destination to be DROP,
DROP, and ACCEPT, respectively. For the POSTROUTING and PREROUTING chains, we set their
default policies to ACCEPT. This is necessary for NAT to work.
With the default policies in place, we need to define the baseline firewall rule. What we want to
accomplish is simple: Let users on the inside network (eth1) make connections to the Internet, but
don’t let the Internet make connections back. To accomplish this, we define a new chain called
“block” that we use for grouping our state-tracking rules together. The first rule in that chain simply
states that any packet that is part of an established connection or that is related to an established
connection is allowed through. The second rule states that in order for a packet to create a new
connection, it cannot originate from the eth0 (Internet-facing) interface. If a packet does not match
against either of these two rules, the final rule forces the packet to be dropped.
With the blocking chain in place, we need to call on it from the INPUT and FORWARD chains.
We aren’t worried about the OUTPUT chain, since only packets originating from the firewall itself
come from there. The INPUT and FORWARD chains, on the other hand, need to be checked. Recall
that when doing NAT, the INPUT chain will not be hit, so we need to have FORWARD do the check.
If a packet is destined to the firewall itself, we need the checks done from the INPUT chain.
Finally, as the packet leaves the system, we perform the MASQUERADE function from the
POSTROUTING chain in the NAT table. All packets that leave from the eth0 interface go through this
chain.
With all the packet checks and manipulation behind us, we enable IP forwarding (a must for NAT
to work) and SYN cookie protection, plus we enable the switch that keeps the firewall from
processing ICMP broadcast packets (Smurf attacks).
At this point, we have a working firewall for a simple environment. If we don’t run any servers,
we can save this configuration and consider ourselves done. On the other hand, let’s assume we have
two applications that we want to make work through this firewall: a Linux system on the inside
network that we need SSH access to from remote locations and a Windows system from which we
want to run BitTorrent. Let’s start with the SSH case first.
To make a port available through the firewall, we need to define a rule that says, “If any packet on
the eth0 (Internet-facing) interface is TCP and has a destination port of 22, change its destination IP
address to 172.16.1.3.” This is accomplished by using the DNAT action on the PREROUTING chain,
since we want to change the IP address of the packet before any of the other chains see it.
The second problem we need to solve is how to insert a rule on the FORWARD chain that allows
any packet whose destination IP address is 172.16.1.3 and destination port is 22 to be allowed. The
key word is insert (-I). If we append the rule (-A) to the FORWARD chain, the packet will instead be
directed through the block chain, because the rule iptables -A FORWARD -j block will apply first.
We can apply a similar idea to make BitTorrent work. Let’s assume that the Windows machine
that is going to use BitTorrent is 172.16.1.2. The BitTorrent protocol uses ports 6881–6889 for
connections that come back to the client. Thus, we use a port range setting in the iptables command.
Ta-da! We now have a working firewall and support for an SSH server and a BitTorrent user on
the inside of our network.
Summary
In this chapter we discussed the ins and outs of the Linux firewall, Netfilter. In particular, we
discussed the usage of the iptables and ip6tables commands. With this information, you should be
able to build, maintain, and manage a Linux-based firewall.
If it hasn’t already become evident, Netfilter is an impressively complex and rich system. Authors
have written complete books on Netfilter alone and other complete texts on firewalls. In other words,
you’ve got a good toolkit under your belt with this chapter.
In addition to this chapter, you may want to take some time to read up on more details of Netfilter.
More detailed information can be obtained from the main Netfilter web site (www.netfilter.org).
Don’t forget that security can be fun, too. The Cuckoo’s Egg by Clifford Stoll (Pocket, 2000) is a
true story of an astronomer turned hacker-catcher in the late 1980s. It makes for a great read and gives
you a sense of what the Internet was like before commercialization, let alone firewalls, became the
norm.
CHAPTER 14
Local Security
e frequently hear about newly discovered attacks (or vulnerabilities) against various
operating systems. An often important and overlooked aspect of these new attacks is the
exploit vector. In general, exploit vectors are of two types: those in which the vulnerability
is exploitable over a network and those in which the vulnerability is exploitable locally. Although
related, local security and network security require two different approaches. In this chapter, we
focus on security from a local security perspective.
Local security addresses the problem of attacks that require the attacker to be able to do
something on the system itself for the purpose of gaining root access (administrative access).
For example, a whole class of attacks take advantage of applications that create temporary files in
the /tmp directory but do not check the temporary file’s ownership, its file permissions, or whether it
is a link to another file before opening and writing to it. An attacker can create a symbolic link of the
expected temporary filename to a file that he wants to corrupt (such as /etc/passwd) and then run the
application. If the application is SetUID to root (covered later in this chapter), it will destroy the
/etc/passwd file when writing to its temporary file. The attacker can use the lack of an
/etc/passwd file to bypass other security mechanisms so that he can gain root access. This attack is
purely a local security issue, because of the existence and use of a SetUID application that is
available on the local system.
Systems that have untrustworthy users as well as lack of proper local security mechanisms can
pose a real problem and invite attacks. University environments are often ripe for these types of
attacks: students may need access to servers to complete assignments and perform other academic
W
work, but such a situation can be a great threat to the system because students get bored, they may test
the bounds of their access and their own creativity, or they may sometimes not think about the
consequences and impacts of their actions.
Local security issues can also be triggered by network security issues. If a network security issue
results in an attacker being able to invoke any program or application on the server, the attacker can
use a local security-based exploit not only to give herself full access to the server, but also to
escalate her own privileges to the root user. “Script kiddies”—attackers who use other people’s
attack programs because they are incapable of creating their own—are known to use these kinds of
methods to gain full access to your system. In their parlance, you’ll be “owned.”
This chapter addresses the fundamentals of keeping your system secure against local security
attacks. Keep in mind, however, that a single chapter on this topic will not make you an expert.
Security is a field that is constantly evolving and requires constant updating. The McGraw-Hill
“Hacking Exposed” series of books is an excellent place to jumpstart your knowledge, and you can
pick up big security news at the BugTraq mailing list (www.securityfocus.com).
In this chapter, you will notice two recurrent themes: mitigating risk and simpler is better. The
former is another way of adjusting your investment (both in time and money), given the risk you’re
willing to take on and the risk that a server poses if compromised. And keep in mind that because you
cannot prevent all attacks, you have to accept a certain level of risk—and the level of risk you accept
will drive the investment in both time and money. So, for example, a web server dishing up your
vacation pictures on a low-bandwidth link is a lower risk than a server handling large financial
transactions for Wall Street.
The “simpler is better” logic is engineering 101—simple systems are less prone to problems,
easier to fix, easier to understand, and inevitably more reliable. Keeping your servers simple is a
desirable goal.
Common Sources of Risk
Security is the mitigation of risk. Along with every effort of mitigating risk comes an associated cost.
Costs are not necessarily financial; they can take the form of restricted access, loss of functionality, or
loss of time. Your job as an administrator is to balance the costs of mitigating risk with the potential
damage that an exploited risk can cause.
Consider a web server, for example. The risk of hosting a service that can be probed, poked at,
and possibly exploited is inherent in exposing any network accessibility. However, you may find that
the risk of exposure is low so long as the web server is maintained and immediately patched when
security issues arise. If the benefit of running a web server is great enough to justify your cost of
maintaining it, then it is a worthwhile endeavor.
In this section, we look at common sources of risk and examine what you can do to mitigate those
risks.
SetUID Programs
SetUID programs are executables that have a special attribute (flag) set in their permissions, which
allows users to run the executable in the context of the executable’s owner. This enables
administrators to make selected applications, programs, or files available with higher privileges to
normal users, without having to give those users any administrative rights. An example of such a
program is ping. Because the creation of raw network packets is restricted to the root user (creation
of raw packets allows the application to add any contents within the packet, including attacks), the
application must run with the SetUID bit enabled and the owner set to root. Thus, for example,
even though user yyang may start the ping program, the program can be run in the context of the root
user for the purpose of placing an Internet Control Message Protocol (ICMP) packet onto the network.
The ping utility in this example is said to be “SetUID root.”
The problem with programs that are running with root privileges is that they have an obligation to
be highly “conscious” of their security as well. It should not be possible for a normal user to do
something dangerous on the system by using that program. This means many checks need to be written
into the program and potential bugs must be carefully removed. Ideally, these programs should be
small and do one thing. This makes it easier to evaluate the code for potential bugs that can harm the
system or allow for a user to gain privileges that he or she should not have.
From a day-to-day perspective, it is in the administrator’s best interest to keep as few SetUID
root programs on the system as possible. The risk balance here is the availability of features/functions
to users versus the potential for bad things to happen. For some common programs such as ping,
mount, traceroute, and su, the risk is low for the value they bring to the system. Some well-known
SetUID programs, such as the X Window System, pose a low to moderate risk; however, given X
Window System’s exposure, it is unlikely to be the root of any problems. If you are running a pure
server environment and you do not need X Window System, it never hurts to remove it.
SetUID programs executed by web servers are almost always a bad thing. Use great caution with
these types of applications and look for alternatives. The exposure is much greater, since it is
possible for network input (which can come from anywhere) to trigger this application and affect its
execution.
If you find that you must run an application SetUID with root privileges, another alternative is to
find out whether it is possible to run the application in a chroot environment (discussed later in this
chapter in the section “Using chroot”).
ping
Finding and Creating SetUID Programs
A SetUID program has a special file attribute that the kernel uses to determine whether it should
override the default permissions granted to an application. When you’re doing a directory listing, the
permissions shown on a file in its ls -l output will reveal this little fact. Here’s an example:
If the fourth character in the permissions field is an s, the application is SetUID. If the file’s
owner is root, then the application is SetUID root. In the case of ping, we can see that it will execute
with root permissions available to it.
Here’s another example—the Xorg (X Window) program:
As with ping, we see that the fourth character of the permissions is an s and the owner is root. The
Xorg program is, therefore, SetUID root.
To determine whether a running process is SetUID, you can use the ps command to see both the
actual user of a process and its effective user, like so:
This will output all of the running programs with their process ID (PID), effective user (euser), real
user (ruser), and command name (comm). If the effective user is different from the real user, it is
likely a SetUID program.
NOTE Some applications that are started by the root user give up their permissions to run as a less
privileged user to improve security. The Apache web server, for example, might be started by the
root user to allow it to bind to Transmission Control Protocol (TCP) port 80 (only privileged users
can bind to ports lower than 1024), but it then gives up its root permissions and starts all of its
threads as an unprivileged user (typically the user “nobody,” “apache,” or “www”).
To make a program run as SetUID, use the chmod command. Prefix the desired permissions with a
4 to turn on the SetUID bit. (Using a prefix of 2 will enable the SetGID bit, which is similar to
SetUID, but offers group permissions instead of user permissions.)
For example, if we have a program called myprogram and we want to make it SetUID root, we
would do the following:
Ensuring that a system has only the absolutely minimum and necessary SetUID programs can be a
good housekeeping measure. A typical Linux distribution can easily have several files and
executables that are unnecessarily SetUID. Going from directory to directory to find SetUID programs
can be tiresome and error-prone. So instead of doing that manually, you can use the find command,
like so:
Unnecessary Processes
When stepping through startup and shutdown scripts, you may have noticed that a standard-issue Linux
system starts with a lot of processes running. The question that needs to be asked is this: Do I really
need everything I start? You might be surprised at your answer.
The underlying security issue always goes back to risk: Is the risk of running an application worth
the value it brings you? If the value a particular process brings is zero because you’re not using it,
then no amount of risk is worth it. Looking beyond security, there is the practical matter of stability
and resource consumption. If a process brings zero value, even a benign process that does nothing but
sit in an idle loop uses memory, processor time, and kernel resources. If a bug were to be found in
that process, it could threaten the stability of your server. Bottom line: If you don’t need it, don’t run
it.
If your system is running as a server, you should reduce the number of processes that are run. For
example, if there is no reason for the server to connect to a printer, disable the print services. If there
is no reason the server should accept or send e-mail, turn off the mail server component. If no
services are run from xinetd, then xinetd should be turned off. No printer? Turn off Common UNIX
Printing System (CUPS). Not a file server? Turn off Network File System (NFS) and Samba.
A Real-Life Example: Thinning Down a Server
Let’s take a look at a real-life deployment of a Linux server handling web and e-mail access
outside of a firewall and a Linux desktop/workstation behind a firewall with a trusted user. The
two configurations represent extremes: tight configuration in a hostile environment (the Internet)
and a loose configuration in a well-protected and trusted environment (a local area network, or
LAN).
The Linux server runs the latest Fedora distro. With unnecessary processes thinned down,
the server has 10 programs running, with 18 processes when no one is logged in. Of the 10
programs, only SSH, Apache, and Sendmail are externally visible on the network. The rest
handle basic management functions, such as logging (rsyslog) and scheduling (cron). Removing
nonessential services used for experimentation only (for example, Squid proxy server), the
running program count can be reduced to 7 (init, syslog, cron, SSH, Sendmail, Getty, and
Apache), with 13 processes running, 5 of which are Getty to support logins on serial ports and
the keyboard.
By comparison, a Fedora system configured for desktop usage by a trusted user that has not
been thinned down can have as many as 100 processes that handle everything from the X
Window System, to printing, to basic system management services.
For desktop systems where the risk is mitigated (for example, where the desktop sits behind
a firewall and the users are trusted), the benefits of having a lot of these applications running
might well be worth the risk. Trusted users appreciate the ability to print easily and enjoy having
access to a nice user interface, for example. For a server such as the Linux server, however, the
risk would be too great to have unnecessary programs running, and, therefore, any program or
process not needed should be removed.
Fully thinned down, the server should be running the bare minimum it needs to provide the
services required of it.
Picking the Right Runlevel
Most default Linux installations will boot straight to the X Window System. This provides a nice
startup screen, a login menu, and an overall positive desktop experience. For a server, however, all
of that is typically unnecessary for the reasons already stated.
Most Red Hat Package Manager (RPM)–based Linux distributions, such as Fedora, Red Hat
Enterprise Linux (RHEL), openSUSE, CentOS, and so on, that are configured to boot and load the X
Window (GUI) subsystem will boot to runlevel 5 (also referred to as the graphical target on systemd
enabled distros). In such distros, changing the runlevel to 3 will turn off X Window.
The /etc/inittab (or its equivalent) file traditionally controls the runlevel that such systems boot
into. For example, to make an openSUSE server boot into runlevel 3 (no GUI) instead of runlevel 5,
the /etc/inittab file needs to be edited so that the entry in the file that looks like this,
is changed to this,
Linux distros that have fully implemented the systemd service manager use the systemctl utility,
as well as a series of file system elements (soft links) to control and manage the system’s default boot
target (runlevel). Chapters 6 and 8 cover systemd in detail.
Debian-based systems such as Ubuntu use the /etc/init/rc-sysinit.conf file to control the default
runlevel that the system boots into. The default runlevel on such systems is usually runlevel 2. And the
control of whether the X Window subsystem starts up is left to the run control scripts (rc scripts).
TIP You can see what runlevel you’re in by simply typing runlevel at the prompt:
Nonhuman User Accounts
User accounts on a server need not always correspond to humans. Recall that every process running
on a Linux system must have an owner. Running the ps auxww command on your system will show all
of the process owners on the leftmost column of its output. On your desktop system, for example, you
could be the only human user, but a look at the /etc/passwd files shows that there are several other
user accounts on the system.
For an application to drop its root privileges, it must be able to run as another user. Here is where
those extra users come into play: Each application that gives up root can be assigned another
dedicated user on the system. This user typically owns all the application’s files (including
executable, libraries, configuration, and data) and the application processes. By having each
application that drops privileges use its own user, the risk of a compromised application having
access to other application configuration files is mitigated. In essence, an attacker is limited by what
files the application can access, which, depending on the application, may be quite uninteresting.
Limited Resources
To better control the resources available to processes started by the shell, the ulimit facility can be
used. System-wide defaults can be configured using the /etc/security/limits.conf file.
ulimit options can be used to control such things as the number of files that may be open, how
much memory they may use, CPU time they may use, how many processes they may open, and so on.
The settings are read by the PAM (Pluggable Authentication Module) libraries when a user starts up.
The key to choosing ulimit values is to consider the purpose of the system. For example, in the
case of an application server, if the application is going to require a lot of processes to run, then the
system administrator needs to ensure that ulimit caps don’t cripple the functionality of the system.
Other types of servers, such as a Domain Name System (DNS) server, should not need more than a
small handful of processes.
Note a caveat here: PAM must have a chance to run to set the settings before the user does
something. If the application starts as root and then drops permissions, PAM is not likely to run. From
a practical point of view, this means that having individual per-user settings is not likely to do you a
lot of good in most server environments. What will work are global settings that apply to both root
and normal users. This detail turns out to be a good thing in the end; having root under control helps
keep the system from spiraling away both from attacks and from broken applications.
TIP A new Linux kernel feature known as control groups (cgroups) also provides the ability to
manage and allocate various system resources such as CPU time, network bandwidth, memory, and so
on. For more on cgroups, see Chapter 10.
The Fork Bomb
A common trick that students still play on other students is to log into their workstations and run
a “fork bomb.” This is a program that simply creates so many processes that it overwhelms the
system and brings it to a grinding halt. For the student victim, this is merely annoying. For a
production server, this is fatal. Here’s a simple shell-based fork bomb using the Bourne Again
Shell (BASH):
If you don’t have protections in place, this script will crash your server.
The interesting thing about fork bombs is that not all of them are intentional. Broken
applications, systems under denial-of-service (DoS) attacks, and sometimes just simple
typographical errors while entering commands can lead to bad things happening. By using the
limits described in this chapter, you can mitigate the risk of a fork bomb by restricting the
maximum number of processes that a single user can invoke. While the fork bomb can still cause
your system to become highly loaded, it will likely remain responsive enough to allow you to
log in and deal with the situation, all the while hopefully maintaining the other services offered.
It’s not perfect, but it is a reasonable balance between dealing with the malicious and not being
able to do anything at all.
The format of each line in the /etc/security/limits.conf file is as follows:
Any line that begins with a pound sign (#) is a comment. The domain value holds the login of a
user or the name of a group; it can also be a wildcard (*). The type field refers to the type of limit, as
in “soft” or “hard.”
The item field refers to what the limit applies to. The following is a subset of items that an
administrator might find useful:
A reasonable setting for most users is simply to restrict the number of processes, unless there is a
specific reason to limit the other settings. If you need to control total disk usage for a user, you should
use disk quotas instead.
An example for limiting the number of processes to 128 for each user can be achieved by creating
an entry like the one shown here in the /etc/security/limits.conf file:
If you log out and log in again, you can see the limit take effect by running the ulimit command
with the -a option to see what the limits are. The max user processes entry in the following
sample output shows the change (third to last line of the output). Type the following:
Mitigating Risk
Once you know what the risks are, mitigating them becomes easier. You might find that the risks you
see are sufficiently low, such that no additional securing is necessary. For example, a Microsoft
Windows desktop system used by a trusted, well-experienced user is a low risk for running with
administrator privileges. The risk that the user downloads and executes something that can cause
damage to the system is low. Furthermore, steps taken to mitigate the risk, such as sticking to welltrusted web sites and disabling the automatic execution of downloaded files, further alleviate the risk.
This well-experienced user may find that being able to run some additional tools and having raw
access to the system are well worth the risk of running with administrator privileges. Like any
nontrivial risk, the list of caveats is long.
Using chroot
The chroot() system call (pronounced “cha-root”) allows a process and all of its child processes to
redefine what they perceive the root directory to be. For example, if you were to chroot(“/www”)
and start a shell, you could find that using the cd command would leave you at /www. The program
would believe it is a root directory, but in reality, that would not be the case. This restriction applies
to all aspects of the process’s behavior: where it loads configuration files, shared libraries, data
files, and so on. The restricted environment is also commonly referred to as a “jail.”
NOTE Once executed, the change in root directory by chroot is irrevocable through the lifetime of
the process.
When the perceived root directory of the system is changed, a process has a restricted view of
what is on the system. Access to other directories, libraries, and configuration files is not available.
Because of this restriction, it is necessary for an application to have all of the files necessary for it to
work completely contained within the chroot environment. This includes any passwd files, libraries,
binaries, and data files.
Every application needs its own set of files and executables, and thus the directions for making an
application work in a chroot environment vary. However, the principle remains the same: make it
all self-contained under a single directory with a faux root directory structure.
CAUTION A chroot environment will protect against accessing files outside of the directory, but it
does not protect against system utilization, memory access, kernel access, and interprocess
communication. This means that if there is a security vulnerability that someone can take advantage of
by sending signals to another process, it will be possible to exploit it from within a chroot
environment. In other words, chroot is not a perfect cure, but is rather more of a deterrent.
An Example chroot Environment
As an example, let’s create a chroot environment for the BASH shell. We begin by creating the
directory into which we want to put everything. Because this is just an example, we’ll create a
directory in /tmp called myroot.
Let’s assume we need only two programs: bash and ls. Let’s create the bin directory under
myroot
and copy the binaries over there:
With the binaries there, we now need to check whether these binaries need any libraries. We use
the ldd command to determine what (if any) libraries are used by these two programs.
We run ldd against /bin/bash, like so:
We also run ldd against /bin/ls, like so:
Now that we know what libraries need to be in place, we create the lib64 directory and copy the
64-bit libraries over (because we are running a 64-bit operating system).
First we create the /tmp/myroot/lib64 directory:
For shared libraries that /bin/bash needs, we run the following:
And for /bin/ls, we need the following:
CAUTION The previous copy (cp) commands, were based strictly on the output of the ldd/bin/bash
and ldd/bin/ls commands on our sample system. You might need to modify the names and versions of
the files that you are copying over to the chroot environment to match the exact file names that are
required on your system/platform.
Most Linux distros include a little program called chroot that invokes the chroot() system call
for us, so we don’t need to write our own C program to do it. The program takes two parameters: the
directory that we want to make the root directory and the command that we want to run in the chroot
environment. We want to use /tmp/myroot as the directory and start /bin/bash, so we run the
following:
Because there is no /etc/profile or /etc/bashrc to change our prompt, the prompt will change to
bash-4.1#. Now try an ls :
Then try a pwd to view the current working directory:
NOTE We didn’t need to explicitly copy over the pwd command used previously, because pwd is one
of the many BASH built-in commands. It comes with the BASH program that we already copied over.
Since we don’t have an /etc/passwd or /etc/group file in the chrooted environment (to help map
numeric user IDs to usernames), an ls -l command will show the raw user ID (UID) values for each
file. Here’s an example:
With limited commands/executables in our sample chroot environment, the environment isn’t
terribly useful for practical work, which is what makes it great from a security perspective; we allow
only the minimum files necessary for an application to work, thus minimizing our exposure in the
event the application gets compromised.
Keep in mind that not all chroot environments need to have a shell and an ls command installed
—for example, if the Berkeley Internet Name Domain (BIND) DNS server needs only its own
executable, libraries, and zone files installed, then that’s all you need.
SELinux
Traditional Linux security is based on a Discretionary Access Control (DAC) model. The DAC
model allows the owner of a resource (objects) to control which users or groups (subjects) can
access the resource. It is called “discretionary” because the access control is based on the discretion
of the owner.
Another type of security model is the Mandatory Access Control (MAC) model. Unlike the DAC
model, the MAC model uses predefined policies to control user and process interactions. The MAC
model restricts the level of control that users have over the objects that they create. SELinux is an
implementation of the MAC model in the Linux kernel.
The U.S. government’s National Security Agency (NSA) has taken an increasingly public role in
information security, especially due to the growing concern over information security attacks that
could pose a serious threat to the world’s ability to function. With Linux becoming an increasingly
key component of enterprise computing, the NSA set out to create a set of patches to increase the
security of Linux. The patches have all been released under the GNU Public License (GPL) with full
source code and are thus subject to the scrutiny of the world—an important aspect given Linux’s
worldwide presence and developer community. The patches are collectively known as “SELinux,”
short for “Security-Enhanced Linux.” The patches have been integrated into the 2.6 Linux kernel
series using the Linux Security Modules (LSM). This integration has made the patches and
improvements far-reaching and an overall benefit to the Linux community.
SELinux makes use of the concepts of subjects (users, applications, processes, and so on), objects
(files and sockets), labels (metadata applied to objects), and policies (which describe the matrix of
access permissions for subjects and objects). Given the extreme granularity of objects, it is possible
to express rich and complex rules that dictate the security model and behavior of a Linux system.
Because SELinux uses labels, it requires a file system that supports extended attributes.
NOTE The full gist of SELinux is well beyond the scope of a single section in this book. If you are
interested in learning more about SELinux, visit the SELinux Fedora project page at
http://fedoraproject.org/wiki/SELinux.
AppArmor
AppArmor is SUSE’s implementation of the MAC security model. It is SUSE’s alternative to SELinux
(which is used mainly in Red Hat–derived distros such as Fedora, CentOS, and RHEL). AppArmor’s
backers generally tout it as being easier to manage and configure than SELinux. AppArmor’s
implementation of the MAC model focuses more on protecting individual applications—hence the
name Application Armor—instead of attempting a blanket security that applies to the entire system, as
in SELinux. AppArmor’s security goal is to protect systems from attackers exploiting vulnerabilities
in specific applications that are running on the system. AppArmor is file system–independent. It is
integrated into and used mostly in SUSE’s openSUSE and SUSE Linux Enterprise (SLE), as well as
some Debian-based distros. And, of course, it can also be installed and used in other Linux
distributions.
NOTE If you are interested in learning more about AppArmor, you can find good documentation at
www.suse.com/support/security/apparmor/.
Monitoring Your System
As you become familiar with Linux, your servers, and their day-to-day operation, you’ll find that you
start getting a “feel” for what is normal. This might sound peculiar, but in much the same way you
learn to “feel” when your car isn’t quite right, you’ll know when your server is not quite the same.
Part of your getting a feel for the system requires basic system monitoring. For local system
behavior, you need to trust your underlying system as not having been compromised in any way. If
your server does get compromised and a “root kit” that bypasses monitoring systems is installed, it
can be difficult to see what is happening. For this reason, a mix of on-host and remote host-based
monitoring is a good idea.
Logging
By default, most of your log files will be stored in the /var/log directory, with the -logrotate
program automatically rotating (archiving) the logs on a regular basis. Although it is handy to be able
to log into your local disk, it is often a better idea to have your system send its log entries to a
dedicated log server. With remote logging enabled, you can be certain that any log entries sent to the
log server before an attack are most likely guaranteed not to be tampered with.
Because of the volume of log data that can be generated, you might find it prudent to learn some
basic scripting skills so that you can easily parse through the log data and automatically highlight and
e-mail anything that is peculiar or should warrant suspicion. For example, a filter that e-mails error
logs is useful only to an administrator. This allows the administrator to track both normal and
erroneous activity without having to read through a significant number of log messages every day.
Using ps and netstat
Once you have your server up and running, take a moment to study the output of the ps auxww
command. In the future, deviations from this output should catch your attention. As part of monitoring,
you may find it useful to list periodically what processes are running and make sure that any
processes you don’t expect are there for a reason. Be especially suspicious of any packet-capture
programs, such as tcpdump, that you did not start yourself.
The same can be said about the output of the netstat -an command. Admittedly netstat’s
focus is more from a network security standpoint. Once you have a sense of what represents normal
traffic and normally open ports, any deviations from that output should trigger interest into why the
deviation is there. Did someone change the configuration of the server? Did the application do
something that was unexpected? Is there threatening activity on the server?
Between ps and netstat, you should have a fair handle on the goings-on with your network and
process list.
Using df
The df command shows the available space on each of the disk partitions that is mounted. Running df
on a regular basis to see the rate at which disk space gets used is a good way to look for any
questionable activity. A sudden change in disk utilization should spark your curiosity into where the
change came from. For example, a sudden increase in disk storage usage could be because users are
using their home directories to store vast quantities of MP3 files, movies, and so on. Legal issues
aside, there are also other pressing concerns and repercussions for such unofficial use, such as
backups and DoS issues.
The backups might fail because the tape ran out of space storing someone’s music files instead of
the key files necessary for the business. From a security perspective, if the sizes of the web or FTP
directories grow significantly without reason, it may signal trouble looming with unauthorized use of
your server.
A server whose disk becomes full unexpectedly is also a potential source of a local (and/or
remote) DoS attack. A full disk might prevent legitimate users from storing new data or manipulating
existing data on the server. The server may also have to be temporarily taken offline to rectify the
situation, thereby denying access to other services that the server should be providing.
Automated Monitoring
Most of the popular automated system-monitoring solutions specialize in monitoring network-based
services and daemons. However, most of these also have extensive local resource–monitoring
capabilities. The automated tools can monitor such things as disk usage, CPU usage, process counts,
changes in file system objects, and so on. A couple of these tools include sysinfo utilities, Nagios
plug-ins, and Tripwire.
Mailing Lists
As part of managing your system’s security, you should be subscribed to key security mailing lists,
such as BugTraq (www.securityfocus.com/archive/1). BugTraq is a moderated mailing list that
generates only a small handful of e-mails a day, most of which may not pertain to the software you are
running. However, this is where critical issues are likely to show up first. The last several significant
worms that attacked Internet hosts were dealt with in real time on these mailing lists.
In addition to BugTraq, any security lists for software for which you are responsible are musts.
Also look for announcement lists for the software you use. All of the major Linux distributions also
maintain announcement lists for security issues that pertain to their specific distributions. Major
software vendors also maintain their own lists. Oracle, for example, keeps its information online via
its MetaLink web portal and corresponding e-mail lists. Although this may seem like a lot of e-mail,
consider that most of the lists that are announcement-based are extremely low volume. In general, you
should not find yourself needing to deal with significantly more e-mail than you already do.
Summary
In this chapter you learned about securing your Linux system and mitigating risk, and you learned what
to look for when making decisions about how to balance features/ functions with the need to secure.
Specifically, the chapter covered the risks inherent in SetUID programs (programs that run as root),
as well as the risks in running other unnecessary programs. It also covered approaches to mitigating
risk through the use of chroot environments and controlling access to users. We briefly discussed
two popular implementations of the MAC security model in Linux: SELinux and AppArmor. Finally,
you learned about some of the things that should be monitored as part of daily housekeeping.
In the end, you will find that maintaining a reasonably secure environment is akin to maintaining
good hygiene. Keep your server clean of unnecessary applications, make sure the environment for
each application is minimized so as to limit exposure, and patch your software as security issues are
brought to light. With these basic commonsense practices, you’ll find that your servers will be quite
reliable and secure.
On a final note, keep in mind that studying this chapter alone cannot make you a security expert,
much as the chapter on Linux firewalls won’t make you a firewall expert. Linux and the field of
security are constantly evolving and always improving. You will need to continue to make an effort to
learn about the latest technologies and expand your general security knowledge.
CHAPTER 15
Network Security
n Chapter 14, you learned that exploit vectors are of two types: those in which the vulnerability is
exploitable locally and those in which the vulnerability is exploitable over a network. The former
case was covered in Chapter 14. The latter case is covered in this chapter.
Network security addresses the problem of attackers sending malicious network traffic to your
system with the intent of either making your system unavailable (denial-of-service, or DoS, attack) or
exploiting weaknesses in your system to gain access or control of the system. Network security is not
a substitute for the good local security practices discussed in the previous chapter. Both local and
network security approaches are necessary to keep things working the way that you expect them to
work.
This chapter covers four aspects of network security: tracking services, monitoring network
services, handling attacks, and tools for testing. These sections should be used in conjunction with the
information in Chapters 13 and 14.
I
TCP/IP and Network Security
The following discussion assumes you have experience configuring a system for use on a TCP/IP
network. Because the focus here is on network security and not an introduction to networking, this
section discusses only those parts of TCP/IP affecting your system’s security. If you’re curious about
TCP/IP’s internal workings, read Chapter 11.
The Importance of Port Numbers
Every host on an IP-based network has at least one IP address. In addition, every Linux-based host
has many individual processes running. Each process has the potential to be a network client, a
network server, or both. With potentially more than one process being able to act as a server on a
single system, using an IP address alone to identify a network connection is not enough.
To solve this problem, TCP/IP adds a component identifying a TCP (or User Datagram Protocol
[UDP]) port. Every connection from one host to another has a source port and a destination port.
Each port is labeled with an integer between 0 and 65535.
To identify every unique connection possible between two hosts, the operating system keeps track
of four pieces of information: the source IP address, the destination IP address, the source port
number, and the destination port number. The combination of these four values is guaranteed to be
unique for all host-to-host connections. (Actually, the operating system tracks a myriad of connection
information, but only these four elements are needed for uniquely identifying a connection.)
The host initiating a connection specifies the destination IP address and port number. Obviously,
the source IP address is already known. But the source port number, the value that will make the
connection unique, is assigned by the source operating system. It searches through its list of already
open connections and assigns the next available port number.
By convention, this number is always greater than 1024 (port numbers from 0 to 1023 are
reserved for system uses and well-known services). Technically, the source host can also select its
source port number. To do this, however, another process cannot have already taken that port.
Generally, most applications let the operating system pick the source port number for them.
Given this arrangement, we can see how source host A can open multiple connections to a single
service on destination host B. Host B’s IP address and port number will always be constant, but host
A’s port number will be different for every connection. The combination of source and destination
IPs and port numbers is, therefore, unique, and both systems can have multiple independent data
streams (connections) between each other.
For a typical server application to offer services, it would usually run programs that listen to
specific port numbers. Many of these port numbers are used for well-known services and are
collectively referred to as well-known ports, because the port number associated with a service is an
approved standard. For example, port 80 is the well-known service port for HTTP.
In “Using the netstat Command” section a bit later, we’ll look at the netstat command as an
important tool for network security. When you have a firm understanding of what port numbers
represent, you’ll be able to identify and interpret the network security statistics provided by the
netstat command.
Tracking Services
The services provided by a server are what make it a server. The ability to provide the service is
accomplished by processes that bind to network ports and listen to the requests coming in. For
example, a web server might start a process that binds to port 80 and listens for requests to download
the pages of a site it hosts. Unless a process exists to listen on a specific port, Linux will simply
ignore packets sent to that port.
This section discusses the usage of the netstat command, a tool for tracking network
connections (among other things) in your system. It is, without a doubt, one of the most useful
debugging tools in your arsenal for troubleshooting security and day-to-day network problems.
Using the netstat Command
To track what ports are open and what ports have processes listening to them, we use the netstat
command. Here’s an example:
By default (with no parameters), netstat will provide all established connections for both
network and domain sockets. That means we’ll see not only the connections that are actually working
over the network, but also the interprocess communications (which, from a security monitoring
standpoint, might not be immediately useful). So in the command just illustrated, we have asked
netstat to show us all ports (-a)—whether they are listening or actually connected—for TCP (-t)
and UD(-u). We have told netstat not to spend any time resolving IP addresses to hostnames (-n).
In the netstat output, each line represents either a TCP or UDP network port, as indicated by the
first column of the output. The Recv-Q (receive queue) column lists the number of bytes received by
the kernel but not read by the process. Next, the Send-Q(send queue) column tells us the number of
bytes sent to the other side of the connection but not acknowledged.
The fourth, fifth, and sixth columns are the most interesting in terms of system security. The Local
Address column tells us our server’s IP address and port number. Remember that our server
recognizes itself as 127.0.0.1 and 0.0.0.0, as well as its normal IP address. In the case of multiple
interfaces, each port being listened to will show up on all interfaces and, thus, as separate IP
addresses. The port number is separated from the IP address by a colon (:). In the output, the Ethernet
device has the IP address 192.168.1.4.
The fifth column, Foreign Address, identifies the other side of the connection. In the case of a port
that is being listened to for new connections, the default value will be 0.0.0.0:*. This IP address
means nothing, since we’re still waiting for a remote host to connect to us!
The sixth column tells us the state of the connection. The man page for netstat lists all of the
states, but the two you’ll see most often are LISTEN and ESTABLISHED. The LISTEN state means
that a process on your server is listening to the port and ready to accept new connections. The
ESTABLISHED state means just that—a connection is established between a client and server.
Security Implications of netstat’s Output
By listing all of the available connections, you can get a snapshot of what the system is doing. You
should be able to explain and account for all ports listed. If your system is listening to a port that you
cannot explain, this should raise suspicions.
Just in case you haven’t yet memorized all the well-known services and their associated port
numbers (all 25 zillion of them!), you can look up the matching information you need in the
/etc/services file. However, some services (most notably those that use the portmapper) don’t have
set port numbers but are valid services. To see which process is associated with a port, use the -p
option with netstat. Be on the lookout for odd or unusual processes using the network. For example,
if the Bourne Again Shell (BASH) is listening to a network port, you can be fairly certain that
something odd is going on.
Finally, remember that you are mostly interested in the destination port of a connection; this tells
you which service is being connected to and whether it is legitimate. The source address and source
port are, of course, important, too—especially if somebody or something has opened up an
unauthorized back door into your system. Unfortunately, netstat doesn’t explicitly tell you who
originated a connection, but you can usually figure it out if you give it a little thought. Of course,
becoming familiar with the applications that you do run and their use of network ports is the best way
to determine who originated a connection to where. In general, you’ll find that the rule of thumb is that
the side whose port number is greater than 1024 is the side that originated the connection. Obviously,
this general rule doesn’t apply to services typically running on ports higher than 1024, such as X
Window System (port 6000).
Binding to an Interface
A common approach to improving the security of a service running on your server is to make it such
that it binds only to a specific network interface. By default, applications will bind to all interfaces
(seen as 0.0.0.0 in the netstat output). This will allow a connection to that service from any
interface—so long as the connection makes it past any Netfilter rules (built-in Linux Kernel firewall
stack) you may have configured. However, if you need a service to be available only on a particular
interface, you should configure that service to bind to the specific interface.
For example, let us assume that there are three interfaces on your server:
eth0, with the IP address: 192.168.1.4
eth1, with the IP address: 172.16.1.1
lo, with the IP address: 127.0.0.1
And also assume that your server does not have IP forwarding (/proc/sys/net/ ipv4/ip_forward)
enabled. In other words, machines on the 192.168.1.0/24 (eth0) side cannot communicate with
machines on the 172.16/16 side. The 172.16/16 (eth1) network represents the “safe” or “inside”
network, and, of course, 127.0.0.1 (lo or loopback) represents the host itself.
If the application binds itself to 172.16.1.1, then only those applications on the 172.16/16 network
will be able to reach the application and connect to it. If you do not trust the hosts on the
192.168.1/24 side (for example, it is a demilitarized zone, or DMZ), this is a safe way to provide
services to one segment without exposing yourself to another. For even less exposure, you can bind an
application to 127.0.0.1. By doing so, you arrange that connections will have to originate from the
server itself to communicate with the service. For example, if you need to run the MySQL database
for a web-based application and the application runs on the server, then configuring MySQL to accept
only connections from 127.0.0.1 means that any risk associated with remotely connecting to and
exploiting the MySQL service is significantly mitigated. The attacker would have to compromise your
web-based application and somehow make it query the database on the attacker’s behalf (perhaps via
a SQL injection attack).
SSH Tunneling Tricks
If you need to temporarily provide a service to a group of technically proficient users across the
Internet, binding the service to the loopback address (localhost) and then forcing the group to use
SSH tunnels is a great way to provide authenticated and encrypted access to the service.
For example, if you have a Post Office Protocol 3 (POP3) service running on your server,
you can bind the service to the localhost address. This, of course, means nobody will be able to
connect to the POP3 server via a regular interface/address. But if you run an SSH server on the
system, authenticated users can connect via SSH and set up a port-forwarding tunnel for their
remote POP3 e-mail client.
Here’s a sample command to do this from the remote SSH client:
The POP3 e-mail client can then be configured to connect to the POP3 server at the IP
address 127.0.0.1 via port 1110 (127.0.0.1:1110).
Shutting Down Services
One purpose for the netstat command is to determine what services are enabled on your servers.
Making Linux distributions easier to install and manage right out of the box has led to more and more
default settings that are unsafe, so keeping track of services is especially important.
When you’re evaluating which services should stay and which should go, answer the following
questions:
Do we need the service? The answer to this question is important. In most situations, you
should be able to disable a great number of services that start up by default. A stand-alone
web server, for example, should not need to run Network File System (NFS).
If we do need the service, is the default setting secure? This question can also help you
eliminate some services—if they aren’t secure and they can’t be made secure, then chances
are they should be removed. For example, if remote login is a requirement and Telnet is the
service enabled to provide that function, then an alternative such as SSH should be used
instead, due to Telnet’s inability to encrypt login information over a network. (By default,
most Linux distributions ship with Telnet disabled and SSH enabled.)
Does the software providing the service need updates? All software needs updates from time
to time, such as that on web and FTP servers. This is because as features get added, new
security problems creep in. So be sure to remember to track the server software’s
development and get updates as necessary.
Shutting Down xinetd and inetd Services
To shut down a service that is started via the xinetd program, simply edit the service’s configuration
file under the /etc/xinetd.d/ directory and set the value of the disable directive to Yes.
For traditional System V–based services, you can also use the chkconfig command to disable
the service managed by xinetd. For example, to disable the echo service, you would run the
following:
On Linux distributions running systemd (such as Fedora), you can alternatively disable a service
using the systemctl. For example, to disable the xinetd service, use the following:
On Debian-based systems such as Ubuntu, you can use the sysv-rc-conf command (install it
with the apt-get command if you don’t have it installed) to achieve the same effect. For example, to
disable the echo service in Ubuntu, you could run the following:
TIP On older Linux distros using the inetd super-server daemon, you should edit the /etc/inetd .conf
file and comment out the service you no longer want. To disable a service, start the line with a pound
sign (#). (See Chapter 8 for more information on xinetd and inetd.)
Remember to send the HUP signal to inetd after you’ve made any changes to the /etc/inetd.conf
file and a SIGUSR2 signal to xinetd. If you are using the Fedora (or similar) distro, you can also type
the following command to reload xinetd:
Shutting Down Non-xinetd Services
If a service is not managed by xinetd, then a separate process or script that is started at boot time is
running it. If the service in question was installed by your distribution, and your distribution offers a
nice tool for disabling a service, you may find that to be the easiest approach.
For example, some Linux distros support use of the chkconfig program, which provides an easy
way to enable and disable individual services. For example, to disable the rpcbind service from
starting in runlevels 3 and 5 on such systems, simply run the following:
The parameter --level refers to the specific runlevels that should be affected by the change.
Since runlevels 3 and 5 represent the two multiuser modes, we select those. The rpcbind parameter
is the name of the service as referred to in the /etc/init.d/ directory. Finally, the last parameter can be
set to “on,” “off,” or “reset.” The “on” and “off” options are self-explanatory. The “reset” option
refers to resetting the service to its native state at install time.
If you wanted to turn the rpcbind service on again, simply run this:
Note that using chkconfig doesn’t actually turn an already running service on or off; instead, it
defines what will happen at the next startup time. To stop the running process, use the control script in
the /etc/init.d/ directory. In the case of rpcbind, we would stop it with the following:
On Linux distributions running systemd (such as Fedora), you can alternatively stop a service
using the systemctl command. For example, to stop the rpcbind service, type the following:
Shutting Down Services in a Distribution-Independent Way
To prevent a service from starting up at boot time, change the symlink (symbolic link) in the
corresponding runlevel’s rc.d directory. This is done by going to the /etc/rc.d/ directory (/etc/rc*.d/
folder in Debian), and in one of the rc*.d directories finding the symlinks that point to the startup
script. (See Chapter 6 for information on startup scripts.) Rename the symlink to start with an X
instead of an S. Should you decide to restart a service, it’s easy to rename it again starting with an S.
If you have renamed the startup script but want to stop the currently running process, use the ps
command to find the process ID number and then the kill command to terminate the process. For
example, here are the commands to kill a portmap process and the resulting output:
NOTE As always, be sure of what you’re killing before you kill it, especially on a production
server.
Monitoring Your System
The process of locking down your server isn’t just for the sake of securing your server; it gives you
the opportunity to see clearly what normal server behavior should look like. After all, once you know
what normal behavior is, unusual behavior will stick out like a sore thumb. (For example, if you
turned off your Telnet service when setting up the server, seeing a log entry for Telnet would mean
that something is wrong!)
Several free and open source commercial-grade applications exist that perform monitoring and
are well worth checking out. Here, we’ll take a look at a variety of excellent tools that help with
system monitoring. Some of these tools already come installed with your Linux distributions; others
don’t. All are free and easily acquired.
Making the Best Use of syslog
In Chapter 8, we explored rsyslogd, the system logger that saves log messages from various programs
into text files for record-keeping purposes. By now, you’ve probably seen the types of log messages
you get with rsyslog. These include security-related messages, such as who has logged into the
system, when they logged in, and so forth.
As you can imagine, it’s possible to analyze these logs to build a time-lapse image of the
utilization of your system services. This data can also point out questionable activity. For example,
why was the host crackerboy.nothing-better-to-do.net sending so many web requests in such a short
period of time? What was he looking for? Has he found a hole in the system?
Log Parsing
Doing periodic checks on the system’s log files is an important part of maintaining security.
Unfortunately, scrolling through an entire day’s worth of logs is a time-consuming and unerringly
boring task that might reveal few meaningful events. To ease the drudgery, pick up a text on a
scripting language (such as Perl) and write small scripts to parse out the logs. A well-designed script
works by throwing away what it recognizes as normal behavior and showing everything else. This
can reduce thousands of log entries for a day’s worth of activities down to a manageable few dozen.
This is an effective way to detect attempted break-ins and possible security gaps.
Hopefully, it’ll become entertaining to watch the script kiddies trying and failing to break down
your walls. Several canned solutions exist that can also help make parsing through log files easier.
Examples of such programs that you might want to try out are logwatch, gnome-system-log,
ksystemlog and Splunk (www.splunk.com).
Storing Log Entries
Unfortunately, log parsing may not be enough. If someone breaks into your system, it’s likely that your
log files will be promptly erased—which means all those wonderful scripts won’t be able to tell you
a thing. To get around this, consider dedicating a single host on your network to storing log entries.
Configure your local logging daemon to send all of its messages to a separate/central loghost, and
configure the central host appropriately to accept logs from trusted or known hosts. In most instances,
this should be enough to gather, in a centralized place, the evidence of any bad things happening.
If you’re really feeling paranoid, consider attaching another Linux host to the loghost using a
serial port and using a terminal emulation package, such as minicom, in log mode and then feeding all
the logs to the serially attached machine. Using a serial connection between the hosts helps ensure that
one of the hosts does not need network connectivity. The logging software on the loghost can be
configured to send all messages to /dev/ttyS0 if you’re using COM1, or to /dev/ttyS1 if you’re using
COM2. And, of course, do not connect the other system to the network! This way, in the event the
loghost also gets attacked, the log files won’t be destroyed. The log files will be safe residing on the
serially attached system, which is impossible to log into without physical access.
For an even higher degree of ensuring the sanctity of logs, you can connect a parallel-port printer
to another system and have the terminal emulation package echo everything it receives on the serial
port to the printer. Thus, if the serial host system fails or is damaged in some way by an attack, you’ll
have a hard copy of the logs. Note, however, that a serious drawback to using the printer for logging
is that you cannot easily search through the logs because it is all in hard copies!
Monitoring Bandwidth with MRTG
Monitoring the amount of bandwidth being used on your servers produces some useful information. A
common use for this is to justify the need for upgrades. By showing system utilization levels to your
managers, you’ll be providing hard numbers to back up your claims. Your data can be easily turned
into a graph, too (and everyone knows how much upper management and managers like graphs).
Another useful aspect of monitoring bandwidth is to identify bottlenecks in the system, thus helping
you balance the system load. But relative to the topic of this chapter, a useful aspect of graphing your
bandwidth is to identify when things go wrong.
Once you’ve installed a package such as MRTG (Multi-Router Traffic Grapher, available at
www.mrtg.org) to monitor bandwidth, you will quickly get a criterion for what “normal” looks like
on your site. A substantial drop or increase in utilization is something to investigate, as it may
indicate a failure or a type of attack. Check your logs, and look for configuration files with odd or
unusual entries.
Handling Attacks
Part of securing a network includes planning for the worst case: What happens if someone succeeds?
It doesn’t necessarily matter how; it just matters that they have done it. Servers are doing things they
shouldn’t, information is leaking that should not leak, or other mayhem is discovered by you, your
team, or someone else asking why you’re trying to spread mayhem.
What do you do?
Just as a facilities director plans for fires and your backup administrator plans for recovering data
if none of your systems is available, a security officer needs to plan for how to handle an attack. This
section covers key points to consider with respect to Linux. For an excellent overview on handling
attacks, visit the CERT web site at www.cert.org.
Trust Nothing (and No One)
The first thing you should do in the event of an attack is to fire everyone in the I.T. department.
Absolutely no one is to be trusted. Everyone is guilty until proven innocent. Just kidding!
But, seriously, if an attacker has successfully broken into your systems, there is nothing that your
servers can tell you about the situation that is completely trustworthy. Root kits, or tool kits that
attackers use to invade systems and then cover their tracks, can make detection difficult. With binaries
replaced, you may find that there is nothing you can do to the server itself that helps. In other words,
every server that has been successfully hacked needs to be completely rebuilt with a fresh
installation. Before doing the reinstall, you should make an effort to look back at how far the attacker
went so as to determine the point in the backup cycle when the data is certain to be trustworthy. Any
data backed up after that should be closely examined to ensure that invalid data does not make it back
into the system.
Change Your Passwords
If the attacker has gotten your root password or may have taken a copy of the password file (or
equivalent), it is crucial that all of your passwords be changed. This is an incredible hassle; however,
it is necessary to make sure that the attacker doesn’t waltz back into your rebuilt server using the
password without any resistance.
NOTE It is also a good idea to change your root password following any staff changes. It may seem
like everyone is leaving on good terms; however, later finding out that someone on your team had
issues with the company could mean that you’re already in trouble.
Pull the Plug
Once you’re ready to start cleaning up, you will need to stop any remote access to the system. You
may find it necessary to stop all network traffic to the server until it is completely rebuilt with the
latest patches before reconnecting it to the network.
This can be done by simply pulling the plug on whatever connects the box to the network. Putting
a server back onto the network when it is still getting patches is an almost certain way to find yourself
dealing with another attack.
Network Security Tools
You can use countless tools to help monitor your systems, including Nagios (www.nagios.org),
MRTG (www.mrtg.org) for graphing statistics, Big Brother (www.bb4.org), and, of course, the
various tools already mentioned in this chapter. But what do you use to poke at your system for basic
sanity checks?
In this section, we review a few tools that you can use for testing your system. Note that no one
single tool is enough, and no combination of tools is perfect—there is no secret “Hackers Testing
Tool Kit” that security professionals use. The key to the effectiveness of most tools is how you use
them and how you interpret that data gathered by the tools.
A common trend that you’ll see with regard to a few tools listed here is that by their designers’
intent, they were not intended to be security tools. Several of these tools were created to aid in basic
diagnostics and system management. What makes these tools work well for Linux from a security
perspective is that they offer deeper insight into what your system is doing. That extra insight often
proves to be incredibly helpful.
nmap
The nmap program can be used to scan a host or a group of hosts to look for open TCP and UDP
ports. nmap can go beyond scanning and can actually attempt to connect to the remote listening
applications or ports so that it can better identify the remote application. This is a powerful and
simple way for an administrator to take a look at what the system exposes to the network and is
frequently used by both attackers and administrators to get a sense of what is possible against a host.
What makes nmap powerful is its ability to apply multiple scanning techniques. This is especially
useful because each scanning technique has its pros and cons with respect to how well it traverses
firewalls and the level of stealth desired.
Snort
An intrusion detection system (IDS) provides a way to monitor a point in the network surreptitiously
and report on questionable activity based on packet traces. The Snort program (www.snort.org) is an
open source IDS and intrusion prevention system (IPS) that provides extensive rule sets that are
frequently updated with new attack vectors. Any questionable activity can be sent to a logging host,
and several open source log-processing tools are available to help make sense of the information
gathered (for example, the Basic Analysis and Security Engine, or BASE).
Running Snort on a Linux system that is located at a key entry/exit point in your network is a great
way to track the activity without your having to set up a proxy for each protocol that you want to
support. A commercial version of Snort called SourceFire is also available. You can find out more
about SourceFire at www.sourcefire.com.
Nessus
The Nessus system (www.nessus.org) takes the idea behind nmap and extends it with deep
application-level probes and a rich reporting infrastructure. Running Nessus against a server is a
quick way to perform a sanity check on the server’s exposure.
Your key to understanding Nessus is in understanding its output. The report will log numerous
comments, from an informational level all the way up to a high level. Depending on how your
application is written and what other services you offer on your Linux system, Nessus may log false
positives or seemingly scary informational notes.
Take the time to read through each one of them and understand what the output is, as not all of the
messages necessarily reflect your situation. For example, if Nessus detects that your system is at risk
due to a hole in Oracle 8 but your server does not even run Oracle, more than likely, you have hit
upon a false positive.
Although Nessus is open source and free, it is owned and managed by a commercial company,
Tenable Network Security. You can learn more about Tenable at www.tenablesecurity.com.
Wireshark/tcpdump
You learned about Wireshark and tcpdump in Chapter 11, where we used them to study the ins and
outs of TCP/IP. Although those chapters used these tools only for troubleshooting, they are just as
valuable for performing network security functions.
Raw network traces are the food devoured by all the tools listed in the preceding sections to gain
insight into what your server is doing. However, these tools don’t have quite the insight that you do
into what your server is supposed to do. Thus, you’ll find it useful to be able to take network traces
yourself and read through them to look for any questionable activity. You may be surprised by what
you see!
For example, if you are looking at a possible break-in, you may want to start a raw network trace
from another Linux system that can see all of the network traffic of your questioned host. By capturing
all the traffic over a 24-hour period, you can go back and start applying filters to look for anything
that shouldn’t be there. Extending the example, if the server is supposed to handle only web
operations and SSH, with reverse Domain Name System (DNS) resolution turned off on both, take the
trace and apply the filter “not port 80 and not port 22 and not icmp and not arp.” Any packets that
show up in the output are suspect.
Summary
This chapter covered the basics of network security as it pertains to Linux. Using the information
presented here, you should have the knowledge you need to make an informed decision about the state
of health of your server and decide what, if any, action is necessary to secure it.
As has been indicated in other chapters, please do not consider this chapter a complete source of
network security information. Security as a field is constantly evolving and requires keeping a
watchful/careful eye toward new developments.
Be sure to subscribe to the relevant mailing lists, keep an eye on relevant the web sites, educate
yourself with additional reading materials/books, and, most important, always apply common sense.
PART IV
Internet Services
CHAPTER 16
DNS
he ability to map an unfriendly numerical IP address into a people-friendly format has been of
paramount importance since the inception of the Internet in the 1970s. Although this translation
isn’t mandatory, it does make the network much more useful and easy to work with for humans.
Initially, IP address–to–name mapping was done through the maintenance of a hosts.txt file that
was distributed via FTP to all the machines on the Internet. As the number of hosts grew (starting
back in the early 1980s), it was soon clear that a single person maintaining a single file of all of those
hosts was not a scalable way of managing the association of IP addresses to hostnames. To solve this
problem, a distributed system was devised in which each site would maintain information about its
own hosts. One host at each site would be considered authoritative, and that single host address
would be kept in a master table that could be queried by all other sites. This is the essence of the
Domain Name Service (DNS).
If the information in DNS wasn’t decentralized, as it is, one other choice would be to have a
central site maintaining a master list of all hosts (numbering in the tens of millions) and having to
update those hostnames tens of thousands of times a day—an overwhelming alternative! Even more
important to consider are the needs of each site. One site might need to maintain a private DNS server
because its firewall requires that local area network (LAN) IP addresses not be visible to outside
networks, yet the hosts on the LAN must be able to find hosts on the Internet. If you’re stunned by the
prospect of having to manage this for every host on the Internet, then you’re getting the picture.
T
NOTE In this chapter, you will see the terms “DNS server” and “name server” used interchangeably.
Technically, “name server” is a little ambiguous because it can apply to any number of naming
schemes that resolve a name to a number and vice versa. In the context of this chapter, however,
“name server” will always mean a DNS server unless otherwise stated.
We will discuss DNS in depth, so you’ll have what you need to configure and deploy your own
DNS servers for whatever your needs might be.
The Hosts File
Not all sites run their own DNS servers. Not all sites need their own DNS servers. In sufficiently
small sites with no Internet connectivity, it’s reasonable for each host to keep its own copy of a table
matching all of the hostnames on the local network with their corresponding IP addresses. In most
Linux and UNIX systems, this table is stored in the /etc/hosts file.
NOTE You might want to keep a hosts file locally for other valid reasons, despite having access to a
DNS server. For example, a host might need to look up an IP address locally before going out to
query the DNS server. Typically, this is done so that the system can keep track of hosts it needs for
booting so that even if the DNS server becomes unavailable, the system can still boot successfully.
Less obvious might be the simple reason that you want to give a host a name but you don’t want to (or
can’t) add an entry to your DNS server.
The /etc/hosts file keeps its information in a simple tabular format in a plain-text file. The IP
address is in the first column, and all the related hostnames are in the second column. The third
column is typically used to store the short version of the hostname. Only white space separates the
fields. Pound symbols (#) at the beginning of a line represent comments. Here’s an example:
In general, your /etc/hosts file should contain, at the very least, the necessary host-to-IP mappings
for the loop-back interface (127.0.0.1 for IPv4 and ::1 for IPv6) and the local hostname with its
corresponding IP address.
A more robust naming service is the DNS system. The rest of this chapter will cover the use of the
DNS name service.
How DNS Works
In this section, we’ll explore some background material necessary to your understanding of the
installation and configuration of a DNS server and client.
Domain and Host Naming Conventions
Until now, you’ve most likely referenced sites by their fully qualified domain name (FQDN), like
this one: The BIND program can be www.kernel.org. Each string between the periods in this FQDN
is significant. Starting from the right and moving to the left are the top-level domain component, the
second-level domain component, and the third-level domain component. This is illustrated further in
Figure 16-1 in the FQDN for a system (serverA.example.org) and is a classic example of an FQDN.
Its breakdown is discussed in detail in the following section.
Figure 16-1. FQDN for serverA.example.org
The Root Domain
The DNS structure is like that of an inverted tree (upside-down tree); this, therefore, means that the
root of the tree is at the top and its leaves and branches are at the bottom! Funny sort of tree, eh?
At the top of the inverted domain tree is the highest level of the DNS structure, aptly called the
root domain and represented by the simple dot (.).
This is the dot that’s supposed to occur after every FQDN, but it is silently assumed to be present
even though it is not explicitly written. Thus, for example, the proper FQDN for www.kernel.org is
really www.kernel.org. (with the root dot at the end). And the FQDN for the popular web portal for
Yahoo! is actually www.yahoo.com. (likewise).
Coincidentally, this portion of the domain namespace is managed by a bunch of special servers
known as the root name servers. At the time of this writing, a total of 13 root name servers were
managed by 13 providers.
Each provider may have multiple servers (or clusters) that are spread all over the world. The
servers are distributed for various reasons, such as security and load balancing. Also at the time of
this writing, 10 of the 13 root name servers fully support IPv6-type record sets. The root name
servers are named alphabetically, with names like a.root-server.net, b.root-server.net, …m.rootserver.net. The role of the root name servers will be discussed a bit later.
The Top-Level Domain Names
The top-level domains (TLDs) can be regarded as the first branches that we would meet on the way
down from the top of our inverted tree structure.
You could be bold and say that the top-level domains provide the categorical organization of the
DNS namespace. What this means in plain English is that the various branches of domain namespace
have been divided into clear categories to fit different uses (examples of such uses could be
geographical, functional, and so on). At the time of this writing, there were more than 270 top-level
domains.
The TLDs can be broken down further:
Generic top-level domain (such as .org, .com, .net, .mil, .gov, .edu, .int, .biz, and so on).
Country-code top-level domains (such as .us, .uk, .ng, and .ca, corresponding to the country
codes for the United States, the United Kingdom, Nigeria, and Canada, respectively).
The newly introduced branded top-level domains. These allow organizations to create any
TLDs with up to 64 characters. They can include generic words and brand names (such as
.coke, .pepsi, .example, .linux, .microsoft, .caffenix, .who, .unicef, .companyname, and so
on).
Other special top-level domains (such as the .arpa domain).
The top-level domain in our sample FQDN (serverA.example.org.) is “.org.”
The Second-Level Domain Names
The names at this level of the DNS make up the actual organizational boundary of the namespace.
Companies, Internet service providers (ISPs), educational communities, nonprofit groups, and
individuals typically acquire unique names within this level. Here are a few examples: redhat.com,
ubuntu.com, fedoraproject.org, labmanual.org, kernel.org, and caffenix.com.
The second-level domain in our sample FQDN (serverA.example.org.) is “example.”
The Third-Level Domain Names
Individuals and organizations that have been assigned second-level domain names can pretty much
decide what to do with the third-level names. The convention, though, is to use the third-level names
to reflect hostnames or other functional uses. It is also common for organizations to begin the
subdomain definitions from here. An example of functional assignment of a third-level domain name
is the “www” in the FQDN www.yahoo.com. The “www” here can be the actual hostname of a
machine under the umbrella of the yahoo.com domain, or it can be an alias to a real hostname.
The third-level domain name in our sample FQDN (serverA.example.org.) is “serverA.” Here, it
simply reflects the actual hostname of our system.
By keeping DNS distributed in this manner, the task of keeping track of all the hosts connected to
the Internet is delegated to each site, which takes care of its own information. The central repository
listing of all the primary name servers, called the root server, is the only list of existing domains.
Obviously, a list of such a critical nature is itself mirrored across multiple servers and multiple
geographic regions. For example, an earthquake in one part of the world might destroy the root
server(s) for that area, but all the root servers in other unaffected parts of the world can take up the
slack until the affected servers come back online. The only noticeable difference to users is likely to
be a slightly higher latency in resolving domain names. Pretty amazing, isn’t it? The inverted tree
structure of DNS is shown in Figure 16-2.
Figure 16-2. The DNS tree, two layers deep
Subdomains
“But I just saw the site www.support.example.org!” you say. “What’s the hostname component, and
what’s the domain name component?”
Welcome to the wild and mysterious world of subdomains. A subdomain exhibits all the
properties of a domain, except that it has delegated a subsection of the domain instead of all the hosts
at a site. Using the example.org site as an example, the subdomain for the support and help desk
department of Example, Inc., is support.example.org. When the primary name server for the
example.org domain receives a request for a hostname whose FQDN ends in support.example.org, the
primary name server forwards the request down to the primary name server for support.example.org.
Only the primary name server for support.example.org knows all the hosts existing beneath it—hosts
such as a system named “www” with the FQDN of www.support.example.org.
Figure 16-3 shows you the relationship from the root servers down to example.org and then to
support.example.org. The “www” is, of course, the hostname.
Figure 16-3. Concept of subdomains
To make this clearer, let’s follow the path of a DNS request:
1. A client wants to visit a web site called “www.support.example.org.”
2. The query starts with the top-level domain “org.” Within “org.” is “example.org.”
3. Let’s say one of the authoritative DNS servers for the “example.org” domain is named
“ns1.example.org.”
4.
Since the host ns1 is authoritative for the example.org domain, we have to query it for all
hosts (and subdomains) under it.
5. So we query it for information about the host we are interested in:
“www.support.example.org.”
6. Now ns1.example.org’s DNS configuration is such that for anything ending with a
“support.example.org,” the server must contact another authoritative server called
“dns2.example.org.”
7. The request for “www.support.example.org” is then passed on to dns2.example.org, which
returns the IP address for www.support.example.org—say, 192.168.1.10.
Note that when a site name appears to reflect the presence of subdomains, it doesn’t mean
subdomains in fact exist. Although the hostname specification rules do not allow periods, the
Berkeley Internet Name Domain (BIND) name server has always allowed them. Thus, from time to
time, you will see periods used in hostnames. Whether or not a subdomain exists is handled by the
configuration of the DNS server for the site. For example, www.bogus.example.org does not
automatically imply that bogus.example.org is a subdomain. Rather, it may also mean that www.bogus
is the hostname for a system in the example.org domain.
The in-addr.arpa Domain
DNS allows resolution to work in both directions. Forward resolution converts names into IP
addresses, and reverse resolution converts IP addresses back into hostnames. The process of reverse
resolution relies on the in-addr.arpa domain, where arpa is an acronym for Address Routing and
Parameters Area.
As explained in the preceding section, domain names are resolved by looking at each component
from right to left, with the suffixing period indicating the root of the DNS tree. Following this logic,
IP addresses must have a top-level domain as well. This domain is called in-addr.arpa for IPv4-type
addresses. In IPv6, the domain is called ip6.arpa.
Unlike FQDNs, IP addresses are resolved from left to right once they’re under the in-addr.arpa
domain. Each octet further narrows down the possible hostnames. Figure 16-4 provides a visual
example of reverse resolution of the IP address 138.23.169.15.
Figure 16-4. Reverse DNS resolution of 138.23.169.15
Types of Servers
DNS servers come in three flavors: primary, secondary, and caching. Another special class of name
servers consists of the so-called “root name servers.” Other DNS servers require the service
provided by the root name servers every once in a while.
The three main flavors of DNS servers are discussed next.
Primary servers are considered authoritative for a particular domain. An authoritative server is
the one on which the domain’s configuration files reside. When updates to the domain’s DNS tables
occur, they are done on this server. A primary name server for a domain is simply a DNS server that
knows about all hosts and subdomains existing under its domain.
Secondary servers work as backups and as load distributors for the primary name servers.
Primary servers know of the existence of secondaries and send them periodic notifications/alerts of
changes to the name tables. The secondary then initiates a zone transfer to pull in the actual changes.
When a site queries a secondary name server, the secondary responds with authority. However,
because it’s possible for a secondary to be queried before its primary can alert it to the latest
changes, some people refer to secondaries as “not quite authoritative.” Realistically speaking, you
can generally trust secondaries to have correct information. (Besides, unless you know which is
which, you cannot tell the difference between a query response from a primary and one received from
a secondary.)
Root Name Servers
The root name servers act as the first port of call for the topmost parts of the domain namespace.
These servers publish a file called the “root zone file” to other DNS servers and clients on the
Internet. The root zone file describes where the authoritative servers for the DNS top-level
domains (.com, .org, .ca, .ng, .hk, .uk, and so on) are located.
A root name server is simply an instance of a primary name server—it delegates every
request it gets to another name server. You can build your own root server out of BIND—
nothing terribly special about it!
Caching servers are just that: caching servers. They contain no configuration files for any
particular domain. Rather, when a client host requests a caching server to resolve a name, that server
will check its own local cache first. If it cannot find a match, it will find the primary server and ask it.
This response is then cached. Practically speaking, caching servers work quite well because of the
temporal nature of DNS requests. Their effectiveness is based on the premise that if you’ve asked for
the IP address to http://www.example.org in the past, you are likely to do so again in the near future.
Clients can tell the difference between a caching server and a primary or secondary server, because
when a caching server answers a request, it answers it “non-authoritatively.”
NOTE A DNS server can be configured to act with a specific level of authority for a particular
domain. For example, a server can be primary for example.org but be secondary for domain.com. All
DNS servers act as caching servers, even if they are also primary or secondary for any other
domains.
Installing a DNS Server
There isn’t much variety in the DNS server software available, but two particular flavors of DNS
software abound in the Linux/UNIX world: djbdns and the venerable BIND server. djbdns is a
lightweight DNS solution that claims to be a more secure replacement for BIND, which is an older
and much more popular program. It is used on a vast majority of name-serving machines worldwide.
BIND is currently maintained and developed by the Internet Systems Consortium (ISC). (You can
learn more about the ISC at www.isc.org.) The ISC is in charge of development of the ISC Dynamic
Host Configuration Protocol (DHCP) server/client as well as other software.
NOTE Because of the timing between writing this book and the inevitable release of newer software,
it is possible that the version of BIND discussed here will not be the same as the version to which
you will have access; but you shouldn’t worry at all, because most of the configuration directives,
keywords, and command syntax have remained much the same between recent versions of the
software.
Our sample system runs the Fedora distribution of Linux, and, as such, we will be using the
precompiled binary that ships with this OS. Software that ships with Fedora is supposed to be fairly
recent software, so you can be sure that the version of BIND referred to here is close to the latest
version that can be obtained directly from the www.isc.org site (the site even has precompiled Red
Hat Packages or RPMs, for the BIND program).
The good news is that once BIND is configured, you’ll rarely need to concern yourself with its
operation. Nevertheless, keep an eye out for new releases. New bugs and security issues are
discovered from time to time and should be corrected. Of course, new features are released as well,
but unless you have a need for them, those releases are less critical.
The BIND program can be found under the /Packages/ directory at the root of the Fedora DVD
media. You can also download it to your local file system from any of the Fedora mirrors:
http://download.fedora.redhat.com/pub/fedora/linux/releases/<FEDORAVERSION=/Fedora/x86_6
If you have a working connection to the Internet, installing BIND can be as simple as running this
command:
If, on the other hand, you downloaded or copied the BIND binary into your current working
directory, you can install it using the rpm command:
Once this command finishes, you are ready to begin configuring the DNS server.
Downloading, Compiling, and Installing the ISC BIND Software from
Source
If the ISC BIND software is not available in a prepackaged form for your particular Linux
distribution, you can always build the software from source code available from the ISC site at
www.isc.org. You might also want to take advantage of the most recent bug fixes available for
the software, which your distribution has not yet implemented. As of this writing, the most
current stable version of the software is version 9.8.1, which can be downloaded directly from
http://ftp.isc.org/isc/bind9/9.8.1/bind-9.8.1.tar.gz. Make sure that you have the openssl-devel
package installed on your RPM-based distro before attempting to compile and build BIND from
source. The equivalent package in the Debian/Ubuntu world is libssl-dev. The package ensures
that you have the necessary library/header files available to support some of the advanced
security features of BIND.
Once the package is downloaded, unpack the software as shown. For this example, we
assume the source was downloaded into the /usr/local/src/ directory. Unpack the tarball thus:
Change to the bind* subdirectory created by the preceding command. And then take a minute
to study any README file(s) that might be present.
Next configure the package with the configure command. Assuming we want BIND to be
installed under the /usr/local/named/ directory, we’ll run this:
Create the directory specified by the “prefix” option, using mkdir:
To compile and install, issue the make ; make install commands:
The version of ISC BIND software that we built from source installs the name server
daemon (named) and some other useful utilities under the /usr/local/named/ sbin/ directory. The
client-side programs (dig, host, nsupdate, and so on) are installed under the
/usr/local/named/bin/ directory.
What Was Installed
Many programs come with the main bind and bind-utils packages that were installed earlier. We are
interested in the following four tools:
/usr/sbin/named The DNS server program itself.
/usr/sbin/rndc The bind name server control utility.
/usr/bin/host A program that performs a simple query on a name server.
/usr/bin/dig A program that performs complex queries on a name server. The remainder of
the chapter will discuss some of the programs/utilities listed here, as well as their
configuration and usage.
Understanding the BIND Configuration File
The named.conf file is the main configuration file for BIND. Based on this file’s specifications,
BIND determines how it should behave and what additional configuration files, if any, must be read.
This section of the chapter covers what you need to know to set up a general-purpose DNS server.
You’ll find a complete guide to the new configuration file format in the html directory of BIND’s
documentation.
The general format of the named.conf file is as follows:
The statement keyword tells BIND we’re about to describe a particular facet of its operation,
and options are the specific commands applying to that statement. The curly braces are required so
that BIND knows which options are related to which statements; a semicolon appears after every
option and after the closing curly brace.
An example of this follows:
This BIND statement means that this is an option statement. And the particular option here is the
directive that specifies BIND’s working directory—that is, the directory on the local file system that
will hold the name server’s configuration data.
The Specifics
This section documents the most common statements you will see in a typical named.conf file. The
best way to tackle this is to skim it first, and then treat it as a reference guide for later sections. If
some of the directives seem bizarre or don’t quite make sense to you during the first pass, don’t
worry. Once you see them in use in later sections, the hows and whys will quickly fall into place.
Comments
Comments can be in one of the following formats:
Format
Indicates
//
C++-style comments
/*…*/
C-style comments
#
Perl and UNIX shell script–style comments
In the case of the first and last styles (C++ and Perl/UNIX shell), once a comment begins, it
continues until the end of the line. In regular C-style comments, the closing */ is required to indicate
the end of a comment. This makes C-style comments easier for multiline comments. In general,
however, you can pick the comment format that you like best and stick with it. No one style is better
than another.
Statement Keywords
You can use the following statement keywords:
Keyword Description
acl
Access Control List—determines what kind of access others have to your DNS server.
Allows you to include another file and have that file treated as part of the normal
include
named.conf file.
Specifies what information gets logged and what gets ignored. For logged information,
logging
you can also specify where the information is logged.
options Addresses global server configuration issues.
controls Allows you to declare control channels for use by the rndc utility.
server
Sets server-specific configuration options.
zone
Defines a DNS zone.
The include Statement
If you find that your configuration file is starting to grow unwieldy, you may want to consider
breaking up the file into smaller components. Each file can then be included into the main named.conf
file. Note that you cannot use the include statement inside another statement.
Here’s an example of an include statement:
NOTE To all you C and C++ programmers out there: Be sure not to begin include lines with the
pound symbol (#), despite what your instincts tell you! That symbol is used to start comments in the
named.conf file.
The logging Statement
The logging statement is used to specify what information you want logged and where. When this
statement is used in conjunction with the syslog facility, you get an extremely powerful and
configurable logging system. The items logged are a number of statistics about the status of named. By
default, they are logged to the /var/log/messages file. In its simplest form, the various types of logs
have been grouped into predefined categories; for example, there are categories for security-related
logs, a general category, a default category, a resolver category, a queries category, and so on.
Unfortunately, the configurability of this logging statement comes at the price of some additional
complexity, but the default logging set up by named is good enough for most uses. Here is a simple
logging directive example:
NOTE Line numbers have been added to the preceding listing to aid readability.
The preceding logging specification means that all logs that fall under the default category will be
sent to the system’s syslog (the default category defines the logging options for categories where no
specific configuration has been defined).
Line 3 in the listing specifies where all queries will be logged to; in this case, all queries will be
logged to the system syslog.
The server Statement
The server statement tells BIND specific information about other name servers it might be dealing
with. The format of the server statement is as follows:
Here, ip-address in line 1 is the IP address of the remote name server in question.
The bogus option in line 2 tells the server whether the remote server is sending bad information.
This is useful if you are dealing with another site that may be sending you bad information due to a
misconfiguration. The keys clause in line 3 specifies a key_id defined by the key statement, which
can be used to secure transactions when talking to the remote server. This key is used in generating a
request signature that is appended to messages exchanged with the remote name server. The item in
line 4, transfer-format, tells BIND whether the remote name server can accept multiple answers
in a single query response.
A sample server entry might look like this:
Zones
The zone statement allows you to define a DNS zone—the definition of which is often confusing.
Here is the fine print: A DNS zone is not the same thing as a DNS domain. The difference is subtle,
but important.
Let’s review: Domains are designated along organizational boundaries. A single organization can
be separated into smaller administrative subdomains. Each subdomain gets its own zone. All of the
zones collectively form the entire domain.
For example, .example.org is a domain. Within it are the subdomains .engr.example.org,
.marketing.example.org, .sales.example.org, and .admin.example.org. Each of the four subdomains has
its own zone. And .example.org has some hosts within it that do not fall under any of the subdomains;
thus, it has a zone of its own. As a result, the example.org domain is actually composed of five zones
in total.
In the simplest model, where a single domain has no subdomains, the definition of zone and
domain are the same in terms of information regarding hosts, configurations, and so on.
The process of setting up zones in the named.conf file is discussed in the following section.
Configuring a DNS Server
Earlier, you learned about the differences between primary, secondary, and caching name servers. To
recap: Primary name servers contain the databases with the latest DNS information for a zone. When
a zone administrator wants to update these databases, the primary name server gets the update first,
and the rest of the world asks it for updates. Secondaries explicitly keep track of primaries, and
primaries notify the secondaries when changes occur. Primaries and secondaries are considered
equally authoritative in their answers. Caching name servers have no authoritative records, only
cached entries.
Defining a Primary Zone in the named.conf File
The most basic syntax for a zone entry is as follows:
The path-name refers to the file containing the database information for the zone in question. For
example, to create a zone for the domain example.org, where the database file is located in
/var/named/example.org.db, you would create the following zone definition in the named.conf file:
Note that the directory option for the named.conf file will automatically prefix the
example.org.db filename. So if you designated directory /var/named, the server software will
automatically look for example.org’s information in /var/named/example.org.db.
The zone definition created here is just a forward reference—that is, the mechanism by which
others can look up a name and get the IP address for a system under the example.org domain that your
name server manages. It’s also proper Internet behavior to supply an IP-to-hostname mapping (also
necessary if you want to send e-mail to some sites). To do this, you provide an entry in the inaddr.arpa domain.
The format of an in-addr.arpa entry is the first three octets of your IP address, reversed, followed
by in-addr.arpa. Assuming that the network address for example.org is 192.168.1, the in-addr.arpa
domain would be 1.168.192.in-addr.arpa. Thus, the corresponding zone statement in the named.conf
file would be as follows:
Note that the filenames (example.org.db and example.org.rev) used in the zone sections here are
completely arbitrary. You are free to choose your own naming convention as long as it makes sense to
you.
The exact placement of our sample example.org zone section in the overall named.conf file will
be shown later on in the “Breaking out the Individual Steps” section.
Additional Options
Primary domains can also use the following configuration choices from the options statement:
check-names
allow-update
allow-query
allow-transfer
notify
also-notify
Using any of these options in a zone configuration will affect only that zone.
Defining a Secondary Zone in the named.conf File
The zone entry format for secondary servers is similar to that of master servers. For forward
resolution, here is the format:
Here, domain-name is the exact same zone name as specified on the primary name server, IPaddress-list is the list of IP addresses where the primary name server for that zone exists, and
path-name is the full path location of where the server will keep copies of the primary’s zone files.
Additional Options
A secondary zone configuration can also use some of the configuration choices from the options
statement:
check-names
allow-update
allow-query
allow-transfer
max-transfer-time-in
Defining a Caching Zone in the named.conf File
A caching configuration is the easiest of all configurations. It’s also required for every DNS server
configuration, even if you are running a primary or secondary server. This is necessary for the server
to search the DNS tree recursively to find other hosts on the Internet.
For a caching name server, we define three zone sections. Here’s the first entry:
The first zone entry here is the definition of the root name servers. The line type hint; specifies that
this is a caching zone entry, and the line file “root.hints”; specifies the file that will prime the
cache with entries pointing to the root servers. You can always obtain the latest root hints file from
www.internic.net/zones/named.root.
The second zone entry defines the name resolution for the local host. The second zone entry is as
follows:
The third zone entry defines the reverse lookup for the local host. This is the reverse entry for
resolving the local host address (127.0.0.1) back to the local hostname:
Putting these zone entries into /etc/named.conf is sufficient to create a caching DNS server. But,
of course, the contents of the actual database files (localhost.db, 127.0.0.rev, example.org.db, and so
on) referenced by the file directive are also important. The following sections will examine the
makeup of the database file more closely.
DNS Records Types
This section discusses the makeup of the name server database files—the files that store specific
information that pertains to each zone that the server hosts. The database files consist mostly of
record types; therefore, you need to understand the meaning and use of the common record types for
DNS: SOA, NS, A, PTR, CNAME, MX, TXT, and RP.
SOA: Start of Authority
The SOA record starts the description of a site’s DNS entries. The format of this entry is as follows:
NOTE Line numbers have been added to the preceding listing to aid readability.
The first line contains some details you need to pay attention to: domain.name is, of course, to be
replaced with your domain name. This is usually the same name that was specified in the zone
directive in the /etc/named.conf file. Notice that last period at the end of domain.name. It’s
supposed to be there—indeed, the DNS configuration files are extremely picky about it. The ending
period is necessary for the server to differentiate relative hostnames from fully qualified domain
names (FQDNs); for example, it signifies the difference between serverA and serverA.example.org.
IN tells the name server that this is an Internet record. There are other types of records, but it’s
been years since anyone has had a need for them. You can safely ignore them.
SOA tells the name server this is the Start of Authority record.
The ns.domain.name. is the FQDN for the name server for this domain (that would be the server
where this file will finally reside). Again, watch out and don’t miss that trailing period.
The hostmaster.domain.name. is the e-mail address for the domain administrator. Notice the
lack of an @ in this address. The @ symbol is replaced with a period. Thus, the e-mail address
referred to in this example is [email protected] The trailing period is used here, too.
The remainder of the record starts after the opening parenthesis on line 1. Line 2 is the serial
number. It is used to tell the name server when the file has been updated. Watch out—forgetting to
increment this number when you make a change is a mistake frequently made in the process of
managing DNS records. (Forgetting to put a period in the right place is another common error.)
NOTE To maintain serial numbers in a sensible way, use the date formatted in the following order:
YYYYMMDDxx. The tail-end xx is an additional two-digit number starting with 00, so if you make
multiple updates in a day, you can still tell which is which.
Line 3 in the list of values is the refresh rate in seconds. This value tells the secondary DNS
servers how often they should query the primary server to see if the records have been updated.
Line 4 is the retry rate in seconds. If the secondary server tries but cannot contact the primary
DNS server to check for updates, the secondary server tries again after the specified number of
seconds.
Line 5 specifies the expire directive. It is intended for secondary servers that have cached the
zone data. It tells these servers that if they cannot contact the primary server for an update, they should
discard the value after the specified number of seconds. One to two weeks is a good value for this
interval.
The final value (line 6, the minimum) tells caching servers how long they should wait before
expiring an entry if they cannot contact the primary DNS server. Five to seven days is a good
guideline for this entry.
TIP Don’t forget to place the closing parenthesis (line 7) after the final value.
NS: Name Server
The NS record is used for specifying which name servers maintain records for this zone. If any
secondary name servers exist that you intend to transfer zones to, they need to be specified here. The
format of this record is as follows:
IN NS ns1.domain.name.
IN NS ns2.domain.name.
You can have as many backup name servers as you’d like for a domain—at least two is a good
idea. Most ISPs are willing to act as secondary DNS servers if they provide connectivity for you.
A: Address Record
This is probably the most common type of record found in the wild. The A record is used to provide a
mapping from hostname to IP address. The format of an A address is simple:
For example, an A record for the host serverB.example.org, whose IP address is 192.168.1.2,
would look like this:
The equivalent of the IPv4 A resource record in the IPv6 world is called the AAAA (quad-A)
resource record. For example, a quad-A record for the host serverB whose IPv6 address is
2001:DB8::2 would look like this:
Note that any hostname is automatically suffixed with the domain name listed in the SOA record,
unless this hostname ends with a period. In the foregoing example for serverB, if the SOA record
prior to it is for example.org, then serverB is understood to be serverB.example.org. If you were to
change this to serverB.example.org (without a trailing period), the name server would understand it to
be serverB.example.org .example.org.—which is probably not what you intended! So if you want to
use the FQDN, be sure to suffix it with a period.
PTR: Pointer Record
The PTR record is for performing reverse name resolution, thereby allowing someone to specify an
IP address and determine the corresponding hostname. The format for this record is similar to the A
record, except with the values reversed:
The IP-Address can take one of two forms: just the last octet of the IP address (leaving the name
server to suffix it automatically with the information it has from the in-addr.arpa domain name) or the
full IP address, which is suffixed with a period. The Host_name must have the complete FQDN. For
example, the PTR record for the host serverB would be as follows:
A PTR resource record for an IPv6 address in the ip6.arpa domain is expressed similarly to the
way it is done for an IPv4 address, but in reverse order. Unlike in the normal IPv6 way, the address
cannot be compressed or abbreviated; it is expressed in the so-called “reverse nibble format” (fourbit aggregation). Therefore, with a PTR record for the host with the IPv6 address 2001:DB8::2, the
address will have to be expanded to its equivalent of 2001:0db8:0000:0000:0000:0000:0000:0002.
For example, here’s the IPv6 equivalent for a PTR record for the host serverB with the IPv6
address 2001:DB8::2 (single line):
MX: Mail Exchanger
The MX record is in charge of telling other sites about your zone’s mail server. If a host on your
network generates an outgoing mail message with its hostname on it, someone returning a message
would not send it back directly to that host. Instead, the replying mail server would look up the MX
record for that site and send the message there. For example, MX records are used when a user’s
desktop named pc.domain. sends a message using its PC-based mail client/reader, which cannot
accept Simple Mail Transfer Protocol (SMTP) mail; it’s important that the replying party have a
reliable way of knowing the identity of pc.domain.name’s mail server.
The format of the MX record is as follows:
Here, domainname. is the domain name of the site (with a period at the end, of course); the weight is
the importance of the mail server (if multiple mail servers exist, the one with the smallest number has
precedence over those with larger numbers); and the Host_name is, of course, the name of the mail
server. It is important that the Host_name have an A record as well.
Here’s an example entry:
Typically, MX records occur close to the top of DNS configuration files. If a domain name is not
specified, the default name is pulled from the SOA record.
CNAME: Canonical Name
CNAME records allow you to create aliases for hostnames. A CNAME record can be regarded as an
alias. This is useful when you want to provide a highly available service with an easy-to-remember
name, but still give the host a real name.
Another popular use for CNAMEs is to “create” a new server with an easy-to-remember name
without having to invest in a new server at all. Here’s an example: Suppose a site has a web server
with a hostname of zabtsuj-content.example.org. It can be argued that zabtsuj-content.example.org is
neither a memorable nor user-friendly name. So since the system is a web server, a CNAME record,
or alias, of “www” can be created for the host. This will simply map the user-unfriendly name of
zabtsuj-content.example.org to a more user-friendly name of www.example.org. This will allow all
requests that go to www.example.org to be passed on transparently to the actual system that hosts the
web content—that is, zabtsuj-content.example.org.
Here’s the format for the CNAME record:
For our sample scenario, the CNAME entry will be
RP and TXT: The Documentation Entries
Sometimes it’s useful to provide contact information as part of your database—not just as comments,
but as actual records that others can query. This can be accomplished using the RP (Responsible
Person) and TXT records.
A TXT record is a free-form text entry into which you can place whatever information you deem
fit. Most often, you’ll want to put only contact information in these records. Each TXT record must be
tied to a particular hostname. Here’s an example:
The RP record was created as an explicit container for a host’s contact information. This record
states who the responsible person is for the specific host; here’s an example:
As useful as these records may be, they are a rarity these days, because it is perceived that they
give away too much information about the site that could lead to social engineering–based attacks.
You may find such records helpful in your internal DNS servers, but you should probably leave them
out of anything that someone could query from the Internet.
Setting up BIND Database Files
So now you know enough about all the DNS record types to get started. It’s time to create the actual
database that will feed the server. The database file format is not too strict, but some conventions
have jelled over time. Sticking to these conventions will make your life easier and will smooth the
way for the administrator who takes over your creation.
NOTE Remember to add comments liberally to the bind configuration files. Comment lines begin
with a pound sign (#). Even though there isn’t much mystery to what’s going on in a DNS database
file, a history of the changes is a useful reference for what was being accomplished and why.
The database files are your most important configuration files. It is easy to create the forward
lookup databases; what usually gets left out are the reverse lookups. Some tools, such as Sendmail
and TCP Wrappers, will perform reverse lookups on IP addresses to see where people are coming
from, so it is a common courtesy to have this information.
Every database file should start with a $TTL entry. This entry tells BIND what the time-to-live
value is for each individual record whenever it isn’t explicitly specified. (The time-to-live, or TTL,
in the SOA record is for the SOA record only.) After the $TTL entry is the SOA record and at least
one NS record. Everything else is optional. (Of course, “everything else” is what makes the file
useful!) You might find the following general format helpful:
Let’s walk through the process of building a complete DNS server from start to finish to
demonstrate how the information shown thus far comes together. For this example, we will build the
DNS servers for example.org that will accomplish the following goals:
Establish two name servers: ns1.example.org and ns2.example.org.
The name servers will be able to respond to queries for IPv6 records of which they are
aware.
Act as a slave server for the sales.example.org zone, where serverB.example.org will be the
master server.
Define A records for serverA, serverB, smtp, ns1, and ns2.
Define AAAA records (IPv6) for serverA-v6 and serverB-v6.
Define smtp.example.org as the mail exchanger (MX) for the example.org domain.
Define www.example.org as an alternative name (CNAME) for serverA .example.org, and
define ftp.example.org as an alternative name for serverB.example.org.
Define contact information for serverA.example.org.
Okay, Mr. Bond, you have your instructions. Go forth and complete the mission. Good luck!
Breaking out the Individual Steps
To accomplish our goal of setting up a DNS server for example.org, we will need to take a series of
steps. Let’s walk through them one at a time.
1. Make sure that you have installed the BIND DNS server software as described earlier in the
chapter. Use the rpm command to confirm this:
NOTE If you only built and installed BIND from source, the preceding rpm command will not reveal
anything, because the RPM database will not know anything about it. But you would know what you
installed and where.
2. Use any text editor you are comfortable with to create the main DNS server configuration file
—the /etc/named.conf file. Enter the following text into the file:
3. Save the preceding file as /etc/named.conf and exit the text editor.
4. Next we’ll need to create the actual database files referenced in the file sections of the
/etc/named.conf file. In particular, the files we want to create are root.hints, localhost.db,
127.0.0.rev, example.org.db, and example.org.rev. All the files will be stored in BIND’s
working directory, /var/named/. We’ll create them as they occur from the top of the
named.conf file to the bottom.
5. Thankfully, we won’t have to create the root hints file manually. Download the latest copy of
the root hints file from the Internet. Use the wget command to download and copy it in the
proper directory:
6. Use any text editor you are comfortable with to create the zone file for the local host. This is
the localhost.db file. Enter the following text into the file:
7. Save the preceding file as /var/named/localhost.db and exit the text editor.
8. Use any text editor to create the zone file for the reverse lookup zone for the local host. This
is the 127.0.0.rev file. Enter the following text into the file:
TIP It is possible to use abbreviated time values in BIND. For example, 3H means 3 hours, 2W
means 2 weeks, 30M implies 30 minutes, and so on.
9. Save the preceding file as /var/named/127.0.0.rev and exit the text editor.
10. Next, create the database file for the main zone of concern—that is, the example.org domain.
Use a text editor to create the example.org.db file, and input the following text into the file:
11. Save the preceding file as /var/named/example.org.db and exit the text editor.
12. Finally, create the reverse lookup zone file for the example.org zone. Use a text editor to
create the /var/named/example.org.rev file, and input the following text into the file:
13. We don’t have to create any files to be secondary for sales.example.com. We need to add
only the entries we already have in the named.conf file. (Although the log files will complain
about not being able to contact the master, this is okay, since we have only shown how to set
up the primary master for the zone for which our server is authoritative.)
The next step will demonstrate how to start the named service. But because the BIND
software is so finicky about its dots and semicolons, and because you might have had to type
in all the configuration files manually, chances are great that you invariably made some typos
(or we made some typos ourselves). So your best bet will be to monitor the system log files
carefully to view error messages as they are being generated in real time.
14. Use the tail command in another terminal window to view the logs, and then issue the
command in the next step in a separate window so that you can view both simultaneously. In
your new terminal window, type the following:
15. We are ready to start the named service at this point. Use the service command to launch the
service:
On a systemd-enabled distro, you can use the following command instead to start up the
named service:
TIP On an openSUSE system, the equivalent command will be
16. If you get a bunch of errors in the system logs, you will find that the logs will usually tell you
the line number and/or the type of error. So fixing the errors shouldn’t be too hard. Just go
back and add the dots and semicolons where they ought to be. Another common error is
misspelling the configuration file’s directives—for example, writing “master” instead of
“masters”; though both are valid directives, each is used in a different context.
TIP If you have changed BIND’s configuration files (either the main named.conf or the database
file), you will need to tell it to reread them by sending the named process a HUP signal. Begin by
finding the process ID (PID) for the named process. This can be done by looking for it in /var/run/
named/named.pid. If you do not see it in the usual location, you can run the following command to get
it:
The value under the PID column is the process ID of the named process. This is the PID to which you
want to send a HUP signal. You can then send it a HUP signal by typing # kill -HUP 7706. Of
course, replace 7706 with the correct process ID from your output.
17. Finally you might want to make sure that your DNS server service starts up during the next
system reboot. Use the chkconfig command:
On a systemd-enabled distro, you can use the following command instead to ensure that
named automatically starts up with the system boot:
The next section will walk you through the use of tools that can be used to test or query a DNS
server.
The DNS Toolbox
This section describes a few tools that you’ll want to get acquainted with as you work with DNS.
They’ll help you to troubleshoot problems more quickly.
host
The host tool is really a simple utility to use. Its functionality can, of course, be extended by using it
with its various options. Its options and syntax are shown here:
In its simplest use, host allows you to resolve hostnames into IP addresses from the command
line. Here’s an example:
You can also use host to perform reverse lookups. Here’s an example:
The host command can also be used to query for IPv6 records. For example, to query (on its
listening IPv6 interface) a name server (::1) for the IPv6 address for the host serverBv6.example.org, you can run the following:
To query for the PTR record for serverB-v6, you can use the following:
dig
The domain information gopher, dig, is a great tool for gathering information about DNS servers.
This tool has the BIND group’s blessing and official stamp.
Its syntax and some of its options are shown here (see the dig man page for the meaning of the
various options):
Here’s dig’s usage summary:
Here, <server> is the name of the DNS server you want to query, domain is the domain name you
are interested in querying, and query-type is the name of the record you are trying to get (A, MX,
NS, SOA, HINFO, TXT, ANY, and so on).
For example, to get the MX record for the example.org domain we established in the earlier
project from the DNS server we set up, you would issue the dig command like this:
To query our local DNS server for the A records for the yahoo.com domain, simply type this:
NOTE Notice that for the preceding command, we didn’t specify the query type—that is, we didn’t
explicitly specify an A-type record. The default behavior for dig is to assume you want an A-type
record when nothing is specified explicitly. You might also notice that we are querying our DNS
server for the yahoo.com domain. Our server is obviously not authoritative for the yahoo.com domain,
but because we also configured it as a caching-capable DNS server, it is able to obtain the proper
answer for us from the appropriate DNS servers.
To query our local IPv6-capable DNS server for the AAAA record for the host serverBv6.example.org, type the following:
To reissue one of the previous commands, but this time suppress all verbosity using one of dig’s
options (+short), type this:
To query the local name server for the reverse lookup information (PTR RR) for 192.168.1.1,
type this:
To query the local name server for the IPv6 reverse lookup information (PTR RR) for
2001:db8::2, type this:
The dig program is incredibly powerful. Its options are too numerous to cover properly here.
Read the man page installed with dig to learn how to use some of its more advanced features.
nslookup
The nslookup utility is one of the tools that you will find exists across various operating system
platforms, so it is probably one of the most familiar tools for many. Its usage is quite simple, too. It
can be used both interactively and noninteractively (that is, directly from the command line).
Interactive mode is entered when no arguments are provided to the command. Typing nslookup
by itself at the command line will drop you to the nslookup shell. To get out of interactive mode, just
type exit at the nslookup prompt.
TIP When nslookup is used in interactive mode, the command to quit the utility is exit. But most
people will often instinctively issue the quit command to try to exit the interactive mode. nslookup
will think it is being asked to do a DNS lookup for the hostname “quit.” It will eventually time out.
You can create a DNS record that will immediately remind the user of the proper command to use. An
entry like this in the zone file for your domain will suffice:
With the preceding entry in the zone file, whenever anybody queries your DNS server using
nslookup interactively and then mistakenly issues the quit command, the user will get a gentle
reminder that says “use-exit-to-quit-nslookup.”
Usage for the noninteractive mode is summarized here:
For example, to use nslookup noninteractively to query our local name server for information
about the host www.example.org, you’d type this:
NOTE The BIND developer group frowns on use of the nslookup utility. It has officially been
deprecated.
whois
The whois command is used for determining ownership of a domain. Information about a domain’s
owner isn’t a mandatory part of its records, nor is it customarily placed in the TXT or RP records. So
you’ll need to gather this information using the whois technique, which reports the domain owner’s
actual owner, snail-mail address, e-mail address, and technical contact phone numbers.
Let’s try an example to get information about the example.com domain:
nsupdate
An often-forgotten but powerful DNS utility is nsupdate. It is used to submit Dynamic DNS (DDNS)
Update requests to a DNS server. It allows the resource records (RR) to be added or removed from a
zone without your needing to edit the zone database files manually. This is especially useful because
DDNS-type zones should not be edited or updated by hand, since the manual changes are bound to
conflict with the dynamic updates that are automatically maintained in journal files, which can result
in zone data being corrupt.
The nsupdate program reads input from a specially formatted file or from standard input. Here’s
the syntax for the command:
The rndc Tool
The remote name daemon control utility is handy for controlling the name server and also debugging
problems with the name server.
The rndc program can be used to manage the name server securely. A separate configuration file
is required for rndc, because all communication with the server is authenticated with digital
signatures that rely on a shared secret, which is typically stored in a configuration file named
/etc/rndc.conf. You will need to generate the secret that is shared between the utility and the name
server by using tools such as rndc-confgen (we don’t discuss this feature here).
Following is the usage summary for rndc:
You can use rndc, for example, to view the status of the DNS server:
If, for example, you make changes to the zone database file (/var/named/example.org.db) for one
of the zones under your control (such as example.org) and you want to reload just that zone without
restarting the entire DNS server, you can issue the rndc command with the option shown here:
CAUTION Remember to increment the serial number of the zone after making any changes to it!
Configuring DNS Clients
In this section, we’ll delve into the wild and exciting process of configuring DNS clients! Okay,
maybe it’s not that exciting—but there’s no denying the clients’ significance to the infrastructure of
any networked site.
The Resolver
So far, we’ve been studying servers and the DNS tree as a whole. The other part of this equation is,
of course, the client—the host that’s contacting the DNS server to resolve a hostname into an IP
address.
NOTE You might have noticed earlier in the section “The DNS Toolbox” that most of the queries we
were issuing were being made against the DNS server called localhost. Localhost is, of course, the
local system whose shell you are executing the query commands from. In our case, hopefully, this
system is serverA.example.org! The reason we specified the DNS server to use was that, by default,
the system will query whatever the host’s default DNS server is. And if it so happens that your host’s
DNS server is some random DNS server that your ISP has assigned you, some of the queries will fail,
because your ISP’s DNS server will not know about the zone you manage and control locally. So if
we configure our local system to use our local DNS server to process all DNStype queries, we won’t
have to specify “localhost” manually any longer. This is called configuring the resolver.
Under Linux, the resolver handles the client side of DNS. This is actually part of a library of C
programming functions that get linked to a program when the program is started. Because all of this
happens automatically and transparently, the user doesn’t have to know anything about it. It’s simply a
little bit of magic that lets users start browsing the Internet.
From the system administrator’s perspective, configuring the DNS client isn’t magic, but it’s
straightforward. Only two files are involved: /etc/resolv.conf and /etc/ nsswitch.conf.
The /etc/resolv.conf File
The /etc/resolv.conf file contains the information necessary for the client to know what its local DNS
server is. (Every site should have, at the very least, its own caching DNS server.) This file has two
lines: the first indicates the default search domain, and the second indicates the IP address of the
host’s name server.
The default search domain applies mostly to sites that have their own local servers. When the
default search domain is specified, the client side will automatically append this domain name to the
requested site and check that first. For example, if you specify your default domain to be “yahoo.com”
and then try to connect to the hostname “my,” the client software will automatically try contacting
“my.yahoo.com.” Using the same default, if you try to contact the host “www.stat.net,” the software
will try “www.stat.net.yahoo.com” (a perfectly legal hostname), find that it doesn’t exist, and then try
“www.stat.net” alone (which does exist).
Of course, you can supply multiple default domains. However, doing so will slow the query
process a bit, because each domain will need to be checked. For instance, if both example.org and
stanford.edu are specified, and you perform a query on www.stat.net, you’ll get three queries:
www.stat.net.yahoo.com, www.stat.net.stanford.edu, and www.stat.net.
The format of the /etc/resolv.conf file is as follows:
Here, domainname is the default domain name to search, and IP-address is the IP address of your
DNS server. For example, here’s a sample /etc/resolv.conf file:
Thus, when a name lookup query is needed for serverB.example.org, only the host part is needed
—that is, serverB. The example.org suffix will be automatically appended to the query. Of course,
this is valid only at your local site, where you have control over how clients are configured!
The /etc/nsswitch.conf File
The /etc/nsswitch.conf file tells the system where it should look up certain kinds of configuration
information (services). When multiple locations are identified, the /etc/nsswitch.conf file also
specifies the order in which the information can best be found. Typical configuration files that are set
up to use /etc/nsswitch.conf include the password file, group file, and hosts file. (To see a complete
list, open the file in your favorite text editor.)
The format of the /etc/nsswitch.conf file is simple. The service name comes first on a line (note
that /etc/nsswitch.conf applies to more than just hostname lookups), followed by a colon. Next are
the locations that contain the information. If multiple locations are identified, the entries are listed in
the order in which the system needs to perform the search. Valid entries for locations are files,
nis, dns, [NOTFOUND], and NISPLUS. Comments begin with a pound symbol (#).
For example, if you open the file with your favorite editor, you might see a line similar to this:
This line tells the system that all hostname lookups should first start with the /etc/hosts file. If the
entry cannot be found there, NISPLUS is checked. If the host cannot be found via NISPLUS, regular
NIS is checked, and so on. It’s possible that NISPLUS isn’t running at your site and you want the
system to check DNS records before it checks NIS records. In this case, you’d change the line to this:
And that’s it. Save your file, and the system automatically detects the change.
The only recommendation for this line is that the hosts file (files) should always come first in the
lookup order.
What’s the preferred order for NIS and DNS? This depends on the site. Whether you want to
resolve hostnames with DNS before trying NIS will depend on whether the DNS server is closer than
the NIS server in terms of network connectivity, if one server is faster than another, or if firewall
issues, site policy issues, and other such factors exist.
Using [NOTFOUND=action]
In the /etc/nsswitch.conf file, you’ll see entries that end in [NOTFOUND=action]. This is a special
directive that allows you to stop the process of searching for information after the system has failed
all prior entries. The action can be either to return or continue. The default action is to continue.
For example, if your file contains the line
the system will try to look up host information in the /etc/hosts file only. If the requested information
isn’t found there, NIS and DNS won’t be searched.
Configuring the Client
Let’s walk through the process of configuring a Linux client to use a DNS server. We’ll assume that
we are using the DNS server on serverA and we are configuring serverA itself to be the client. This
may sound a bit odd at first, but it is important for you to understand that just because a system runs
the server does not mean it cannot run the client. Think of it in terms of running a web server—just
because a system runs Apache doesn’t mean you can’t run Firefox on the same machine and access the
web sites hosted locally on the machine via the loop-back address (127.0.0.1)!
Breaking out the steps to configuring the client, we see the following:
1. Edit /etc/resolv.conf and set the nameserver entry to point to your DNS server. Per our
example:
2. Look through the /etc/nsswitch.conf file to make sure that DNS is consulted for hostname
resolutions:
If you don’t have dns listed, as in this output, use any text editor to include dns on the hosts
line.
3. Test the configuration with the dig utility:
Notice that you didn’t have to specify explicitly the name server to use with dig (such as dig
@localhost +short serverA.example.org) for the preceding query. This is because dig will by
default use (query) the DNS server specified in the local /etc/resolv.conf file.
Summary
This chapter covered all the information you’ll need to get a basic DNS server infrastructure up and
running. We also discussed:
Name resolution over the Internet
Obtaining and installing the BIND name server
The /etc/hosts file
The process of configuring a Linux client to use DNS
Configuring DNS servers to act as primary, secondary, and caching servers
Various DNS record types for IPv4 and IPv6
Configuration options in the named.conf file
Tools for use in conjunction with the DNS server to do troubleshooting
Client-side name resolution issues
Additional sources of information
With the information available in the BIND documentation on how the server should be
configured, along with the actual configuration files for a complete server presented in this chapter,
you should be able to go out and perform a complete installation from start to finish.
Like any software, nothing is perfect, and problems can occur with BIND and the related files and
programs discussed here. Don’t forget to check out the main BIND web site (www.isc.org) as well as
the various mailing lists dedicated to DNS and the BIND software for additional information.
CHAPTER 17
FTP
he File Transfer Protocol (FTP) has existed for the Internet since around 1971. Remarkably,
the underlying protocol itself has undergone little change since then. Clients and servers, on the
other hand, have been almost constantly improved and refined. This chapter covers the Very
Secure FTP Daemon (vsftpd) software package.
The vsftpd program is a fairly popular FTP server implementation and is being used by major
FTP sites such as kernel.org, redhat.com, isc.org, and openbsd.org. The fact that these sites run the
software attests to its robustness and security. As the name implies, the vsftpd software was designed
from the ground up to be fast, stable, and secure.
T
NOTE Like most other services, vsftpd is only as secure as you make it. The authors of the program
have provided all of the necessary tools to make the software as secure as possible out of the box, but
a bad configuration can cause your site to become vulnerable. Remember to double-check your
configuration and test it out before going live. Also remember to check the vsftpd web site frequently
for any software updates.
This chapter will discuss how to obtain, install, and configure the latest version of vsftpd. We
will show you how to configure it for private access as well as anonymous access. And, finally,
you’ll learn how to use an FTP client to test out your new FTP server.
The Mechanics of FTP
The act of transferring a file from one computer to another may seem trivial, but in reality, it is not—
at least, not if you’re doing it right. This section steps through the details of the FTP client/server
interaction. Although this information isn’t crucial to your being able to get an FTP server up and
running, it is important when you need to consider security issues as well as troubleshooting issues—
especially troubleshooting issues that don’t clearly manifest themselves as FTP-related. (Is the
problem with the network, or is it the FTP server, or is it the FTP client?)
Client/Server Interactions
The original design of FTP, which was conceived in the early 1970s, assumed something that was
reasonable for a long time on the Internet: Internet users are a friendly, happy-go- lucky, do-no-evil
bunch.
After the commercialization of the Internet around 1990–91, the Internet became much more
popular. With the coming of the World Wide Web, the Internet’s user population and popularity
increased even more. Along with this came hitherto relatively unknown security problems. These
security problems are some of the many reasons firewalls are a standard on most networks.
The original design of FTP does not play very well with the hostile Internet environment that we
have today, which necessitates the use of firewalls. Inasmuch as FTP facilitates the exchange of files
between an FTP client and an FTP server, its design has some built-in nuances that are worthy of
further mention.
One of its nuances stems from the fact that it uses two ports: a control port (port 21) and a data
port (port 20). The control port serves as a communication channel between the client and the server
for the exchange of commands and replies, and the data port is used purely for the exchange of data,
which can be a file, part of a file, or a directory listing.
FTP can operate in two modes: active FTP mode and passive FTP mode.
Active FTP
Active FTP was traditionally used in the original FTP specifications. In this mode, the client connects
from an ephemeral port (number greater than 1024) to the FTP server’s command port (port 21).
When the client is ready to transfer data, the server opens a connection from its data port (port 20) to
the IP address and ephemeral port combination provided by the client. The key here is that the client
does not make the actual data connection to the server but instead informs the server of its own port
by issuing the PORT command; the server then connects back to the specified port. The server can be
regarded as the active party (or the agitator) in this FTP mode.
From the perspective of an FTP client that is behind a firewall, the active FTP mode poses a
slight problem: the firewall on the client side might frown upon (or disallow) connections originating
or initiated from the Internet from a privileged service port (such as data port 20) to nonprivileged
service ports on the clients it is supposed to protect.
Passive FTP
The FTP client issues the PASV command to indicate that it wants to access data in the passive mode,
and the server then responds with an IP address and an ephemeral port number on itself to which the
client can connect to transfer the data. The PASV command issued by the client tells the server to
“listen” on a data port that is not its normal data port (that is, port 20) and to wait for a connection
rather than initiate one. The key difference here is that it is the client that initiates the connection to the
port and IP address provided by the server. And in this regard, the server can be considered the
passive party in the data communication.
From the perspective of an FTP server that is behind a firewall, passive FTP mode is a little
problematic, because a firewall’s natural instinct would be to disallow connections that originate
from the Internet that are destined for ephemeral ports of the systems that it is supposed to protect. A
typical symptom of this behavior occurs when a client appears to be able to connect to the server
without a problem, but the connection seems to hang whenever an attempt to transfer data occurs.
To address some of the issues pertaining to FTP and firewalls, many firewalls implement
application-level proxies for FTP, which keep track of FTP requests and open up those high ports
when needed to receive data from a remote site.
Obtaining and Installing vsftpd
The vsftpd package is the FTP server software that ships with most popular and modern Linux
distributions. The latest version of the software can be obtained from its official web site,
http://vsftpd.beasts.org. The web site also hosts great documentation and the latest news about the
software. But because vsftpd is the FTP server solution that ships with Fedora, you can easily install
it from the installation media or directly from any Fedora software package repository. In this section
and the next, you will learn how to install/configure the software from the prepackaged binary.
Let’s start with the process of installing the software from a Red Hat Package Manager (RPM)
binary.
1. While logged into the system as the superuser, use the yum command to download and install
vsftpd simultaneously. Type the following (enter y for “yes” when prompted):
NOTE Depending on your Fedora version, you can also manually download the software from a
Fedora repository on the Internet—for example, from here:
http://download.fedora.redhat.com/pub/fedora/linux/releases/<VERSION>/Fedora/x86_64/os/Package
Alternatively, you can install directly from the mounted install media (CD or DVD). The software
will be under the /<your_media_mount_point>/Packages/ directory.
2. Confirm that the software has been installed:
On a Debian-based distribution such as Ubuntu, vsftpd can be installed by typing the
following:
Configuring vsftpd
After we have installed the software, our next step is to configure it for use. When we installed the
vsftpd software, we also installed other files and directories on the local file system. Some of the
more important files and directories that come installed with the vsftpd RPM are discussed in Table
17-1.
File
/usr/sbin/vsftpd
Description
This is the main vsftpd executable. It is the daemon itself.
This is the main configuration file for the vsftpd daemon. It contains the
/etc/vsftpd/vsftpd.conf
many directives that control the behavior of the FTP server.
Text file that stores the list of users not allowed to log into the FTP server.
/etc/vsftpd/ftpusers
This file is referenced by the Pluggable Authentication Module (PAM)
system.
Text file used either to allow or deny access to users listed. Access is
denied or allowed according to the value of the userlist_deny directive
/etc/vsftpd/user_list
in the vsftpd.conf file.
This is the FTP server’s working directory.
/var/ftp
This serves as the directory that holds files meant for anonymous access to
/var/ftp/pub
the FTP server.
Table 17-1. The vsftpd Configuration Files and Directories
The vsftpd.conf Configuration File
As stated, the main configuration file for the vsftpd FTP server is vsftpd.conf. Performing an
installation of the software via RPM will usually place this file in the /etc/vsftpd/ directory.
On Debian-like systems, the configuration file is located at /etc/vsftpd.conf. The file is quite
easy to manage and understand, and it contains pairs of options (directives) and values that are in this
simple format:
option=value
CAUTION vsftpd.conf options and values can be very finicky syntax. No space(s) should appear
between the option directive, the equal sign (=), and the value. Having any spaces therein can prevent
the vsftpd FTP daemon from starting up!
As with most other Linux/UNIX configuration files, comments in the file are denoted by lines that
begin with the pound sign (#). To see the meaning of each of the directives, you should consult the
vsftpd.conf man page, using the man command like so:
NOTE vsftpd configuration files are located directly under the /etc directory on Debian-like
systems. For example, the equivalent of the /etc/vsftpd/ftpusers in Fedora is located at /etc/
ftpusers in Ubuntu.
The options (or directives) in the /etc/vsftpd/vsftpd.conf file can be categorized according to the
role they play. Some of these categories are discussed in Table 17-2.
Table 17-2. Configuration Options for vsftpd
NOTE The possible values of the options in the configuration file can also be divided into three
categories: the Boolean options (such as YES, NO), the Numeric options (such as 007, 700), and the
String options (such as root, /etc/vsftpd.chroot_list).
Starting and Testing the FTP Server
Because it comes with some default settings that allow it to hit the ground running, the vsftpd daemon
is pretty much ready to run out of the box.
Of course, we’ll need to start the service. Once you’ve learned how to start the daemon, the rest
of this section will walk through testing the FTP server by connecting to it using an FTP client.
So let’s start a sample anonymous FTP session. But first we’ll start the FTP service.
1. Start the FTP service:
On Linux distributions running systemd, you can alternatively start the vsftpd daemon using
the systemctl command:
TIP If the service command is not available on your Linux distribution, you might be able to
control the service by directly executing its run control script. For example, you may be able to restart
vsftpd by issuing the command
TIP The ftp daemon is automatically started right after installing the software in Ubuntu via aptget. So check to make sure it isn’t already running before trying to start it again. You can examine the
output of the command ps -aux | grep vsftp to check this.
2. Launch the command-line FTP client program, and connect to the local FTP server as an
anonymous user:
3. Enter the name of the anonymous FTP user when prompted—that is, type ftp:
TIP Most FTP servers that allow anonymous logins often also permit the implicit use of the
username “anonymous.” So instead of supplying “ftp” as the username to use to connect anonymously
to our sample Fedora FTP server, we can instead use the popular username “anonymous.”
4. Enter anything at all when prompted for the password:
5. Use the ls (or dir) FTP command to perform a listing of the files in the current directory on
the FTP server:
6. Use the pwd command to display our present working directory on the FTP server:
7. Using the cd command, try to change to a directory outside of the allowed anonymous FTP
directory; for example, try to change the directory to the /boot directory of the local file
system:
8. Log out of the FTP server using the bye FTP command:
Next we’ll try to connect to the FTP server using a local system account. In particular, we’ll use
the username “yyang,” which was created in a previous chapter. So let’s start a sample authenticated
FTP session.
TIP You might have to disable SELinux temporarily on your Fedora server for the following steps.
Use the command setenforce 0 to disable SELinux.
1. Launch the command-line ftp client program again:
2. Enter yyang as the FTP user when prompted:
3. You must enter the password for the user yyang when prompted:
4. Use the pwd command to display your present working directory on the FTP server. You will
notice that the directory shown is the home directory for the user yyang.
5. Using the cd command, try to change to a directory outside of yyang’s FTP home directory;
for example, try to change the directory to the /boot directory of the local file system:
6. Log out of the FTP server using the bye FTP command:
As demonstrated by these sample FTP sessions, the default vsftpd configuration on our sample
Fedora system allows these things:
Anonymous FTP access Any user from anywhere can log into the server using the username
ftp (or anonymous), with anything at all for a password.
Local user logins All valid users on the local system with entries in the user database (the
/etc/passwd file) are allowed to log into the FTP server using their normal usernames and
passwords. This is true with SELinux in permissive mode. On our sample Ubuntu server, this
behavior is disabled out of the box.
Customizing the FTP Server
The default out-of-the-box behavior of vsftpd is probably not what you want for your production FTP
server. So in this section we will walk through the process of customizing some of the FTP server’s
options to suit certain scenarios.
Setting up an Anonymous-Only FTP Server
First we’ll set up our FTP server so that it does not allow access to users that have regular user
accounts on the system. This type of FTP server is useful for large sites that have files that should be
available to the general public via FTP. In such a scenario, it is, of course, impractical to create an
account for every single user when users can potentially number into the thousands.
Fortunately for us, vsftpd is ready to serve as an anonymous FTP server out of the box. But we’ll
examine the configuration options in the vsftpd.conf file that ensure this and also disable the options
that are not required.
With any text editor of your choice, open up the /etc/vsftpd/vsftpd.conf file for editing. Look
through the file and make sure that, at a minimum, the directives listed next are present. (If the
directives are present but commented out, you might need to remove the comment symbol [#] or
change the value of the option.)
You will find that these options are sufficient to enable your anonymous-only FTP server, so you
may choose to overwrite the existing /etc/ vsftpd/vsftpd.conf file and enter just the options shown.
This will help keep the configuration file simple and uncluttered.
TIP Virtually all Linux systems come preconfigured with a user called “ftp.” This account is
supposed to be a nonprivileged system account and is used especially for anonymous FTP-type
access. You will need this account to exist on your system in order for anonymous FTP to work. To
confirm the account exists, use the getent utility. Type
If you don’t get output similar to this, you can quickly create the FTP system account with the
useradd command. To create a suitable ftp user, type
If you had to make any modifications to the /etc/vsftpd/vsftpd.conf file, you need to restart the
vsftpd service:
On distros using systemd as the service manager, to restart vsftpd you can run:
If the service command is not available on your Linux distribution, you may be able to control the
service by directly executing its run control script. For example, you may be able to restart vsftpd by
issuing this command:
Setting up an FTP Server with Virtual Users
Virtual users are users that do not actually exist—that is, these users do not have any privileges or
functions on the system other than those for which they were created. This type of FTP setup serves as
a midway point between enabling users with local system accounts access to the FTP server and
enabling only anonymous users. If there is no way to guarantee the security of the network connection
from the user end (FTP client) to the server end (FTP server), it would be foolhardy to allow users
with local system accounts to log into the FTP server. This is because the FTP transaction between
both ends usually occurs in plain text. Of course, this is relevant only if the server contains any data
of value to its owners!
The use of virtual users will allow a site to serve content that should be accessible to untrusted
users, but still make the FTP service accessible to the general public. In the event that the credentials
of the virtual user(s) ever become compromised, you can at least be assured that only minimal
damage can occur.
TIP It is also possible to set up vsftpd to encrypt all the communication between itself and any FTP
clients by using Secure Sockets Layer (SSL). This is quite easy to set up, but the caveat is that the
clients’ FTP application must also support this sort of communication—and unfortunately, not many
FTP client programs have this support. If security is a serious concern, you might consider using
OpenSSH’s sftp program instead for simple file transfers.
In this section we will create two sample virtual users named “ftp-user1” and “ftp-user2.” These
users will not exist in any form in the system’s user database (the /etc/passwd file).
The following steps detail the process to achieve this:
1. Create a plain-text file that contains the username and password combinations of the virtual
users. Each username with its associated password will be on alternating lines in the file. For
example, for the user ftp-user1, the password will be “user1,” and for the user ftp-user2, the
password will be “user2.”
Name the file plain_vsftpd.txt. Use any text editor of your choice to create the file. Here we
use vi:
2. Enter this text into the file:
3. Save the changes to the file, and exit the text editor.
4. Convert the plain-text file that was created in Step 1 into a Berkeley DB format (db) that can
be used with the pam_userdb.so library. The output will be saved in a file called
hash_vsftpd.db stored under the /etc directory. Type the following:
NOTE On Fedora systems, you need to have the db4-utils package installed to have the db_load
program. You can quickly install it using Yum with the command yum install db4-utils. Or look
for it on the installation media. The equivalent package in Ubuntu is called db4.9-util and the binary
is named db4.9_load.
5. Restrict access to the virtual users database file by giving it more restrictive permissions.
This will ensure that it cannot be read by any casual user on the system. Type the following:
6. Next, create a PAM file that the FTP service will use as the new virtual users database file.
We’ll name the file virtual-ftp and save it under the /etc/pam.d/ directory. Use any text editor
to create the file.
7. Enter this text into the file:
These entries tell the PAM system to authenticate users using the new database stored in the
hash_vsftpd.db file.
8. Make sure that the changes have been saved into a file named virtual-ftp under the
/etc/pam.d/ directory.
9. Let’s create a home environment for our virtual FTP users. We’ll cheat and use the existing
directory structure of the FTP server to create a subfolder that will store the files that we want
the virtual users to be able to access. Type the following:
TIP We cheated in step 9 so that we won’t have to go through the process of creating a guest FTP
user that the virtual users will eventually map to, and also to avoid having to worry about permission
issues, since the system already has an FTP system account that we can safely leverage. Look for the
guest_username directive under the vsftpd.conf man page for further information (man
vsftp.conf).
10. Now we’ll create our custom vsftpd.conf file that will enable the entire setup.
11. With any text editor, open the /etc/vsftpd/vsftpd.conf file for editing. Look through the file
and make sure that, at a minimum, the directives listed next are present. (If the directives are
present but commented out, you may need to remove the comment sign or change the value of
the option.) Comments have been added to explain the less-obvious directives.
TIP If you choose not to edit the existing configuration file and create one from scratch, you will
find that the options specified here will serve your purposes with nothing additional needed. The
vsftpd software will simply assume its built-in defaults for any option that you didn’t specify in the
configuration file! You can, of course, leave out all the commented lines to save yourself the typing.
12. We’ll need to create (or edit) the /etc/vsftpd.user_list file that was referenced in the
configuration in step 10. To create the entry for the first virtual user, type this:
13. To create the entry for the second virtual user, type this:
14. We are ready to fire up or restart the FTP server now. Type this:
15. We will now verify that the FTP server is behaving the way we want it to by connecting to it
as one of the virtual FTP users. Connect to the server as ftp-user1 (remember that the FTP
password for that user is “user1”).
16. We’ll also test to make sure that anonymous users cannot log into the server:
17. We’ll finally verify that local users (for example, the user yyang) cannot log into the server:
Everything looks fine.
TIP vsftpd is an IPv6-ready daemon. Enabling the FTP server to listen on an IPv6 interface is as
simple as enabling the proper option in the vsftpd configuration file. The directive to enable is
listen_ipv6 and its value should be set to YES, like so: listen_ipv6=YES.
To have the vsftpd software support IPv4 and IPv6 simultaneously, you will need to spawn
another instance of vsftpd and point it to its own config file to support the protocol version you want.
The directive listen=YES is for IPv4.
The directives listen and listen_ipv6 are mutually exclusive and cannot be specified in the same
configuration file. On Fedora and other Red Hat–type distros, the vsftpd startup scripts will
automatically read (and start) all files under the /etc/vsftpd/ directory that end with *.conf. So, for
example, you can name one file /etc/vsftpd/vsftpd.conf and name the other file that supports IPv6
something like /etc/vsftpd/vsftpd-ipv6.conf. This is the way it’s supposed to work in theory. Your
mileage may vary.
Summary
The Very Secure FTP Daemon is a powerful FTP server offering all of the features you need for
running a commercial-grade FTP server in a secure manner. In this chapter, we discussed the process
of installing and configuring the vsftpd server on Fedora and Debian-like systems. Specifically, the
following information was covered:
Some important and often-used configuration options for vsftpd
Details about the FTP protocol and its effects on firewalls
How to set up anonymous FTP servers
How to set up an FTP server that allows the use of virtual users
How to use an FTP client to connect to the FTP server to test things out
This information is enough to keep your FTP server humming for quite a while. Of course, like
any printed media about software, this text will age and the information will slowly but surely
become obsolete. Please be sure to visit the vsftpd web site from time to time not only to learn about
the latest developments, but also to obtain the latest documentation.
CHAPTER 18
Apache Web Server
his chapter discusses the process of installing and configuring the Apache HTTP server
(www.apache.org) on your Linux server. Apache is free software released under the Apache
license. At the time of this writing, and according to a well-respected source of Internet
statistics (Netcraft, Ltd., at www.netcraft.co.uk), Apache has a web server market share of more than
50 percent. This level of acceptance and respect from the Internet community comes from the
following benefits and advantages provided by the Apache server software:
T
It is stable.
Several major web sites, including amazon.com and IBM, are using it.
The entire program and related components are open source.
It works on a large number of platforms (all popular variants of Linux/UNIX, some of the notso-popular variants of UNIX, and even Microsoft Windows).
It is extremely flexible.
It has proved to be secure.
Before we get into the steps necessary to configure Apache, let’s review some of the
fundamentals of HTTP as well as some of the internals of Apache, such as its process ownership
model. This information will help you understand why Apache is set up to work the way it does.
Understanding HTTP
HTTP is a significant portion of the foundation for the World Wide Web, and Apache is a server
implementation of HTTP. Browsers such as Firefox, Opera, and Microsoft Internet Explorer are
client implementations of HTTP.
As of this writing, HTTP is at version 1.1 and is documented in RFC 2616 (for details, go to
www.ietf.org/rfc/rfc2616.txt).
Headers
When a web client connects to a web server, the client’s default method of making this connection is
to contact the server’s TCP port 80. Once connected, the web server says nothing; it’s up to the client
to issue HTTP-compliant commands for its requests to the server. Along with each command comes a
request header that includes information about the client. For example, when using Firefox under
Linux as a client, a web server might receive the following information from a client:
The first line contains the HTTP GET command, which asks the server to fetch a file. The
remainder of the information makes up the header, which tells the server about the client, the kind of
file formats the client will accept, and so forth. Many servers use this information to determine what
can and cannot be sent to the client, as well as for logging purposes.
Along with the request header, additional headers may be sent. For example, when a client uses a
hyperlink to get to the server site, a header entry showing the client’s originating site will also appear
in the header.
When it receives a blank line, the server knows a request header is complete. Once the request
header is received, it responds with the actual requested content, prefixed by a server header. The
server header provides the client with information about the server, the amount of data the client is
about to receive, the type of data coming in, and other information. For example, the request header
just shown, when sent to an HTTP server, results in the following server response header:
A blank line and then the actual content of the transmission follow the response header.
Ports
The default port for HTTP requests is port 80, but you can also configure a web server to use a
different (arbitrarily chosen) port that is not in use by another service. This allows sites to run
multiple web servers on the same host, with each server on a different port. Some sites use this
arrangement for multiple configurations of their web servers to support various types of client
requests.
When a site runs a web server on a nonstandard port, you can see that port number in the site’s
URL. For example, the address http://www.redhat.com with an added port number would read
http://www.redhat.com:80.
TIP Don’t make the mistake of going for “security through obscurity.” If your server is on a
nonstandard port, that doesn’t guarantee that Internet troublemakers won’t find your site. Because of
the automated nature of tools used to attack a site, it takes very little effort to scan a server and find
which ports are running web servers. Using a nonstandard port does not keep your site secure.
Process Ownership and Security
Running a web server on a Linux/UNIX platform forces you to be more aware of the traditional
Linux/UNIX permissions and ownership model. In terms of permissions, that means each process has
an owner and that owner has limited rights on the system.
Whenever a program (process) is started, it inherits the permissions of its parent process. For
example, if you’re logged in as root, the shell in which you’re doing all your work has all the same
rights as the root user. In addition, any process you start from this shell will inherit all the
permissions of that root. Processes may give up rights, but they cannot gain rights.
NOTE There is an exception to the Linux inheritance principle. Programs configured with the SetUID
bit do not inherit rights from their parent process, but rather start with the rights specified by the
owner of the file itself. For example, the file containing the program su (/bin/su) is owned by root
and has the SetUID bit set. If the user yyang runs the program su, that program doesn’t inherit the
rights of yyang, but instead will start with the rights of the superuser (root). To learn more about
SetUID, see Chapter 4.
How Apache Processes Ownership
To carry out initial network-related functions, the Apache HTTP server must start with root
permissions. Specifically, it needs to bind itself to port 80 so that it can listen for requests and accept
connections. Once it does this, Apache can give up its rights and run as a non-root user (unprivileged
user), as specified in its configuration files. Different Linux distributions may have varying defaults
for this user, but it is usually one of the following: nobody, www, apache, wwwrun, www-data, or
daemon.
Remember that when running as an unprivileged user, Apache can read only the files that the user
has permissions to read.
Security is especially important for sites that use Common Gateway Interface (CGI) scripts. By
limiting the permissions of the web server, you decrease the likelihood that someone can send a
malicious request to the server. The server processes and corresponding CGI scripts can break only
what they can access. As user nobody, the scripts and processes don’t have access to the same key
files that root can access. (Remember that root can access everything, no matter what the
permissions.)
NOTE In the event that you decide to allow CGI scripts on your server, pay strict attention to how
they are written. Be sure it isn’t possible for input coming in over the network to make the CGI script
do something it shouldn’t. Although there are no hard statistics on this, some successful attacks on
sites are possible because of improperly configured web servers and/or poorly written CGI scripts.
Installing the Apache HTTP Server
Most modern Linux distributions come with the binary package for the Apache HTTP server software
in Red Hat Package Manager (RPM) format, so installing the software is usually as simple as using
the package management tool on the system. This section walks you through the process of obtaining
and installing the program via RPM and Advanced Packaging Tool (APT). Mention is also made of
installing the software from source code, if you choose to go that route. The actual configuration of
the server covered in later sections applies to both classes of installation (from source or from a
binary package).
On a Fedora system, you can obtain the Apache RPM in several ways. Here are some of them:
Download the Apache RPM (for example, httpd-*.rpm) for your operating system from your
distribution’s software repository. For Fedora, you can obtain a copy of the program from
http://download.fedora.redhat.com/pub/fedora/linux/releases/<VERSION>/Fedora/x86_64/os/
where <VERSION> refers to the specific version of Fedora that you are running, (for
example, 16, 17, or 25).
You can install from the install media, from the /Packages/ directory on the media.
You can pull down and install the program directly from a repository using the Yum program.
This is perhaps the quickest method if you have a working connection to the Internet. And this
is what we’ll do here.
To use Yum to install the program, type the following:
To confirm that the software is installed, type the following:
And that’s it! You now have Apache installed on the Fedora server.
For a Debian-based Linux distribution such as Ubuntu, you can use APT to install Apache by
running:
The web server daemon is automatically started after you install using apt-get on Ubuntu
systems.
Installing Apache from Source
Just in case you are not happy with the built-in defaults that the binary Apache package forces
you to live with and you want to build your web server software from scratch, you can always
obtain the latest stable version of the program directly from the apache.org web site. The
procedure for building from source is discussed here. Please note that we use the asterisk (*)
wildcard symbol to mask the exact version of httpd software (Apache) that was used. This is
done because the exact stable version of httpd available might be different when you go through
the steps. You should therefore substitute the asterisk symbol with a proper and full version
number for the apache/httpd software package. So for example, instead of writing httpd2.2.21.tar.gz or httpd-2.4.0.tar.gz, we cheat and simply write - httpd-2.*.
The most current version will always be available at www.apache.org/dist/httpd/.
1. We’ll download the latest program source into the /usr/local/src/ directory from the
apache.org web site. You can use the wget program to do this:
2. Extract the tar archive. And then change to the directory that is created during the
extraction.
3. Assuming we want the web server program to be installed under the /usr/local/ httpd/
directory, we’ll run the configure script with the proper prefix option:
4. Run make.
5. Create the program’s working directory (that is, /usr/local/httpd/), and then run make
install:
Once the install command completes successfully, a directory structure will be created
under /usr/local/httpd/ that will contain the binaries, the configuration files, the log files, and so
on, for the web server.
Apache Modules
Part of what makes Apache so powerful and flexible is that its design allows extensions through
modules. Apache comes with many modules by default and automatically includes them in the default
installation.
If you can imagine “it,” you can be almost certain that somebody has probably already written a
module for “it” for the Apache web server. The Apache module application programming interface
(API) is well documented, and if you are so inclined (and know how), you can probably write your
own module for Apache to provide any functionality you want.
To give you some idea of what kinds of things people are doing with modules, visit
http://modules.apache.org. There you will find information on how to extend Apache’s capabilities
using modules. Here are some common Apache modules:
mod_cgi Allows the execution of CGI scripts on the web server
mod_perl Incorporates a Perl interpreter into the Apache web server
mod_aspdotnet Provides an ASP.NET host interface to Microsoft’s ASP.NET engine
mod_authz_ldap Provides support for authenticating users of the Apache HTTP server
against a Lightweight Directory Access Protocol (LDAP) database
mod_ssl Provides strong cryptography for the Apache web server via the Secure Sockets
Layer (SSL) and Transport Layer Security (TLS) protocols
mod_ftpd Allows Apache to accept FTP connections
mod_userdir Allows user content to be served from user-specific directories on the web
server via HTTP
If you know the name of a particular module that you want (and if the module is popular enough),
you might find that the module has already been packaged in an RPM format, so you can install it
using the usual RPM methods. For example, if you want to include the SSL module (mod_ssl) in your
web server setup, on a Fedora system, you can issue this Yum command to download and install the
module for you automatically:
Alternatively, you can go to the Apache modules project web site and search for, download,
compile, and install the module that you want.
TIP Make sure the run-as user is there! If you build Apache from source, the sample configuration
file (httpd.conf) expects that the web server will run as the user daemon. Although that user exists on
almost all Linux distributions, if something is broken along the way, you may want to check the user
database (/etc/passwd) to make sure that the user daemon does indeed exist.
Starting up and Shutting Down Apache
Starting up and shutting down Apache on most Linux distributions is easy. To start Apache on a
Fedora system or any other Red Hat–like system, use this command:
On Linux distributions running systemd, you can alternatively start the httpd daemon using the
systemctl command like so:
To shut down Apache, enter this command:
After making a configuration change to the web server that requires you to restart Apache, type
this:
TIP On a system running openSUSE or SLE (SUSE Linux Enterprise), the commands to start and
stop the web server, respectively, are
and
TIP On a Debian system such as Ubuntu, you can start Apache by running:
The Apache daemon can be stopped by running:
Starting Apache at Boot Time
After installing the web server, it’s reasonable to assume that you want the web service to be
available at all times to your users; you will therefore need to configure the system to start the service
automatically for you between system reboots. It is easy to forget to do this on a system that has been
running for a long time without requiring any reboots, because if you ever had to shut down the system
due to an unrelated issue, you might be baffled as to why the web server that has been running
perfectly since installation without incident failed to start up after starting the box. So it is good
practice to take care of this during the early stages of configuring the service.
Most Linux flavors have the chkconfig utility available, which can be used for controlling which
system services start up at what runlevels.
To view the runlevels in which the web server is configured to start up, type
This output shows that the web server is not configured to start up in any runlevel in its out-of-thebox state. To change this and make Apache start up automatically in runlevels 2, 3, 4, and 5, type this:
On Linux distributions running systemd, you can alternatively make the httpd daemon
automatically start up with system reboots by issuing the systemctl command, like so:
In Ubuntu, you can use either the sysv-rc-conf or the update-rc.d utility to manage the
runlevels in which Apache starts up.
NOTE Just in case you are working with an Apache version that you installed from source, you
should be aware that the chkconfig utility will not know about the start-up and shutdown scripts for
your web server unless you explicitly tell the utility about it. And as such, you’ll have to resort to
some other tricks to configure the host system to bring up the web server automatically during system
reboots. You may easily grab an existing start-up script from another working system (usually from
the /etc/init.d/ directory) and modify it to reflect correct paths (such as /usr/local/httpd/) for your
custom Apache setup. Existing scripts are likely to be called httpd or apache2.
Testing Your Installation
You can perform a quick test of your Apache installation using its default home page. To do this, first
confirm that the web server is up and running using the following command:
You can also issue a variation of the systemctl command on systemd-aware systems to view a
nice synopsis (cgroup information, child processes, and so on) of the Apache server status, like so:
On our sample Fedora system, Apache comes with a default page that gets served to visitors in
the absence of a default home page (for example, index.html or index.htm). The file displayed to
visitors when there is no default home page is /var/www/error/noindex.html.
TIP If you are working with a version of Apache that you built from source, the working directory
from which web pages are served is <PREFIX>/htdocs. For example, if your installation prefix is
/usr/local/httpd/, then web pages will, by default, be under /usr/local/httpd/htdocs/.
To find out if your Apache installation went smoothly, start a web browser and tell it to visit the
web site on your machine. To do this, simply type http://www.localhost (or the Internet Protocol
Version 6 [IPv6] equivalent, http://[::1]/) in the address bar of your web browser. You should see a
page stating something to the effect that “your Apache HTTP server is working properly at your site.”
If you don’t see this, retrace your Apache installation steps and make sure you didn’t encounter any
errors in the process. Another thing to check if you can’t see the default web page is to make sure that
you don’t have any host-based firewall such as Netfilter/iptables (see Chapter 13) blocking access
to the web server.
Configuring Apache
Apache supports a rich set of configuration options that are sensible and easy to follow. This makes it
a simple task to set up the web server in various configurations.
This section walks through a basic configuration. The default configuration is actually quite good
and (believe it or not) works right out of the box, so if the default is acceptable to you, simply start
creating your HTML documents! Apache allows several common customizations. After we step
through creating a simple web page, you’ll see how to make those common customizations in the
Apache configuration files.
Creating a Simple Root-Level Page
If you like, you can start adding files to Apache right away in the /var/www/html directory for toplevel pages (for a source install, the directory would be /usr/local/httpd/htdocs). Any files placed in
that directory must be world-readable.
As mentioned earlier, Apache’s default web page is index.html. Let’s take a closer look at
creating and changing the default home page so that it reads, “Welcome to webserver.example.org.”
Here are the commands:
You could also use an editor such as vi, pico, or emacs to edit the index.html file and make it
more interesting.
Apache Configuration Files
The configuration files for Apache are located in the /etc/httpd/conf/ directory on a Fedora or Red
Hat Enterprise Linux (RHEL) system, and for our sample source install, the path will be
/usr/local/httpd/conf/. The main configuration file is usually named httpd.conf on Red Hat–like
distributions such as Fedora.
On Debian-like systems, the main configuration file for Apache is named
/etc/apache2/apache2.conf.
The best way to learn more about the configuration files is to read the httpd.conf file. The default
configuration file is heavily commented, explaining each entry, its role, and the parameters you can
set.
Common Configuration Options
The default configuration settings work just fine right out of the box, and for basic needs, they may
require no further modification. Nevertheless, site administrators may need to customize their web
server or web site further.
This section discusses some of the common directives or options that are used in Apache’s
configuration file.
ServerRoot
This is used for specifying the base directory for the web server. On Fedora, RHEL, and CentOS
distributions, this value, by default, is the /etc/httpd/ directory. The default value for this directive in
Ubuntu, openSUSE, and Debian Linux distributions is /etc/apache2/.
Listen
This is the port(s) on which the server listens for connection requests. It refers to the venerable port
80 (http) for which everything good and bad on the web is so well known!
The Listen directive can also be used to specify the particular IP addresses over which the web
server accepts connections. The default value for this directive is 80 for nonsecure web
communications.
For example, to set Apache to listen on its IPv4 and IPv6 interfaces on port 80, you would set the
Listen directive to read
To set Apache to listen on a specific IPv6 interface (such as fec0::20c:dead:beef:11cd) on port
8080, you would set the Listen directive to read
ServerName
This directive defines the hostname and port that the server uses to identify itself. At many sites,
servers fulfill multiple purposes. An intranet web server that isn’t getting heavy usage, for example,
should probably share its usage allowance with another service. In such a situation, a computer name
such as “www” (fully qualified domain name, or FQDN=www.example.org) wouldn’t be a good
choice, because it suggests that the machine has only one purpose.
It’s better to give a server a neutral name and then establish Domain Name System (DNS)
Canonical Name (CNAME) entries or multiple hostname entries in the /etc/hosts file. In other words,
you can give the system several names for accessing the server, but it needs to know about only its
real name.
Consider a server whose real hostname is dioxin.eng.example.org. This server also doubles as a
web server. You might be thinking of giving it the hostname alias www.sales.example.org. However,
since dioxin will know itself only as dioxin, users who visit www.sales.example.org might be
confused by seeing in their browsers that the server’s real name is dioxin.
Apache provides a way to get around this through the use of the ServerName directive. This
works by allowing you to specify what you want Apache to return as the hostname of the web server
to web clients or visitors.
ServerAdmin
This is the e-mail address that the server includes in error messages sent to the client.
It’s often a good idea, for a couple of reasons, to use an e-mail alias for a web site’s
administrator(s). First, there might be more than one administrator. By using an alias, it’s possible for
the alias to expand out to a list of other e-mail addresses. Second, if the current administrator leaves
the company, you don’t want to have to make the rounds of all those web pages and change the name
of the site administrator.
DocumentRoot
This defines the primary directory on the web server from which HTML files will be served to
requesting clients. On Fedora distros and other Red Hat–like systems, the default value for this
directive is /var/www/html/. On openSUSE and SLE distributions, the default value for this directive
is /srv/www/htdocs.
TIP On a web server that is expected to host plenty of web content, the file system on which the
directory specified by this directive resides should have a lot of space.
MaxClients
This sets a limit on the number of simultaneous requests that the web server will service.
LoadModule
This is used for loading or adding other modules into Apache’s running configuration. It adds the
specified module to the list of active modules.
Enabling and Disabling Apache Modules
Debian-based distros have a handy set of utilities that can be used easily to enable or disable
Apache modules that are already installed. You can confirm the currently installed modules
under the /usr/lib64/apache2/modules/ directory. For example, to enable the userdir module,
simply type this:
To disable the userdir module, you would use the sister command named a2dismod:
Running the a2enmod command will “auto-magically” create a symbol link under the
/etc/apache2/mods-enabled/ directory. The symbol will be a link to the file
/etc/apache2/mods-available/userdir.conf, which contains the actual configuration details for
the userdir module. The contents of the file on our sample system are as follows:
Finally, don’t forget to reload or restart Apache after enabling or disabling a module. This
can be done quickly like so:
User
This specifies the user ID with which the web server will answer requests. The server process will
initially start off as the root user but will later downgrade its privileges to those of the user specified
here. The user should have only just enough privileges to access files and directories that are
intended to be visible to the outside world via the web server. Also, the user should not be able to
execute code that is not HTTP- or web-related.
On a Fedora system, the value for this directive is automatically set to the user named “apache.”
In openSUSE Linux, the value is set to the user “wwwrun.” In Debian-like system such as Ubuntu, the
value is set to the user “www-data.”
Group
This specifies the group name of the Apache HTTP server process. It is the group with which the
server will respond to requests. The default value under the Fedora and RHEL flavors of Linux is
“apache.” In openSUSE Linux, the value is set to the group “www.” In Ubuntu, the default value is
“www-data (set via the “$APACHE_RUN_USER” variable).”
Include
This directive allows Apache to specify and include other configuration files at runtime. It is mostly
useful for organization purposes; you can, for example, elect to store all the configuration directives
for different virtual domains in appropriately named files, and Apache will automatically know to
include them at runtime.
Many of the mainstream Linux distros rely quite heavily on the use of the Include directive to
organize site-specific configuration files and directives for the web server. Often, this file and
directory organization is the sole distinguishing factor between Apache installation/setup among the
different distros.
UserDir
This directive defines the subdirectory within each user’s home directory, where users can place
personal content that they want to make accessible via the web server. This directory is usually
named public_html and is usually stored under each user’s home directory. This option is, of course,
dependent on the availability of the mod_userdir module in the web server setup.
Here’s a sample usage of this option in the httpd.conf file:
ErrorLog
This defines the location where errors from the web server will be logged.
Quick How-To: Serving HTTP Content from User Directories
After enabling the UserDir option, and assuming the user yyang wants to make some web
content available from within her home directory via the web server, following these steps will
make this happen:
1. While logged into the system as the user yyang, create the public_html folder:
2. Set the proper permissions for the parent folder:
3. Set the proper permissions for the public_html folder:
4. Create a sample page named index.html under the public_html folder:
As a result of these commands, files placed in the public_html directory for a particular user
and set to world-readable will be on the Web via the web server.
To access the contents of that folder via HTTP, you would need to point a web browser to
this URL:
where YOUR_HOST_NAME is the web server’s fully qualified domain name or IP address. And
if you are sitting directly on the web server itself, you can simply replace that variable with
localhost.
For the example shown here for the user yyang, the exact URL will be
http://localhost/&sim;yyang. And the IPv6 equivalent is http://[::1]/&sim;yyang.
Note that on a Fedora system with the SELinux subsystem enabled, you may have to do a
little more to get the UserDir directive working. This is because of the default security contexts
of the files stored under each user’s home directory. By default, the context is user_home_t. For
this functionality to work properly, you will have to change the context of all files under
&sim;/username/public_html/ to httpd_sys_content_t. This allows Apache to read the files
under the public_html directory. The command to do this is
LogLevel
This option sets the level of verbosity for the messages sent to the error logs. Acceptable log levels
are emerg, alert, crit, error, warn, notice, info, and debug. The default log level is warn.
Alias
The Alias directive allows documents (web content) to be stored in any other location on the file
system that is different from the location specified by the DocumentRoot directive. It also allows you
to create abbreviations (or aliases) for path names that might otherwise be quite long.
ScriptAlias
The ScriptAlias option specifies a target directory or file as containing CGI scripts that are meant
to be processed by the CGI module (mod_cgi).
VirtualHost
One of the most-used features of Apache is its ability to support virtual hosts. This makes it possible
for a single web server to host multiple web sites as if each site had its own dedicated hardware. It
works by allowing the web server to provide different, autonomous content, based on the hostname,
port number, or IP address that is being requested by the client. This is accomplished by the HTTP
1.1 protocol, which specifies the desired site in the HTTP header rather than relying on the server to
learn what site to fetch from its IP address.
This directive is actually made up of two tags: an opening <VirtualHost> tag and a closing
</VirtualHost> tag. It is used to specify the options that pertain to a particular virtual host. Most of
the directives that we discussed previously are valid here, too.
Suppose, for example, that we wanted to set up a virtual host configuration for a host named
www.another-example.org. To do this, we can create a VirtualHost entry in the httpd.conf file (or
use the Include directive to specify a separate file), like this one:
On Debian-like distros, you can use another set of utilities (a2ensite and a2dissite) to enable or
disable virtual hosts and web sites quickly under Apache.
For example, assuming we created the previous configuration file named www.anotherexample.org for the virtual web site and stored the file under the /etc/apache2/sites-available/
directory. We can enable the virtual web site using the following command:
Similarly, to disable the virtual site, you can run this command:
After running any of the previous commands (a2ensite or a2dissite), you should make Apache
reload its configuration files by running the following:
Finally, don’t forget that it is not enough to configure a virtual host using Apache’s VirtualHost
directive—the value of the ServerName option in the VirtualHost container must be a name that is
resolvable via DNS (or any other means) to the web server machine.
NOTE Apache’s options/directives are too numerous to be covered in this section. But the software
comes with its own extensive online manual, which is written in HTML so that you can access it in a
browser. If you installed the software via RPM, you might find that documentation for Apache has
been packaged into a separate RPM binary, and as a result, you will need to install the proper
package (for example, httpd-manual) to have access to it. If you downloaded and built the software
from source code, you will find the documentation in the manual directory of your installation prefix
(for example, /usr/local/ httpd/manual). Depending on the Apache version, the documentation is
available online at the project’s web site at http://httpd.apache.org/docs/.
Troubleshooting Apache
The process of changing various configuration options (or even the initial installation) can sometimes
not work as smoothly as you’d like. Thankfully, Apache does an excellent job at reporting in its error
log file why it failed or what is failing.
The error log file is located in your logs directory. If you are running a stock Fedora or RHELtype installation, this is in the /var/log/httpd/ directory. If you are running Apache on a stock Debianor Ubuntu-type distro, this is in the /var/log/apache2/ directory.
If, on the other hand, you installed Apache yourself using the installation method discussed earlier
in this chapter, the logs are in the /usr/local/httpd/logs/ directory. In these directories, you will find
two files: access_log and error_log.
The access_log file is simply that—a log of which files have been accessed by people visiting
your web site(s). It contains information about whether the transfer completed successfully, where the
request originated (IP address), how much data was transferred, and what time the transfer occurred.
This is a powerful way of determining the usage of your site.
The error_log file contains all of the errors that occur in Apache. Note that not all errors that
occur are fatal—some are simply problems with a client connection from which Apache can
automatically recover and continue operation. However, if you started Apache but still cannot visit
your web site, take a look at this log file to see why Apache might not be responding. The easiest way
to see the most recent error messages is by using the tail command, like so:
If you need to see more log information than that, simply change the number 10 to the number of
lines that you need to see. And if you would like to view the errors or logs in real time as they are
being generated, you should use the -f option for the tail command. This provides a valuable
debugging tool, because you can try things out with the server (such as requesting web pages or
restarting Apache) and view the results of your experiments in a separate virtual terminal window.
The tail command with the -f switch is shown here:
This command will constantly tail the logs until you terminate the program (using CTRL-C).
Summary
This chapter covered the process of setting up your own web server using Apache (aka httpd) from
the ground up. This chapter by itself is enough to get you going with a top-level page and a basic
configuration. At a minimum, the material covered here will help you get your web server on the
inter-webs, or Internets—whichever one you prefer!
It is highly recommended that you take some time to page through the relevant and official Apache
manual/documentation (http://httpd.apache.org/docs/). It is well written, concise, and flexible enough
that you can set up just about any configuration imaginable. The text focuses on Apache and Apache
only, so you don’t have to wade through hundreds of pages to find what you need.
CHAPTER 19
SMTP
he Simple Mail Transfer Protocol (SMTP) is the de facto standard for mail transport across the
Internet. Anyone who wants to have a mail server capable of sending and receiving mail across
the Internet must be able to support it. Many internal networks have also taken to using SMTP
for their private mail services because of its platform independence and availability across all
popular operating systems. In this chapter, we’ll discuss the mechanics of SMTP as a protocol and its
relationship to other mail-related protocols, such as Post Office Protocol (POP) and Internet Message
Access Protocol (IMAP). Then we will go over the Postfix SMTP server, one of the easier and more
secure SMTP servers out there.
T
Understanding SMTP
The SMTP protocol defines the method by which mail is sent from one host to another. That’s it. It
does not define how the mail should be stored. It does not define how the mail should be displayed to
the recipient.
SMTP’s strength is its simplicity, and that is due, in part, to the dynamic nature of networks during
the early 1980s. (The SMTP protocol was originally defined in 1982.) Back in those days, people
were linking networks together with everything short of bubble gum and glue. SMTP was the first
mail standard that was independent of the transport mechanism. This meant people using TCP/IP
networks could use the same format to send a message as someone using two cans and a string—at
least theoretically.
SMTP is also independent of operating systems, which means each system can use its own style
of storing mail without worrying about how the sender of a message stores his mail. You can draw
parallels to how the phone system works: Each phone service provider has its own independent
accounting system. However, they all have agreed upon a standard way to link their networks together
so that calls can go from one network to another transparently.
In the Free Open Source Software (FOSS) world, several software packages provide their own
implementation of SMTP. Two of the most popular SMTP packages that ship with the mainstream
Linux distros are Sendmail and Postfix.
Rudimentary SMTP Details
Ever had a “friend” who sent you an e-mail on behalf of some government agency informing you that
you owe taxes from the previous year, plus additional penalties? Somehow, a message like this ends
up in a lot of people’s mailboxes around April Fool’s Day. We’re going to show you how they did it
and, what’s even more fun, how you can do it yourself. (Not that we would advocate such behavior,
of course.)
The purpose of this example is to show how SMTP sends a message from one host to another.
After all, more important than learning how to forge an e-mail is learning how to troubleshoot mailrelated problems. So in this example you are acting as the sending host, and whichever machine you
connect to is the receiving host.
SMTP requires only that a host be able to send straight ASCII text to another host. Typically, this
is done by contacting the SMTP port (port 25) on a mail server. You can do this using the Telnet
program. For example,
Here, the host mailserver is the recipient’s mail server. The 25 that follows mailserver tells
Telnet that you want to communicate with the server’s port 25 rather than the normal telnet port 23.
(Port 23 is used for remote logins, and port 25 is for the SMTP server.)
The mail server will respond with a greeting message such as this:
You are now communicating directly with the SMTP server.
Although there are many SMTP commands, four are worth noting:
HELO
MAIL FROM:
RCPT TO:
DATA
The HELO command is used when a client introduces itself to the server. The parameter to HELO is
the hostname that is originating the connection. Of course, most mail servers take this information
with a grain of salt and double-check it themselves. Here’s an example:
If you aren’t coming from the example.org domain, many mail servers will respond by telling you that
they know your real IP address, but they may or may not stop the connection from continuing.
The MAIL FROM: command requires the sender’s e-mail address as its argument. This tells the
mail server the e-mail’s origin. Here’s an example:
This means the message is from [email protected]
The RCPT TO: command requires the receiver’s e-mail address as an argument. Here’s an
example:
This means the message is destined to [email protected]
Now that the server knows who the sender and recipient are, it needs to know what message to
send. This is done by using the DATA command. Once it’s issued, the server will expect the entire
message, with relevant header information, followed by one empty line, a period, and then another
empty line. Continuing the example, [email protected] might want to send the following message
to [email protected]:
And that’s all there is to it. To close the connection, enter the QUIT command.
This is the basic technique used by applications that send mail—except, of course, that all the
gory details are masked behind a nice GUI application. The underlying transaction between the client
and the server remains mostly the same.
Security Implications
Sendmail is a popular open source mail server implementation used by many Linux distros and
Internet sites. Like any other server software, its internal structure and design are complex and
require a considerable amount of care during development. In recent years, however, the developers
of Sendmail have taken a paranoid approach to their design to help alleviate some security issues.
The Postfix developers took their mail server implementation one step further and wrote the
server software from scratch with security in mind. Basically, the package ships in a tight security
mode, and it’s up to the individual user to loosen it up as much as is needed for a specific
environment. This means the responsibility falls to us for making sure we keep the software properly
configured (and thus not vulnerable to attacks).
When deploying any mail server, keep the following issues in mind:
When an e-mail is sent to the server, what programs will it trigger?
Are those programs securely designed?
If they cannot be made secure, how can you limit the damage in case of an attack?
Under what permissions do those programs run?
In Postfix’s case, we need to back up and examine its architecture.
Mail service has three distinct components:
Mail user agent (MUA) What the user sees and interacts with, such as the Eudora, Outlook,
Evolution, Thunderbird, and Mutt programs. An MUA is responsible only for reading mail
and allowing users to compose mail.
Mail transport agent (MTA) Handles the process of getting the mail from one site to
another; Sendmail and Postfix are MTAs.
Mail delivery agent (MDA) What takes the message, once received at a site, and gets it to
the appropriate user mailbox.
Many mail systems integrate these components. For example, Microsoft Exchange Server
integrates the MTA and MDA functionalities into a single system. (If you consider the Outlook Web
Access interface to Exchange Server, it is also an MUA.) Lotus Domino also works in a similar
fashion. Postfix, on the other hand, works as an MTA only, passing the task of performing local mail
delivery to another external program. This allows each operating system or site configuration to use
its own custom tool, if necessary, for tasks such as determining mailbox storage mechanisms.
In most straightforward configurations, sites prefer using the Procmail program to perform the
actual mail delivery (MDA). This is because of its advanced filtering mechanism, as well as its
secure design from the ground up. Many older configurations have stayed with their default /bin/mail
program to perform mail delivery.
Installing the Postfix Server
We chose the Postfix mail server in this discussion for its ease of use and because it was written from
the ground up to be simpler than Sendmail. (The author of Postfix also argues that the simplicity has
led to improved security.) Postfix can perform most of the things that the Sendmail program can do—
in fact, the typical installation procedure for Postfix is to work as a drop-in replacement for Sendmail
binaries completely.
In the following sections, we show you how to install Postfix using the built-in package
management (Red Hat’s RPM or Debian’s dpkg) mechanism of the distribution. This is the
recommended method. We also show how to build and install the software from its source code.
Installing Postfix via RPM in Fedora
To install Postfix via RPM on Fedora, CentOS, or RHEL distros, simply use the Yum tool as follows:
Once the command runs to completion, you should have Postfix installed. Since Sendmail is the
default mailer that gets installed in Fedora and RHEL distros, you will need to disable it using the
chkconfig command and then enable Postfix:
On systemd-enabled distros, the equivalent commands are
Finally, you can flip the switch and actually start the Postfix process. With a default configuration,
it won’t do much, but it will confirm whether the installation worked as expected.
On systemd-enabled distros, the equivalent commands are
TIP Another way to change the mail subsystem on a Red Hat–based distribution is to use the
system-switch-mail program. This program can be installed using Yum as follows:
You can also use the command-line alternatives facility to switch the default MTA provider on the
system:
Installing Postfix via APT in Ubuntu
Postfix can be installed in Ubuntu by using Advanced Packaging Tool (APT). Ubuntu, unlike other
Linux distributions, does not ship with any MTA software preconfigured and running. You explicitly
need to install and set one up. To install the Postfix MTA in Ubuntu, run this command:
The install process offers a choice of various Postfix configuration options during the install
process:
No configuration This option will leave the current configuration unchanged.
Internet site Mail is sent and received directly using SMTP.
Internet with smarthost Mail is received directly using SMTP or by running a utility such
as fetchmail. Outgoing mail is sent using a smarthost.
Satellite system All mail is sent to another machine, called a smarthost, for delivery.
Local only The only delivered mail is the mail for local users. The system does not need
any sort of network connectivity for this option.
We will use the first option, No configuration, on our sample Ubuntu server. The install process
will also create the necessary user and group accounts that Postfix needs.
Installing Postfix from Source Code
Begin by downloading the Postfix source code from www.postfix.org. As of this writing, the
latest stable version was postfix-2.8.7.tar.gz. Once you have the file downloaded, use the tar
command to unpack the contents:
Once Postfix is unpacked, change into the postfix-2.8.7 directory and run the make command,
like so:
The complete compilation process will take a few minutes, but it should work without event.
Please note that if the compile step fails with an error about being unable to find “db.h” or
any other kind of “db” reference, there is a good chance your system does not have the Berkeley
DB developer tools installed. Although it is possible to compile the Berkeley DB tools yourself,
it is not recommended, as Postfix will fail if the version of DB being used in Postfix is different
from what other system libraries are using. To fix this, install the db4-devel package. This can
be done using Yum as follows:
Because Postfix might be replacing your current Sendmail program, you’ll want to make a
backup of the Sendmail binaries. This can be done as follows:
Now you need to create a user and a group under which Postfix will run. You may find that
some distributions already have these accounts defined. If so, the process of adding a user will
result in an error.
You’re now ready to do the make install step to install the actual software. Postfix includes
an interactive script that prompts for values of where things should go. Stick to the defaults by
simply pressing the ENTER key at each prompt.
With the binaries installed, it’s time to disable Sendmail from the startup scripts. You can do
that via the chkconfig command, like so:
The source version of Postfix includes a nice shell script that handles the start-up and
shutdown process for us. For the sake of consistency, you can wrap it into a standard start-up
script that can be managed via chkconfig. Using the techniques learned from Chapter 6, you
create a shell script called /etc/init.d/ postfix. You can use the following code listing for the
postfix script:
With the script in place, double-check that its permissions are correct with a quick chmod:
Then use chkconfig to add it to the appropriate runlevels for startup:
Configuring the Postfix Server
By following the preceding steps, you have compiled (if you built from source) and installed the
Postfix mail system. After the compilation stage, the make install script will exit and prompt you for
any changes that are wrong, such as forgetting to add the postfix user. Now that you have installed the
Postfix server, you can change directories to /etc/postfix and configure the Postfix server.
You configure the server through the /etc/postfix/main.cf configuration file. It’s obvious from its
name that this configuration file is the main configuration file for Postfix. The other configuration file
of note is the master.cf file. This is the process configuration file for Postfix, which allows you to
change how Postfix processes are run. This can be useful for setting up Postfix on clients so that it
doesn’t accept e-mail and forwards to a central mail hub. (For more information on doing this, see the
documentation at www.postfix.org.)
Now let’s move on to the main.cf configuration file.
The main.cf File
The main.cf file is too large to list all of its options in this chapter, but we will cover the most
important options that will get your mail server up and running. Thankfully, the configuration file is
well documented and clearly explains each option and its function.
The sample options discussed next are enough to help you get a basic Postfix mail server up and
running at a minimum.
myhostname
This parameter is used for specifying the hostname of the mail system. It sets the Internet hostname for
which Postfix will be receiving e-mail. The default format for the hostname is to use the fullyqualified domain name (FQDN) of the host. Typical examples of mail server hostnames are
mail.example.com or smtp.example.org. Here’s the syntax:
mydomain
This parameter is the mail domain that you will be servicing, such as example.com, labmanual.org, or
google.com. Here’s the syntax:
myorigin
All e-mail sent from this e-mail server will look as though it came from this parameter. You can set
this to either $myhostname or $mydomain, like so:
Notice that you can use the value of other parameters in the configuration file by placing a $ sign
in front of the variable name.
mydestination
This parameter lists the domains that the Postfix server will take as its final destination for incoming
e-mail. Typically, this value is set to the hostname of the server and the domain name, but it can
contain other names, as shown here:
If your server has more than one name, for example, server.example.org and serverA.anotherexample.org, you will want to make sure you list both names here.
mail_spool_directory
You can run the Postfix server in two modes of delivery: directly to a user’s mailbox or to a central
spool directory. The typical way is to store the mail in /var/spool/mail. The variable will look like
this in the configuration file:
The result is that mail will be stored for each user under the /var/spool/mail directory, with each
user’s mailbox represented as a file. For example, e-mail sent to [email protected] will be stored
in /var/spool/mail/yyang.
mynetworks
The mynetworks variable is an important configuration option. This lets you configure what servers
can relay through your Postfix server. You will usually want to allow relaying from local client
machines and nothing else. Otherwise, spammers can use your mail server to relay messages. Here’s
an example value of this variable:
If you define this parameter, it will override the mynetworks_style parameter. The
mynetworks_style parameter allows you to specify any of the keywords class, subnet, or host.
These settings tell the server to trust these networks to which the server belongs.
CAUTION If you do not set the $mynetworks variable correctly and spammers begin using your
mail server as a relay, you will quickly find a surge of angry mail administrators e-mailing you about
it. Furthermore, it is a fast way to get your mail server blacklisted by one of the spam control
techniques, such as DNS Blacklist (DNSBL) or Realtime Blackhole Lists (RBL). Once your server is
blacklisted, very few people will be able to receive mail from you, and you will need to jump
through a lot of hoops to get unlisted. Even worse, no one will tell you that you have been blacklisted.
smtpd_banner
This variable allows you to return a custom response when a client connects to your mail server. It is
a good idea to change the banner to something that doesn’t give away what server you are using. This
just adds one more slight hurdle for hackers trying to find faults in your specific software version.
inet_protocols
This parameter is used to invoke the Internet Protocol Version 6 (IPv6) capabilities of the Postfix
mail server. It is used to specify the Internet protocol version that Postfix will use when making or
accepting connections. Its default value is ipv4. Setting this value to ipv6 will make Postfix support
IPv6. Here are some example values that this parameter accepts:
Tons of other parameters in the Postfix configuration file are not discussed here. You might see
them commented out in the configuration file when you set the preceding options. These other options
will allow you to set security levels and debugging levels, among other things, as required.
Now let’s move on to running the Postfix mail system and maintaining your mail server.
Checking Your Configuration
Postfix includes a nice tool for checking a current configuration and helping you troubleshoot it.
Simply run the following:
This will list any errors that the Postfix system finds in the configuration files or with permissions
of any directories that it needs. A quick run on our sample system shows this:
Looks like we made a typo in the configuration file. When going back to fix any errors in the
configuration file, you should be sure to read the error message carefully and use the line number as
guidance, not as absolute. This is because a typo in the file could mean that Postfix detected the error
well after the actual error took place.
In this example, a typo we made on line 76 didn’t get caught until line 91 because of how the
parsing engine works. However, by carefully reading the error message, we knew the problem was
with the “mydomain” parameter, and so it took only a quick search before we found the real line
culprit.
Let’s run the check again:
Groovy! We’re ready to start using Postfix.
Running the Server
Starting the Postfix mail server is easy and straightforward. Just pass the start option to the
postfix run control script:
When you make any changes to the configuration files, you need to tell Postfix to reload itself to
make the changes take effect. Do this by using the reload option:
On systemd-enabled distros, the equivalent commands are
Checking the Mail Queue
Occasionally, the mail queues on your system will fill up. This can be caused by network failures or
various other failures, such as other mail servers. To check the mail queue on your mail server,
simply type the following command:
This command will display all of the messages that are in the Postfix mail queue. This is the first
step in testing and verifying that the mail server is working correctly.
Flushing the Mail Queue
Sometimes after an outage, mail will be queued up, and it can take several hours for the messages to
be sent. Use the postfix flush command to flush out any messages that are shown in the queue by
the mailq command.
The newaliases Command
The /etc/aliases file contains a list of e-mail aliases. This is used to create site-wide e-mail lists and
aliases for users. Whenever you make changes to the /etc/aliases file, you need to tell Postfix about it
by running the newaliases command. This command will rebuild the Postfix databases and inform
you of how many names have been added.
Making Sure Everything Works
Once the Postfix mail server is installed and configured, you should test and test again to make sure
that everything is working correctly. The first step in doing this is to use a local mail user agent, such
as pine or mutt, to send e-mail to yourself. If this works, great; you can move on to sending e-mail to a
remote site, using the mailq command to see when the message gets sent. The final step is to make
sure that you can send e-mail to the server from the outside network (that is, from the Internet). If you
can receive e-mail from the outside world, your work is done.
Mail Logs
On Fedora, RHEL, and CentOS systems, by default, mail logs go to /var/log/maillog, as defined by
the rsyslogd configuration file. If you need to change this, you can modify the rsyslogd configuration
file, /etc/rsyslog.conf, by editing the following line:
Most sites run their mail logs this way, so if you are having problems, you can search through the
/var/log/maillog file for any messages.
Debian-based systems, such as Ubuntu, store the mail-related logs in the /var/log/ mail.log file.
openSUSE and SUSE Linux Enterprise (SLE) store their mail-related logs in the files
/var/log/mail, /var/log/mail.err, /var/log/mail.info, and /var/log/mail.warn.
If Mail Still Won’t Work
If mail still won’t work, don’t worry. SMTP isn’t always easy to set up. If you still have problems,
walk logically through all of the steps and look for errors. The first step is to look at your log
messages, which might show that other mail servers are not responding. If everything seems fine
there, check your Domain Name System (DNS) settings. Can the mail server perform name lookups?
Can it perform Mail Exchanger (MX) lookups? Can other people perform name lookups for your mail
server? It is also possible that e-mails are actually being delivered but are being marked as junk or
spam at the recipient end. Check the junk or spam mail folder at the receiver’s end.
Proper troubleshooting techniques are indispensable for good system administration. A good
resource for troubleshooting is to look at what others have done to fix similar problems. Check the
Postfix web site at www.postfix.org, or check the newsgroups at www.google.com for the problems
or symptoms of what you might be seeing.
Summary
In this chapter, you learned the basics of how SMTP works. You also installed and learned how to
configure a basic Postfix mail server. With this information, you have enough knowledge to set up and
run a production mail server.
If you’re looking for additional information on Postfix, start with the online documentation at
www.postfix.org. The documentation is well written and easy to follow. It offers a wealth of
information on how Postfix can be extended to perform a number of additional functions that are
outside the scope of this chapter.
Another excellent reference on the Postfix system is The Book of Postfix: State-of-the-Art
Message Transport, by Ralf Hildebrandt and Patrick Koetter (No Starch Press, 2005). This book
covers the Postfix system in excellent detail.
As with any other service, don’t forget to keep up with the latest news on Postfix. Security
updates do come out from time to time, and it is important that you update your mail server to reflect
these changes.
CHAPTER 20
POP and IMAP
n Chapter 19, we covered the differences between mail transport agents (MTAs), mail delivery
agents (MDAs), and mail user agents (MUAs). When it comes to the delivery of mail to specific
user mailboxes, we assumed the use of Procmail, which delivers copies of e-mail to users in the
mbox format. The mbox format is a simple text format that can be read by a number of console mail
user agents, such as pine, elm, and mutt, as well as some GUI-based mail clients.
The key to the mbox format, however, is that the client has direct access (at the file system level)
to the mbox file itself. This works well enough in tightly administered environments where the
administrator of the mail server is also the administrator of the client hosts; however, this system of
mail folder administration might not scale well in certain scenarios. The following sample scenarios
might prove to be a bit thorny:
I
Users are unable to stay reasonably connected to a fast/secure network for file system access
to their mbox file (for example, roaming laptops).
Users demand local copies of e-mail for offline viewing.
Security requirements dictate that users not have direct access to the mail store (for example,
Network File System [NFS]-shared mail spool directories are considered unacceptable).
Mail user agents do not support the mbox file format (typical of Windows-based clients).
To deal with such cases, the Post Office Protocol (POP) was created to allow for network-based
access to mail stores. Many early Windows-based mail clients used POP for access to Internet email, because it allowed users to access UNIX-based mail servers (the dominant type of mail server
on the Internet until the rise of Microsoft Exchange in the late 1990s).
The idea behind POP is simple: A central mail server remains online at all times and can receive
and store mail for all of its users. Mail that is received is queued on the server until a user connects
via POP and downloads the queued mail. The mail on the server itself can be stored in any format
(such as mbox) so long as it adheres to the POP protocol.
When a user wants to send an e-mail, the e-mail client relays it through the central mail server via
Simple Mail Transfer Protocol (SMTP). This allows the client the freedom to disconnect from the
network after passing on its e-mail message to the server. The task/responsibility of forwarding the
message, taking care of retransmissions, handling delays, and so on, is then left to the well-connected
mail server. Figure 20-1 shows this relationship.
Figure 20-1. Sending and receiving mail with SMTP and POP
Early users of POP found certain aspects of the protocol too limiting. Features such as being able
to keep a master copy of a user’s e-mail on the server with only a cached copy on the client were
missing. This led to the development of the Internet Message Access Protocol (IMAP).
The earliest Request for Comments (RFC) documenting the inner workings of IMAPv2 is RFC
1064 dated 1988. After IMAPv2 came IMAP version 4 (IMAPv4) in 1994. Most e-mail clients are
compatible with IMAPv4.
Some design deficiencies inherent in IMAPv4 led to another update in the protocol specifications,
and, thus, IMAPv4 is currently at its first revision—IMAP4rev1 (RFC 3501).
The essence of how IMAP has evolved can be best understood by thinking of mail access as
working in one of three distinct modes: online, offline, and disconnected. The online mode is akin to
having direct file system access to the mail store (for example, having read access to the /var/mail
file system). The offline mode is how POP works, where the client is assumed to be disconnected
from the network except when explicitly pulling down its e-mail. In offline mode, the server normally
does not retain a copy of the mail.
Disconnected mode works by allowing users to retain cached copies of their mail stores. When
the client is connected, any incoming/outgoing e-mail is immediately recognized and synchronized;
however, when the client is disconnected, changes made on the client are kept until reconnection,
when synchronization occurs. Because the client retains only a cached copy, a user can move to a
completely different client and resynchronize his or her e-mail.
By using IMAP, your mail server will support all three modes of access. After all is said and
done, deploying and supporting both POP and IMAP is usually a good idea. It allows users the
freedom to choose whatever mail client and protocol best suits them.
This chapter covers the installation and configuration of the University of Washington (UW)
IMAP server, which includes a POP server hook. This particular mail server has been available for
many years. The installation process is also easy. For a small to medium-sized user base (up to a few
hundred users), it should work well.
If you’re interested in a higher volume mail server for IMAP, consider the Cyrus or Courier
IMAP server. Both offer impressive scaling options; however, they come at the expense of needing a
slightly more complex installation and configuration procedure.
POP and IMAP Basics
Like the other services discussed so far, POP and IMAP each need a server process to handle
requests. The POP and IMAP server processes listen on ports 110 and 143, respectively.
Each request to and response from the server is in clear-text ASCII, which means it’s easy for us
to test the functionality of the server using Telnet. This is especially useful for quickly debugging mail
server connectivity/availability issues. Like an SMTP server, you can interact with a POP or IMAP
server using a short list of commands.
To give you a look at the most common commands, let’s walk through the process of connecting
and logging on to a POP server and an IMAP server. This simple test allows you to verify that the
server does in fact work and is providing valid authentication.
Although there are many POP commands, here are a couple worth mentioning:
USER
PASS
And a few noteworthy IMAP commands are the following:
LOGIN
LIST
STATUS
EXAMINE/SELECT
CREATE/DELETE/RENAME
LOGOUT
Installing the UW-IMAP and POP3 Server
The University of Washington produces a well-regarded IMAP server that is used in many production
sites around the world. It is a well-tested implementation; thus, it is the version of IMAP that we will
install.
Most Linux distributions have prepackaged binaries for UW-IMAP in the distros repositories. For
example, UW-IMAP can be installed in Fedora/CentOS/RHEL by using Yum like so:
On Debian-like systems, such as Ubuntu, UW-IMAP can be installed by using Advanced
Packaging Tool (APT) like so:
Installing UW-IMAP from Source
Begin by downloading the UW-IMAP server to /usr/local/src. The latest version of the server
can be found at ftp://ftp.cac.washington.edu/imap/imap.tar.Z. Once it is downloaded, unpack it
as follows:
This will create a new directory under which all of the source code will be present. For the
version we are using, you will see a new directory called imap-2007f created. Change into the
directory as follows:
The defaults that ship with the UW-IMAP server work well for most installations. If you are
interested in tuning the build process, open the makefile (found in the current directory) with an
editor and read through it. The file is well documented and shows what options can be turned on
or off. For the installation we are doing now, you can stick with a simple configuration change
that you can issue on the command line.
In addition to build options, the make command for UW-IMAP requires that you specify the
type of system on which the package is being built. This is in contrast to many other open source
programs that use the ./configure program (also known as Autoconf) to determine the running
environment automatically. The options for Linux are as follows:
Parameter
ldb
lnx
lnp
Environment
Debian Linux
Linux with traditional passwords
Linux with Pluggable Authentication Modules (PAM)
lmd
lrh
lr5
lsu
sl4
sl5
slx
Mandrake Linux (also known as Mandriva Linux)
Red Hat Linux 7.2 and later
Red Hat Enterprise 5 and later (should cover recent Fedora versions)
SUSE Linux
Linux with Shadow passwords (requiring an additional library)
Linux with Shadow passwords (not requiring an additional library)
Linux needing an extra library for password support
A little overwhelmed with the choices? Don’t be. Many of the choices are for old versions
of Linux that are not used anymore. If you have a Linux distribution that is recent, the only ones
you need to pay attention to are lsu (SUSE), lr5 (RHEL), lmd (Mandrake), slx, and ldb
(Debian).
If you are using openSUSE, RHEL/Fedora/CentOS, Debian, or Mandrake/ Mandriva, go
ahead and select the appropriate option. If you aren’t sure, the slx option should work on almost
all Linux-based systems. The only caveat with the slx option is that you may need to edit the
makefile and help it find where some common tool kits, such as OpenSSL, are located.
To keep things simple, we will follow the generic case by enabling OpenSSL and Internet
Protocol version 6 (IPv6) support. To proceed with the build, simply run the following:
If you get prompted to build the software with IPv6 support, type y (yes) to confirm.
The entire build process should take only a few minutes, even on a slow machine. Once
complete, you will have four executables in the directory: mtest, ipop2d, ipop3d, and imapd.
Copy these to the /usr/local/sbin directory, like so:
Be sure the permissions to the executables are set correctly. Because they need to be run
only by root, it is appropriate to limit nonprivileged access to them accordingly. Simply set the
permissions as follows:
That’s it.
TIP UW-IMAP is especially finicky about OpenSSL. You will have to make sure that you have the
OpenSSL development libraries (header files) readily available on the system on which you are
compiling UW-IMAP. For RPM-based distros this requirement is provided by the openssl-devel
package. On Debian-based distros, the requirement is satisfied by the libssl-dev package. Once you
have the necessary OpenSSL headers installed, you might need to edit the SSLINCLUDE variable in
the file ./src/osdep/unix/Makefile to reflect path to the header files. Setting this path to /usr/include
on our sample Fedora server suffices. Alternatively, you may, of course, simply disable support for
any features that you don’t need for your environment at compile time.
Running UW-IMAP
Most distributions automatically set up UW-IMAP to run under the superdaemon xinetd (for more
information on xinetd, see Chapter 8). Sample configuration files to get the IMAP server and the
POP3 servers running under xinetd in Fedora are shown here.
For the IMAP server, the configuration file is /etc/xinetd.d/imap.
For the POP3 server, the configuration file is /etc/xinetd.d/ipop3.
TIP You can use the chkconfig utility in Fedora, RHEL, CentOS, and openSUSE to enable and
disable the IMAP and POP services running under xinetd. For example, to enable the IMAP service
under xinetd, simply run
This will change the disable = yes directive to disable = no in the /etc/ xinetd.d/imap file.
Before telling xinetd to reload its configuration, you will want to check that your /etc/ services
file has both POP3 and IMAP listed. If /etc/services does not have the protocols listed, simply add
the following two lines:
TIP If you are working with the UW-IMAP package that was compiled and installed from source,
don’t forget to change the server directive in the xinetd configuration file to reflect the correct path.
In our example, the proper path for the compiled IMAP server binary would be
/usr/local/sbin/imapd.
Finally, tell xinetd to reload its configuration by restarting it. If you are using Fedora, RHEL, or
CentOS, this can be done with the following command:
On systemd-enabled distros, you can restart xinetd by using the systemctl command like this:
If you are using another distribution, you might be able to restart xinetd by passing the restart
argument to xinetd’s run control, like so:
If everything worked, you should have a functional IMAP server and POP3 server. Using the
commands and methods shown in the earlier section “POP and IMAP Basics,” you can connect and
test for basic functionality.
TIP If you get an error message along the way, check the /var/log/messages file for additional
information that might help in troubleshooting.
Checking Basic POP3 Functionality
We begin by using Telnet to connect to the POP3 server (localhost in this example). From a command
prompt, type the following:
The server is now waiting for you to give it a command. (Don’t worry that you don’t see a
prompt.) Start by submitting your login name as follows:
Here, yourlogin is, of course, your login ID. The server responds with this:
Now tell the server your password using the PASS command:
Here, yourpassword is your password. The server responds with this:
Here, X represents the number of messages in your mailbox. You’re now logged in and can issue
commands to read your mail. Since you are simply validating that the server is working, you can log
out now. Simply type QUIT, and the server will close the connection.
That’s it.
Checking Basic IMAP Functionality
We begin by using Telnet to connect to the IMAP server (localhost in this example). From the
command prompt, type the following:
The IMAP server will respond with something similar to this:
The server is now ready for you to enter commands. Note that like the POP server, the IMAP
server will not issue a prompt.
The format for IMAP commands is shown here:
Here, tag represents any unique (user-generated) value used to identify (tag) the command. Example
tags are A001, b, box, c, box2, 3, and so on. Commands can be executed asynchronously,
meaning that it is possible for you to enter one command and, while waiting for the response, enter
another command. Because each command is tagged, the output will clearly reflect what output
corresponds to what request.
To log into the IMAP server, simply enter the login command, like so:
Here, username is the username you want to test and password is the user’s password. If the
authentication is a success, the server will respond with something like this:
That is enough to tell you two things:
The username and password are valid.
The mail server was able to locate and access the user’s mailbox.
With the server validated, you can log out by simply typing the logout command, like so:
The server will reply with something similar to this:
Other Issues with Mail Services
Thus far, we’ve covered enough material to get you started with a working mail server, but there is
still a lot of room for improvements. In this section, we walk through some of the issues you might
encounter and some common techniques to address them.
SSL Security
The biggest security issue with the POP3 and IMAP servers is that in their simplest configuration,
they do not offer any encryption. Advanced IMAP configurations offer richer password-hashing
schemes, and most modern full-featured e-mail clients support them. Having said this, your best bet is
to encrypt the entire stream using Secure Sockets Layer (SSL) whenever possible.
TIP The binary version of the UW-IMAP package that was installed using the distribution’s package
management system (Yum or APT) supports SSL.
Earlier, on our sample server, we did not configure our instance of UW-IMAP to use SSL. We
did this to keep things simple, and, moreover, it makes for a nice confidence booster for you to be
able to get something working quickly—before we start tinkering too much with it and adding other
layers of complexity.
If you do want to use SSL, you will need to take the following steps:
1. Make sure that your version of UW-IMAP has support for SSL built-in.
2. If necessary, modify the appropriate xinetd configuration files to enable imaps and pop3s.
You can also enable imaps and pop3s by running these commands:
3. Reload or restart xinetd for good measure:
4. Remember that the imaps service runs on TCP port 993, and pop3s runs on TCP port 995, so
you need to make sure that your firewall(s) are not blocking remote access to those ports on
your server.
5. Install an SSL certificate. With respect to creating an SSL certificate, you can create a selfsigned certificate quite easily using OpenSSL:
This will create a certificate that will last ten years. Place it in your OpenSSL certificates
directory. On RHEL, Fedora, and CentOS, this is the /etc/pki/tls directory.
NOTE Users will receive a warning that the certificate is not properly signed if you use the
previous method of creating a certificate. If you do not want this warning to appear, you will need to
purchase a certificate from a Certificate Authority (CA) such as VeriSign. Depending on your specific
environment, this might or might not be a requirement. However, if all you need is an encrypted tunnel
through which passwords can be sent, a self-signed certificate works fine.
6. Finally, you need to make sure that your clients use SSL when connecting to the imap server.
In most of the popular e-mail client programs such as Thunderbird, evolution, Outlook, and so
on, the option to enable this may be as simple as a check box in the Email Account
configuration options.
Testing IMAP and POP3 Connectivity over SSL
Once you move to an SSL-based mail server, you might find that your tricks in checking on the mail
server using Telnet don’t work anymore. This is because Telnet assumes no encryption on the line.
Getting past this little hurdle is quite easy; simply use OpenSSL as a client instead of Telnet, like
so:
In this example, we are able to connect to the IMAP server running on 127.0.0.1 via port 993, even
though it is encrypted. Once we have the connection established, we can use the commands that we
went over in the “Checking Basic IMAP Functionality” section earlier in this chapter (login,
logout, and so on).
Similarly, we can test the secure version of the POP3 service by running this command:
Again—once we have the connection established, we can use the standard POP3 commands to
interact with the server—albeit securely this time.
Availability
In managing a mail server, you will quickly find that e-mail qualifies as one of the most visible
resources on your network. When the mail server goes down, everyone will know—they will know
quickly, and worst of all, they will let you (the administrator) know, too. Thus, it is important that you
carefully consider how you will be able to provide 24/7 availability for e-mail services.
The number-one issue that threatens mail servers is “fat fingering” a configuration—in other
words, making an error when performing basic administration tasks. There is no solution to this
problem other than being careful! When you’re dealing with any kind of production server, it is
prudent to take each step carefully and make sure that you meant to type what you’re typing. When at
all possible, work as a normal user rather than root and use sudo for specific commands that need
root permissions.
The second big issue with managing mail servers is hardware availability. Unfortunately, this is
best addressed with money. The more the better!
Make an investment up front in a good server chassis. Adequate cooling and as much redundancy
as you can afford is a good way to make sure that the server doesn’t take a fall over something silly
like a CPU fan going out. Dual-power supplies are another way to help keep mechanical things from
failing on you. Uninterruptible power supplies (UPS) for your servers are almost always a must.
Make sure that the server disks are configured in some kind of RAID fashion. This is all to help
mitigate the risk of hardware failure.
Finally, consider expansion and growth early in your design. Your users will inevitably consume
all of your available disk space. The last thing you will want is to start bouncing mail because the
mail server has run out of disk space! To address this issue, consider using disk volumes that can be
expanded on the fly and RAID systems that allow new disks to be added quickly. This will allow you
to add disks to the volume with minimal downtime and without having to move to a completely new
server.
Log Files
Although we’ve mentioned this earlier in the chapter, watching the /var/log/messages and
/var/log/maillog files is a prudent way to manage and track the activity in your mail server. The UWIMAP server provides a rich array of messages to help you understand what is happening with your
server and troubleshoot any peculiar behavior.
A perfect example of the usefulness of log files came in writing this chapter, specifically the SSL
section. After compiling the new version of the server, we forgot to copy the imapd file to
/usr/local/sbin. This led to a puzzling behavior when we tried to connect to the server using
Evolution (a popular open source e-mail client). We tried using the openssl s_client command to
connect, and it gave an unclear error. What was going on?
A quick look at the log files using the tail command revealed the problem:
Well, that more or less spells it out for us. Retracing our steps, we realized that we forgot to copy
the new imapd binary to /usr/local/sbin. A quick run of the cp command, a restart of xinetd, and we
were greeted with success.
In short, when in doubt, take a moment to look through the log files. You’ll probably find a
solution to your problem there.
Summary
This chapter covered some of the theory behind IMAP and POP3, ran through the complete
installation for the UW-IMAP software, and discussed how to test connectivity to each service
manually. With this chapter, you have enough information to run a simple mail server capable of
handling a few hundred users without a problem.
The chapter also covered enabling secure access to your mail server assets via SSL. This method
of security is an easy way to prevent clear-text passwords (embedded in IMAP or POP3 traffic) from
making their way into hands that should not have them. We ended up touching on some basic humanand hardware-related concerns, necessities, and precautions in regards to ensuring that your mail
server is available 24/7.
If you find yourself needing to build out a larger mail system, take the time to read up on the
Cyrus, Dovecot, and Courier mail servers. If you find that your environment requires more groupware
functionality (such as provided with Microsoft Exchange Server), you might want to check out other
software, such as Scalix, Open-Xchange, Zimbra, Kolab, Horde Groupware, and eGroupware. They
all provide significant extended capabilities at the expense of additional complexity in configuration.
However, if you need a mail server that has more bells and whistles, you might find the extra
complexity an unavoidable trade-off.
As with any server software that is visible to the outside world, you will want to keep up to date
with the latest releases. Thankfully, the UW-IMAP package has shown sufficient stability and security
so as to minimize the need for frequent updates, but a watchful eye is still nice. Finally, consider
taking a read through the latest IMAP and POP RFCs to understand more about the protocols.
The more familiar you are with the protocols, the easier you’ll find troubleshooting to be.
CHAPTER 21
The Secure Shell (SSH)
ne of the side effects of connecting a computer into a public network (such as the Internet) is
that, at one point or another, some folks out there will try to break into the system. This is
obviously not a good thing.
In Chapter 15, we discussed techniques for securing your Linux system, all of which are designed
to limit remote access to your system to the bare essentials. But what if you need to perform system
administrative duties from a remote site? Traditional Telnet is woefully insecure, because it transmits
the entire session (logins, passwords, and all) in cleartext. How can you reap the benefits of a truly
multiuser system if you can’t securely log into it?
O
NOTE Cleartext means that the data is unencrypted. In any system, when passwords get sent over the
line in cleartext, a packet sniffer could reveal a user’s password. This is especially bad if that user is
root!
To tackle the issue of remote login versus password security, a solution called Secure Shell
(SSH) was developed. SSH is a suite of network communication tools that are collectively based on
an open protocol/standard that is guided by the Internet Engineering Task Force (IETF). It allows
users to connect to a remote server just as they would using Telnet, rlogin, FTP, and so on, except that
the session is 100-percent encrypted. Someone using a packet sniffer merely sees encrypted traffic
going by. Should they capture the encrypted traffic, decrypting it could take a long time.
In this chapter, we’ll take a brief and general look at cryptography concepts. Then we’ll take a
grand tour of SSH, how to get it, how to install it, and how to configure it.
Understanding Public Key Cryptography
A quick disclaimer is probably necessary before proceeding: This chapter is by no means an
authority on the subject of cryptography and, as such, is not the definitive source for cryptography
matters. What you will find here is a general discussion along with some references to good books
that cover the topic more thoroughly.
Secure Shell relies on a technology called public-key cryptography. It works similarly to a safe
deposit box at the bank: You need two keys to open the box or at least multiple layers of
security/checks have to be crossed. In the case of public-key cryptography, you need two
mathematical keys: a public one and a private one. Your public key can be published on a public web
page, printed on a T-shirt, or posted on a billboard in the busiest part of town. Anyone who asks for it
can have a copy. On the other hand, your private key must be protected to the best of your ability. It is
this piece of information that makes the data you want to encrypt truly secure. Every public
key/private key combination is unique.
The actual process of encrypting data and sending it from one person to the next requires several
steps. We’ll use the popular “Alice and Bob” analogy and go through the process one step at a time as
they both try to communicate in a secure manner with one another. Figures 21-1 through 21-5
illustrate an oversimplified version of the actual process.
Figure 21-1. Alice fetches Bob’s public key.
Figure 21-2. Alice uses Bob’s public key, along with her private key, to encrypt and sign the data,
respectively.
Figure 21-3. Alice sends the encrypted data to Bob.
Figure 21-4. Bob fetches Alice’s public key.
Figure 21-5. Bob uses Alice’s public key, along with his private key, to verify and decrypt the data,
respectively.
Looking at these steps, you’ll notice that at no point was the secret (private) key sent over the
network. Also notice that once the data was encrypted with Bob’s public key and signed with Alice’s
private key, the only pair of keys that could decrypt and verify it were Bob’s private key and Alice’s
public key. Thus, if someone intercepted the data in the middle of the transmission, he or she wouldn’t
be able to decrypt the data without the proper private keys.
To make things even more interesting, SSH regularly changes its session key. (This is a randomly
generated, symmetric key for encrypting the communication between the SSH client and server. It is
shared by the two parties in a secure manner during SSH connection setup.) In this way, the data
stream gets encrypted differently every few minutes. Thus, even if someone happened to figure out the
key for a transmission, that miracle would be valid for only a few minutes until the keys changed
again.
Key Characteristics
So what exactly is a key? Essentially, a key is a large number that has special mathematical
properties. Whether someone can break an encryption scheme depends on his or her ability to find out
what the key is. Thus, the larger the key is, the harder it will be to discover it.
Low-grade encryption has 56 bits. This means there are 256 possible keys. To give you a sense of
scale, 232 is equal to 4 billion, 248 is equal to 256 trillion, and 256 is equal to 65,536 trillion.
Although this seems like a significant number of possibilities, it has been demonstrated that a loose
network of PCs dedicated to iterating through every possibility could conceivably break a low-grade
encryption code in less than a month. In 1998, the Electronic Frontier Foundation (EFF) published
designs for a (then) $250,000 computer capable of cracking 56-bit keys in a few seconds to
demonstrate the need for higher grade encryption. If $250,000 seems like a lot of money to you, think
of the potential for credit card fraud if someone successfully used that computer for that purpose!
NOTE The EFF published the aforementioned designs in an effort to convince the U.S. government
that the laws limiting the export of cryptography software were sorely outdated and hurting the United
States, since so many companies were being forced to work in other countries. This finally paid off in
2000, when the laws were loosened up enough to allow the export of higher grade cryptography.
Unfortunately, most of the companies doing cryptography work had already exported their engineering
to other countries.
For a key to be sufficiently difficult to break, experts suggest no fewer than 128 bits. Because
every extra bit effectively doubles the number of possibilities, 128 bits offers a genuine challenge.
And if you really want to make the encryption solid, a key size of 512 bits or higher is recommended.
SSH can use up to 1024 bits to encrypt your data.
The tradeoff to using higher bit encryption is that it requires more math-processing power for the
computer to churn through and validate a key. This takes time and, therefore, makes the authentication
process a touch slower—but most people think this tradeoff is worthwhile.
NOTE Though unproven, it is believed that even the infamous National Security Agency (NSA) can’t
break codes encrypted with keys higher than 1024 bits.
Cryptography References
SSH supports a variety of encryption algorithms. Public-key encryption happens to be the most
interesting method of performing encryption from site to site and is arguably the most secure. If you
want to learn more about cryptography, here are some good books and other resources to look into:
PGP: Pretty Good Privacy, by Simson Garfinkel, et al. (O’Reilly and Associates, 1994)
Applied Cryptography: Protocols, Algorithms, and Source Code in C, Second Edition, by
Bruce Schneier (John Wiley & Sons, 1996)
Cryptography and Network Security: Principles and Practice, Fifth Edition, by William
Stallings (Prentice Hall, 2010)
“SSH Connection Protocol,” by the Network Working Group, http://tools.ietf.org/id/draftietf-secsh-connect-25.txt
“Determining Strengths for Public Keys Used for Exchanging Symmetric Keys,” by Network
Working Group, www.apps.ietf.org/rfc/rfc3766.html
The groundbreaking PGP book is specific to the PGP program, but it also contains a hefty amount
of history and an excellent collection of general cryptography tutorials. The Applied Cryptography
book might be a bit overwhelming to many, especially nonprogrammers, but it successfully explains
how actual cryptographic algorithms work. (This text is considered a bible among cypherheads.)
Finally, Cryptography and Network Security is heavier on principles than on practice, but it’s useful
if you’re interested in the theoretical aspects of cryptography rather than the code itself.
Understanding SSH Versions
The first version of SSH that was made available by DataFellows (now F-Secure) restricted free use
of SSH to noncommercial activities; commercial activities required that licenses be purchased. But
more significant than the cost of the package is the fact that the source code to the package is
completely open. This is important to cryptographic software, because it allows peers to examine the
source code and make sure there are no holes that might allow hackers to break the security. In other
words, serious cryptographers do not rely on security through obscurity. Since the U.S. government
has relaxed some of its encryption laws, work on the OpenSSH project has increased and it has
become a popular alternative to some of the commercial versions of the SSH protocol.
Because the SSH protocol has become an IETF standard, other developers are also actively
working on SSH clients for other operating systems. There are many Microsoft Windows clients,
Macintosh clients, and even a Palm client, in addition to the standard Linux/UNIX clients. You can
find the version of OpenSSH discussed in this chapter at www.openssh.org.
OpenSSH and OpenBSD
The OpenSSH project was spearheaded by the OpenBSD project. OpenBSD is a version of the
Berkeley Software Distribution (BSD) operating system (another UNIX variant) that strives for the
best security of any operating system available. A quick trip to its web site (www.openbsd.org)
shows that the organization has gone more than a decade with only two remote exploits in its default
installation. Unfortunately, this level of fanaticism on security comes at the expense of not having the
most whiz-bang-feature-rich tools available, since anything added to their distribution must get
audited for security first. The nature and focus of OpenBSD has also made it a popular foundation for
firewalls.
The core of the OpenSSH package is considered part of the OpenBSD project and is thus simple
and specific to the OpenBSD operating system. To make OpenSSH available to other operating
systems, a separate group exists to make OpenSSH portable with each new release issued. Typically,
this happens quickly after the original release.
NOTE Since this book focuses on Linux-based operating systems, you will frequently see versions of
OpenSSH for this platform that are suffixed with the letter p, indicating that they have been ported.
Alternative Vendors for SSH Clients
The SSH client is the client component of the SSH protocol suite. It allows users to interact with the
service(s) provided by an SSH server daemon.
Every day, many people work within heterogeneous environments, and it’s impossible to ignore
all the Windows 98/NT/2000/XP/2003/Vista/7/8 and Mac OS systems out there. To allow these folks
to work with a real operating system (Linux, of course!), there must be a mechanism in place for
logging into such systems remotely. Because Telnet is not secure, SSH provides an alternative.
Virtually all Linux/UNIX systems come with their own built-in SSH clients, and as such, there isn’t
any need to worry about them; however, the non-UNIX operating systems are a different story.
Here is a quick rundown of several SSH clients and other useful SSH resources:
PuTTY for Win32 (www.chiark.greenend.org.uk/~sgtatham/putty) This is probably one
of the oldest and most popular SSH implementations for the Win32 (Microsoft Windows)
platforms. It is extremely lightweight—one binary with no dynamic link libraries (DLLs),
and just one executable. Also on this site are tools such as pscp, which is a Windows
command-line version of Secure Copy (SCP).
OpenSSH for Mac OS X That’s right—OpenSSH is part of the Mac OS X system. When
you open the terminal application, you can simply issue the ssh command. (It also ships with
an OpenSSH SSH server.) Mac OS X is actually a UNIX-based and UNIX-compliant
operating system. One of its main core components—the kernel—is based on the BSD
kernel.
MindTerm, multiplatform (www.cryptzone.com) This program supports versions 1 and 2
of the SSH protocol. Written in 100-percent Java, it works on many UNIX platforms
(including Linux), as well as Windows and Mac OS. See the web page for a complete list of
tested operating systems.
Cygwin (www.cygwin.com) This might be a bit of an overkill, but it is well worth the
initial effort involved with getting it set up. It is a collection of tools that provides a Linux
environment for Windows. It provides an environment to run numerous Linux/UNIX
programs without extensive changes to their source code. Under cygwin, you can run all your
favorite GNU/Linux programs, such as bash, grep, find, nmap, gcc, awk, vim, emacs, rsync,
OpenSSH client, OpenSSH server, and so on, as though you were at a traditional GNU/Linux
shell.
The Weakest Link
You’ve probably heard the saying, “Security is only as strong as your weakest link.” This particular
saying has significance in terms of OpenSSH and securing your network: OpenSSH is only as secure
as the weakest connection between the user and the server. This means that if a user uses Telnet from
host A to host B and then uses ssh to host C, the entire connection can be monitored from the link
between host A and host B. The fact that the link between host B and host C is encrypted becomes
irrelevant.
Be sure to explain this to your users when you enable logins via SSH, especially if you’re
disabling Telnet access altogether. Unfortunately, taking the time to tighten down your security in this
manner will be soundly defeated if your users Telnet to a host across the Internet so that they can ssh
into your server. And more often than not, they may not have the slightest idea of why doing that is a
bad idea.
NOTE When you Telnet across the Internet, you are crossing several network boundaries. Each of
those providers has full rights and capabilities to sniff traffic and gather any information they want.
Someone can easily see you reading your e-mail. With SSH, you can rest assured that your connection
is secure.
Installing OpenSSH via RPM in Fedora
This is perhaps the easiest and quickest way to get SSH up and running on any Linux system. It is
almost guaranteed that you will already have the package installed and running on most modern Linux
distributions. Even if you choose a bare-bones installation (that is, the most minimal set of software
packages selected during operating system installation), OpenSSH is usually a part of that minimum.
This is more the norm than the exception.
But, again, just in case you are running a Linux distribution that was developed on the planet
Neptune but at least has Red Hat Package Manager (RPM) installed, you can always download and
install the precompiled RPM package for OpenSSH.
On our sample Fedora system, you can query the RPM database to make sure that OpenSSH is
indeed installed by typing the following:
And, if by some freak occurrence, you don’t have it already installed (or you accidentally
uninstalled it), you can quickly install an OpenSSH server using Yum by issuing this command:
Installing OpenSSH via APT in Ubuntu
The Ubuntu Linux distribution usually comes with the client component of OpenSSH preinstalled, but
you have to install the server component explicitly if you want it. Installing the OpenSSH server using
Advanced Packaging Tool (APT) in Ubuntu is as simple as running this:
The install process will also automatically start the SSH daemon for you after installing it.
You can confirm that the software is installed by running the following:
Downloading, Compiling, and Installing OpenSSH from Source
As mentioned, virtually all Linux versions ship with OpenSSH; however, you may have a need
to roll your own version from source for whatever reason. This section will cover downloading
the OpenSSH software and the two components it needs: OpenSSL and zlib. Once these are in
place, you can then compile and install the software. If you want to stick with the precompiled
version of OpenSSH that ships with your distribution, you can skip this section and move
straight to the section “Server Start-up and Shutdown.”
We will use OpenSSH version 5.9p1 in this section, but you can still follow the steps using
any current version of OpenSSH available to you (just change the version number). You can
download this from www.openssh.com/portable.html. Select a download site that is closest to
you, and download openssh-5.9p1.tar.gz to a directory with enough free space (/usr/local/src
is a good choice, and we’ll use it in this example).
Once you have downloaded OpenSSH to /usr/local/src, unpack it with the tar command,
like so:
This will create a directory called openssh-5.9p1 under /usr/local/src.
Along with OpenSSH, you will need OpenSSL version 1.0.0 or later. As of this writing, the
latest version of OpenSSL was openssl-1.0.0*.tar.gz. You can download that from
www.openssl.org. Once you have downloaded OpenSSL to /usr/local/src, unpack it with the
tar command, like so:
Finally, the last package you need is the zlib library, which is used to provide compression
and decompression facilities. Most modern Linux distributions have this already, but if you want
the latest version, you need to download it from www.zlib.net. We use zlib version 1.2.5 in our
example. To unpack the package in /usr/local/src after downloading, use tar, like so:
The following steps will walk through the process of compiling and installing the various
components of OpenSSH and its dependencies.
1. Begin by going into the directory into which zlib was unpacked, like so:
2. Then run configure and make:
This will result in the zlib library being built.
3. Install the zlib library:
The resulting library will be placed in the /usr/local/lib directory.
4. Now you need to compile OpenSSL. Begin by changing to the directory to which the
downloaded OpenSSL was unpacked:
5. Once in the OpenSSL directory, all you need to do is run configure and make.
OpenSSL will take care of figuring out the type of system it is on and configure itself to
work in an optimal fashion. Here are the exact commands:
Note that this step may take a few minutes to complete.
6. Once OpenSSL is done compiling, you can test it by running the following:
7. If all went well, the test should run without problems by spewing a bunch of stuff on the
terminal. If there are any problems, OpenSSL will report them to you. If you do get an
error, you should remove this copy of OpenSSL and try the download/unpack/compile
procedure again.
8. Once you have finished the test, you can install OpenSSL:
This step will install OpenSSL into the /usr/local/ssl directory.
9. You are now ready to begin the actual compile and install of the OpenSSH package.
Change into the OpenSSH package directory, like so:
10. As with the other two packages, you need to begin by running the configure program.
For this package, however, you need to specify some additional parameters. Namely, you
need to tell it where the other two packages got installed. You can always run
./configure with the --help option to see all of the parameters, but you’ll find that the
following ./configure statement will probably work fine:
11. Once OpenSSH is configured, simply run make and make install to put all of the files
into the appropriate /usr/local directories.
That’s it—you are done. This set of commands will install the various OpenSSH binaries
and libraries under the /usr/local directory. The SSH server, for example, will be placed under
the /usr/local/sbin directory, and the various client components will be placed under the
/usr/local/bin/ directory.
Note that even though we just walked through how to compile and install OpenSSH from
source, the rest of this chapter will assume that we are dealing with OpenSSH as it is installed
via RPM or APT (as discussed in previous sections).
Server Start-up and Shutdown
If you want users to be able to log into your system via SSH, you will need to make sure that the
service is running and start it if it is not. You should also make sure that the service gets started
automatically between system reboots.
On our Fedora server, we’ll check the status of the sshd daemon:
The sample output shows the service is up and running. On the other hand, if the service is
stopped, issue this command to start it:
On a systemd-enabled distribution, you can alternatively use the systemctl command to start the
sshd service unit by executing this command:
If, for some reason, you do need to stop the SSH server, type the following;
If you make configuration changes that you want to go into effect, you can restart the daemon at
any time by simply running this:
TIP On a Debian-based Linux distro such as Ubuntu, you can run control scripts for OpenSSH to
control the daemon. For example, to start it, you would run this:
To stop the daemon, run this:
TIP On an openSUSE distro, the command to check the status of sshd is
And to start it, the command is
SSHD Configuration File
Out of the box, most Linux systems already have the OpenSSH server configured and running with
some defaults settings.
On most RPM-based Linux distributions, such as Fedora, Red Hat Enterprise Linux (RHEL),
openSUSE, or CentOS, the main configuration file for sshd usually resides under the /etc/ssh/
directory and is called sshd_config. Debian-based distros also store the configuration files under the
/etc/ssh/ directory. For the OpenSSH version that we installed from source earlier, the configuration
file is located under the /usr/local/etc/ directory.
Next we’ll discuss some of the configuration options found in the sshd_config file.
AuthorizedKeysFile Specifies the file that contains the public keys that can be used for user
authentication. The default is /<User_Home_Directory>/.ssh/ authorized_keys.
Ciphers This is a comma-separated list of ciphers allowed for the SSH protocol version 2.
Examples of supported ciphers are 3des-cbc, aes256-cbc, aes256-ctr, arcfour, and blowfishcbc.
HostKey Defines the file containing a private host key used by SSH. The default is
/etc/ssh/ssh_host_rsa_key or /etc/ssh/ssh_host_dsa_key for protocol version 2.
Port Specifies the port number on which sshd listens. The default value is 22.
Protocol This specifies the protocol versions sshd supports. The possible values are 1 and
2. Note that protocol version 1 is generally considered insecure now.
AllowTcpForwarding Specifies whether Transmission Control Protocol (TCP) forwarding
is permitted. The default is yes.
X11Forwarding Specifies whether X11 forwarding is permitted. The argument must be yes
or no. The default is no.
ListenAddress Specifies the local address on which the SSH daemon listens. By default,
OpenSSH will listen on both Internet Protocol version 4 (IPv4) and Internet Protocol version
6 (IPv6) sockets. But if you need to specify a particular interface address, you can tweak this
directive.
NOTE sshd_config is a rather odd configuration file. This is because you will notice that, unlike
other Linux configuration files, comments (#) in the sshd_config file denote the default values of the
options—that is, comments represent already compiled-in defaults.
Using OpenSSH
OpenSSH comes with several useful programs that are covered in this section: the ssh client program,
the Secure Copy (scp) program, and the Secure FTP (sftp) program. The most common application
you will probably use is the ssh client program.
Secure Shell (ssh) Client Program
With the ssh daemon started, you can simply use the ssh client to log into a machine from a remote
location in the same manner that you would with Telnet. The key difference between ssh and Telnet,
of course, is that your SSH session is encrypted, while your Telnet session is not.
The ssh client program will usually assume that you want to log into the remote system
(destination) as the same user with which you are logged into the local system (source). However, if
you need to use a different login (for instance, if you are logged in as root on one host and want to ssh
to another and log in as the user yyang), all you need to do is provide the -l option along with the
desired login. For example, if you want to log into the host server-B as the user yyang from server-A,
you would type
Or you could use the [email protected] command format, like so:
You would then be prompted with a password prompt from server-B for the user yyang’s
password.
But if you just want to log into the remote host without needing to change your login at the remote
end, simply run ssh, like so:
With this command, you’ll be logged in as the root user at server-B.
Of course, you can always replace the hostname with a valid IP address, like
To connect to a remote SSH server that is also listening on an IPv6 address (e.g., 2001:DB8::2),
you could try
Creating a Secure Tunnel
This section covers what is commonly called the “poor man’s virtual private network” (VPN).
Essentially, you can use SSH to create a tunnel from your local system to a remote system. This is a
handy feature when you need to access an intranet or another system that is not exposed to the outside
world on your intranet. For example, you can ssh to a file server machine that will set up the port
forwarding to the remote web server.
Let’s imagine a scenario like the one described next with the following components: inside,
middle, and outside.
Inside The inside component consists of the entire local area network (LAN) (192.168.1.0
network). It houses various servers and workstations that are accessible only by other hosts
on the inside. Let’s assume that one of the internal servers on the LAN hosts a web-based
accounting application. The internal web server’s hostname is “accounts” with an IP address
of 192.168.1.100.
Middle In the middle, we have our main component—a system with two network interfaces.
The system’s hostname is serverA. One of the interfaces is connected directly to the Internet.
The other interface is connected to the LAN of the company.
On serverA, assume the first interface (the wide area network, or WAN, interface) has a
public/routable-type IP address of 1.1.1.1 and the second interface has a private-type IP
address of 192.168.1.1. The second interface of serverA is connected to the LAN
(192.168.1.0 network), which is completely cut off from the Internet.
The only service that is allowed and running on the WAN interface of serverA is the sshd
daemon. ServerA is said to be “dual-homed,” because it is connected to two different
networks: the LAN and the WAN.
Outside Our remote user, yyang, needs to access the web-based accounting application
running on the internal server (accounts) from home. User yyang’s home workstation
hostname is hostA. Yyang’s home system is considered to be connecting via a hostile public
Internet. hostA has a SSH client program installed.
We already said the entire internal company network (LAN, accounts server, other internal hosts,
and so on) is cut off from the Internet and the home system (hostA) is part of the public Internet, so
what gives? The setup is illustrated in Figure 21-6.
Figure 21-6. Port forwarding with SSH
Enter the poor man’s VPN, aka SSH tunneling. The user yyang will set up an SSH tunnel to the
web server running on “accounts” by following these steps:
1. While sitting in front of her home system—hostA—the user yyang will log into the home
system as herself.
2. Once logged in locally, she will create a tunnel from port 9000 on the local system to port 80
on the system (named accounts) running the web-based accounting software.
3. To do this, yyang will connect via SSH to serverA’s WAN interface (1.1.1.1) by issuing this
command from her system at home (hostA):
NOTE The complete syntax for the port-forwarding command is
where local_port is the local port you will connect to after the tunnel is set up,
destination_host:destination_port is the host:port pair where the tunnel
ssh_ server is the host that will perform the forwarding to the end host.
will be directed, and
4. After yyang successfully authenticates herself to serverA and has logged into her account on
serverA, she can then launch a web browser installed on her workstation (hostA).
5. User yyang will need to use the web browser to access the forwarded port (9000) on the local
system and see if the tunnel is working correctly. For this example, she needs to type the
Uniform Resource Locator (URL) http://localhost:9000 into the address field of the browser.
6. If all goes well, the web content being hosted on the accounting server should show up on
yyang’s web browser—just as if she were accessing the site from within the local office LAN
(that is, the 192.168.1.0 network).
7. To close down the tunnel, she simply closes all windows that are accessing the tunnel and
then ends the SSH connection to serverA by typing exit at the prompt she used to create the
tunnel.
The secure tunnel affords you secure access to other systems or resources within an intranet or a
remote location. It is a great and inexpensive way to create a virtual private network between your
host and another host. It is not a full-featured VPN solution, since you can’t easily access every host
on the remote network, but it gets the job done.
In this project, we port-forwarded HTTP traffic. You can tunnel almost any protocol, such as
Virtual Network Computing (VNC) or Telnet. Note that this is a way for people inside a firewall or
proxy to bypass the firewall mechanisms and get to computers in the outside world.
OpenSSH Shell Tricks
It is also possible to create a secure tunnel after you have already logged into the remote SSH
server. That is, you don’t have to set up the tunnel when you are setting up the initial SSH
connection. This is especially useful if you have a shell on a remote host and you need to hop
around onto other systems that would otherwise be inaccessible.
SSH has its own nifty little shell that can be used to accomplish this and other neat tricks.
To gain access to the built-in SSH shell, press SHIFT-~-C (that’s a tilde in the middle) on
the keyboard after logging into an SSH server. You will open a prompt similar to this one:
To set up a tunnel similar to the one that we set up earlier, type this command at the ssh
prompt/shell:
To leave or quit the SSH shell, press ENTER on your keyboard, and you’ll be back to your
normal login shell on the system.
While logged in remotely to a system via SSH, simultaneously typing the tilde character (~)
and the question mark (?) will display a listing of all the other things you can do at the ssh
prompt.
These are the supported escape sequences:
~.
Terminate connection
~
Open a command line
~R
Request rekey (SSH protocol 2 only)
~^Z Suspend SSH
~#
List forwarded connections
~&
Background SSH (when waiting for connections to terminate)
~?
This message
~~
Send the escape character by typing it twice
Note that escapes are recognized only immediately after newlines.
Secure Copy (scp) Program
Secure Copy (scp) is meant as a replacement for the rcp command, which allows you to do remote
copies from one host to another. The most significant problem with the older rcp command is that
users tend to arrange their remote-access settings to allow far too much access into your system. To
help mitigate this, instruct users to use the scp command instead, and then completely disable access
to the insecure rlogin programs. The format of scp is identical to rcp, so users shouldn’t have
problems with this transition.
Suppose user yyang, for example, is logged into her home workstation (client-A) and wants to
copy a file named .bashrc located in the local home directory to her home directory on server-A.
Here’s the command:
If she wants to copy the other way—that is, from the remote system server-A to her local system
client-A—the arguments need to be reversed, like so:
Secure FTP (sftp) Program
Secure FTP is a subsystem of the ssh daemon. You access the Secure FTP server by using the sftp
command-line tool. To sftp from a system named client-A to an SFTP server running on server-A as
the user yyang, type this:
You will then be asked for your password, just as you are when you use the ssh client. Once you
have been authenticated, you will see a prompt like the following:
You can issue various sftp commands while at the sftp shell. For example, to list all the files and
directories under the /tmp folder on the sftp server, you can use the ls command:
For a listing of all the commands, just type a question mark (?).
Notice that some of the commands look strikingly familiar to the FTP commands discussed in
Chapter 17. Among other things, sftp is handy if you forget the full name of a file you are looking for,
because you can leisurely browse the remote file system using familiar FTP commands.
Files Used by the OpenSSH Client
The configuration files for the SSH client and SSH server typically reside in the directory /etc/ssh/
on most Linux distributions. (If you have installed SSH from source into /usr/local, the full path will
be /usr/local/etc/ssh/.) If you want to make any system-wide changes to defaults for the SSH client,
you need to modify the ssh_config file.
CAUTION Remember that the sshd_config file is for the server daemon, while the ssh_config file is
for the SSH client! Note the letter “d” for daemon in the configuration file name.
Within a user’s home directory, SSH information is stored in the directory ~username/ .ssh/. The
file known_hosts is used to hold host key information. This is also used to guard against man-in-themiddle attacks. SSH will alert you when the host keys change. If the keys have changed for a valid
reason—for instance, if the server was reinstalled—you will need to edit the known_hosts file and
delete the line with the changed server.
Summary
The Secure Shell tool is a superior replacement to Telnet for remote logins. Adopting the OpenSSH
package will put you in the company of many other sites that are disabling Telnet access altogether
and allowing only SSH access through their firewalls. Given the wide-open nature of the Internet, this
change isn’t an unreasonable thing to ask of your users.
Here are the key issues to keep in mind when you consider SSH:
SSH is easy to compile and install.
Replacing Telnet with SSH requires no significant retraining for end users.
SSH exists on many platforms, not just Linux/UNIX.
Using SSH as the access/login method to your systems helps to mitigate potential network
attacks in which crackers can “sniff” passwords off your Internet connection.
In closing, you should understand that using OpenSSH doesn’t make your system secure
immediately. There is no replacement for a set of good security practices. Following the lessons from
Chapter 15, you should disable all unnecessary services on any system that is exposed to untrusted
networks (such as the Internet); allow only those services that are absolutely necessary. And that
means, for example, if you’re running SSH, you should disable Telnet, rlogin, rsh, and others.
PART V
Intranet Services
CHAPTER 22
Network File System (NFS)
etwork File System (NFS) is the Linux/UNIX way of sharing files and applications across the
network. The NFS concept is somewhat similar to that of Microsoft Windows File sharing, in
that it allows you to attach to a remote file system (or disk) and work with it as if it were a
local drive—a handy tool for sharing files and large storage space among users.
NFS and Windows File Sharing are a solution to the same problem; however, these solutions are
very different beasts. NFS requires different configurations, management strategies, tools, and
underlying protocols. We will explore these differences as well as show how to deploy NFS in the
course of this chapter.
N
The Mechanics of NFS
As with most network-based services, NFS follows the usual client and server paradigms—that is, it
has its client-side components and its server-side components.
Chapter 7 covered the process of mounting and unmounting file systems. The same principles
apply to NFS, except you also need to specify the server hosting the share in addition to the other
items (such as mount options) you would normally define. Of course, you also need to make sure the
server is actually configured to permit access to the share!
Let’s look at an example. Assume there exists an NFS server named serverA that needs to share
its local /home partition or directory over the network. In NFS parlance, it is said that the NFS server
is “exporting its /home partition.” Assume there also exists a client system on the network named
clientA that needs access to the contents of the /home partition being exported by the NFS server.
Finally, assume all other requirements are met (permissions, security, compatibility, and so on).
For clientA to access the /home share being exported by serverA, clientA needs to make an NFS
mount request for /home so that it can mount it locally, such that the share appears locally as the
/home directory. The command to issue this mount request can be as simple as this:
Assuming that the command was run from the host named clientA, all the users on clientA would
be able to view the contents of /home as if it were just another directory or local file system. Linux
would take care of making all of the network requests to the server.
Remote procedure calls (RPCs) are responsible for handling the requests between the client and
the server. RPC technology provides a standard mechanism for any RPC client to contact the server
and find out to which service the calls should be directed. Thus, whenever a service wants to make
itself available on a server, it needs to register itself with the RPC service manager, portmap.
Portmap tells the client where the actual service is located on the server.
Versions of NFS
NFS is not a static protocol. Standards committees have helped NFS evolve to take advantage of new
technologies, as well as changes in usage patterns. At the time of this writing, three well-known
versions of the protocol exist: NFS version 2 (NFSv2), NFS version 3 (NFSv3), and NFS version 4
(NFSv4). An NFS version 1 also existed, but it was very much internal to SUN and, as such, never
saw the light of day!
NFSv2 is the oldest of the three. NFSv3 is the standard with perhaps the widest use. NFSv4 has
been in development for a while and is the newest standard. NFSv2 should probably be avoided if
possible and should be considered only for legacy reasons. NFSv3 should be considered if stability
and widest range of client support are desired. NFSv4 should be considered if its bleeding-edge
features are needed and probably for very new deployments where backward compatibility is not an
issue.
Perhaps the most important factor in deciding which version of NFS to consider would be the
version that your NFS clients will support.
Here are some of the features of each NFS version:
NFSv2 Mount requests are granted on a per-host basis and not on a per-user basis. This
version uses Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) as its
transport protocol. Version 2 clients have a file size limitation of less than 2 gigabytes (GB)
that they can access.
NFSv3 This version includes a lot of fixes for the bugs in NFSv2. It has more features than
version 2, has performance gains over version 2, and can use either TCP or UDP as its
transport protocol. Depending on the local file system limits of the NFS server itself, clients
can access files larger than 2GB in size. Mount requests are also granted on a per-host basis
and not on a per-user basis.
NFSv4 This version of the protocol uses a stateful protocol such as TCP or Stream Control
Transmission Protocol (SCTP) as its transport. It has improved security features thanks to its
support for Kerberos; for example, client authentication can be conducted on a per-user basis
or a principal basis. It was designed with the Internet in mind, and as a result, this version of
the protocol is firewall-friendly and listens on the well-known port 2049. The services of the
RPC binding protocols (such as rpc.mountd, rpc.lockd, rpc.statd) are no longer required in
this version of NFS because their functionality has been built into the server; in other words,
NFSv4 combines these previously disparate NFS protocols into a single protocol
specification. (The portmap service is no longer necessary.) It includes support for file
access control list (ACL) attributes and can support both version 2 and version 3 clients.
NFSv4 introduces the concept of the pseudo-file system, which allows NFSv4 clients to see
and access the file systems exported on the NFSv4 server as a single file system.
The version of NFS used can be specified at mount time by the client via the use of mount options.
For a Linux client to use NFSv2, the mount option of nfsvers=2 is used. For NFSv3, the mount
option is specified by nfsvers=3. And for NFSv4, the nfsvers option is not supported, but this
version can be used by specifying nfs4 as the file system type.
The rest of this chapter will concentrate mostly on NFSv3 and NFSv4, because they are
considered quite stable in Linux, they are well known, and they also have the widest cross-platform
support.
Security Considerations for NFS
In its default state, NFS is not a secure method for sharing disks. The steps necessary to make NFS
more secure are no different from those for securing any other system. The only catch is that you must
be able to trust the users on the client system, especially the root user. If you’re the root user on both
the client and the server, there is a little less to worry about. The important thing in this case is to
make sure non-root users don’t become root—which is something you should be doing anyway! You
should also strongly consider using NFS mount flags, such as the root_squash flag discussed later
on.
If you cannot fully trust the person with whom you need to share a resource, it will be worth your
time and effort to seek alternative methods of sharing resources (such as read-only sharing of the
resources).
As always, stay up to date on the latest security bulletins from the Computer Emergency Response
Team (www.cert.org), and keep up with all the patches from your distribution vendor.
Mount and Access a Partition
Several steps are involved in a client’s making a request to mount a server’s exported file system or
resource (these steps pertain mostly to NFSv2 and NFSv3):
1. The client contacts the server’s portmapper to find out which network port is assigned as the
NFS mount service.
2. The client contacts the mount service and requests to mount a file system. The mount service
checks to see if the client has permission to mount the requested partition. (Permission for a
client to mount a resource is based on directives/ options in the /etc/exports file.) If the client
does have permission, the mount service returns an affirmative.
3. The client contacts the portmapper again—this time to determine on which port the NFS server
is located. (Typically, this is port 2049.)
4. Whenever the client wants to make a request to the NFS server (for example, to read a
directory), an RPC is sent to the NFS server.
5. When the client is done, it updates its own mount tables but doesn’t inform the server.
Notification to the server is unnecessary, because the server doesn’t keep track of all clients that
have mounted its file systems. Because the server doesn’t maintain state information about clients and
the clients don’t maintain state information about the server, clients and servers can’t tell the
difference between a crashed system and a really slow system. Thus, if an NFS server is rebooted,
ideally all clients should automatically resume their operations with the server as soon as the server
is back online.
Enabling NFS in Fedora
Almost all the major Linux distributions ship with support for NFS in one form or another. The only
task left for the administrator is to configure it and enable it. On our sample Fedora system, enabling
NFS is easy.
Because NFS and its ancillary programs are RPC-based, you first need to make sure that the
system portmap (for Ubuntu, Debian, and so on) or rpcbind (for Fedora, Red Hat Enterprise Linux
[RHEL], and so on) service is installed and running.
First make sure that the rpcbind package is installed on the system. On a Fedora distro, type the
following:
If the output indicates that the software is not installed, you can use Yum to install it by running
this:
To check the status of the rpcbind on Fedora, type this:
If the rpcbind service is stopped, start it like so:
Alternatively, you can use the systemctl command on systemd-enabled Linux distros like
Fedora to start the rpcinfo service by typing:
Before going any further, use the rpcinfo command to view the status of any RPC-based services
that might have registered with portmap:
Because we don’t yet have an NFS server running on the sample system, this output does not show
too many RPC services. To start the NFS service, enter this command:
Alternatively, you can use the systemctl command on systemd-enabled Linux distros like
Fedora to start the nfs server service by typing:
Running the rpcinfo command again to view the status of RPC programs registered with the
portmapper shows this output:
This output shows that various RPC programs (mountd, nfs, rquotad, and so on) are now running.
To stop the NFS service, enter this command:
[[email protected] ~]# service nfs stop
To have the NFS service automatically start up with the system with the next reboot, use the
chkconfig command. First check the runlevels for which it is currently configured to start:
From this output, we can deduce that the service is disabled by default on a Fedora system; enable
it to start up automatically by typing this:
[[email protected] ~]# chkconfig nfs on
Alternatively, you can use the systemctl command on systemd-enabled Linux distros like
Fedora to check if the NFS service is enabled to automatically start up when the system boots by
typing:
From this output, we can deduce that the service is disabled by default on a Fedora system; enable
it to start up automatically by typing this:
[[email protected] ~]# systemctl enable nfs-server.service
Enabling NFS in Ubuntu
Ubuntu and other Debian-like distributions still rely on portmap instead of rpcbind used in the
Fedora distro. Installing and enabling an NFS server in Ubuntu is as easy as installing the following
components: nfs-common, nfs-kernel-server, and portmap.
To install these using Advanced Packaging Tool (APT), run the following command:
[email protected]:~$ sudo apt-get -y install nfs-common \ > nfs-kernel-server
portmap
The install process will also automatically start up the NFS server, as well as all its attendant
services for you. You can check this by running the following:
[email protected]:~$ rpcinfo -p
To stop the NFS server in Ubuntu, type this:
[email protected]:~$ sudo /etc/init.d/nfs-kernel-server stop
The Components of NFS
Versions 2 and 3 of the NFS protocol rely heavily on RPCs to handle communications between
clients and servers. RPC services in Linux are managed by the portmap service. As mentioned, this
ancillary service is no longer needed in NFSv4.
The following list shows the various RPC processes that facilitate the NFS service under Linux.
The RPC processes are mostly relevant only in NFS versions 2 and 3, but mention is made wherever
NFSv4 applies.
rpc.statd This process is responsible for sending notifications to NFS clients whenever the
NFS server is restarted without being gracefully shut down. It provides status information
about the server to rpc.lockd when queried. This is done via the Network Status Monitor
(NSM) RPC protocol. It is an optional service that is started automatically by the nfslock
service on a Fedora system. It is not required in NFSv4.
rpc.rquotad As its name suggests, rpc.rquotad supplies the interface between NFS and the
quota manager. NFS users/clients will be held to the same quota restrictions that would
apply to them if they were working on the local file system instead of via NFS. It is not
required in NFSv4.
rpc.mountd When a request to mount a partition is made, the rpc.mountd daemon takes care
of verifying that the client has the appropriate permission to make the request. This
permission is stored in the /etc/exports file. (The upcoming section “The /etc/exports
Configuration File” tells you more about the /etc/exports file.) It is automatically started by
the NFS server init scripts. It is not required in NFSv4.
rpc.nfsd The main component to the NFS system, this is the NFS server/ daemon. It works
in conjunction with the Linux kernel either to load or unload the kernel module as necessary.
It is, of course, still relevant in NFSv4.
NOTE You should understand that NFS itself is an RPC-based service, regardless of the version of
the protocol. Therefore, even NFSv4 is inherently RPC-based. The fine point here lies in the fact that
most of the previously used ancillary and stand-alone RPC-based services (such as mountd and
statd) are no longer necessary, because their individual functions have now been folded into the
NFSv4 daemon.
rpc.lockd The rpc.statd daemon uses this daemon to handle lock recovery on crashed
systems. It also allows NFS clients to lock files on the server. The nfslock service is no
longer used in NFSv4.
rpc.idmapd This is the NFSv4 ID name-mapping daemon. It provides this functionality to the
NFSv4 kernel client and server by translating user and group IDs to names and vice versa.
rpc.svcgssd This is the server-side rpcsec_gss daemon. The rpcsec_gss protocol allows the
use of the gss-api generic security API to provide advanced security in NFSv4.
rpc.gssd This provides the client-side transport mechanism for the authentication mechanism
in NFSv4.
Kernel Support for NFS
NFS is implemented in two forms among the various Linux distributions. Most distributions ship with
NFS support enabled in the kernel. A few Linux distributions also ship with support for NFS in the
form of a stand-alone daemon that can be installed via a package.
As far back as Linux 2.2, there has been kernel-based support for NFS, which runs significantly
faster than earlier implementations. As of this writing, kernel-based NFS server support is
considered production-ready. It is not mandatory—if you don’t compile support for it into the kernel,
you will not use it. If you have the opportunity to try kernel support for NFS, however, it is highly
recommended that you do so. If you choose not to use it, don’t worry—the nfsd program that handles
NFS server services is completely self-contained and provides everything necessary to serve NFS.
NOTE On the other hand, clients must have support for NFS in the kernel. This support in the kernel
has been around for a long time and is known to be stable. Almost all present-day Linux distributions
ship with kernel support for NFS enabled.
Configuring an NFS Server
Setting up an NFS server is a two-step process. The first step is to create the /etc/exports file, which
defines which parts of your server’s file system or disk are shared with the rest of your network and
the rules by which they get shared. (For example, is a client allowed only read access to the file
system? Are they allowed to write to the file system?) After defining the exports file, the second step
is to start the NFS server processes that read the /etc/exports file.
The /etc/exports Configuration File
This primary configuration file for the NFS server lists the file systems that are sharable, the hosts
with which they can be shared, and with what permissions as well as other parameters. The file
specifies remote mount points for the NFS mount protocol.
The format for the file is simple. Each line in the file specifies the mount point(s) and export flags
within one local server file system for one or more hosts.
Here is the format of each entry/line in the /etc/exports file:
The different fields are explained here:
/directory/to/export This is the directory you want to share with other users—for example,
/home.
client This refers to the hostname(s) of the NFS client(s).
ip_network This allows the matching of hosts by IP addresses (for example, 172.16.1.1) or
network addresses with a netmask combination (for example, 172.16.0.0/16).
permissions These are the corresponding permissions for each client. Table 22-1 describes
the valid permissions for each client.
Permission
Option
Meaning
The port number from which the client requests a mount must be lower than 1024.
This permission is on by default. To turn it off, specify insecure instead.
Allows read-only access to the partition. This is the default permission whenever
ro
nothing is specified explicitly.
rw
Allows normal read/write access.
The client will be denied access to all directories below /dir/to/mount. This
noaccess
allows you to export the directory /dir to the client and then to specify /dir/to as
inaccessible without taking away access to something like /dir/from.
This permission prevents remote root users from having superuser (root)
root_squash
privileges on remote NFS-mounted volumes. The squash literarily means to
squash the power of the remote root user.
This allows the root user on the NFS client host to access the NFS-mounted
no_root_squash directory with the same rights and privileges that the superuser would normally
have.
Maps all user IDs (UIDs) and group IDs (GIDs) to the anonymous user. The
all_squash
opposite option is no_all_squash, which is the default setting.
secure
Table 22-1. NFS Permissions
Following is an example of a complete NFS /etc/exports file. (Note that line numbers have been
added to the listing to aid readability.)
Lines 1 and 2 are comments and are ignored when the file is read.
Line 3 exports the /home file system to the machines named hostA and hostB, and gives them
read/write (rw) permissions as well as to the machine named clientA, giving it read-write (rw)
access, but allowing the remote root user to have root privileges on the exported file system (/home)
—this last bit is indicated by the no_root_squash option.
Line 4 exports the /usr/local/ directory to all hosts on the 172.16.0.0/16 network. Hosts in the
network range are allowed read-only access.
Telling the NFS Server Process about /etc/exports
Once you have an /etc/exports file written up, use the exportfs command to tell the NFS server
processes to reread the configuration information. The parameters for exportfs are as follows:
exportfs Command
Description
Option
Exports all entries in the /etc/exports file. It can also be used to unexport
-a
the exported file systems when used along with the u option—for
example, exportfs -ua.
Re-exports all entries in the /etc/exports file. This synchronizes
/var/lib/nfs/xtab with the contents of the /etc/exports file. For example,
-r
it deletes entries from /var/lib/nfs/xtab that are no longer in /etc/exports
and removes stale entries from the kernel export table.
-u
clientA:/dir/to/mount Unexports
-o options
-v
the directory /dir/to/mount to the host clientA.
Options specified here are the same as described in Table 22-1 for client
permissions. These options will apply only to the file system specified on
the exportfs command line, not to those in /etc/exports.
Be verbose.
Following are examples of exportfs command lines.
To export all file systems specified in the /etc/exports file, type this:
To export the directory /usr/local to the host clientA with the read/write and no_root_squash
permissions, type this:
In most instances, you will simply want to use exportfs -r.
Note that Fedora, CentOS, and RHEL distributions have a capable GUI tool (see Figure 22-1)
called system-config-nfs that can be used for creating, modifying, and deleting NFS shares. It can
be launched from the command line by executing the following:
Figure 22-1. NFS server configuration utility
The showmount Command
When you’re configuring NFS, you’ll find it helpful to use the showmount command to see if
everything is working correctly. The command is used for showing mount information for an NFS
server.
By using the showmount command, you can quickly determine whether you have configured nfsd
correctly.
After you have configured your /etc/exports file and exported all your file systems using
exportfs, you can run showmount -e to see a list of exported file systems on the local NFS server.
The -e option tells showmount to show the NFS server’s export list. Here’s an example:
If you run the showmount command with no options, it will list clients connected to the server:
You can also run this command on clients by passing the server hostname as the last argument. To
show the exported file systems on the NFS server (serverA) from an NFS client (clientA), you can
issue this command while logged into clientA:
Troubleshooting Server-Side NFS Issues
When exporting file systems, you may sometimes find that the server appears to be refusing the client
access, even though the client is listed in the /etc/exports file. Typically, this happens because the
server takes the IP address of the client connecting to it and resolves that address to the fully qualified
domain name (FQDN), and the hostname listed in the /etc/exports file isn’t qualified. (For example,
the server thinks the client hostname is clientA.example.com, but the /etc/exports file lists just
clientA.)
Another common problem is that the server’s perception of the hostname/IP pairing is not correct.
This can occur because of an error in the /etc/hosts file or in the Domain Name System (DNS) tables.
You’ll need to verify that the pairing is correct.
For NFSv2 and NFSv3, the NFS service may fail to start correctly if the other required services,
such as the portmap service, are not already running.
Even when everything seems to be set up correctly on the client side and the server side, you may
find that the firewall on the server side is preventing the mount process from completing. In such
situations, you will notice that the mount command seems to hang without any obvious errors.
Configuring NFS Clients
NFS clients are remarkably easy to configure under Linux, because they don’t require any new or
additional software to be loaded. The only requirement is that the kernel be compiled to support the
NFS file system. Virtually all Linux distributions come with this feature enabled by default in their
stock kernel. Aside from the kernel support, the only other important factor is the options used with
the mount command.
The mount Command
The mount command was originally discussed in Chapter 7. The important parameters to use with the
mount command are the specification of the NFS server name, the local mount point, and the options
specified after the -o on the mount command line.
The following is an example of an NFS mount command line:
Here, serverA is the NFS server name. The -o options are explained in Table 22-2.
mount -o
Command Description
Option
Background mount. Should the mount initially fail (for instance, if the server is down), the
mount process will send itself to background processing and continue trying to execute
bg
until it is successful. This is useful for file systems mounted at boot time, because it keeps
the system from hanging at the mount command if the server is down.
intr
hard
soft
retrans=
n
rsize= n
wsize= n
proto= n
nfsvers=
n
Specifies an interruptible mount. If a process has pending I/O on a mounted partition, this
option allows the process to be interrupted and the I/O call to be dropped. For more
information, see “The Importance of the intr Option,” later in this chapter.
This is an implicit default option. If an NFS file operation has a major timeout, a “server
not responding” message is reported on the console and the client continues retrying
indefinitely.
Enables a soft mount for this partition, allowing the client to time out the connection after
a number of retries (specified with the retrans=r option). For more information, see
“Soft vs. Hard Mounts,” later in this chapter.
The value n specifies the maximum number of connection retries for a soft-mounted
system.
The value n is the number of bytes NFS uses when reading files from an NFS server. The
default value is dependent on the kernel but is currently 4096 bytes for NFSv4.
Throughput can be improved greatly by requesting a higher value (for example,
rsize=32768).
The value n specifies the number of bytes NFS uses when writing files to an NFS server.
The default value is dependent on the kernel but is currently something like 4096 bytes for
NFSv4. Throughput can be greatly improved by asking for a higher value (such as
wsize=32768). This value is negotiated with the server.
The value n specifies the network protocol to use to mount the NFS file system. The
default value in NFSv2 and NFSv3 is UDP. NFSv4 servers generally support only TCP.
Therefore, the valid protocol types are UDP and TCP.
Allows the use of an alternative RPC version number to contact the NFS daemon on the
remote host. The default value depends on the kernel, but the possible values are 2 and 3.
This option is not recognized in NFSv4, where instead you’d simply state nfs4 as the file
system type (-t nfs4).
Sets the security mode for the mount operation to value:
sec=sys Uses local UNIX UIDs and GIDs
(AUTH_SYS). This is the default setting.
sec=
value
sec=krb5
to authenticate NFS operations
Uses Kerberos V5 instead of local UIDs and GIDs to authenticate
users.
Uses Kerberos V5 for user authentication and performs integrity
checking of NFS operations using secure checksums to prevent data tampering.
sec=krb5i
Uses Kerberos V5 for user authentication and integrity checking and
encrypts NFS traffic to prevent traffic sniffing.
sec=krb5p
Table 22-2. Mount Options for NFS
These mount options can also be used in the /etc/fstab file. This same entry in the /etc/fstab file
would look like this:
Again, serverA is the NFS server name, and the mount options are rw, bg, and soft, explained in
Table 22-2.
Soft vs. Hard Mounts
By default, NFS operations are hard, which means they continue their attempts to contact the server
indefinitely. This arrangement is not always beneficial, however. It causes a problem if an emergency
shutdown of all systems is performed. If the servers happen to get shut down before the clients, the
clients’ shutdowns will stall while they wait for the servers to come back up. Enabling a soft mount
allows the client to time out the connection after a number of retries (specified with the retrans=r
option).
NOTE There is one exception to the preferred arrangement of having a soft mount with a retrans=r
value specified: Don’t use this arrangement when you have data that must be committed to disk no
matter what and you don’t want to return control to the application until the data has been committed.
(NFS-mounted mail directories are typically mounted this way.)
Cross-Mounting Disks
Cross-mounting is the process of having serverA NFS-mounting serverB’s disks and serverB NFSmounting serverA’s disks. Although this may appear innocuous at first, there is a subtle danger in
doing this. If both servers crash, and if each server requires mounting the other’s disk in order to boot
correctly, you’ve got a chicken and egg problem. ServerA won’t boot until serverB is done booting,
but serverB won’t boot because serverA isn’t done booting.
To avoid this problem, make sure you don’t get yourself into a situation where this happens.
Ideally, all of your servers should be able to boot completely without needing to mount anyone else’s
disks for anything critical. However, this doesn’t mean you can’t cross-mount at all. There are
legitimate reasons for cross-mounting, such as needing to make home directories available across all
servers.
In these situations, make sure you set your /etc/fstab entries to use the bg mount option. By doing
so, you will allow each server to background the mount process for any failed mounts, thus giving all
of the servers a chance to boot completely and then properly make their NFS-mountable file systems
available.
The Importance of the intr Option
When a process makes a system call, the kernel takes over the action. During the time that the kernel
is handling the system call, the process has no control over itself. In the event of a kernel access
error, the process must continue to wait until the kernel request returns; the process can’t give up and
quit. In normal cases, the kernel’s control isn’t a problem, because typically, kernel requests get
resolved quickly. When there’s an error, however, it can be quite a nuisance. Because of this, NFS
has an option to mount file systems with the interruptible flag (the intr option), which allows a
process that is waiting on an NFS request to give up and move on.
In general, unless you have reason not to use the intr option, it is usually a good idea to do so.
Performance Tuning
The default block size that is transmitted with NFS versions 2 and 3 is 1 kilobyte (for NFSv4, it is
4KB). This is handy, since it fits nicely into one packet, and should any packets get dropped, NFS has
to retransmit just a few packets. The downside to this is that it doesn’t take advantage of the fact that
most networking stacks are fast enough to keep up with segmenting larger blocks of data for transport
and that most networks are reliable enough that it is extremely rare to lose a block of data.
Given these factors, it is often better to optimize for the case of a fast networking stack and a
reliable network, since that’s what you’re going to have 99 percent of the time in production
environments. The easiest way to do this with NFS is to use the wsize (write size) and rsize (read
size) options. A good size to use is 8KB for NFS versions 2 and 3. This is especially good if you
have network cards that support jumbo frames.
An example entry in a NFS client’s /etc/fstab file to tweak the wsize and rsize options is as
follows:
Troubleshooting Client-Side NFS Issues
Like any major service, NFS has mechanisms to help it cope with error conditions. In this section, we
discuss some common error cases and how NFS handles them.
Stale File Handles
If a file or directory is in use by one process when another process removes the file or directory, the
first process gets an error message from the server. Typically, this error states something to the effect
of this: “Stale NFS file handle.”
Most often, stale file handle errors can occur when you’re using a system in the X Window
System environment and you have two terminal windows open. For instance, the first terminal
window is in a particular directory—say, /mnt/usr/local/mydir/—and that directory gets deleted
from the second terminal window. The next time you press enter in the first terminal window, you’ll
see the error message.
To fix this problem, simply change your directory to one that you know exists, without using
relative directories (for example, cd /tmp).
Permission Denied
You’re likely to see the “Permission denied” message if you’re logged in as root and are trying to
access a file that is NFS-mounted. Typically, this means that the server on which the file system is
mounted is not acknowledging root’s permissions.
This is usually the result of forgetting that the /etc/exports file will, by default, enable the
root_squash option. So if you are experimenting from a permitted NFS client as the root user, you
might wonder why you are getting access-denied errors even though the remote NFS share seems to
be mounted properly.
The quick way around this problem is to become the user who owns the file you’re trying to
control. For example, if you’re root and you’re trying to access a file owned by the user yyang, use
the su command to become yyang:
When you’re done working with the file, you can exit out of yyang’s shell and return to root. Note
that this workaround assumes that yyang exists as a user on the system and has the same UID on both
the client and the server.
A similar problem occurs when users obviously have the same usernames on the client and the
server but still get permission-denied errors. This might happen because the actual UIDs associated
with the usernames on both systems are different. For example, suppose the user mmellow has a UID
of 1003 on the host clientA, but a user with the same name, mmellow, on serverA has a UID of 6000.
The simple workaround to this can be to create users with the same UIDs and GIDs across all
systems. The scalable workaround to this may be to implement a central user database infrastructure,
such as LDAP or NIS, so that all users have the same UIDs and GIDs, independent of their local
client systems.
TIP Keep those UIDs in sync! Every NFS client request to an NFS server includes the UID of the
user making the request. This UID is used by the server to verify that the user has permissions to
access the requested file. However, in order for NFS permission-checking to work correctly, the
UIDs of the users must be synchronized between the client and server. (The all_squash option can
circumvent this when used in the /etc/exports file.) Having the same username on both systems
is not enough, however. The numerical equivalent of the usernames (UID) should also be the same. A
Network Information Service (NIS) database or a Lightweight Directory Access Protocol (LDAP)
database can help in this situation. These directory systems help to ensure that UIDs, GIDs, and other
information are in sync by keeping all the information in a central database. NIS and LDAP are
covered extensively in Chapters 25 and 26, respectively.
Sample NFS Client and NFS Server Configuration
In this section you’ll put everything you’ve learned thus far together by walking through the actual
setup of an NFS environment. We will set up and configure the NFS server. Once that is
accomplished, we will set up an NFS client and make sure that the directories get mounted when the
system boots.
In particular, we want to export the /usr/local file system on the host serverA to a particular host
on the network named clientA. We want clientA to have read/write access to the shared volume and
the rest of the world to have read-only access to the share. Our clientA will mount the NFS share at
its /mnt/usr/local mount point. The procedure involves these steps:
1. On the server—serverA—edit the /etc/exports configuration file. You will share /usr/local.
Input this text into the /etc/exports file.
2. Save your changes to the file when you are done editing. Exit the text editor.
3. On the Fedora server, first check whether the rpcbind is running.
If it is not running, start it. If it is stopped or inactive, you can start it with this command:
TIP On an openSUSE system, the equivalent of the preceding commands are rcrpcbind status
and rcrpcbind start. And on other distributions that do not have the service command, you can
try looking under the /etc/init.d/ directory for a file possibly named portmap. You can then manually
execute the file with the status or start option to control the portmap service, for example, by
entering
4. Next, start the NFS service, which will start all the other attendant services it needs.
From its output, the nfs startup script will let you know if it started or failed to start up.
Alternatively, you can use the systemctl command on systemd-enabled Linux distros like
Fedora to start the nfs server service by typing:
5. To check whether your exports are configured correctly, run the showmount command:
6. If you don’t see the file systems that you put into /etc/exports, check /var/log/ messages for
any output that nfsd or mountd might have logged. If you need to make changes to
/etc/exports, run service nfs reload or exportfs -r when you are done, and finally, run
a showmount -e to make sure that the changes took effect.
7. Now that you have the server configured, it is time to set up the client. First, see if the rpc
mechanism is working between the client and the server. You will again use the showmount
command to verify that the client can see the shares. If the client cannot see the shares, you
might have a network problem or a permissions problem to the server. From clientA, issue the
following command:
TIP If the showmount command returns an error similar to “clnt_create: RPC: Port mapper failure Unable to receive: errno 113 (No route to host)” or “clnt_create: RPC: Port mapper failure - RPC:
Unable to receive,” you should ensure that a firewall running on the NFS server or between the NFS
server and the client is not blocking the communications.
8. Once you have verified that you can view shares from the client, it is time to see if you can
successfully mount a file system. First create the local /mnt/usr/ local/ mount point, and then
use the mount command as follows:
9. You can use the mount command to view only the NFS-type file systems that are mounted on
clientA. Type this:
10. If these commands succeed, you can add the mount command with its options into the
/etc/fstab file so that they will get the remote file system mounted upon reboot.
Common Uses for NFS
The following ideas are, of course, just ideas. You are likely to have your own reasons for sharing
file systems via NFS.
To host popular programs If you are accustomed to Windows, you’ve probably worked
with applications that refuse to be installed on network shares. For one reason or another,
these programs want each system to have its own copy of the software—a nuisance,
especially if a lot of machines need the software. Linux (and UNIX in general) rarely has
such conditions prohibiting the installation of software on network disks. (The most common
exceptions are high-performance databases.) Thus, many sites install heavily used software
on a special partition that is exported to all hosts in a network.
To hold home directories Another common use for NFS partitions is to hold home
directories. By placing home directories on NFS-mountable partitions, it’s possible to
configure the Automounter and NIS or LDAP so that users can log into any machine in the
network and have their home directory available to them. Heterogeneous sites typically use
this configuration so that users can seamlessly move from one variant of Linux/UNIX to
another without worrying about having to carry their personal data around with them.
For shared mail spools A directory residing on the mail server can be used to store all of
the user mailboxes, and the directory can then be exported via NFS to all hosts on the
network. In this setup, traditional UNIX mail readers can read a user’s e-mail straight from
the spool file stored on the NFS share. In the case of large sites with heavy e-mail traffic,
multiple servers might be used for providing Post Office Protocol version 3 (POP3)
mailboxes, and all the mailboxes can easily reside on a common NFS share that is accessible
to all the servers.
Summary
In this chapter, we discussed the process of setting up an NFS server and client. This requires little
configuration on the server side. The clie