THE FRAGBOOK By Craig Jensen

THE FRAGBOOK By Craig Jensen
THE FRAGBOOK
By Craig Jensen
TABLE OF CONTENTS
Who Should Read This Book?
What This Book Can Do for You
INTRODUCTION
The Approach
HOW A DISK WORKS
The Physical Parts of a Disk
Intelligent Disk Controller Functions
Disk-Related Software
Disk File Structures
Disk Performance Terminology
Disk Sets
Disk Architecture Evolution
The Original Problem
The SQUEEZE Solution
The Fragmentation Solution
Summary
WHAT IS FRAGMENTATION?
File Fragmentation and Free Space Fragmentation
Internal Fragmentation
Pagefile Fragmentation
Deliberate and Needful Fragmentation
Why Fragmentation Occurs
Seeing It with Your Own Eyes
Multi-Header Files
The Magnitude of the Fragmentation Problem Today
HOW CAN YOU TELL IF YOU'VE GOT IT?
Fragmentation Analysis Tools
WHAT'S WRONG WITH FRAGMENTATION?
What's Happening to Your Disks?
What's Happening to Your Computer?
What's Happening to Your Applications?
What's Happening to Your Users?
WHAT CAN YOU DO ABOUT IT?
Clear off Disks
Copy / Contiguous
Backup and Restore
Initialization Procedures
Disk-to-Disk Copy
Written 1994
Page 1 of 121
THE FRAGBOOK
By Craig Jensen
Defragmentation Software Products
Prevention
Preventive Measures When Initializing a Disk
Preventive Measures After a Disk Has Been Initialized
When Would You Not Want to Defragment a Disk?
The Cost of Fragmentation
Conclusion
GETTING THE COMPUTER TO CLEAN UP AFTER ITSELF
History
An On-Line Defragmenter
Safety
Performance
Basic Functionality
Red Herrings
The True Goal of Defragmentation
Who Buys a Defragmenter?
Who Does Not Buy a Defragmenter?
What Does the Future Hold for Defragmentation?
Conclusion
THE ULTIMATE SOLUTION TO THE FRAGMENTATION PROBLEM
Design Goals
Special File Handling
The Impact of Moving Files
What Makes DISKEEPER Unique?
Conclusion
APPENDIX A
What Is a Computer?
APPENDIX B
File Fragmentation Cost Analysis
APPENDIX C
How to Get a Company to Spend Money for Something You Need and Want
GLOSSARY
INDEX
Written 1994
Page 2 of 121
THE FRAGBOOK
By Craig Jensen
PREFACE
Who Should Read This Book?
This book is intended for managers of Digital's VAX and Alpha AXP computer systems. VAX is the brand
name of a computer system made by Digital Equipment Corporation of Maynard, Massachusetts. The
various models of the VAX computer range in capacity from a desktop workstation to a company-wide
mainframe. Alpha AXP is the brand name of another computer system made by Digital which uses the
same operating system as the VAX. A System Manager is the person who is in charge of or responsible
for such a computer system - its maintenance, operation and management. Put bluntly, the System
Manager is the person who gets in trouble when the computer goes down or isn't working as well as it
should.
Naturally, for this book to be of any use to you at all, you have to have a VAX or Alpha AXP computer
system or at least be planning to have one in the future. That VAX or Alpha AXP has to have disks, since
the problems we will discuss occur only on disks. The computer has to be turned on, not shut down for
use as a doorstop somewhere, and it has to be at least occasionally used. If you don't use it, you won't
get the fragmentation disease and you wouldn't care if you did.
This computer of yours has to be running the VMS (or OpenVMS) operating system. Other computers
and other operating systems suffer from fragmentation too, but this book is specifically written with the
OpenVMS system in mind. The fragmentation problem is so important, and so inherent in the OpenVMS
system, it would be a mistake to try and generalize this discussion into something applicable to any
computer or operating system.
VAX and Alpha AXP System Managers, this book is for you! System Managers run into a lot of problems
keeping their users happy with the performance of the computer system. One problem that haunts
virtually every System Manager, and literally plagues those with more than a few disk drives to care for,
is fragmentation. Fragmentation, more fully defined in a later chapter, is a sort of disease that affects
computer systems, causing them to slow down and perform badly, somewhat like an arthritic old man
who these days just can't get around quite as well as he used to. The problem is so widespread and so
damaging and, surprisingly, so poorly understood, that an entire book on the subject is warranted.
So, who should read this book?
fragmentation.
Written 1994
Any VAX or Alpha AXP System Manager who is interested in
Page 3 of 121
THE FRAGBOOK
By Craig Jensen
What This Book Can Do For You
ƒ
Help you recognize whether there is a fragmentation problem.
ƒ
Show you how to find fragmentation.
ƒ
Show you how to know it when you see it.
ƒ
Help you see where it's coming from.
ƒ
Help you understand it and what it is doing to your system.
ƒ
Make it easy and simple to understand and deal with.
ƒ
Help you handle it. Show you what changes have to be made to handle it and prevent it.
ƒ
Tell you what equipment and tools you must have to handle it.
ƒ
Help you predict fragmentation problems and solutions.
ƒ
Get you started.
ƒ
Get you handling fragmentation.
ƒ
Help you rid your system of fragmentation forever.
ƒ
Show you how to spot and correct errors in fragmentation handling.
ƒ
Give you the ability to deal with fragmentation routinely.
ƒ
Show you how to keep fragmentation handled all the time.
ƒ
Show you how to handle fragmentation without doing anything.
ƒ
Show you how to prevent fragmentation from ever coming back.
ƒ
Help you improve conditions in the VAX or Alpha AXP management area.
ƒ
Show you the way to a life without fragmentation.
ƒ
Help you be more in control of your computer system.
INTRODUCTION
To discuss fragmentation intelligently, it is first necessary to establish what we mean when we use the
word computer. The simple definition of computer is: An electronic machine for making calculations,
storing and analyzing information fed into it, and controlling machinery automatically.
There is more to it than that, however. It is important to note as well that a computer does not think, that
a computer has speed and power out of proportion to anything else in our day-to-day lives, and that a
computer can be used to "enormously increase the production and income of an area." I really like the
viewpoint expressed by this last quotation, as it tends to perk up the ears and interest people in the real
Written 1994
Page 4 of 121
THE FRAGBOOK
By Craig Jensen
potential of this marvelous machine. The quote is from an essay entitled What is a Computer? by L. Ron
Hubbard, which is included in full in Appendix A.
The computers with which we are specifically concerned are those which run on the OpenVMS operating
system, namely the VAX and the Alpha AXP.
VAX is defined as:
A computer made by Digital Equipment Corporation. VAX is a high-performance, multiprogramming
computer system based on a. 32-bit architecture VAX stands for Virtual Address eXtension.
Alpha AXP is defined as:
A computer made by Digital Equipment Corporation. Alpha AXP is a RISC-based computer that uses
a few simple, fast instructions in order to facilitate faster processing speed.
The Alpha AXP computer was introduced by Digital in November 1992. Although Alpha AXP processors
use different sets of instructions than VAX processors, Digital developed a translation of the OpenVMS
operating system for the new Alpha AXP computers. For reasons which will become clear later on in this
book, this had the effect of transferring the very problem about which this book is written from the VAX
computer to the Alpha AXP computer; hence our discussion applies equally to both computer systems.
Since we are going to be talking a lot about disks, let's agree on what a disk is. The maker of both the
VAX and Alpha AXP computers, Digital Equipment Corporation, defines disk as:
A thin, round plate with a magnetic surface coating on which data can be stored by magnetic
recording.
Disks come in many different sizes and architectures. And, even though we will limit our discussion to
fixed (hard, not floppy) disks, there are still a lot of sizes and architectures to consider. Disk architecture
is so fundamental to a discussion of fragmentation that an entire chapter (the first) is devoted to it. The
chapter consists mostly of definitions of terms, but it contains a lot of pictures and is worth reviewing
even if you are already familiar with basic disk concepts. The purpose of a disk is to store information.
This information, or data, is said to be stored in "files."
In computer terminology, any collection of data that is treated as a single unit on a storage medium (such
as a disk) is referred to as a "file." Not unlike a manila folder designed to hold sheets of paper in concept,
computer files are stored on a disk, with the disk acting as a filing cabinet.
A file can be accessed (found and retrieved), modified (changed in some way) and again stored on the
disk. In this way, thousands upon thousands of pieces of information can be stored on a physically small
disk, much more than can be stored in a regular filing cabinet.
Now we come to the real question: What exactly is fragmentation anyway? The word fragmentation
means "the state of being fragmented." The word fragment means "a detached, isolated or incomplete
part." It is derived from the Latin fragmentum, which in turn is derived from frangere, meaning "break." So
fragmentation means that something is broken into parts that are detached, isolated or incomplete.
There are two types of fragmentation with which we are immediately concerned: file fragmentation and
free space fragmentation. File fragmentation concerns computer disk files that are not whole but rather
Written 1994
Page 5 of 121
THE FRAGBOOK
By Craig Jensen
are broken into scattered parts, while free space fragmentation means that the empty space on a disk is
broken into scattered parts rather than being collected all in one big empty space. File fragmentation
causes problems with accessing data stored in computer disk files, while free space fragmentation
causes problems creating new data files or extending (adding to) old ones.
In Figure 1, a file consisting of ten records is shown with all ten records contiguous (immediately
adjacent to each other) and again with the records scattered around in three different groups. The first
arrangement is not fragmented. The second arrangement is fragmented.
Figure 1 File Fragmentation
In Figure 2, three files are arranged contiguously and again with the files scattered around in three
different places. In the first arrangement, the free space is not fragmented; it is consolidated into one
large area. In the second arrangement, the free space is fragmented.
Figure 2 Free Space Fragmentation
Taken together, we refer to the two types of fragmentation as disk fragmentation. It is important to note
that, when talking about fragmentation, we are talking about the file as a container for data and not about
the contents (data) of the file itself. People sometimes use the word fragmentation to describe the
condition of a file which has its records (contents) scattered about within the file, separated by numerous
small gaps. This type of fragmentation may be a problem with the application which maintains the file; it
is not inherent in the operating system or disk file structure.
Written 1994
Page 6 of 121
THE FRAGBOOK
By Craig Jensen
In Figure 3, three records are arranged first contiguously and then again with the records separated by
empty record space. In the first arrangement, the record space is not fragmented; it is consolidated into
one large area. In the second arrangement, the record space is fragmented.
Figure 3 Record Space Fragmentation
Since record space fragmentation is the concern of applications and not of the operating system or file
system, this book does not address the subject any further. The various solutions to fragmentation are
collectively referred to as defragmentation, meaning something that gets rid of the fragmentation
problem.
This book will tell you all about fragmentation and defragmentation, and do it in a way that is clear and
understandable. The material will be presented with care taken to define technical terms. Lots of
diagrams and pictures will be used and the material will be presented on a gradient, bit by bit from the
simplest material to the more complex, so you do not have to be an expert in later areas to understand
earlier areas. For your part, take care to look up the definitions of any words you do not understand or
are uncertain of, even non-technical words. If you are unfamiliar with the physical objects discussed,
arrange to look at or touch them. For intangibles, of which there are many in this subject, try drawing
pictures of them to develop a more solid concept.
Finally, if you are having trouble in an area, go back to where you were last doing well, and check to see
if there were any words you did not fully understand. If so, look up the definition in a good (not
complicated) dictionary. If you do this thoroughly, you will find the material much easier to read.
The Approach
Here is the approach that will be used to present the story of fragmentation: After explaining how a disk
works, and taking a quick look at the OpenVMS file system, fragmentation will be explained in detail.
Then you will learn how to tell whether your system suffers from fragmentation, and if so, how badly. We
will then look at the effects of fragmentation on the VAX or Alpha AXP system - what's wrong with
fragmentation?
After that, we'll see what you can do about it, and how you can get your computer to take care of
fragmentation by itself. This will include a view to the future, anticipating the extinction of fragmentation
for OpenVMS computer systems. The final chapter is devoted to the ultimate solution to the
fragmentation problem. Along the way, we will view how this problem came about, why it wasn't detected
and prevented early on, Digital's viewpoint, some of the controversy over defragmentation, and some of
my personal opinions on the whole subject.
Written 1994
Page 7 of 121
THE FRAGBOOK
By Craig Jensen
Appendices are included with fill-in-the-blanks solutions for organizational problems that get in the way of
solving the fragmentation problem: how to determine the cost of fragmentation, how to justify the cost of
a solution and how to get your company to agree to spend the money you need to handle it.
The single most important thing to keep in mind is that a computer is used to get work done, so
fragmentation is not a problem unless it interferes with doing useful work. Similarly, any solution must be
one which allows us to get more useful work done with our computer system. Actually, fragmentation
does interfere with our use of the computer to do work. It slows things down - little by little, more and
more. If left unhandled for a year or more, it can bring any computer running the OpenVMS operating
system close to a complete standstill. I have personally experienced a VAX with a system disk so badly
fragmented it took half an hour just to log in to a user account! After defragmenting, it took only a few
seconds to log in - good as new. Now let's start at the beginning and unravel the whole story of
fragmentation.
Written 1994
Page 8 of 121
THE FRAGBOOK
By Craig Jensen
CHAPTER 1
HOW A DISK WORKS
You can skip this chapter if you already know all about disks. However, you need to understand disks
well to really understand fragmentation. The rest of this book depends heavily upon the terms and
concepts presented in this chapter. Rather than skipping this chapter altogether, I recommend that the
experienced System Manager skim it or read it quickly, if only to ensure that we agree on all the
essential terminology.
There are many different kinds of disks. We can omit some of them from discussion. Read-Only disks,
for example, such as CD-ROM, cannot become fragmented after manufacture and, if they are already
fragmented when created, there is nothing you can do about it. They are unchangeable. RAM disks and
semiconductor disks are really memory, not disks, though they appear as disks to the computer system.
While it is theoretically possible for such disks to become fragmented, little additional overhead is caused
by their fragmentation, since there are no moving parts to cause delays finding data in "distant" parts of
the disk. Floppy disks are usually too small to suffer from fragmentation and they are not ordinarily used
for on-line storage. They are typically used once to store a small amount of data and then stored in a box
on a shelf somewhere. Such usage does not suffer from the effects of fragmentation.
The disks with which we will concern ourselves are hard (not floppy) disks used for storing frequentlyaccessed information in a computer. In an OpenVMS system, the disk is an essential, central part of the
computer itself. Without a hard disk, there could be no Virtual Memory System, which is the VAX
operating system itself.
Throughout this book, when the word disk is used, the above is the type of disk being referred to. Now
let's take a look at the basic parts of a disk and the terminology that describes them:
The Physical Parts Of A Disk
A disk looks like this, conceptually:
Figure 1-1 A Disk
Magnetic Surface
Any disk has, as a critical part, a surface on which to record data. That surface is usually magnetic,
meaning that it is capable of storing a small amount of magnetism. Perhaps the most remarkable aspect
of disk technology is the number of distinct amounts of magnetism that can be stored on a single surface
of a disk. At the time of this writing, commercially available disks selling for a few hundred dollars are
capable of storing hundreds of millions of separate, distinguishable bits of information on each square
inch of surface. These numbers have been multiplying so fast for so long that I dare not even speculate
on the future of disk storage capacities.
Written 1994
Page 9 of 121
THE FRAGBOOK
By Craig Jensen
The surface of a disk is shaped like a platter, similar to a phonograph record.
Figure 1-2 Disk Surface
Bits
Each single magnetic entity is used by the computer as a binary symbol. Binary means "having two and
only two possible states" such as on or off, true or false, and so on. Each such entity is called a bit,
which is short for binary digit. Binary digits are represented in written communication as zeros and ones.
The use of numbers to represent bits, however, is only a convenience and does not mean that bits have
anything to do with mathematics. They do not. Bits are in the realm of logic, which is a matter of
reasoning and not arithmetic calculation.
Figure 1-3 Bit
Byte
When eight bits are considered together, they are referred to as a byte. A single eight-bit byte is the
amount of computer storage typically used to store a single letter of the alphabet or other symbol of
human communication. The word animal could thus be stored in six bytes of computer storage.
The byte is so common to computers that disks are referred to in terms of how many millions of bytes
can be stored on them. Disks that can hold hundreds or even thousands of millions of bytes are
commonplace today.
Figure 1-4 Byte
Block, Sector
The surface of a disk is divided into sections. This sectioning is not a physical marking on the surface,
but rather it is just an idea that the disk is so divided. These sections are called sectors or blocks. The
term sector is more common to personal computers and VAX or Alpha AXP hardware, while block is
common OpenVMS terminology. In OpenVMS, a block is a collection of 512 bytes. OpenVMS disks are
Written 1994
Page 10 of 121
THE FRAGBOOK
By Craig Jensen
formatted into blocks of 512 bytes each. With a disk so formatted, it is common to talk about blocks of
disk capacity. A 456 megabyte disk, for example, could also be said to have a capacity of about 890,000
blocks (890,000 x 512 = about 456 MB).
Figure 1-5 Block
Cluster
With larger disk capacities, it is inefficient for the computer system to deal with millions of individual
blocks one at a time. The operating system's map of the disk's blocks is too big to be useful unless single
bits in the map can represent more than one disk block. Accordingly, disk blocks are grouped into
clusters, which are groups of blocks read and written as a unit. In other words, a cluster is the minimum
allocation quantity for a disk. The cluster size, in terms of number of blocks per cluster, can be varied by
reinitializing the disk.
Figure 1-6 Cluster
Tracks
The blocks and clusters of storage space are arranged in groups referred to as tracks. A single track is
one strip of disk space beginning at one point on the surface and continuing around in a circle ending at
the same point. The tracks are concentric rings, not a spiral like the grooves on a phonograph record.
Each surface has many tracks.
Written 1994
Page 11 of 121
THE FRAGBOOK
By Craig Jensen
Figure 1-7 Tracks
Platters
A disk may consist of one or more platters, each of which may be recorded on both sides. The platter
spins like a phonograph record on a turntable.
Figure 1-8 Platters
Cylinder
The tracks at the same radius on each platter, taken together, are referred to as a cylinder. If you
visualized these tracks without any other part of the disk, they would form the shape of a hollow cylinder.
Figure 1-9 Cylinder
Head
To detect magnetic information on the recording surface, the disk has one or more heads. A head is a
tiny magnetic device capable of reading or writing magnetic bits on the disk surface. The platter spins
near the head(s), so that a single track of recorded information is continuously passing under the head,
available for reading or writing. The head never touches the surface. Rather, it floats on a cushion of air
so thin that a human hair or even a particle of cigarette smoke cannot pass between the head and the
surface. As foreign particles that small would cause the disk to fail, such disks are sealed in air-tight
containers.
While some disks have had one head hovering over each track, it is far more common to have movable
heads capable of moving from track to track as needed. The movement of a head from one track to
another is called a seek. This term will be important when we are talking about disk speeds. The time it
takes for a head to seek is one of the most critical factors in determining the speed of a disk.
Written 1994
Page 12 of 121
THE FRAGBOOK
By Craig Jensen
Figure 1-10 Head
Arms
Disk heads are mounted on arms that hold the heads close to the platter surface at precisely the right
point to read or write data. There may be one arm for each head, but on multiple-platter disks a single
arm may support two heads - one for the platter above the arm and one for the platter below. Some disks
mount all the heads on a group of arms that move in unison. Imagine your spread fingers moving
between the fanned pages of a book and you will get the idea of multiple disk arms moving together in
and out of disk platters.
Figure 1-11 Arms
Spindle
A disk platter is attached to a spindle around which it rotates like a wheel on the axle of a car. The
spindle is at the exact center of the platter. The arm moves the head from the outer edge of the platter
toward the spindle at the center and back out again. Though most disks have only one spindle, some
complex disks are made up of two or more single-spindle disks treated as one large disk. These are
called multi-spindle disks. However, no platter ever has more than one spindle.
Figure 1-12 Spindle
Electronics
Electronic circuitry is required to sense and record the magnetism on the surface of the platters and to
move the heads. This circuitry is commonly referred to as the electronics of the disk. The electronics
communicate data between the physical disk and the computer.
Written 1994
Page 13 of 121
THE FRAGBOOK
By Craig Jensen
Figure 1-13 Electronics
Drive
The combination of one or more spindles, arms, heads, platters and electronics into a single physical
device for storing and retrieving data is known as a disk drive. In this book, the term drive will be used
often to refer to the disk drive. Viewed outside its cabinet, a disk drive is a sealed metal object that looks
something like this:
Figure 1-14 Disk Drive
Cable
The electronics in the disk drive are connected to circuitry in the computer by means of cables, which are
no more than wires with a certain type of connector on each end. Often, the individual wires are colorcoded for clarity.
Figure 1-15 Cable
Controller
While the cables attach directly to the electronics of the disk drive on one end, they do not really attach
directly to the computer on the other end. On the computer end, the cables attach to a controller, which
is sometimes referred to as an interface. The controller, which is attached to the computer, decodes
instructions from the computer and issues instructions to the disk drive to do what the computer has
instructed. The controller also receives data and status information from the disk drive, which it passes
Written 1994
Page 14 of 121
THE FRAGBOOK
By Craig Jensen
on to the computer in a form the computer can understand. A single controller may service more than
one disk drive.
Intelligent Disk Controller Functions
Disk controllers range in complexity from a very simple controller that merely relays instructions and
data, to an intelligent controller that uses its information about the status of the disk to help the computer
process data faster. Two examples of intelligent disk controller functions are seek ordering and data
caching.
Seek Ordering
By keeping track of the exact position of the heads at all times, the controller can determine which one of
multiple requests from the computer can be serviced in the shortest time. Then, instead of servicing the
computer's requests in the order received, the controller can service first the requests for data nearest
the heads and then the requests for data farther away. This is called seek ordering, which simply means
putting the seeks in a better order.
For an over-simplified example, let's say the head is hovering over track 7. The computer is waiting for
data from track 2 and from track 5. To go from track 7 to track 2 to service the first request and then back
to track 5 for the second would require more time than it would to just stop off at track 5 on the way to
track 2. So, the intelligent controller reorders the two requests and services the track 5 request first, then
the track 2 request. The result is faster access to disk data on average.
Figure 1-16 Seek Ordering
Data Caching
Dramatic performance improvements can be gained by placing a significant amount of memory inside
the disk controller. This local memory is called a cache and is used to store data recently retrieved from
the disk by the computer. Then, if the computer should happen to request exactly the same data again,
the controller can service the request from the local cache at memory speed (microseconds) instead of
at disk speed (milliseconds). Of course, these dramatic gains are only available when the same disk
block is retrieved more than once and when that block has been saved by the controller in its local
cache. The amount of performance gain from such a system is wildly variable, being largely applicationdependent.
Written 1994
Page 15 of 121
THE FRAGBOOK
By Craig Jensen
Figure 1-17 Data Caching
Disk-Related Software
Driver
Just as there are many different types of disks, there are also many different types of disk controllers.
Different computer instructions are needed to deal with each different type of controller. The set of
instructions used to manipulate a controller is a software component known as a driver. The driver
resides at the lowest levels of a computer's operating system, where it can interact directly with the
hardware. The driver translates the instruction codes of the disk controller into standardized instructions
recognizable and usable by the more generalized parts of the computer system, and vice versa.
Conversely, the driver enables an application program to issue a generic "get data" instruction, for
example, to the disk without having to concern itself with the peculiarities of that particular disk (the
number of tracks, platters, and so on). A single disk driver may service more than one disk controller.
users
applications software
system services (software)
drivers
hardware
Example 1-1 Drivers in the OpenVMS Hierarchy
Input/Output (I/O) Request Queue
A driver has associated with it a queue for holding input/output (I/O) requests. This queue is merely a
data structure enabling the computer to store an I/O request while it carries on with its work without
having to wait for the I/O processing to complete. Entries can be added to the queue or deleted from it
and, under certain conditions, the entries in the queue can be rearranged.
QIO
The OpenVMS operating system contains a mechanism for queuing (inserting) an I/O request to the
queue of a driver. This mechanism is called the $QIO system service. The dollar sign indicates that this
abbreviation is Digital's. QIO stands for "Queue Input Output," where queue is used as a verb. An
application program or a higher level of the operating system uses the $QIO system service to cause I/O
to occur.
Application
An application is a computer program which controls the computer system to perform some useful work
for the user.
Written 1994
Page 16 of 121
THE FRAGBOOK
By Craig Jensen
Disk File Structures
OpenVMS deals with disk blocks from two different points of view.
Physical Block
The actual arrangement of information on the surface of a disk platter is referred to as a physical block.
The Physical Block Number (PBN) is an address used for identifying a particular block on the surface of
the disk.
Figure 1-18 Physical Block Numbers
Logical Block
When the blocks on a disk are considered from a programming point of view, they are viewed as logical
blocks. The address of a logical block on a disk is its Logical Block Number (LBN). LBN 0 (zero) is the
first LBN on a disk. Logical blocks correspond one-for-one to physical blocks, but the logical block
number might not correspond directly to the same physical block numbers.
Figure 1-19 Logical Block Numbers
OpenVMS drivers, controllers and some disk electronics are capable of detecting a physical block that is
unreliable for storage of data and replacing it with a spare block from the same disk. When this occurs,
Written 1994
Page 17 of 121
THE FRAGBOOK
By Craig Jensen
the logical block number does not change, even though the physical block number is now different. In
this book, when discussing blocks or block numbers, we will be referring to logical blocks unless
otherwise specified.
Volume
In OpenVMS, the word volume refers to a structured (formatted) disk. When considering a disk as a
logical (conceptual) unit of storage, rather than a physical unit of storage, it is referred to as a volume.
OpenVMS has the capability of treating one or more physical disk drives as one disk. This capability is
implemented by the use of software and does not involve any additional electronics or cabling. When two
or more disks are so combined, the combination is referred to as a volume set, which is described more
fully later.
Storage Bitmap
For each disk, OpenVMS maintains a map indicating which clusters of logical blocks are in use and
which are free. Each cluster consists of one or more logical blocks. Each bit in the map represents one
cluster. Therefore, a cluster is the minimum amount of disk space that can be allocated to anything. The
map is called the Storage Bitmap. When OpenVMS allocates space on the disk for a file, it can scan the
storage bitmap to find available clusters.
Figure 1-20 Storage Bitmap Relationship To Clusters
Extents
When OpenVMS allocates space on a disk for a file, it is not always possible to allocate all the needed
space contiguously on the disk. Sometimes it is necessary to allocate part of a file in one place and the
remainder in another. Files have been known to be allocated in dozens and even hundreds of pieces
scattered around the disk. Each piece of a file so allocated is called an extent. The concept of an extent
is inherent in a study of file fragmentation, as the allocation of multiple extents for a single file is file
fragmentation. A contiguous file has only one extent.
Figure 1-21 Extents
Extent Cache
The extent cache is a portion of the system's memory that is set aside solely for the use of the OpenVMS
file allocation mechanism. The extent cache stores the addresses of deallocated clusters, making it fast
for OpenVMS to find free disk space by reusing these same clusters. This saves the overhead of
scanning the Storage Bitmap of a disk to find free space. The extent cache, however, cannot store the
logical block numbers of all deleted clusters. Ordinarily, there is room for only 64 LBNs to be stored,
though this number can be changed by the System Manager.
Written 1994
Page 18 of 121
THE FRAGBOOK
By Craig Jensen
Figure 1-22 Extent Cache
File
Information stored on a disk is ordinarily stored in a file. In fact, for any OpenVMS disk using the ODS-2
structure, no information can be retrieved from a disk unless it is contained in a file. A file is "a collection
of related records treated as a unit and usually referenced by a . . . name." While a record is a unit of
data within a file, an extent (see above) is a unit of the file as a container for data.
Figure 1-23 File With One Extent
Index File
The Digital Dictionary defines index file as "The file on a . . . volume that contains the access information
for all files on the volume and enables the operating system to identify and access the volume." The
index file is a catalog of all the files on a particular disk. In fact, the header (identifying information) of a
file resides within the index file. All the information needed to access a file is contained here.
Figure 1-24 Index File Pointing To Extents
File Header
"A block in the index file that describes a file on a . . . disk. Every file residing on the disk has at least one
header, which provides the location of the file's extents." While the header actually contains much more
information than this, this is all we need to know about it for the purposes of this book.
Written 1994
Page 19 of 121
THE FRAGBOOK
By Craig Jensen
Retrieval Pointers
Within the file header, the information critical to our discussion of file fragmentation is the section headed
Retrieval Pointers. These pointers indicate where the file's data is located on the disk. Each pointer
consists of the LBN of the first data block and a count of how many successive contiguous blocks
contain data for that file. For example, this file has a single retrieval pointer:
Map area
Retrieval pointers
Count: 12
LBN:
27990
Example 1-2 File With One Retrieval Pointer
The first data block of this file is at Logical Block Number 27990. The entire file is contained in that block
plus the 11 blocks following it in LBN sequence. This 12-block file is contiguous (not fragmented). It has
only one extent.
In the following example, the file is broken into four fragments. The first consists of 6 blocks starting at
LBN 5; the second is 3 blocks at LBN 297; the third, 3 blocks at LBN 200460; and the fourth, 4104 blocks
at LBN 200760. This file is fragmented. It has four extents.
Map area
Retrieval pointers
Count:
6
Count:
3
Count:
3
Count: 4104
LBN: 5
LBN: 297
LBN: 200460
LBN: 200760
Example 1-3 File With Four Retrieval Pointers
Directory
A directory is "a file that briefly catalogs a set of files stored on a disk. . ." From the user's point of view, a
directory is a catalog of the names of files that are grouped in a particular way.
Written 1994
Page 20 of 121
THE FRAGBOOK
By Craig Jensen
Figure 1-25 Directory
Disk Performance Terminology
Seek Time
As mentioned earlier, the movement of a disk head from one track to another is called a seek. The time it
takes for a head to seek is the most critical factor in determining the speed of a disk. This is known as
the disk's seek time. It consists of three parts: the time to start the head in motion and get it up to speed,
the time to move the head from one track to another, and the time it takes to stop the head. Additional
overhead is required to identify the correct track and, if necessary, reposition the head.
Figure 1-26 Seek Time
Rotational Latency
As a disk platter spins around the spindle, the blocks in a single track of recorded data are brought near
a disk head. The head can only read or write a block when that block is immediately under the head.
Accordingly, the time to access a block of data on the disk varies. It is much quicker to access a block
that is currently or about to be under the head than it is to access a block that has recently passed under
the head and is moving away. The block that has just passed under the head has to wait nearly a full
rotation of the disk for another access opportunity. This delay is known as rotational latency.
A common disk rotational speed is 3600 RPM, meaning that the disk spins completely around 3600
times each minute. At this speed, each revolution of the disk takes 16.67 milliseconds. Naturally,
Written 1994
Page 21 of 121
THE FRAGBOOK
By Craig Jensen
rotational latency on this disk varies from zero to 16.67 milliseconds, and the average rotational latency
is half the maximum, or 8.33 milliseconds.
Figure 1-27 Rotational Latency
Average Access Time
To compute the time it takes to access any block on a disk, it is necessary to combine the seek time and
the rotational latency. Clearly, if the head is over the proper track and the desired block is immediately
under the head, the access time is approximately zero. If the head is over the proper track, the access
time is the same as the rotational latency. If the head has to move from one track to another, however,
you must add the rotational latency to the seek time to compute the average access time. The average
access time for modern-day disks falls roughly in the range of 8 to 25 milliseconds.
Figure 1-28 Seek Time + Rotational Latency
Disk Sets
Several methods of combining disks into sets are in current use. One reason for combining disks into
sets is to improve performance by reducing the average access time when figured for the set as a whole.
Volume Sets
A volume set is a group of two or more disks combined so as to be treated by OpenVMS as a single disk
equal in capacity to the total capacities of all the disks in the set. In a volume set, a file can span two or
more disks in the set. The primary benefit of a volume set over separate disks is size - the volume set
can accommodate a file larger than any individual disk in the set. The volume set also reduces average
access time by spreading the disk accesses over two (or more) disks instead of one. Theoretically, half
the disk accesses will occur on one disk in a two-volume set, while the other half occur on the other disk.
If the theory were borne out in practice, this would allow twice as many disk operations in the same time,
but actual results are not greatly improved.
Figure 1-29 Volume Set
Written 1994
Page 22 of 121
THE FRAGBOOK
By Craig Jensen
Shadow Sets
A shadow set is a group of two (or more) identical disks combined so as to be treated by OpenVMS as a
single disk equal in capacity to only one of the disks in the group. Each time data is written to the shadow
set, the same data is written to all the disks in the set. That way, the data can be retrieved even if all but
one of the disks fails. The primary benefit of a shadow set over separate disks is safety. The likelihood of
losing the data on two or more disks at the same time is much more remote than that of losing the data
on just one disk.
A shadow set also alters performance by increasing the time needed to write data (as it has to be written
to all disks in the set) and by reducing the time it takes to read data. The two or more disks in the set are
unlikely to have exactly the same block under their heads at exactly the same time, so OpenVMS will
direct the read to the one which can retrieve the data faster, either because the data is closer to its head
or because there is less work outstanding for that drive to do. Therefore, on average, the read time for
the set is faster than that for either disk.
Figure 1-30 Shadow Set
Stripesets
Two or more disks can also be combined into a stripeset. A stripeset is similar to a volume set in that the
full capacity of all the disks in the set is available for data storage, and the whole set is treated by
OpenVMS as a single large disk. Also, the virtual disk (the disk which is treated by OpenVMS as a single
large disk) is divided into "chunks" rather than clusters.
The main difference is that, while each file extent must reside entirely on a single disk in a volume set, in
a stripeset file extents are deliberately spread across multiple disks. One chunk (which could contain one
or more extents) resides on one disk, the next chunk in sequence resides on the next disk, the next
chunk on the next disk, and so on, starting over at the first disk when the last disk in the set is reached.
The primary benefit of a stripeset is application performance. Data read from a single file will come from
several disks simultaneously, greatly reducing the application wait time. For example, if each disk has a
24 millisecond (ms) access time, the average time to collect a chunk from two disks simultaneously is the
same - 24 ms. But that is 24 ms for two chunks instead of one.
Written 1994
Page 23 of 121
THE FRAGBOOK
By Craig Jensen
Figure 1-31 Stripeset
Disk Architecture Evolution
Perhaps the most interesting question that comes up with respect to fragmentation is, "Why is the
OpenVMS operating system designed so badly as to suffer from fragmentation?" The answer to this
question is that age-old saw, "It's a feature, not a bug!" It's true. Originally, the VAX/VMS operating
system, as it was called then, was deliberately designed to allow fragmentation and to deal with it
efficiently. This design was actually the solution to an older and more serious problem.
The Original Problem
You see, when computers first appeared on the scene, they had no disks at all. Obviously, disk
fragmentation was not a problem. No one had even had the idea yet, so the operating systems of that
age had no mechanism to deal with fragmentation at all. If the computer even had an operating system.
This ancient era extended into the late 1960's, before the VAX and before even the PDP-11, on which
the VAX was based. In the Digital world, these were the days of the PDP-8.
Then, disks happened. The first disks were small - unbelievably small by today's standards. They were
measured in thousands of bytes rather than millions, never mind the billion-byte disks now available.
Written 1994
Page 24 of 121
THE FRAGBOOK
By Craig Jensen
Nevertheless, this was a fabulous advance in the days when memory sold for a dollar a byte. (At this
writing, memory sells for $40 a megabyte retail - 25,000 times less!)
Later, the early PDP-11 operating system, called RT-11 (Real Time-11), was capable of storing data on
disks in files, and the files were organized in a formal file structure. This file structure, however, required
that all files be contiguous. That is, no file could be split into two or more pieces. It was not a question of
performance; the system simply had no capability to create or access a file split into pieces. A file had a
single location on a disk and that was that.
This requirement for contiguous files meant that a newly-created file had to fit within a single gap on the
disk or not at all. It was not possible to allocate a part of the file to one gap and the rest to another, as
can be done with today's OpenVMS. This was true even when there were no individual gaps large
enough to accommodate the file, in spite of the fact that the total free space on the disk far exceeded the
size of the new file. There had to be a large enough contiguous free space, period.
Naturally, this problem (of not enough contiguous free space to create a file) occurred every time a disk
filled up, and small disks fill up very fast. With frequent file deletions, it was not unusual to have a disk
reach the point where no more files could be created even though the disk was little over half full.
The SQUEEZE Solution
The solution to this problem was the SQUEEZE command. SQUEEZE compacted the disk, rearranging
the files so they were all together near the beginning (LBN 0) of the disk, leaving all the free space in one
large, contiguous area at the end. After a SQUEEZE, any file could be created, provided it would fit in the
total free space remaining.
Figure 1-32 SQUEEZE
Nothing else could be done on that disk while the SQUEEZE was in progress, but this was not a big
drawback, as RT-11 was a single-user system anyway. The only person inconvenienced was the one
who wanted that new file created, so the benefit always outweighed the inconvenience.
The Fragmentation Solution
Then a wondrous new operating system came along, one that allowed multiple simultaneous users of the
same PDP-11 computer - RSX-11. Now the inconvenience of a SQUEEZE would certainly outweigh the
benefit, as all the users would have to stop working, not just the one who wanted to create the new file.
Clearly, SQUEEZE was no longer a viable solution.
Written 1994
Page 25 of 121
THE FRAGBOOK
By Craig Jensen
The designers of RSX-11 cleverly created a file structure that included the revolutionary capability to
locate parts of a file in different places on the disk. Each file had a header that gave the location and size
of each piece of the file, so the file could be in pieces scattered around the disk. Now a file could be
created anytime there was sufficient free space anywhere on the disk; the space did not have to be
contiguous.
Nor was there any drawback to this mechanism whatsoever. It really was "a feature, not a bug."
Performance losses due to fragmentation, even when taken to extremes, caused very little difficulty for
anyone. You must realize that at this time, in the early 1970's, disks were very small. Let's take a look at
a real example:
The RK05 disk, which was in common use at the time, held 2½ megabytes (5,000 blocks). Suppose the
disk was totally and utterly fragmented, that is, no two consecutive data blocks were contiguous. Every
single disk access likely required moving the head and waiting for rotational latency. Even so, the whole
disk could be read in 250 seconds (50 ms times 5,000 blocks). That's a little over four minutes - worst
case. The same action on today's 700MB disk, even with a 16 ms access time, takes over 6 hours.
During the same period of time, disk capacities have increased to over 500 times that of the RK05, and
CPU speeds have increased to over 400 times that of the original PDP-11. Even though disk speeds
have increased by a factor of three, they have not kept pace with the rest of computer technology. This
makes the speed of the disk a major bottleneck in the computer system. This point is critical to an
understanding of the fragmentation problem.
Clearly, today's larger disks brought with them more than higher capacities and speeds. They brought a
susceptibility to a new kind of computer disease - fragmentation. Here is a table showing the time
required to access every block on a disk:
S
I
Z
E
I
N
M
B
AVERAGE ACCESS TIME IN MILLISECONDS
70.00
50.00
40.00
33.00
25.00
---------------------------------------------------------------0.032 | 4.48
3.20
2.56
2.11
1.60
0.5 | 70.00
50.00
40.00
33.00
25.00
1 | 2.33
100.00 80.00
66.00
50.00
2 | 4.67
3.33
2.67
2.20
100.00
5 | 11.67
8.33
6.67
5.50
4.17
10 | 23.33
16.67
13.33
11.00
8.33
20 | 46.67
33.33
26.67
22.00
16.67
40 | 93.33
66.67
53.33
44.00
33.33
100 | 3.89
2.78
2.22
110.00 83.33
206 | 8.01
5.72
4.58
3.78
2.86
456 | 17.73
12.67
10.13
8.36
6.33
700 | 27.22
19.44
15.56
12.83
9.72
1200 | 46.67
33.33
26.67
22.00
16.67
Legend:
Written 1994
16.00
10.00
8.00
1.02
16.00
32.00
64.00
2.67
5.33
10.67
21.33
53.33
109.87
4.05
6.22
10.67
0.64
10.00
20.00
40.00
100.00
3.33
6.67
13.33
33.33
68.67
2.53
3.89
6.67
0.51
8.00
16.00
32.00
80.00
2.67
5.33
10.67
26.67
54.93
2.03
3.11
5.33
seconds minutes hours
Table 1-1 Time Required To Access Every Block On A Disk
Page 26 of 121
THE FRAGBOOK
By Craig Jensen
So the ability to deal with fragmented files, which was carried over from the RSX-11 operating system to
the OpenVMS operating system, was a solution to an earlier problem that failed to anticipate the
enormous capacities to which disks would grow. There is no end to this growth in sight. Deliberate
fragmentation
is
no
longer
only
a
feature;
it
is
now
a
problem,
too.
Summary
In this chapter, the inner workings of a disk have been explained, with care taken to define the terms
needed to truly understand fragmentation. To be sure, there is a lot more to know about disks, but our
concern here is not to learn disk design, construction or maintenance. Our purpose is to understand
enough about disks so that fragmentation and its cure make sense. To this end, the first chapter has also
devoted time to the basic concepts of files and file structure. With these concepts and terms well
understood, we are ready to tackle the real problem - the fragmentation disease.
Written 1994
Page 27 of 121
THE FRAGBOOK
By Craig Jensen
CHAPTER 2
WHAT IS FRAGMENTATION?
File Fragmentation and Free Space Fragmentation
Fragmentation means two things:
1) a condition in which individual files on a disk are not contiguous but are broken up in pieces scattered
around the disk; and
2) a condition in which the free space on a disk consists of little bits of free space here and there rather
than only one or a few free spaces.
Condition 1 is referred to as file fragmentation, while Condition 2 is referred to as disk fragmentation or,
more precisely, free space fragmentation. File fragmentation causes performance problems when
reading files, while free space fragmentation causes performance problems when creating and extending
files. Neither condition has anything to do with the contents of a file. We are concerned only with the
files as containers for data and with the arrangement of these containers on the disk.
Internal Fragmentation
The term fragmentation is sometimes applied to the contents of a file. This type of fragmentation will be
explained here only to differentiate it from our real subjects, file and free space fragmentation.
Files consist of records. Each record is a collection of fields considered as a unit. There are three basic
kinds of files, each affected by file content fragmentation differently:
Sequential: In a sequential file, every record except the first falls immediately after the preceding record.
There are no gaps. An illustration of a sequential file is a music cassette. You cannot get to any selection
without searching through the tape. Accordingly, sequential files are not subject to internal fragmentation.
The situation simply cannot exist.
Random: In a random access or direct access file, every record is the same size. Because of this,
records can be deleted and replaced with new ones easily. An illustration of a direct access file is a
bookshelf full of books which are all the same size. You can go directly to any book desired and withdraw
it from the shelf. You can also replace it anywhere there is a space on the shelf. Fragmentation of the
contents of such a file causes virtually no performance problems, as the file is designed to be accessed
in random order and any new record is guaranteed to fit precisely within any free space in the file.
Indexed: Indexed files, however, do suffer from internal fragmentation. An illustration of an indexed file is
a floor of offices in a building. The directory in the lobby tells you what floor the office is on, but you still
have to search the floor to find the right office. Such files have an index that contains pointers to
organized data records elsewhere in the file. In such a file, variable length data records are stored in
buckets of a certain number of blocks each. If a record will not fit in a bucket (because the bucket is
already full of other records), the bucket is split into two buckets to accommodate all the records. An
indexed file with numerous split buckets is said to be fragmented. This type of fragmentation affects
performance of only those applications accessing the affected file (unless such activity is so intense that
it degrades the performance of the entire system). It is cured by reorganizing the data records within the
Written 1994
Page 28 of 121
THE FRAGBOOK
By Craig Jensen
file, usually by creating a better-organized copy of the file to supersede the fragmented one. This
reorganization can be done safely only when access to the file has been suspended.
This internal file fragmentation is not the type of fragmentation with which this book is concerned.
Pagefile Fragmentation
Another type of fragmentation which occurs on OpenVMS systems but is beyond the scope of this book
is pagefile fragmentation.
As information is added to the pagefile and deleted from it, the space in the pagefile can become
fragmented, leaving no single space large enough to hold more information. This type of fragmentation
causes severe performance degradation and can even cause the system to become unusable. It is cured
by rebooting the system, and is prevented by increasing the size of the pagefile or adding secondary
pagefile(s) to the system.
Figure 2-1 Pagefile Fragmentation
Written 1994
Page 29 of 121
THE FRAGBOOK
By Craig Jensen
Note: The type of defragmentation described in this book does nothing to remedy pagefile
fragmentation.
Deliberate and Needful Fragmentation
It sometimes happens that a file is deliberately created in a fragmented state. The best example of this is
a standard OpenVMS file, needed for every OpenVMS disk volume, called INDEXF.SYS. This file
contains the headers for all the files on that volume. It also contains certain information critical to the
system's ability to access data on that disk volume, like the location of the INDEXF.SYS file itself. This
information is so important, it is separated into four pieces and stored in four different places on the disk;
minimizing the risk of losing all four pieces at once and maximizing the ability to recover data from a
damaged disk. As these four copies are part of the INDEXF.SYS file, the file must be fragmented at all
times, but only to the degree described here. The part of the file containing file headers can be made
contiguous and kept so.
Why Fragmentation Occurs
The OpenVMS File Allocation Strategy
When OpenVMS allocates disk space for a file, it looks in the storage bitmap to find what clusters are
available. In so looking, it always begins its scan of the storage bitmap from the beginning (LBN 0) when
the disk has been recently mounted. Thus there is a tendency on the part of OpenVMS to group files
near the logical beginning of a disk, leaving the higher LBNs free. This tendency is modified (for better or
for worse) by the Extent Cache (see Extent Cache section later in this chapter for a more complete
explanation), but it is worth understanding clearly to grasp one of the primary causes of file and free
space fragmentation on an OpenVMS disk.
Starting with a completely empty disk, allocating space by choosing the first available clusters in the
storage bitmap is a reasonable approach. At least it is until some files are deleted. Until file deletions
begin, you would see the storage bitmap bits changing steadily from "free" to "allocated," from beginning
to end, like mercury in a thermometer rising from the bulb to the boiling point. The state of the disk is
clear: every cluster before a certain point is allocated to one file or another, while every cluster after that
same point is free, waiting to be allocated to a new file. Additionally, every file is contiguous - the ideal
state for maximum disk I/O performance under most circumstances.
Figure 2-2 Contiguous Files On A Disk
Once even a single file is deleted, however, the OpenVMS scan-from-the-beginning allocation strategy
begins to trip over itself. When the file is deleted, naturally, its clusters are marked "free" in the storage
bitmap. Our elegant thermometer is now broken, having a gap in the mercury somewhere between the
bulb and the mercury's highest point.
Written 1994
Page 30 of 121
THE FRAGBOOK
By Craig Jensen
Figure 2-3 Fragmented Files On A Disk
The scan-from-the-beginning allocation strategy is going to find that gap on the next allocation scan and
allocate the space to the new file. This is fine, presenting no performance problem or fragmentation
susceptibility, provided the new file fits entirely within the gap vacated by the deleted file.
Figure 2-4 New File Allocation
But what if it doesn't fit? What if the new file is larger than the one deleted? OpenVMS will allocate the
entire gap (or what is left of it if part has been used already) to the new file and then continue its scan of
the storage bitmap to find more space to allocate. With only a single gap in the storage bitmap, this
continued scan will take us all the way to the end of the allocated portion of the storage bitmap and there
we will find the space to allocate for the remainder of the file. Not so bad. The file has only two extents
(fragments). And OpenVMS, as we have seen, was specifically designed to deal with files broken into
multiple fragments. This two-fragment file is not a serious problem for OpenVMS, causing only a slight
degradation of performance. But what happens when more than a few files are deleted? What happens
when dozens, hundreds or even thousands of files are deleted, as is the typical case for an interactive
time-sharing system like OpenVMS? What happens is that the mercury in our thermometer becomes
shattered into a zillion pieces, with a zillion gaps into which file fragments can be allocated. In fact, even
with a maximally fragmented storage bitmap, in which precisely every other cluster is allocated, with the
intervening clusters free, OpenVMS continues to merrily allocate disk space on a first-come-first-served,
scan-from-the-beginning basis. Space for a 100 block file allocated under these circumstances on a disk
with a one-block cluster size would be allocated in 100 separate pieces, giving you a file requiring 100
separate disk I/O operations to service, where a single I/O operation would serve for the same file
existing in only one piece.
Why? Well, scanning the storage bitmap takes precious time. Ending the scan at the first available
cluster makes for shorter scans and saves time. At least it saves scanning time. But what about the 100
times greater overhead required to access fragmented files?
Written 1994
Page 31 of 121
THE FRAGBOOK
By Craig Jensen
The Extent Cache
As we have seen in Chapter 1, a decade ago there were good reasons for this now seemingly awful
blunder. Before inspecting its true impact, however, we have to take into consideration the extent cache.
The extent cache is a portion of the system's memory that is set aside for the use of the OpenVMS file
allocation mechanism. The extent cache stores the LBNs of released clusters, making it easy for
OpenVMS to reuse these same clusters without the overhead of a storage bitmap scan.
Figure 2-5 Extent Cache
Some argue that the extent cache completely overcomes the drawbacks of the scan-from-the-beginning
allocation strategy, claiming that the majority of deleted files (the ones whose clusters will be loaded into
the extent cache) tend to reside in the higher LBNs of a disk. While this may be true in a contrived
laboratory environment, it is not the case in a typical production environment. In a production
environment, with lots of users running programs that create and delete files willy-nilly, the deleted files
tend to occur randomly over the entire range of LBNs on a disk.
Seeing It With Your Own Eyes
The above description of the OpenVMS file allocation strategy may seem beyond belief or exaggerated.
If you lean towards skepticism, here is a way to demonstrate the matter for yourself.
You need a disk that can be initialized. Of course, this means all files on the disk will be lost, so don't go
initializing a disk containing data you need. Use a floppy disk, if you have one on your system, or use a
spare disk. If you have neither, use a data disk only after you have backed it up carefully.
You will need two test files: one very small (1 to 4 blocks) called TEST_SMALL.DAT and one somewhat
larger (about 100 blocks) called TEST_BIG.DAT. It does not matter what is in these files. Pick any two
files you have on hand that are about the right size and copy them, or create a new file using the DCL
CREATE command, a text editor or some other procedure of your choice.
Then create the following command procedure, using your own disk device designation in place of
DUA1:
$! Demonstrate Fragmentation Occurring
$
$ initialize /index=end DUA1: DEMO
$ mount DUA1: DEMO
$ count = 0
Written 1994
Page 32 of 121
THE FRAGBOOK
By Craig Jensen
$ LOOP:
$ copy /contiguous TEST_SMALL.DAT DUA1:[000000]TEST_SMALL.TMP
$ copy /contiguous TEST_SMALL.DAT DUA1:[000000]
$ count = count + 1
$ if count .LT. 10 then goto LOOP
$ delete /nolog DUA1:[000000]TEST_SMALL.TMP;*
$ copy /log TEST_BIG.DAT DUA1:[000000]
$ dismount DUA1:
$ mount DUA1: DEMO
$ dump /header /blocks=END:0 DUA1:[000000]TEST_BIG.DAT
$ exit
This command procedure initializes the scratch disk so it consists entirely of one big contiguous free
space, less the necessary system files. It then creates ten pairs of small files and deletes every other
one, leaving ten small files separated by ten small gaps. Next, it copies one large file onto the disk. This
large file is invariably broken up by OpenVMS into ten small pieces (occupying the ten small gaps) and
one large piece (the remainder). In other words, the file is created by OpenVMS in a badly fragmented
condition even though there is plenty of free space further along on the disk in which the file could have
been created contiguously.
In the display resulting from the DUMP /HEADER command at the end of the command procedure, file
fragments are represented by the Retrieval Pointers. If there is more than one pointer, the file is
fragmented. In this example, you should see eleven retrieval pointers. In the dump, the counts and
number of map area words are not important for our purposes; it is the number of pointers that you
should pay attention to. As you can see from the dump of the header, the file TEST_BIG.DAT is split into
many fragments even though far more than 100 free blocks remain on the disk in a single contiguous
free space.
When you consider the long-term effects of this allocation strategy on a disk in continuous use, you can
see readily that fragmentation can become extreme.
Multi-Header Files
Fragmentation at its worst comes in the form of the multi-header file. As its name implies, this is a file
with more than one header or, to be more precise, with a header containing so many retrieval pointers
they won't fit into a single one-block header. OpenVMS, therefore, allocates a second (or third or fourth!)
block in the INDEXF.SYS file to accommodate storage of the extra retrieval pointers. Just for the record,
the first block of a file header will hold all the information there is to know about a file, plus approximately
70 retrieval pointers. A full header block, therefore, can accommodate a file fragmented into as many as
70 pieces. This is pretty miserable, as fragmentation goes, but it can get worse - much worse.
Written 1994
Page 33 of 121
THE FRAGBOOK
By Craig Jensen
A second header block can be allocated to hold approximately another 102 retrieval pointers. This gets
us up to the positively gross level of 172 fragments in a single file. Not wanting to underestimate the
depths to which disk management can fall, the VMS developers provided for even more additional
header blocks - each one holding another 102 pointers or so. I don't want to take this line of discussion
any further, though. Fragmentation to the tune of hundreds of fragments per file borders on outright
sabotage.
The Magnitude Of The Fragmentation Problem Today
How widespread is the fragmentation disease? Pandemic is the word doctors use to describe a disease
when virtually everyone has it. Fragmentation is unquestionably pandemic. It occurs on every computer
running the OpenVMS system, except:
•
•
•
a diskless system
a system with pre-formatted, Read-Only disks (like a CD-ROM server)
a system that is turned off (expensive doorstop)
If you have a computer system that you don't use very often, its fragmentation problem will be slight. But
if you don't use it, who cares?
That leaves us with all the other systems - the vast majority by far. These systems are typically running
24 hours a day, used interactively by the users from somewhere around 8:00 AM to the vicinity of 5:00
PM, with peaks of usage around 10:00 AM and 2:30 PM, and a real dead spot at lunch time. Such
systems typically have sporadic usage in the evening, then slam to 100% utilization at midnight when a
barrage of batch jobs kick off and run for several hours; usage then tapers off to nearly nothing until the
users
arrive
again
in
the
morning.
Such a system typically has several disk drives dedicated to user applications. These disks get a lot of
use, with hundreds of files being created and deleted every day. Naturally, more are created than are
deleted, so the disk tends to fill up every few months and stay that way (nearly full) until the System
Manager forces users to delete excess files.
Under these circumstances, a disk will fragment badly. You can expect to see a 10% to 20% increase in
fragmentation each week. That is, if you had 10,000 files, all contiguous at the beginning of the week, by
the same time the next week, you could expect those same 10,000 files to consist of 11,000 pieces or
more. A week later, there would be over 12,000 pieces, then 13,000 and so on. After a month, the
fragmentation level would exceed 40% with over 14,000 pieces. In three months, the level multiplies to
over 240%, with over 34,000 pieces. After a year, the problem would theoretically reach astronomical
proportions, with those same 10,000 files fragmented into some 1.4 million pieces. But it doesn't really,
as there aren't enough disk blocks to hold that many pieces (on this "typical" disk) and the performance
degradation is so bad that users aren't able to use the system enough to keep up the fragmentation rate.
It is true however, that a poorly managed disk, with no handling done for fragmentation will, over time,
degrade so badly that it becomes for all practical purposes unusable simply because each file is in so
many pieces that the time to access all the files a user needs is just not worth the effort.
Written 1994
Page 34 of 121
THE FRAGBOOK
By Craig Jensen
CHAPTER 3
HOW CAN YOU TELL IF YOU'VE GOT IT?
Like someone first hearing about a terrible new disease, you may find yourself wondering, "Do I have
fragmentation? How do I know if I've got it?"
Some System Managers just know they've got it, while others wait for outraged users to complain (tar,
feathers and shotguns loaded with peppercorn and rock salt) of rotten performance and use that as a
measure of fragmentation. Professional System Managers actually look into their VAX or Alpha AXP
computer systems.
It is not difficult to find out whether you have fragmentation or not, though it is easier if you have the right
tools.
Fragmentation Analysis Tools
The DUMP Utility
The simplest and most direct method for determining whether a file is fragmented is the DUMP
command, used as follows:
$ DUMP /HEADER /BLOCKS=END:0 filespec
This command allows you to examine the header of a specific file for multiple retrieval pointers. Each
pointer represents one fragment (extent) in the file. So, if there is only one pointer, the file is contiguous
(not fragmented at all). If there are two pointers, the file is fragmented into two pieces. Three pointers
means three fragments, and so on.
The drawbacks of the DUMP command are that it can only be used on one file at a time, requiring an
unacceptable amount of time to examine all the files on a disk, and it gives you a lot of irrelevant
information, with the map pointers appearing at the end of all the information displayed.
The MONITOR Utility
To determine whether fragmented files are causing your computer to do excessive disk I/O, you can use
Digital's MONITOR utility, which comes free with OpenVMS. Type this command:
$ MONITOR IO
You should see a screen that looks something like this:
OpenVMS Monitor Utility
I/O SYSTEM STATISTICS
on node CURLY
6-Sep-1993 11:26:20
CUR
Written 1994
AVE
MIN
MAX
Page 35 of 121
THE FRAGBOOK
By Craig Jensen
Direct I/O Rate
9.50
6.42
0.00
11.58
Buffered I/O Rate
0.66
0.89
0.00
9.27
Mailbox Write Rate
0.71
0.12
0.00
0.96
Split Transfer Rate
1.00
0.81
0.00
6.33
Log Name Translation Rate
0.00
0.22
0.00
0.99
File Open Rate
0.20
0.27
0.00
0.40
Page Fault Rate
Page Read Rate
Page Read I/O Rate
Page Write Rate
Page Write I/O Rate
Inswap Rate
Free List Size
Modified List Size
3.42
3.08
0.95
20.86
3.01
2.21
0.00
6.29
1.46
1.03
0.00
3.99
0.41
0.87
0.00
0.89
0.19
0.63
0.00
0.50
0.00
0.00
0.00
0.00
48048.00 48042.65 48042.00 48049.00
252.00 247.31
0.00 252.00
Example 3-1 MONITOR IO Display
In example 3-1, the word "CUR" indicates the current rate of I/O transfers per second. "AVE" means
average, "MIN" means minimum and "MAX" means maximum.
The fourth line in the table is the Split Transfer Rate. This line tells you how many times the computer is
having to do two or more I/O transfers when one would serve. A split transfer is the result of
fragmentation. If there are any split transfers at all, you are suffering performance losses due to
fragmentation.
Now enter this command:
$ MONITOR FCP
(FCP stands for File Control Primitive)
You should see a screen that looks something like this:
OpenVMS Monitor Utility
FILE PRIMITIVE STATISTICS
on node MOE
6-SEP-1993 13:28:53
CUR
AVE
MIN
FCP Call Rate
Allocation Rate
Create Rate
Disk Read Rate
Disk Write Rate
Volume Lock Wait Rate
Written 1994
0.66
0.00
0.00
0.62
0.03
0.03
0.00
0.00
0.00
MAX
9.63
1.32
1.98
0.00
0.61
0.00
17.94
0.33
0.26
0.00
3.98
0.00
0.01
0.00
0.66
Page 36 of 121
THE FRAGBOOK
By Craig Jensen
CPU Tick Rate
File Sys Page Fault Rate
Window Turn Rate
File Lookup Rate
File Open Rate
Erase Rate
0.33
1.00
0.00
21.85
0.00
0.03
0.00
0.66
3.65
0.37
0.00
3.98
0.33
0.33
0.00
0.31
0.24
0.00
0.00
0.00
0.00
8.97
6.31
0.00
Example 3-2 MONITOR FCP Display
The critical entry in this table is Window Turn Rate, which tells you how many times OpenVMS had to
load new retrieval pointers from a file's header to gain access to the desired portion of a file. The term
window, as it is used here, means the set of retrieval pointers the system keeps in memory to access the
file. If the file is contiguous, only one pointer is needed to access the file, so a window turn would never
occur. A window typically holds seven pointers, so a file can be fragmented in up to seven pieces and
still can be accessed without causing a window turn. When there are eight or more pieces, however, one
or more pointers have to be flushed and new pointers loaded into the window in memory to get at the
later parts of the file. If a file is fragmented into many pieces, windows turns can become a major
performance bottleneck.
Split transfers and window turns are not the only consequences of fragmentation, but they are the only
ones you can detect with the OpenVMS MONITOR utility. If you do not have the Disk Analysis Utility,
Software Performance Monitor or VAX Performance Advisor, MONITOR will have to do.
Digital's Performance Utilities
Digital Equipment Corporation offers two system utilities that include some fragmentation analysis
capability: the Software Performance Monitor (SPM) and VAX Performance Advisor (VPA). Both will tell
you a little about fragmentation on your disk, but neither will tell you a lot, as they were designed
primarily for other things. And they are a bit pricey, especially when compared to Executive Software's
Disk Analysis Utility.
Software Performance Monitor (SPM)
Digital's Software Performance Monitor, or SPM for short, produces two reports that can be used to
detect performance problems due to fragmentation. One reports system performance problems in terms
of Window Turns and Split I/Os and the other reports information about fragmentation of files and free
space on a disk. Unfortunately, the Window Turns and Split I/Os information is only reported on a
system-wide basis, so the performance problem cannot be traced to a particular disk. Without knowing
which disk is the cause of the problem, there is nothing you can do about it except "fix everything" or use
other tools to narrow the search.
We'll look first at the report on Window Turns and Split I/Os. Before you can generate the report, you
have to collect data for a while. The DCL command to invoke SPM is "PERFORMANCE" and the
parameters for this DCL command are the commands acted upon by SPM.
First, collect the necessary data:
$ PERFORMANCE COLLECT=TUNE /CLASS=ALL /OUTPUT=filename
Written 1994
Page 37 of 121
THE FRAGBOOK
By Craig Jensen
In this command, filename is the name you want given to the output file. When you have finished
collecting your sample of performance data, use this command to stop SPM:
$ PERFORMANCE COLLECT=TUNE /STOP
Then, generate the report using the same file name as above with this command:
$ PERFORMANCE REPORT=LOG_FILE filename
This produces an eleven-page report. Only the second page is of interest here, and even that one
contains far more data than we need. This shows a sample from page 2 of the SPM SYSTEM report:
+------ File I/O Rates (per second) -----+
!
!
! Window Window Split Erase File !
! Hits Turns I/Os I/Os Opens !
! ------ ------ ------ ------ ------ !
! 25.5 0.4 0.2 0.0 3.7 !
+----------------------------------------+
Example 3-3 SPM SYSTEM Report
This box contains the two numbers that interest us most: Window Turns and Split I/Os. Use these
numbers to determine the impact of fragmentation on the performance of your VAX. Any Window Turns
or Split I/Os at all means fragmentation is probably slowing down your VAX or Alpha AXP.
Next we'll look at the report on disk files and space. This time you do not have to collect data before you
can generate the report. The command to generate the report is:
$ PERFORMANCE REPORT=DISK_SPACE diskname
This produces a report of three pages. There is a lot of useful information in this report, but the key
information is Mean no. extents/file on Page 2 and the listing of Files with extension headers on Page 3.
Here is a sample of the SPM DISK_SPACE report:
11-NOV-1993 09:43:32.27
SPM V3.3-03 Page 1
****
Detailed
volume
_HSCOOO$DUA5: *****
VAX
analysis
for
Items preceded by 'I','M' or 'S' are controlled by
Initialize,
Mount or Sysgen.
(I ) Volume name is 'USERDISK1 '.
(I ) Serial number is
0.
Written 1994
Page 38 of 121
THE FRAGBOOK
By Craig Jensen
(I ) Creation date was 4-NOV-1987 15:22:23.77.
(I ) Volume owner is 'CAMPBELL
'.
(IM ) Owner uic is [SYSTEM].
(I ) Format type is 'DECFILE11B '.
(IM
)
Volume
protection
is
[RWED,RWED,RWED,RWED].
(IMS) Default data checking is NOREAD-CHECK,
NOWRITE-CHECK.
(I ) Structure level is 2, version 1.
(I ) Allocation cluster size is 3 blocks.
(I ) Index file bitmap is located at LBN 445614.
(IM ) Default file extension is 5 blocks.
(IM ) Default window size is 10 retrieval pointers.
(I
) Maximum number of files allowed is
111384.
(IMS) Default number of cached directories is 3.
Volume size is
891072 blocks with
51 blocks/track,
14 tracks/cylinder,
1248 cylinders/volume.
*****
Summary of FREE STORAGE
_HSC000$DUA5: *****
Free Storage Extent Sizes No. Extents
% Space
-------------------------- ----------- -----------
for
Cum
=
3, =
6, =
9, =
15, =
30, =
60, =
90, =
150, =
300, =
600, =
900, =
1500, =
3000, =
6000, =
9000. =
15000. =
30000, =
60000,
=
90000,
=
150000
0
100.0
Total free blocks =
68736.
No. of extents =
101.
Mean blocks/extent =
681.
Smallest extent =
3.
Largest extent =
3957.
11-NOV-1993 09:43:32.27
SPM V3.3-03 Page 2
VAX
****
Summary of ALLOCATED SPACE for
_HSC000$DUA5: *****
Space Allocated per Header
Cum % Headers
-------------------------- ----------=
Written 1994
3, =
6, =
No. Headers
------------9, =
15, =
Page 39 of 121
THE FRAGBOOK
By Craig Jensen
30, =
60, =
90, =
150, =
300, =
600, =
900, =
1500, =
3000, =
6000, =
9000, =
15000, =
30000, =
60000,
=
90000,
=
150000
0
100.0
Minimum allocated extent =
3.
Maximum allocated extent =
47262.
Total allocated blocks =
820965 ( 92.1% of
volume).
Total used blocks
=
801995 ( 97.7% of
allocated).
No. extents allocated =
13682.
Mean alloc blocks/extent =
60.
Total no. of files
=
13688.
Mean alloc blocks/file =
60.
Mean no. extents/file =
1.
No. extension headers =
4.
No. multi-volume files =
0.
No. directories
=
468.
11-NOV-1993 09:43:32.27
SPM V3.3-03 Page 3
****
Files with extension
_HSC000$DUA5: *****
VAX
headers
for
File
name
Ext. headers
----------------------------------------------------------- ----------[CRANDALL.PJT]M4WC_26SEP.DAT;1
1
[FAIR.CMS.EOF.33]EXEC.LIS;1
3
Example 3-4 SPM Disk Space Report
VAX Performance Advisor (VPA)
Before you can produce any report, graph or analysis from VPA, data must be collected from the system
over a period of time. You can tell VPA to start collecting the data immediately:
VPA> ADVISE /COLLECT /START
or you can have it start collecting the data at a later specified time:
VPA> ADVISE /COLLECT /BEGINNING="dd-mmm-yyyy hh:mm:ss.cc"
Written 1994
Page 40 of 121
THE FRAGBOOK
By Craig Jensen
Then you have to tell VPA when to stop collecting data. Again, you can tell it to stop immediately:
VPA> ADVISE /COLLECT /STOP
or at a later specified time:
VPA> ADVISE /COLLECT /ENDING="dd-mmm-yyyy hh:mm:ss.cc"
Note: You can generate VPA reports while the data capture is running. You do not have to stop VPA to
request graphs, reports or an analysis.
Window turns can be reported as a graph. To display a graph of window turns, type the following:
VPA> ADVICE /GRAPH=GRAPH /TYPE=CUSTOM=SYSTEM=(WINDOW)
Here is a sample of the VPA output showing the Window Turn Rate:
VPA CUSTOM GRAPH
Node: BOOT
Date: 16-SEP-1993 00:00-10:09
LEGEND:
1 = WINDOW_TURN
(Metric Values are Stacked)
Y-Units: Window Turn Rate
0.300 !
0.285 !
0.270 !
1
0.255 !
1
0.240 !
1
0.225 !
1
0.210 !
1
0.195 !
11
0.180 !
11
0.165 !
11
0.150 !
11 1
0.135 !
11 1
0.120 !
11 1
0.105 !
11 1
0.090 !
111 1
0.075 !
111 1
1
0.060 !
111 1
1
0.045 !
111 1 1
1
0.030 ! 1
1 1111 1 1
1
1
0.015 !11111111111111111 111111 11111 111111 11111 111111 11111
111111 11111 111
+------------------------------------------------------------------------0
1
2
3
4
5
6
7
8
9
10
Written 1994
Page 41 of 121
THE FRAGBOOK
By Craig Jensen
VPA
Command:
ADVISE/GRAPH/NODE=BOOT/TYPE=CUSTOM=SYSTEM=(WINDOW)
Example 3-5 VPA Output Showing Window Turns
Example 3-5 shows the Window Turn Rate over a period of ten hours (midnight to 10:09 AM) running
generally at about 0.015 window turns per second. Around 2:00 AM, however, there is a burst of activity,
with the Window Turn Rate peaking at 0.270 window turns per second.
What does this mean in terms meaningful to fragmentation analysis? Any window turns at all indicate a
fragmentation problem severe enough to degrade the performance of your system. The more window
turns you have, the worse the performance degradation is. The only acceptable number here is zero.
Another useful report you can get from VPA is one showing Split I/Os. To produce this report, enter the
following command:
VPA> ADVISE /REPORT=PERFORMANCE_EVALUATION=(NOALL_STATISTICS,VPA>_ DISK_STATISTICS) /NODE=BOOT
Here is a sample of such a report:
Reporting on Node1
VPA V2.0 Performance
CLUSTER
Evaluation
Monday 16 SEP 00:00 to 10:16
Page 1
+-----------------------------------------------------------------------+
| The following table gives the summary of all disk activity as seen |
| by the indicated node. An "*" for service node indicates that more |
| than one was detected.
|
+-----------------------------------------------------------------------+
Disk Avg I/O Avg Avg IOsz Source Service % % IO % IO
# of
Volume per Sec Queue Kb/sec in pgs Node Node Busy Read Split Type Samples
------ ------- ----- ------ ------ ------ ------ ------ ---- ----- ---- ------BOOTDISK
(BOOT$DUA0)
0.21 0.01 0.6 5.6 NODE1 NODE1 0.75 86 0 RD54 309
WORK1
(WORK1$DKA0)
0.00 0.00 0.0 0.0 NODE1 NODE2 0.00 0 0 UNK 309
CORPDISK
(BOOT$DUC0)
2.49 0.12 6.7 5.3
NODE1
84 0 RA81 309
0.04 0.12 0.1 5.3 NODE1
8.91 84 0
309
DATASHADOW (USER1$DKB200)
0.00 0.00 0.0 0.0 NODE1 NODE4 0.00
WORK2
(WORK2$DUB0)
0.00 0.00 0.0 0.0 NODE1 NODE3 0.00
FINDISK
(USER1$DKB400)
0.00 0.00 0.0 0.0 NODE1 NODE5 0.00
USERDISK2
(USER2$DKB0)
Written 1994
0
0 UNK
309
0
0 RD54
309
0
0 RD54
309
Page 42 of 121
THE FRAGBOOK
By Craig Jensen
0.29 0.03 1.3
Totals
8.6 NODE1 NODE4 1.07 88
3.20 0.16 8.66 5.41
82
5 UNK
309
0
User
ADVISE/REPORT=PERFORMANCE_EVALUATION=(NOALL,DISK_STAT)/NODE=BOOT
Command:
Example 3-6 VPA Report Showing Split I/Os
The important information in this report, for purposes of fragmentation analysis, is the third column from
the right, headed % IO Split. This indicates the percentage of I/O requests that resulted in Split I/Os. In
this example, you can see that only one disk, USERDISK2, is suffering from Split I/Os and that 5% of the
I/O requests to that disk result in Split I/Os. Therefore, you would conclude from this report that about 5%
of the USERDISK2 I/O load is resources wasted due to fragmentation.VPA
Disk Analysis Utility
The best tool, in my opinion, is the Disk Analysis Utility from Executive Software. This is a program I
designed and wrote originally in 1986 with my associate Rick Cadruvi. He has since reworked it and
enhanced it to the point where it probably qualifies as a total rewrite now, but it remains the only tool
designed specifically and solely to determine the extent of fragmentation on an OpenVMS disk.
The Disk Analysis Utility can provide anything from a quick summary to a very detailed analysis. It can
be run at any time on any Digital-supported disk and it is not necessary to allocate the disk, dismount it,
or stop users from accessing it.
The Disk Analysis Utility is invoked with the command:
$ DAU disk_name /qualifiers
where disk_name is the device name of the disk to be analyzed. If disk_name is omitted, the Disk
Analysis Utility will prompt for the disk name. The optional /qualifiers represents Disk Analysis Utility
qualifiers, which you can find described in its manual.
Sample Disk Analysis Utility Output
The following example was produced using the command:
$ DAU BOOT$DUA0: /FULL
Disk Analysis Utility
DISK_ANALYSIS V6.0
Copyright (c) 1993 Executive Software International, All Rights Reserved.
_BOOT$DUA0 :
1-AUG-1993 12:16:39.58
Number of Usable Blocks: 311100
Cluster Size
:
3
<<<<<<<<<<<<<<<<<<<<<<<<<<<< Free Space Summary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Total Free Space Size : 78260
Smallest Free Space :
676
Number of Free Spaces :
16
Largest Free Space : 9366
#Spaces = 82% of Total :
10
Mean Size of Free Space: 4891
<<<<<<<<<<<<<<<<<<<<<<<<<< 16 Largest Free Spaces >>>>>>>>>>>>>>>>>>>>>>>>>>>
Written 1994
Page 43 of 121
THE FRAGBOOK
By Craig Jensen
Start LBN Size Start LBN Size Start LBN Size Start LBN Size
--------- --------- --------- --------- --------- --------- --------- --------204828
9366 196610
6760 68198
4852 73886
3166
98304
8232 190860
5748 78336
4126 57390
1516
118748
7576 51854
5524 36978
3982 165152
1036
139196
6774 59472
5302 90274
3624 47986
676
<<<<<<<<<<<<<<<<<<<<<<<<< Free Spaces Distribution >>>>>>>>>>>>>>>>>>>>>>>>>>
Size Range
# Spaces
Size Range
# Spaces
-------------------- -----------------------------------1 to
2 0
511 to 1022 1
3 to
6 0
1023 to 2046 2
7 to
14 0
2047 to 4094 3
15 to
30 0
4095 to 8190 8
31 to
62 0
8191 to 16382 2
63 to
126 0
16383 to 32766 0
127 to
254 0
32767 to 65534 0
255 to
510 0
65535 to 311198 0
<<<<<<<<<<<<<<<<<<<<<<<<<<<< Free Space Detail >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Start LBN Size Start LBN Size Start LBN Size Start LBN Size
--------- --------- --------- --------- --------- --------- --------- --------36978
3982 59472
5302 90274
3624 165152
1036
47986
676 68198
4852 98304
8232 190860
5748
51854
5524 73886
3166 118748
7576 196610
6760
57390
1516 78336
4126 139196
6774 204828
9366
<<<<<<<<<<<<<<<<<<<<<<<<<<<< Special Case Files >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Reserved File : (1,1,0) [000000]INDEXF.SYS;1
Reserved File : (2,2,0) [000000]BITMAP.SYS;1
Zero Length File : (3,3,0) [000000]BADBLK.SYS;1
Reserved File : (3,3,0) [000000]BADBLK.SYS;1
Reserved File : (4,4,0) [000000]000000.DIR;1
Zero Length File : (5,5,0) [000000]CORIMG.SYS;1
Reserved File : (5,5,0) [000000]CORIMG.SYS;1
Zero Length File : (6,6,0) [000000]VOLSET.SYS;1
Reserved File : (6,6,0) [000000]VOLSET.SYS;1
Zero Length File : (7,7,0) [000000]CONTIN.SYS;1
Reserved File : (7,7,0) [000000]CONTIN.SYS;1
Zero Length File : (8,8,0) [000000]BACKUP.SYS;1
Reserved File : (8,8,0) [000000]BACKUP.SYS;1
Zero Length File : (9,9,0) [000000]BADLOG.SYS;1
Reserved File : (9,9,0) [000000]BADLOG.SYS;1
Zero Length File : (36,31,0) [SYS6.SYSMGR]ACCOUNTNG.DAT;1
Zero Length File : (581,15,0) [SYS6.SYSMGR]VMSIMAGES.DAT;14
Placed File
: (1132,20,0) [SYSE.V4COMMON.SYSEXE]JBCSYSQUE.DAT;3
Zero Length File : (1478,1,0) [SYSE.V4COMMON.SYSEXE]NOTICE.TXT;2
<<<<<<<<<<<<<<<<<<<<<<<<<< Most Fragmented Files >>>>>>>>>>>>>>>>>>>>>>>>>>>>
Written 1994
Page 44 of 121
THE FRAGBOOK
By Craig Jensen
77: (866,2,0) [SYSE.V4COMMON.SYSLIB]BASIC$STARLET.TLB;1
73: (86,24,0) [SYS1.SYSERR]ERRLOG.SYS;1
34: (116,7,0) [SYS3.SYSERR]ERRLOG.SYS;1
19: (26,73,0) [SYS0.SYSERR]ERRLOG.SYS;1
10: (81,68,0) [SYS2.SYSERR]ERRLOG.SYS;1
9: (1132,20,0) [SYSE.V4COMMON.SYSEXE]JBCSYSQUE.DAT;3
7: (1,1,0) [000000]INDEXF.SYS;1
6: (24,1,0) [SYS0.SYSEXE]PAGEFILE.SYS;1
4: (34,6,0) [SYS2.SYSEXE]PAGEFILE.SYS;1
4: (1051,3,0) [SYS3.SYSEXE]PAGEFILE.SYS;1
3: (1009,8,0) [SYS0.EASYLINK]INMAIL.LOG;1
3: (1206,24,0) [SYS6.SYSEXE]PAGEFILE.SYS;1
3: (1225,20,0) [SYS3.SYSMGR]OPERATOR.LOG;70
2: (230,2,0) [SYS5.SYSEXE]PAGEFILE.SYS;1
2: (316,11,0) [SYS0.EASYLINK]EMC.LOG;72
2: (618,8,0) [SYS1.SYSMGR]OPERATOR.LOG;92'
2: (1112,15,0) [SYS2.SYSMGR]OPERATOR.LOG;68
<<<<<<<<<<<<<<<<<<<<<<<<<<<<< File Information >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Maximum Number of Files: 51866
# Reserved Files
:
9
Total Number of Files :
1545
# Placed Files
:
1
Total Size of all Files: 230607
# Multi-Header Files :
2
Smallest File Size :
1
# Multi-Volume Files :
0
Largest File Size
: 50000
# Directory Files
:
125
Mean Size of all Files :
150
# Zero Length Files :
9
# Extent Headers
:
1
# Files with Frags >= 2:
17
Total File Fragments :
1763
Lost Blks/Extent Cache : 2333
Mean Fragments per File:
1.10
Total Split I/Os
: 1291
<<<<<<<<<<<<<<<<<<<<<<<<<<<< File Sizes Summary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Size Range
# Files
Size Range
# Files
-------------------- --------- -------------------- --------1 to
2 335
511 to 1022 11
3 to
6 288
1023 to 2046 8
7 to
14 179
2047 to 4094 4
15 to
30 267
4095 to 8190 1
31 to
62 187
8191 to 16382 4
63 to
126 134
16383 to 32766 2
127 to
254 81
32767 to 65534 1
255 to
510 34
65535 to 311198 0
<<<<<<<<<<<<<<<<<<<<<<<<<< File Fragments Summary >>>>>>>>>>>>>>>>>>>>>>>>>>>
Fragments # Files
Fragments
----------- --------- ----------- --------0
9
6
1
1
1518
7
1
2
5
8
0
3
3
9
1
4
2
10
1
Written 1994
# Files
Fragments
----------- --------12
0
13
0
14
0
15
0
16
0
# Files
Page 45 of 121
THE FRAGBOOK
By Craig Jensen
5
-
0
11
-
0
17+
-
4
Example 3-7 DAU OutputDisk Analysis Utility
How to Read the Disk Analysis Display
The heading includes the title, the version of the program used, a copyright notice, the physical device
name of the disk analyzed and a date/time stamp indicating when the report was generated. The
physical device name of the disk is provided to ensure there is no mistake about which disk is being
analyzed, because the utility accepts logical as well as physical disk names as input.
Number of Usable Blocks is the total number of usable blocks on the disk, whether allocated or free.
Cluster Size is the minimum allocation quantity for that disk. It is the number of blocks represented by
each bit in the storage bitmap. The cluster size is set by the System Manager at the time the disk is
initialized.
The Free Space Summary section gives information about free space on the disk so you can determine
the degree of free space fragmentation.
Total Free Space Size is the number of disk blocks that are unoccupied, including the number of blocks
in the extent cache for this disk on the local node.
Number of Free Spaces is the total number of free spaces (of any size) on the disk.
#Spaces = nn% of Total indicates how many free spaces combined represent 80% or more of the free
space on the disk. In the example, ten free spaces constitute 82% of the free space on the disk.
Smallest Free Space is the size in blocks of the smallest free space on the disk.
Largest Free Space is the size in blocks of the largest free space on the disk.
Mean Size of Free Space is the average size of a free space, calculated by dividing the total free space
size in blocks by the number of free spaces.
The table headed 16 Largest Free Spaces shows the location and size in blocks of the 16 largest
contiguous free spaces on the disk. Space held in system caches is considered only if the user has the
privileges required to access such information.
The table headed Free Spaces Distribution shows the number of free spaces on the disk categorized by
size in blocks. In the example, there is one free space from 511 to 1022 blocks in size, two more in the
range 1023 to 2046, three in the range 2047 to 4094, eight in the range 4095 to 8190 and two between
8191 and 16382.
The table headed Free Space Detail shows the size and location of unallocated free spaces on the disk,
in order from LBN 0 to the end of the disk. Again, space held in system caches is considered only if the
user has the privileges required to access such information.
The table headed Special Case Files shows files that may be of special interest to the System Manager.
The list includes files reserved for use by the file system, directory files, zero length files, multi-header
Written 1994
Page 46 of 121
THE FRAGBOOK
By Craig Jensen
files, placed files and any multi-volume files if the disk is part of a volume set. The files are listed both by
File ID and by name.
The table headed Most Fragmented Files shows the files that have the most fragments, in order from
most fragmented to least. Only files that have two or more fragments are listed, but the size of the list (64
files by default) can be limited by the use of the /MOST_FRAGMENTED_FILES=n qualifier. The number
of fragments in each file is shown as well. Multi-header files appear several times, as these are listed
once for each header.
The File Information table is a collection of statistical information about the files and fragments on the
disk.
Maximum Number of Files is the total number of files that can be created on the disk.
Total Number of Files is the total number of files (headers) currently on the disk, including zero-length
files.
Total Size of all Files is the number of disk blocks that are occupied. Blocks allocated to a file but not yet
written are included in the total as well.
Smallest File Size is the size in blocks of the smallest file on the disk.
Largest File Size is the size in blocks of the largest file on the disk.
Mean Size of all Files is the average size of all files, calculated by dividing the total size of all files by the
total number of files, excluding zero-length files.
# Extent Headers is the number of extra (extension) file headers. This does not include the primary
header required for each file.
Total File Fragments is the total number of pieces of files on the disk. Note that it is possible for this
number to differ from, and even be less than, the Total Number of Files figure, because zero-length files
are included in the Total Number of Files, but such files consist of zero fragments.
Mean Fragments per File shows how many fragments there are in the typical file on the disk (total
number of fragments divided by total number of non-zero-length files). This is an index of how
fragmented the files on the disk are. If the mean fragments per file is 1.00, the files are all contiguous. If
the figure is 1.10, then 10% of the files, on average, are in two pieces. 1.20 means 20%, 1.30 means
30%, etc. A figure of 2.00 means the files average two fragments each. 1.00 is the best figure attainable,
indicating that all files or nearly all files are contiguous.
# Reserved Files is the number of files reserved for use only by the file system.
# Placed Files indicates how many files on the disk are fixed at their current location using OpenVMS
placement control.
# Multi-Header Files indicates the total number of headers associated with files that have more than one
header.
Written 1994
Page 47 of 121
THE FRAGBOOK
By Craig Jensen
# Multi-Volume Files indicates how many files on the disk span two or more disks in a volume set.
# Directory Files indicates how many directory files there are on the disk.
# Zero Length Files indicates how many files have no blocks allocated to them.
# Files with Frags >= 2 indicates how many files have two or more fragments, meaning these files are
fragmented (not contiguous).
Lost Blks/Extent Cache is the result of the following calculation:
Number of Usable Blocks - (Total Free Space Size + Total Size of all Files)
In other words, this value is derived from two arithmetic operations. The number of blocks known to be
free is added to the number of blocks known to be in use. This sum is subtracted from the size of the
disk to arrive at Lost Blks/Extent Cache. Therefore, this value represents all the blocks that are not
accounted for. They are not in use and they are not available for use. Their status is unknown. On an
active disk, this is not alarming. These blocks may be any combination of:
-
blocks marked allocated that are not yet actually part of a file (due to space being allocated in
anticipation of writing data);
-
blocks held in another node's extent cache for this disk; or
-
blocks that were added to this node's extent cache while the Disk Analysis Utility was running.
Total Split I/Os indicates how many I/O operations since system boot time have been split into two or
more I/O operations to retrieve data split across two or more fragments of a file. This is the most
meaningful indicator of the actual cost of fragmentation on your system. It is important to note, however,
that this value is the total split I/Os for the entire system, not just for one disk. Therefore, defragmenting
only one disk on a multi-disk system may only cause this value to decrease slightly. It may be necessary
to defragment all the disks on the system to see this value reduced to zero (following the next reboot).
The table headed File Sizes Summary shows the number of files on the disk categorized by size in
blocks. In the example, there are 335 one-to-two block files, 288 files from 3 to 6 blocks in size, and so
on.
The table headed File Fragments Summary shows the number of files on the disk categorized by number
of fragments. In the example, there are 9 zero-length files, 1510 files with one fragment each
(contiguous), six files with two fragments each, and so on.
Note: The figures provided by the Disk Analysis Utility constitute a rolling snapshot of the state
of the disk, so the figures can appear inconsistent at times.
Customized Analysis
Another, more direct approach, is this: Run some tests on your system to get some idea of how good or
bad its performance is. If you don't have tools to do this, you can make some up. Time the system
Written 1994
Page 48 of 121
THE FRAGBOOK
By Craig Jensen
response. How long is it from the time a user hits the RETURN key after a command until the first
characters are displayed on the screen in response to that command? How many users can log in before
the system starts to bog down? How long does a major batch job take to complete? These are tests you
can run on your own system that tell you how well your system is performing under your conditions at
your site. Be sure to run these tests when the system is in use by everyday users. Otherwise, you get a
distorted picture that reflects a laboratory environment instead of your real environment.
Then, once you have run these tests and documented the results, backup your disks. You probably do
this once a week or so anyway, so it doesn't cost you any extra work if you time your testing to occur
shortly before the backup.
Then, initialize each disk and restore the complete disk from backup. (Be very, very sure you have a
valid, usable backup before initializing the disk because, once you do, the data on that disk is gone. The
/VERIFY qualifier on the BACKUP command is useful for this purpose.) When you initialize the disk, be
sure to use sensible values for such qualifiers as /CLUSTER_SIZE=, /DIRECTORIES= and
/HEADERS=. Low values for these will cause your INDEXF.SYS file to be excessively fragmented and
may contribute to fragmentation in other ways. (See Chapter 5 for recommendations on minimizing
fragmentation by proper initialization of disks.)
After initializing each disk, restore the data from backup, making sure you use the /NOINITIALIZE
qualifier on the BACKUP command to prevent BACKUP from overriding your initialization parameters.
This results in disks that are, for the moment at least, free of fragmentation. Now run your tests again,
under conditions as similar to the first test as possible. Compare the results and see just how much
fragmentation was costing your own site in terms of performance.
Written 1994
Page 49 of 121
THE FRAGBOOK
By Craig Jensen
CHAPTER 4
WHAT'S WRONG WITH
FRAGMENTATION?
When you find out you have fragmentation, your next concern might be, "How bad is it?" If the Disk
Analysis Utility reveals a Mean Fragments Per File (fragmentation rating) of 1.2 (badly fragmented) or
more, you may be in trouble. You had better do something about that fast, before the system stops
altogether.
If you think I am exaggerating, consider this: One site, with a combination system/user disk with 4.9
fragments per file required nearly half an hour for each user to log on. This dropped to a few seconds
once the main disk was defragmented. Another system, with an incredible 18.7 fragments per file, was
literally unusable until defragmented.
A fragmentation rating of 1.2 means there are 20% more pieces of files on the disk than there are files,
indicating perhaps 20% extra computer work needed. It should be pointed out that these numbers are
merely indicators. If only a few files are badly fragmented while the rest are contiguous, and those few
fragmented files are never accessed, the fragmentation may have no performance impact at all. On the
other hand, if your applications are accessing the fragmented files heavily, the performance impact could
be much greater than 20%. You have to look further to be sure. For example, if there were 1,000 files
and only one of those files is ever used, but that one is fragmented into 200 pieces (20% of the total
fragments on the disk), you have a serious problem, much worse than the 20% figure would indicate. In
other words, it is not the fact that a file is fragmented that causes performance problems, it is the
computer's attempts to access the file that degrade performance.
To explain this properly, it is first necessary to examine how files are accessed and what is going on
inside the computer when files are fragmented.
What's Happening to Your Disks?
Here's a diagram of a disk:
Figure 4-1 Disk
This diagram represents one side of a single platter. The circles represent tracks, though in reality there
would be far more tracks on one side of a platter. Within one track is a shaded strip representing a file.
Imagine a head on an arm, not much different from the needle on the tone arm of a phonograph, moving
from file to file as the platter spins. The contents of the file can be scanned from the disk in one
continuous sweep merely by positioning the head over the right track and then detecting the file data as
the platter spins the track past the head.
Now here is a diagram of a disk with one file broken into two parts:
Written 1994
Page 50 of 121
THE FRAGBOOK
By Craig Jensen
Figure 4-2 Disk With File In Two Parts
In this case, the file is fragmented into two parts on the same track. Thus, to access this file, the head
has to move into position as described above, scan the first part of the file, then suspend scanning briefly
while waiting for the second part of the file to move under the head. Then the head is reactivated and the
remainder of the file is scanned.
As you can see, the time needed to read the fragmented file is longer than the time needed to read the
unfragmented (contiguous) file. The exact time needed is the time to rotate the entire file under the head,
plus the time needed to rotate the gap under the head. A gap such as this might add a few milliseconds
to the time needed to access a file. Multiple gaps would, of course, multiply the time added. The gap
portion of the rotation is wasted time due solely to the fragmentation disease. Then, on top of that, you
have to add all the extra operating system overhead required to process the extra I/Os.
Now let's look at another disk:
Figure 4-3 Two File Extents On Different Tracks
In this case, the file is again fragmented into two parts. But this time the two parts are on two different
tracks. So, in addition to the delay added by the rotation of the disk past the gap, we have to add time for
movement of the head from one track to another. This track-to-track motion is usually much more timeconsuming than rotational delay, costing tens of milliseconds per movement. Further, this form of
fragmentation is much more common than the gap form.
To make matters worse, the relatively long time it takes to move the head from the track containing the
first fragment to the track containing the second fragment can cause the head to miss the beginning of
the second fragment, necessitating a delay for nearly one complete rotation of the disk, waiting for the
second fragment to come around again to be read.
But the really grim news is this: files don't always fragment into just two pieces. You might have three or
four, or ten or a hundred fragments in a single file. Imagine the gymnastic maneuvers your disk heads
are going through trying to collect up all the pieces of a file fragmented into 100 pieces!
Written 1994
Page 51 of 121
THE FRAGBOOK
By Craig Jensen
Figure 4-4 File In Many Fragments
When it takes more than one I/O to obtain the data contained in one (fragmented) file, this is known as a
split transfer or split I/O. When a file is fragmented into more than the seven pieces that can be
accommodated by a single file window, and the eighth or later fragment is accessed, one or more
retrieval pointers are flushed from the window and it is reloaded with seven more retrieval pointers. This
is called a window turn. When more than 70 pointers are required to map (indicate the location of) a file
in its header, a second (or third, or fourth) file header is required. The name for that is a multi-header file.
Each of these fragmentation symptoms costs overhead, and each one described costs much more than
the one before.
For every split transfer, the overhead of a second (or third, or fourth, etc.) disk I/O transfer is added. For
every window turn, the overhead of reloading the window, in addition to the I/O required just to access
the fragment is added. For every multi-header file accessed, add to each I/O the overhead of reading a
second (or third, or fourth, etc.) file header from the INDEXF.SYS file.
On top of all that, extra I/O requests due to split I/Os and window turns are added to the I/O request
queue along with ordinary and needful I/O requests. The more I/O requests there are in the I/O request
queue, the longer user applications have to wait for I/O to be processed. This means that fragmentation
causes everyone on the system to wait longer for I/O, not just the user accessing the fragmented file.
Fragmentation overhead certainly mounts up. Imagine what it is like when there are 300 users on the
system, all incurring similar amounts of excess overhead.
What's Happening to Your Computer?
Now let's take a look at what these excess motions and file access delays are doing to the computer.
OpenVMS is a complicated operating system. It is complex because it has a great deal of functionality
built in to the system, saving you and your programmers the trouble of building that functionality into your
application programs. One of those functions is the service of providing an application with file data
without the application having to locate every bit and byte of data physically on the disk. OpenVMS will
do that for you.
When a file is fragmented, OpenVMS does not trouble your program with the fact, it just rounds up all the
data requested and passes it along. This sounds fine, and it is a helpful feature, but there is a cost.
OpenVMS, in directing the disk heads to all the right tracks and LBNs within each track, consumes
system time to do so. That's system time that would otherwise be available to your applications. Such
time, not directly used for running your program, is called overhead.
You can see overhead depicted graphically on your system by using the MONITOR utility. Type this
command:
Written 1994
Page 52 of 121
THE FRAGBOOK
By Craig Jensen
$ MONITOR MODES
You should see a display that looks something like this:
Example 4-1 MONITOR MODES Display
The critical line of this display is the User Mode line. That's the one that tells you how much of the VAX's
computing capacity is being used to run application programs. Everything else is mostly overhead,
unless you are running PDP-11 programs in compatibility mode, in which case that would have to be
counted as productive (!) time as well.
Idle time, of course, is unused computer time, but that is a type of overhead, isn't it? When you look at
this display, you really want to compare the User Mode value to the total of the values above it. The
modes higher in this table show you how much of your computer's time is being spent doing work on
your behalf, other than running the application program itself. In my experience as a System Manager, I
have been fairly satisfied to see these values split in a 2-to-1 ratio. That is, I expect to see two-thirds of
the system being used directly for running applications in user mode, and one-third being consumed by
overhead. If you see more than one-third of the system spent on overhead, as in the example above, you
have a performance problem, and fragmentation is a good place to look for the cause.
If there is fragmentation and system overhead is high, as indicated by large values for Interrupt Stack
and Kernel Mode, you probably have a situation in which OpenVMS is spending a lot of extra time
processing I/O requests because two or three or more actual I/O transfers have to be done to collect up
all the pieces of fragmented files. This adds up to a performance problem.
What's Happening to Your Applications?
What's happening to your applications while all this overhead is going on? Simple: Nothing. They wait.
Written 1994
Page 53 of 121
THE FRAGBOOK
By Craig Jensen
What's Happening to Your Users?
Oh yes, the users. . . .
The users wait, too, but they do not often wait without complaining, as computers do. They get upset, as
you may have noticed.
The users wait for their programs to complete, while excess fragments of files are chased up around the
disk. They wait for keyboard response while the computer is busy chasing up fragments for other
programs that run between the user's keyboard commands. They wait for new files to be created, while
the operating system searches for enough free space on the disk and allocates a fragment here, a
fragment there, and so on. They wait for batch jobs to complete that used to get done faster on the same
computer with the same user load, before fragmentation robbed them of their machine time. They even
wait to log in, as the operating system wades through fragmented command procedures and data
needed by startup programs. Even backup takes longer - a lot longer - and the users suffer while backup
is hogging the machine for more and more of "their" time.
All the users know is this: The system is slow; you're in charge of the system; it's all your fault. And
they're right. If you are the System Manager, you are responsible for the computer system and its
performance.
If management and finance people are aware of the problem, they view it as paying for 100% of a
computer system, but getting something less for their money. The users are not only upset, they're
getting less work done and producing less income for the company. That's bad, and it's your
responsibility. Something had better be done about it, and quickly.
Written 1994
Page 54 of 121
THE FRAGBOOK
By Craig Jensen
CHAPTER 5
WHAT CAN YOU DO ABOUT IT?
What can you do about fragmentation? Get rid of it, of course. How? There are several ways, all of
which will be explained here. It's not hopeless. Something can be done about it.
Clear off Disks
First off, you could keep your disks half empty. This discipline, enacted as a matter of policy, would keep
enough free space so files would not fragment so badly on creation. It is at the moment of file creation
that the fragmentation problem begins. When a disk is nearly full, the free space on the disk tends to
fragment badly. This greatly increases the likelihood that a newly created file will be created in many
small fragments. When a disk is half empty, the free space tends to occur in larger pools (because there
is more of it), increasing the chances that newly created files will be created in a single contiguous piece
or, at worst, in only a few larger fragments. So a policy of keeping disks half empty reduces the
fragmentation problem by prevention.
Of course, this solution carries the drawback of having to have twice as much disk space as you really
need. Nice if you have the budget.
Copy / Contiguous
A second solution is to use the DCL command DUMP /HEADER to examine files that are known to be in
heavy use and, when a fragmented file is found, use the DCL command COPY /CONTIGUOUS to
defragment the file, purging the old copy once the new one is made. This is a simple and inexpensive
solution, but tedious, to say the least. It has the additional drawback of changing the creation and backup
dates of each file copied, which means your incremental backups are going to swell with files that have
not materially changed. It also changes the File I.D. for the file, which may cause problems for your batch
and print queues. Further, you must be very sure not to attempt this at the same time an application is
accessing the file. At best, the application could be locked out and may abort processing. At worst, the
application could update the old copy of the file during your copy-making and the updates would be lost
in the purge. Another major drawback of this technique is that it marks the file as having to be
contiguous. This causes OpenVMS to recopy the file to another area of the disk whenever the file is
extended if there is not enough contiguous free space immediately following the file. Still another major
problem is alias file names. OpenVMS allows a file to have two or more different names, called aliases.
The file must not be open when this method of defragmentation is used, but it is possible, if the file has
an alias, that the file could be open under the alias and so this technique could fail.
The exact commands for this technique are:
$ DUMP /HEADER /BLOCKS=END:0 filespec
$ OPEN /ERROR=FILE_OPEN INPUT_FILE filespec
$ CLOSE INPUT_FILE
$ COPY /CONTIGUOUS filespec filespec
Written 1994
Page 55 of 121
THE FRAGBOOK
By Craig Jensen
$ PURGE filespec
$ EXIT
$
$FILE_OPEN:
$ WRITE SYS$OUTPUT "File open, try again later"
$ EXIT
Backup and Restore
From the time Digital's users first started complaining about fragmentation, the officially recommended
remedy was something called "backup and restore." The "and restore" part of this is the critical part.
Omitted from the phrase "backup and restore" is the critical middle step - initialize.
"Backup" is easy. You already do that anyway, so it doesn't take any more time or effort than you already
expend just to make sure your files are safely backed up. Backing up a disk, however, does absolutely
nothing for fragmentation. To cure the fragmentation, it is necessary to then reinitialize the disk after the
backup and then restore all the files to the disk.
Initializing the disk, of course, effectively deletes every file from the disk. The data can then be restored
from the backup tape (or other media), and the data is restored in a clean unfragmented, contiguous
fashion.
There are drawbacks to this solution, too. Not the least of these is the time it takes to restore the
information to the disk. This takes just about as long as the backup process itself, which is not exactly
quick.
Another drawback is the absolute requirement for a backup that it be precisely accurate in every respect.
If the tape is badly flawed or the drive has malfunctioned, your data is lost. You simply cannot get that
data back. So you are strongly encouraged to verify your backup tapes before initializing the disk. The
verify pass, of course, also takes quite a long time.
Perhaps the most aggravating drawback of backup and restore is that it has to be done after hours. You
can't very well erase all the users' files while they are using them, and people get really upset when you
take away access to their files during the workday. So you stay late at night or come in on the weekend
to handle this chore.
Now, in my experience as a System Manager, my friends weren't working nights and weekends. They
used these times for sensible activities like having fun with their families or nights out on the town. And
there is nothing like long hours of tedious, boring backup and restore to remind you of that fact. They're
out having fun and you're not.
To compound the aggravation, it is nearly impossible to get any other useful work done while doing a
backup. If you're using a 9-track tape drive, you have to jump up every ten minutes or so to change the
tape. There isn't much useful work you can do that can be interrupted every ten minutes, so you are
reduced to an awful lot of busy work or just plain sitting around. And this goes on for hours.
Written 1994
Page 56 of 121
THE FRAGBOOK
By Craig Jensen
I know at least one System Manager who would have been divorced by his wife if he hadn't solved this
after-hours backup and restore problem.
To look at this from the dollars and cents viewpoint, if your system doesn't sit idle every night, what does
it cost your organization to shut the thing down for a night or two to defragment the disks? It's not cheap.
Even a small system costs enough to make you want to avoid downtime like the plague.
For those who don't mind these difficulties or who have no better solution, the commands for doing a
backup and restore operation are:
$ BACKUP /IMAGE /VERIFY disk-device tape-device
$ INITIALIZE disk-device
$ BACKUP /IMAGE /NOINITIALIZE /VERIFY tape-device/SAVE_SET disk-device
Initialization Procedures
As long as you are initializing the disk anyway, there are several things you can do at the same time to
make that disk less susceptible to fragmentation and better performing too*. These are explained in
detail in the Prevention section, later in this chapter.
Disk-to-Disk Copy
An abbreviated form of the backup and restore technique that is much faster is the disk-to-disk copy.
This technique requires a spare disk drive exactly like the one you want to defragment.
What you do is make an image backup (a logical copy) of the fragmented disk onto the spare disk drive.
The BACKUP utility automatically initializes the new disk unless you initialize it yourself and suppress
BACKUP's initialization with its /NOINITIALIZE qualifier. BACKUP then copies the files contiguously to
the new disk, leaving all the free space in two large areas. Then you change the disk drive number
physically so OpenVMS will know where the data is. Unfortunately, you also have to power down and
restart the disk drives for the plug swap to take effect.
This technique is very fast - as fast as you can copy one entire disk onto another. The obvious drawback
is the expense: it requires having a spare disk drive. Another drawback is that you still have to backup
the data separately, unless you can afford to keep the spare disk drive tied up as a backup to the
defragmented disk. Yet another drawback is that, to ensure the data doesn't change in the middle of the
copying, you have to block access to both disks, depriving the users of access to their files for the
duration of the process.
The commands for this technique are:
$ INITIALIZE disk-2 label_name
$ BACKUP /IMAGE /NOINITIALIZE /VERIFY disk-1 disk-2
The initialization advice given for the backup and restore technique earlier in this chapter applies equally
to this method.
Written 1994
Page 57 of 121
THE FRAGBOOK
By Craig Jensen
Defragmentation Software Products
There are software products available that you can use to defragment disks. These are referred to as
defragmenters. They come in two forms: off-line defragmenters and on-line defragmenters. We'll
examine each separately.
Off-Line Defragmenters
An off-line defragmenter is a computer program used to defragment a disk. It is differentiated from online defragmenters in that you have to take a disk out of service (off-line) to use the defragmenter on it.
Why? This type of defragmenter analyzes a disk to determine the state of fragmentation and then maps
out a rearrangement of the files on the disk that will reduce or eliminate the fragmentation. After mapping
out where the files should go, it rearranges them. This type of defragmentation has to be done off-line to
accommodate the drawbacks inherent in such a method:
1. Having a separate analysis pass and then the actual file rearrangement pass presents the biggest
problem. If, after calculating the ideal position for each file on the disk, some user application comes
along and deletes a file, adds a new file or extends an existing file, the analysis is instantly obsolete
and the planned rearrangement is unlikely to provide ideal results. In fact, rearranging files with an
obsolete analysis is downright dangerous. If the defragmenter were to write data into an area it thinks
is free but that has become occupied since the analysis, user data could be lost. By taking the disk
out of service, so no user application can access any file on the disk, this danger is eliminated.
2. This type of defragmentation is like throwing all the files up in the air and slipping them into the right
slots as they come back down. What if something happens while the files are up in the air? I am not
talking about adding, changing or deleting a file. I am talking about a disaster. Suppose the system
goes down or the disk fails? What happens to the data files that are "up in the air?" Most likely, they
are lost.
The remedy for this is a logging facility that keeps track of what files are "up in the air" at any given
moment, keeping copies of the files in a scratch space so the file can be reconstructed following a
catastrophic interruption. Logging and reconstruction such as this is extremely complicated in a
constantly changing environment, so such a defragmenter must be run off-line in an unchanging,
laboratory-like environment.
3. Since many sites tend to keep their disks very nearly full, there may not be enough room for the
defragmenter to make a copy of a file to be defragmented, particularly a large file. For this reason,
the off-line type of defragmenter often uses a scratch area on a second disk for copies of files being
defragmented. This may require taking two disks out of service - the one being defragmented and the
one with the scratch area. It certainly requires that the defragmentation be done off-line to reduce the
risk of data loss. Even so, a power failure may leave you with an important file (is there any chance it
would be an unimportant file?) stranded out in the scratch area, with recovery dependent upon a
special procedure you need to run to get the file back. But what if the power failure is in the middle of
the night when you're not around? And what if the stranded file is the program image containing the
recovery procedure?
Written 1994
Page 58 of 121
THE FRAGBOOK
By Craig Jensen
Of course, the ultimate drawback for this off-line method of defragmentation is taking the disk out of
service. Taking the disk out of service, obviously, means no one can use it. The disk, if not the entire
system, is "down" for the duration of the defragmentation activity. The users' data is inaccessible. Like
backup and restore, this solution carries a heavy penalty.
Now take a look at this: Let's say it takes two hours to do that defragmentation job (and I have seen
reports of off-line defragmentation efforts taking ten times that long). That's two hours of lost computer
time. How much performance increase does your defragmenter have to achieve to make up for two
hours of complete downtime? You're right, it's a lot. It seems to me that the cure is worse than the
disease.
Because of these serious drawbacks, because of the outrageous cost of shutting down a computer
system for the duration of the defragmentation process and because a much better solution arrived, offline defragmenters have all but disappeared from the market.
On-Line Defragmenter
An on-line defragmenter is one that processes disks while user jobs are active, even while user
applications are accessing files on the same disk that is being defragmented. It is not necessary to take
the disk off-line or allocate it to the defragmenter process.
An on-line defragmenter eliminates the drawbacks inherent in off-line defragmenters. There is no
analysis pass to become obsolete when users add, change or delete files. Rather, each file is analyzed
individually as the defragmenter turns its attention to defragmenting that particular file. The files are not
"thrown up in the air" and juggled. Instead, each file is copied into a new location and, once safely there,
removed from the old location. It doesn't use a scratch area in which files can get lost. The file being
defragmented is kept intact in its original position while the new, contiguous copy is created elsewhere
on the same disk.
But the real advantage of an on-line defragmenter is that of keeping the disk in service while the
defragmenting is done. No more hours of downtime; no more downtime at all. Only with this type of
defragmenting can the system performance improve without sacrificing an equal or greater amount of
system resources to do so.
How long should a defragmenter take to do its job? Less than the time and resources being lost to
fragmentation. If your system loses 20% of its resources to fragmentation, a defragmenter that
consumed even 19% would be worthwhile (though not much). Clearly, the fewer resources that are
consumed, the more worthwhile that defragmenter would be. The point is that some defragmenters
consume 21% or even more of your system's resources. So this cost of defragmentation must be
weighed against the cost of performance losses due to fragmentation.
Another major factor to consider is the amount of time and effort spent by you, the System Manager, in
managing the defragmenter. The ideal on-line defragmenter would be one of the "set it and forget it"
variety. You just install it on your system and it takes care of everything from then on.
Prevention
There are several major performance problems that should be handled before addressing the quality of
defragmentation. These are multi-header files, user files on the system disk, and unnecessary data
checking. It is also well worth your while to invest a little system management time in organizing the
Written 1994
Page 59 of 121
THE FRAGBOOK
By Craig Jensen
system so fragmentation occurs less often and is a little slower to creep in. Techniques for doing this are
discussed later in this section.
Like the old saying, "An ounce of prevention is worth a pound of cure," a little care taken before
fragmentation becomes a problem can save a lot of time, effort and headaches cleaning up the mess
later.
Clearing unneeded files off the disks is by far the most effective means of preventing fragmentation. The
more space you can keep free on the disks, the less likely files are to become fragmented. It should be
noted, however, that reducing disk space in use below 50% does not improve things much - not enough
to warrant the cost of maintaining so much unutilized disk capacity. On the other hand, even if you
cannot keep the disks half empty, keeping them 40% empty, or 30% or even 20% helps a lot. In my own
experience, I have observed that VAX disk I/O performance is great when the disk is 50% empty or
more. Performance worsens slightly as the disk fills to 80% of capacity. Above 80%, performance really
goes to the dogs quite rapidly and, if you are running above 90% full, you have a built-in performance
problem of severe proportions.
When I see a disk above 90% full, I lose all interest in fragmentation and get busy clearing off some
space on that disk. The performance gains from doing so are clearly noticeable.
It is not as difficult to free up 10% of a disk as you might think. I have personally accomplished this feat
numerous times by the simple expedient of issuing a notice to all users (via a NOTICE file in the login
procedure). The notice says, "The disks on our computer system are too full. Delete all unnecessary files
from the system within 24 hours. If sufficient space is not freed up by then, the System Manager will
delete files from the directories of the worst offenders until sufficient space is available. Files not needed
now which might be needed later can be stored in archives until they are needed."
This notice always gets good results. Sometimes, it brings 20% or more free space with no further action
than that. Of course, you have to actually enforce it once in a while, but a check of each user's total file
sizes usually reveals a user or two who is abusing the system wholesale, with dozens of versions of old,
old files that haven't been touched in months. This is the guy who saves everything and consumes as
much space as half the other users combined.
To detect this varmint, merely enable disk quotas briefly and run off a report of how many disk blocks
each user is consuming. Then you can turn off disk quotas if you don't really need them on. (Disk quotas
cost overhead).
Another source of wasted space is users who have left the organization. Archive their files and delete
them from the system.
Time spent in clearing off disks to 20% or more free space will be the best investment you can make in
improving disk I/O performance and preventing fragmentation.
Written 1994
Page 60 of 121
THE FRAGBOOK
By Craig Jensen
Preventive Measures When Initializing a Disk
Disk Cluster Size
Keep in mind that initializing a disk erases everything on that disk. Therefore, it is advisable to use the
INITIALIZE command only after first doing the BACKUP/VERIFY step, to ensure that you have a backup
of the disk and that its data integrity has been verified.
Make sure you choose the correct cluster size for the intended use of the disk. Disks larger than 50,000
blocks default to a cluster size of three when initialized, but this may not be the best value for your
intended use. A cluster size of one incurs the maximum possible overhead in disk I/O, but assures the
availability of every last block on the disk. A cluster size of three reduces the size of the storage bitmap
on the disk by a factor of three and speeds file allocation, but one or two disk blocks are wasted for every
file that is not a multiple of three blocks in size. If your average file size is one, this could be a
tremendous waste - two-thirds of the disk!
Here is a table displaying the amount of disk space wasted for various cluster sizes when the file sizes
vary randomly:
Avg
Blocks Max Files on
Blocks
456MB
Disk
Wasted
Wasted
at 80% Full
per File
1
0
712,858
0
2
0.5
356,429
178,214
3
1.0
237,619
237,619
4
1.5
178,214
267,322
5
2.0
142,572
285,143
6
2.5
118,810
297,024
7
3.0
101,837
305,510
8
3.5
89,107
311,875
9
4.0
79,206
316,826
10
4.5
71,286
320,786
11
5.0
64,805
324,026
12
5.5
59,405
326,726
13
6.0
54,835
329,011
14
6.5
50,918
330,970
15
7.0
47,524
332,667
16
7.5
44,554
334,152
Table 5-1 Wasted Space Due To Cluster Size Setting
Cluster
Size
Check the current cluster size for your disks using this DCL command:
$ SHOW DEVICES /FULL
Written 1994
Page 61 of 121
THE FRAGBOOK
By Craig Jensen
When choosing the cluster size, consider first what is most important with respect to that disk: speed of
access (large cluster size) or maximum utilization of the available space (small cluster size). Then
consider what the average size of a file will be on that disk. If most files will be small and saving space is
important, use a small cluster size - perhaps half the size of an average file. If most files will be large and
speed is more important than saving space, use a large cluster size - perhaps 16. The maximum is
1/100th the size of the disk. Research shows that the typical disk has an average file size of eight blocks.
The command for setting the cluster size when initializing a disk is:
$ INITIALIZE /CLUSTER_SIZE=n diskname label_name
Data Checking
When initializing a disk, do not use the /DATA_CHECK qualifier unless you really need the extra level of
safety it affords. /DATA_CHECK increases disk I/O by causing read-after-write operation and, optionally,
read-after-read to verify data integrity. That is to say, every time an application reads from or writes to the
disk, a follow-up read is performed and the data compared for accuracy. Data checking need not be
turned on for this feature to be used. Critical applications can use it any time, whether it is turned on or
not, by specifying data checking in their I/O routines. Having data checking turned on causes this feature
to be in effect for every I/O to that disk. If this is not what you want, make sure it is turned off. The default
is /NODATA_CHECK, so chances are you have not been using this feature anyway.
The command for turning off data checking when initializing a disk is:
$ INITIALIZE /NODATA_CHECK diskname label_name
Directory File Pre-allocation
The DCL INITIALIZE command allows you to pre-allocate space for directories. Unfortunately, it defaults
to only sixteen directories, so most disks require additional directory file space to be allocated. The
additional directories are created smack in the middle of production processing, disrupting application I/O
and scattering the newly-created directory files all around the disk. When you initialize a disk, estimate
how many directories will be created on it, and specify a slightly larger number with the INITIALIZE
/DIRECTORIES=n qualifier.
The command for preallocating directory space when initializing a disk is:
$ INITIALIZE /DIRECTORIES=n diskname label_name
File Header Pre-allocation
The DCL INITIALIZE command allows you to pre-allocate space for file headers in INDEXF.SYS.
Unfortunately, like INITIALIZE /DIRECTORIES, it defaults to only sixteen files, so most disks require
additional space to be allocated to INDEXF.SYS after the disk has been initialized. The extra space
allocated is often not contiguous to INDEXF.SYS, so this all-important file becomes fragmented right
from the start. When you initialize a disk, it is very important for consolidation of free space on your disk
that you estimate how many files will be created on it and specify a slightly larger number with the
INITIALIZE /HEADERS=n qualifier.
The command for preallocating space for file headers when initializing a disk is:
Written 1994
Page 62 of 121
THE FRAGBOOK
By Craig Jensen
$ INITIALIZE /HEADERS=n diskname label_name
Index File Location
When initializing a disk, the disk's index file can be forced to the beginning of the disk (toward LBN 0),
the middle of the disk, the end of the disk, or to any specific block desired. I recommend that the index
file be placed at the end of the disk using the INITIALIZE /INDEX=END qualifier. This frees up the
maximum amount of space near the beginning of the disk, where OpenVMS will be allocating new files.
Having few or no files near the beginning of the disk guarantees the fastest possible file creation times
and increases the likelihood that new files will be contiguous when created.
The command for locating the index file when initializing a disk is:
$ INITIALIZE /INDEX=END diskname label_name
Preventive Measures After a Disk Has Been Initialized
Even if you cannot reinitialize your disks to obtain better performance, you can modify volume
characteristics to improve the situation somewhat. The following commands can be used after a volume
has been initialized and should be considered for use on all your disks.
Turn Off Data Checking
Use the DCL command SHOW DEVICES /FULL to find out whether a disk has data checking enabled for
read, write or both. If it does, you will see in the SHOW DEVICES display heading the words, "data
check on reads," "data check on writes" or both. Data checking increases disk I/O by causing read-afterwrite operations and, optionally, read-after-read to verify data integrity. If you have data checking
enabled and you do not really need the extra level of safety it affords, disable data checking with the
following DCL command:
$ SET VOLUME /DATA_CHECK=(NOREAD,NOWRITE) diskname
This is the default condition for a newly initialized disk.
Turn Off Erase On Delete
Use the DCL command SHOW DEVICES /FULL to find whether a disk has erase on delete enabled. If it
does, you will see in the SHOW DEVICES display footing the words, "erase on delete". Erase on delete
increases disk I/O by causing a system-specified pattern to be written into a file area when the file is
deleted. The pattern makes it harder to figure out what was in the file before it was deleted. Some sites,
particularly in the defense industry, require this for security purposes. If you have erase on delete
enabled and you do not really need the security it affords, disable it with the following DCL command:
$ SET VOLUME /NOERASE_ON_DELETE diskname
Increase Extend Quantities
Use the DCL command SHOW RMS_DEFAULT to find out what the RMS extend quantity is. The RMS
extend quantity determines how many blocks are allocated each time a file is extended. This should be
raised to a large value, such as 100. If a file, such as a log file, is extended many times, and the extend
quantity is small, the file is likely to become extremely fragmented because the (small) extents are
created in different places on the disk. If the extend quantity were large, the file would be less
fragmented because the pieces of the file would be larger. There is little adverse impact to this action, as
Written 1994
Page 63 of 121
THE FRAGBOOK
By Craig Jensen
excess allocation is truncated when the file is closed. Use this DCL command to set the RMS extend
quantity:
$ SET RMS_DEFAULT /SYSTEM /EXTEND=100
Note that changing the volume extension quantity with the DCL command SET VOLUME
/EXTENSION=100 is overridden by the RMS_DEFAULT value.
Other Hints
With all these preventive measures, it is important to bear in mind that the whole purpose for
defragmenting is to speed system performance and responsiveness. While fragmentation is guaranteed
to ruin system performance, it is not the only thing that causes performance problems. If one of the
things covered in the two sections above is out, the performance degradation that results may be as bad
as fragmentation-induced problems or worse. Needless to say, if your purpose is to improve
performance, these remedies should be used in addition to defragmenting to get all the gains you can
get.
Unneeded data checking, for example, can double the number of I/O operations on a disk. If the extra
safety of data checking is not needed on your system, enormous performance gains can be had by the
one simple measure of disabling data checking.
Along these same lines, you should know that system disks are the worst place to store user files.
System disks work pretty hard for OpenVMS and storing user files on the system disk causes excess
disk I/O whenever a user is accessing those files. Worse, that user's disk I/O, since it is I/O to the system
disk, affects the performance of the entire system. The situation is compounded for a common system
disk, as two or more OpenVMS systems are working from the same system disk. Give yourself a big
performance boost by moving all user files off the system disk.
One last recommendation I have is that you reduce subdirectory levels. Each time a file is accessed, the
file itself is not the only disk access required. The Master File Directory must be accessed, the main-level
directory file, and any subdirectory files as well. Each access requires a disk read, unless the directory is
already in the directory cache. Fortunately, directories are usually in the cache (often 90% of the time),
so this is a minor problem. Nonetheless, a definite performance improvement can be obtained by
reducing the number of subdirectory levels on a disk.
When Would You Not Want to Defragment a Disk?
The simple answer to this question is that the only time you would ever want to not defragment a disk is
when it is already defragmented. It is hard to imagine a circumstance when you would want to leave a
disk fragmented. But I'll do my best. . . .
The INDEXF.SYS file, as noted earlier, is deliberately fragmented into four pieces. Leave it that way if
you want your disk to be accessible after the next reboot. If the index file is fragmented into more than
four pieces, however, you can improve free space consolidation quite a bit by defragmenting it. The only
way this can be done is by BACKUP and RESTORE, but when you reinitialize the disk between
BACKUP and RESTORE, you must pre-allocate sufficient file headers (see the section File Header Preallocation earlier in this chapter).
Written 1994
Page 64 of 121
THE FRAGBOOK
By Craig Jensen
OpenVMS contains support for locking a file at a specific LBN. Such a file is called a placed file.
Theoretically, in a realtime environment, you could lay out a disk with files placed at specific, known
points around the surface of the disk and thus guarantee the fastest possible transition of the disk head
from one file to the next. I say theoretically because I have never heard of anyone actually doing this and
it certainly would be a meaningless exercise for a disk in an on-line, interactive user environment. The
random motions of the head seeking data for multiple user applications would defeat utterly any preplanned movement of the head amongst placed files.
Finally, I return to the same old argument: you would not want to defragment a disk when the cost of
doing so exceeds the return you would get in terms of performance. So after each defragmentation pass,
some little time must pass and some amount of performance degradation must occur for the cost of the
next defragmentation pass to be justified. Let me tell you, we are really getting down to the picky details
now. All I am saying is that you would not want to attempt to defragment a disk immediately after you had
just defragmented it.
The Cost of Fragmentation
To determine whether a defragmentation pass is worth it or even whether the purchase of a
defragmenter is worthwhile, you need to know the cost of fragmentation. There is no flat answer to this. It
is different for every system. A little-used system in a non-critical environment has a lower cost for
performance losses than one in a round-the-clock, mission critical application.
To sort this out for your system, this book includes an appendix (Appendix B) devoted to a step-by-step
calculation of the costs of fragmentation and justification of the cost of handling it.
Conclusion
The conclusion is, inescapably, that while you can do something to prevent fragmentation, the prevention
is incomplete and temporary. And, while there are solutions built in to OpenVMS, these solutions are
incomplete and tedious in the extreme. A defragmenter is the only solution that directly addresses and is
specifically designed to solve the fragmentation problem. Accordingly, the next chapter is devoted to a
detailed discussion of defragmentation by means of a defragmenter.
Written 1994
Page 65 of 121
THE FRAGBOOK
By Craig Jensen
CHAPTER 6
GETTING THE COMPUTER TO CLEAN UP AFTER ITSELF
This chapter is devoted entirely to the defragmenter as a solution to the fragmentation problem.
History
Defragmenters are not new, as things go in the computer industry. The first one became available for
VAX/VMS in 1986. The concept was a simple one: get the computer to clean up after itself, saving the
System Manager from the drudgery of doing it himself. I hasten to point out that this simplicity and clarity
of purpose was not as obvious at first as it seems in looking back. In fact, the first defragmenter for
VAX/VMS took longer to run and had a higher risk of data corruption than backing up the entire disk to
tape and restoring it. It had the sole distinction of being the first software product designed specifically for
defragmenting disks and disk files. It achieved this distinction by virtue of being a conversion of a
defragmenter from another computer and operating system to VAX/VMS. To be first, it paid the price of
insufferably slow performance and severe risk of data loss. Needless to say, this product and the
company that produced it have long since vanished from the market.
Nevertheless, it was a start, and a crack in the walls of Digital Equipment Corporation's system software
fortress. Before 1986, Digital had a virtual monopoly on system software for the VAX. But, when it came
to defragmentation, Digital's official view was that on-line defragmentation was "not an easy problem."
Each year, a large group of Digital's customers, members of the Digital Equipment Computer Users
Society (DECUS), surveys its members to determine the things that most need improvement in the
OpenVMS operating system. This survey is called the System Improvement Request (SIR) ballot. In the
1985 SIR ballots, on-line disk compression (another word for defragmentation) placed first in the survey
by a large margin in the United States and second in Europe. Digital customarily responds to the top ten
items on the SIR ballot. In Europe, Digital's response took the form of a talk by Andy Goldstein, a top
guru for Digital's revered VMS Central Engineering group, in which he said:
We agree that's a fine idea. . . We certainly intend to take a good hard look at this problem. I have
some experience with it. I actually worked for RSX back in the old days when we had a gadget called
DCU which some of you may remember. I think DCU was a very good demonstration to us all that
this is not an easy problem. One of the interesting problems, other than simply interpreting the file
structure correctly, is coordinating the compression with ongoing file activity, and of course clusters
add a unique twist to that. Also, the performance problem is quite difficult because in-place disk
compression is to the best of my understanding an n-squared problem and so with some very large
disks you run into some very serious performance problems. Obviously you can do intelligent things
to make it run faster, and we are going to have to spend some considerable time and attention in
getting reasonable performance out of it.
--DECUS, Cannes, France, Sept 1985
In the U.S., Digital's official response appeared in the February 1986 DECUS newsletter:
While the importance of on-line compression is obvious in retrospect, this is a new request for us in
that it has never appeared so prominently in past SIR ballots. We do not have any current plans to
build such a facility. However, we understand its importance and we understand that it will increase
as time goes on.
Written 1994
Page 66 of 121
THE FRAGBOOK
By Craig Jensen
There are a number of difficult problems to deal with, including coordinating ongoing file activity with
the compression process, and solving the performance problem on large disks (since disk
reorganization is inherently an n-squared order problem).
We will investigate this for future VMS development.
--PAGESWAPPER, Feb 1986
These statements by Digital were alarming. Here we had a very large group of customers on two
continents declaring this to be their single biggest problem with Digital's flagship product and Digital's
official response was that they had no current plans to deal with it!
It would appear that the roadblock in Digital's path was the fixed idea that on-line defragmentation was
an "n-squared order problem." This idea is basically a declaration of the impossibility of solving the
problem in an acceptable way. Fortunately for the customers, there is nothing like a public declaration of
impossibility from an expert to prompt entrepreneurs into action.
The second quote, in fact, was a driving factor in convincing more than one software company to throw
its hat into the defragmenter ring. It seemed to be assurance that Digital would not be coming along
anytime soon to steamroller the market with its own solution in the form of an "official" Digital
defragmenter, even in the face of huge customer demand. True enough, it was not until more than five
years later, on December 9, 1991, that Digital announced a disk defragmentation product. During those
five years, Digital repeatedly declared its policy on the subject of defragmenters to be that the possibility
of obtaining performance gains by use of a disk defragmenting utility was a "misconception." The official
Digital solution was:
Digital recommends that disk volumes be saved and restored on a regular basis, using the VMS
Backup Utility. When the volume is restored, all the files will be allocated contiguously, and the free
space on the disk will be collapsed into one area. Plus, a backup copy of the disk will exist.
VMS Systems Dispatch, August 1989, Digital Equipment Corporation
The policy statement went on to speculate on the horrors that might befall a defragmenter user if
something went wrong.
This policy was based on unsound advice. The policy statement ignores the risk inherent in backup and
restore: that of unreadable tape. This is not an insignificant risk. What happens if you backup a disk to
tape, reinitialize it and then find that the tape is unreadable? You're up the river without a paddle, as they
say. That data is gone and it's gone for good. Not so obvious, but a serious concern is that the backup
and restore process is so tedious and so time-consuming that one-quarter of the System Managers
surveyed recently said they have never bothered to do it. How good can a solution be if it is never used
at all?
On top of that, Digital's policy statement lacked any serious testing to determine whether the facts on
which it was based were true. I know, because I did serious testing. The best that could be said for
Digital would be that Digital had lumped all defragmenters together and assumed that what is true for
one must be true for the rest. Don't get the idea that I think badly of Digital. I admire the company in
many ways. But Digital has over 95,000 employees and sometimes some of these employees don't
communicate very well with each other.
Written 1994
Page 67 of 121
THE FRAGBOOK
By Craig Jensen
So, first Digital said it couldn't be done. Then, when it was done by others, Digital said its usefulness was
a misconception. Now Digital is fielding its own defragmenter product. You draw your own conclusion.
In any event, in 1986 the defragmenter happened. Now there was a tool for the specific purpose of
defragmenting disks. Best of all, an on-line variety of defragmenter was developed and did everything
itself, automatically.
An On-Line Defragmenter
Laying aside the off-line variety as an unsatisfactory solution, let's take a look at what an on-line
defragmenter is, exactly, and how it works.
An on-line defragmenter is distinguished from an off-line defragmenter by the fact that you do not have to
shut down the system, kick the users off or take a disk out of service to use the defragmenter. An
automatic on-line defragmenter goes a step further and includes a mechanism for determining when to
defragment. The non-automatic version is manual in this regard - it requires the System Manager to
decide when to defragment. The automatic on-line defragmenter has some mechanism for determining
when to defragment - whether by sensing the state of fragmentation or by measuring the degree of
fragmentation between defragmentation passes or just by waiting a certain time and then cleaning things
up again.
Ideally, the on-line defragmenter keeps a record of how badly fragmented the disk is every time it makes
a defragmentation run and, based on that information, increases or decreases the time intervals between
runs.
This is not a simple problem. Perhaps the single greatest contributing factor to the problem is the fact
that OpenVMS includes no mechanism for determining how often a particular file is accessed. If you
could find out how often a file is accessed, you would know how critical that file is to system performance
and thus how much attention the defragmenter should give to keeping that particular file contiguous.
Another important factor is the lack of any mechanism to determine whether a file has been accessed at
all. Without either mechanism, we are reduced to looking mostly at indirect measures of the impact of
fragmentation on the system. These measures are described in Chapter 3.
What's Wrong With Fragmentation?
Once it has been determined, by whatever mechanism, that it is time to defragment a disk, the
defragmenter has to determine whether defragmentation can be done safely at that time or whether it
should wait for a better time to do it. It also has to determine whether the defragmentation activity would
degrade system performance unacceptably at that time. It even has to check whether defragmentation
will do any good; maybe the disk is in such good shape that defragmentation would be a waste of
resources. In this case it would just wait for a while longer and check again later on.
When the defragmenter has determined that the time is right, the next question is, "What files should be
defragmented?" On even the most badly fragmented disk, some files are contiguous. Attempting to
defragment these would be a waste of time and resources, so some means is needed of detecting them
and causing the defragmenter to skip processing those files - all with a minimum of overhead, of course.
Other files (such as INDEXF.SYS) must not be defragmented because to do so would interfere with the
proper operation of the operating system.
Written 1994
Page 68 of 121
THE FRAGBOOK
By Craig Jensen
Then, amongst the files that should be defragmented, some determination must be made as to which
should go first. Should the files be processed in order of appearance in their directory files? Should the
most fragmented files be processed first? Or is some other order best? Does it matter at all?
When a file is selected for defragmenting, the defragmenter has to determine where on the disk to put
the newly-created defragmented file. The wrong location might do nothing to improve the condition of
free space fragmentation or might even make it worse. Some intelligence is required to choose the best
place on the disk for the defragmented file, keeping in mind the gaps that will be created when the
fragmented version is deleted.
Then there is the question of what if no single place on the disk is suitable? Could the file be split into
two or more pieces and still be better off than in its present fragmented state? Maybe. Maybe not.
Suppose the disk is so messed up that it is going to take two or more defragmentation passes to get the
disk into good shape. How does this affect decisions on where to put a defragmented file? Could it
profitably be put into a worse position on the first pass, anticipating movement to its ideal position in a
subsequent pass?
When a new location is selected, how exactly should the copying be done? Should the file be copied
directly into the new location or should it be copied first into a temporary location and then moved to its
final destination? And how do you deal with user attempts to access that file while it is in the middle of
being relocated? What if someone attempts to modify the original file after the copy has been made, but
before the new file formally takes the place of the old one? And, not the least of our worries, what if the
system goes down right in the middle of all this?
An automatic on-line defragmenter also has a quality control problem. How can it be sure the file was
copied correctly and users can now access every bit of the new file exactly as they did the old?
These are the obvious problems in the simple case. No mention has been made of problems unique to
multi-header files, files that span disk volumes, or files that are held open by the system on a complete
bypass of the usual procedures so there is no way to tell whether a file is in use or not.
OK, gentle reader, the scary part is over. I hope you are still reading so you can receive the news that
there are solutions to all these problems. My purpose in rattling off these problems is to show you that
defragmentation is a complicated undertaking, that it has been thought through and we defragmenter
vendors are not playing around with your irreplaceable data with our heads stuck in the sand. The
important thing to know is that the computer system itself has the answers to these questions within it or,
at least, it has the data from which answers can be formulated.
An automatic on-line defragmenter then, is one which uses the data already available within the
computer, without the need for operator intervention, to determine when to defragment disks and disk
files, what files to defragment, the order in which to defragment them, where to put the new, contiguous
files, whether to completely or only partially defragment a particular file, to do all this without interfering
with user access to the disk and disk files and to do so with absolute 100% guaranteed data integrity.
Yes, this can be done and, at this writing, is being done on tens of thousands of OpenVMS systems
around the world.
Written 1994
Page 69 of 121
THE FRAGBOOK
By Craig Jensen
Done right, this solution to fragmentation requires no attention from the System Manager or operations
staff at all. It is a complete elimination of fragmentation as a problem for that OpenVMS system. Such an
automatic solution to a problem inherent in the operating system is called an operating system
enhancement, as opposed to the manual, tool-variety solution, which is called a utility.
Safety
A good on-line defragmenter does not just provide a means for recovery of user data in the event of a
system crash during the defragmentation process; it actually processes files in such a way that no data
can be lost. It is possible and practical to create the new, defragmented version of the file, verify its
accuracy and replace the old version with the new in between user accesses to the file, all the while
guaranteeing that directory and INDEXF.SYS file information refers to a flawless copy of the file. With
such a method, there is no window for error in which a file of user data can be lost, even when the
system crashes at the worst possible moment.
Apparently, it was concern about this potential window for error and uncertainty about its handling that
kept Digital out of the defragmenter business until 1991. In that year Digital incorporated a mechanism
into OpenVMS version 5.5 to relocate a file without any window for error. The mechanism adopted by
Digital, called MOVEFILE, is similar to the mechanism a leading defragmenter had been using since
1986. When MOVEFILE appeared, Digital's public concerns about the safety of on-line defragmenters
ceased, at least for those defragmenters that used the official Digital mechanism for moving files!
The solution is easily explained. Relocating a file on the disk for purposes of defragmenting is a multistep process. Doing some of the steps without doing the rest can result in a file that is confused,
damaged or even lost. The solution is to isolate the critical steps that must be all completely done or
none done at all and treat these as a single step. Such a group of steps treated as a unit is called a
primitive. In version 5.5 of OpenVMS, this operation is called the movefile primitive. It moves a file from
one location on a disk to another, guaranteeing that all the steps necessary to move the file will be fully
complete, or none of the steps will be done at all. Thus, you can be sure that the file is either fully moved
intact or remains fully intact at its old location. No in-between state can exist. Therefore, even if the
system crashes, you can be confident that all your data still exists without corruption of any kind.
Unmovable Files
Some files are unmovable, either because the OpenVMS operating system depends upon the file being
at a fixed location or because an application program has fixed the file at a certain location and requires
it to remain there. The defragmenter must take care to detect such files and leave them where they are.
It is worth noting, however, that programmer errors in applications and even in certain OpenVMS utilities,
such as the MAIL facility, sometimes result in a file having the "fixed placement" attribute when there is
no reason or intention for that file to remain fixed in place. So how do you know whether a file is
supposed to be placed or not? The rule I use as a System Manager is that no one has any business
fixing files in place except the System Manager; at least not without his or her knowledge and
permission. So, if the file is not an OpenVMS system file and you, as the System Manager, did not put it
there, the file is erroneously marked "placed" and you can feel free to move it.
This is not a very safe rule for a defragmenter, however, and any defragmenter should honor file
placement control just in case someone really does need that file to be in that exact place.
Written 1994
Page 70 of 121
THE FRAGBOOK
By Craig Jensen
System Disks and Common System Disks
System disks present special problems for a defragmenter and common system disks are even harder to
deal with. Until recently, there were exactly nine files on an OpenVMS system disk that could never be
defragmented while the system was up. Now there are many more, though the number varies from
system to system. The only ways to process these untouchable files are to shut down the system and
use Stand-Alone Backup to backup and restore the system disk, or to arrange for the system disk to be
accessible from a VAX or Alpha AXP as a user disk and defragment the disk off-line.
These are not usually viable options, so system disks were originally excluded from processing by
defragmenters. Some defragmenters skirted the issue by excluding all files in a system root directory,
processing only files that reside on the system disk outside those directories reserved for the OpenVMS
operating system. The remarkable aspect of this is that user files residing on the system disk probably
cost more performance than moderate levels of fragmentation. The System Manager could get a bigger
performance boost by moving those files off the system disk than by defragmenting them, bigger
perhaps than defragmenting all the fragmented user files on user disks.
A good defragmenter knows exactly which files can be moved and which cannot, so it can defragment a
system disk just as freely as any user disk without risk to the system. The same is true for a common
system disk, where the system files of two or more different systems reside, though extra care must be
taken by the defragmenter to ensure that another system's unmovable files are left in place even though
they will not appear unmovable to the system on which the defragmenter is running.
Quorum Disks
A quorum disk is one which substitutes for a VAX or Alpha AXP computer, acting as a node in a cluster.
The reasons for this are complex and not important to defragmentation. The important thing is that, as on
a system disk, certain files on a quorum disk must not be moved. The defragmenter has to take this into
account.
Performance
The main reason, if not the only reason, for defragmenting a disk is performance. If you know nothing of
fragmentation, you can become quite perplexed watching a system's performance become worse and
worse, week after week, month after month, with exactly the same hardware configuration, the same
software applications and the same user load. It's almost as if the machine left the factory with planned
obsolescence built into it. How can the performance of the exact same system be so much worse if
nothing has changed? Well something did change: the files and disk free space fragmented with use.
The proof? Defragment the disks, and the system performs like new.
The sinister side of this is that fragmentation occurs so gradually that you might not notice the creeping
degradation. If system response worsens by only a fraction of a second each day, no one is likely to
notice from one day to the next. Then, weeks or months later, you realize that system response is
intolerable. What happened? Your system has caught the fragmentation disease.
Disk Access
The first rule of performance management is that the cure must not be worse than the disease. This is
the rule that killed off-line defragmenters. Here's how it works.
Let's say, for the sake of argument, that your system is losing 10% of its performance to fragmentation.
That is to say, jobs take 10% longer to run than they should or, put another way, only 90% of a day's
Written 1994
Page 71 of 121
THE FRAGBOOK
By Craig Jensen
work can get done in a day. Ten percent of a 24-hour day is 2.4 hours. The solution to your
fragmentation problem has to consume less than 2.4 hours per day or it just isn't worth it.
Seems simple, doesn't it? Well, shutting down the system or taking a disk out of service to defragment it
is a 100% degradation of performance. Performance just doesn't get any worse than "the system is
down." So an off-line defragmenter that costs you three or four hours of computer time a day is more
costly than the losses to fragmentation. The cure is worse than the disease.
The computer resources consumed by a defragmenter must be less, much less, than the performance
losses due to fragmentation. The best way to violate this rule is to defragment using a method that
requires taking the disk out of service. So a good defragmenter works on a disk while the system is up
and while the disk being defragmented is being accessed by user applications. After safety, this is the
most important feature of a defragmenter.
File Availability
A secondary aspect of this same disk access feature is that the files on the disk must be available to
user applications. It is not enough to allow access only to the free space on the disk for the creation of
new files. User applications must be able to access existing files as well. And while the defragmenter
may be accessing only a single file out of perhaps 10,000 on a disk, guess which file some user's
application is most likely to want to read? Yes, it is the one file that just happens to be being
defragmented at that moment. Murphy's law strikes again.
So an on-line defragmenter must assume that there will be contention for access to the files being
defragmented. Other programs will want to get at those files and will want to get at them at the same
time as the defragmenter. The defragmenter, therefore, must have some means of detecting such an
access conflict and responding in such a way that user access is not denied. The defragmenter has to
give way. Technologically, this is tricky, but it can be done and is done by a good defragmenter.
Locating Files
Another aspect of defragmenter performance is the amount of time and resources consumed in finding a
file to defragment. Scanning through some 10,000 files by looking up file names in directories and
subdirectories is out of the question. The time it takes to do this is a blatant violation of Rule One - it
outweighs the performance gains likely to be obtained by defragmenting.
A much better way to rapidly find files for defragmenting is by reading the INDEXF.SYS file directly. The
index file contains the file headers for all the files on the disk and within each file header is contained all
the information a defragmenter needs to know about the state of fragmentation of a file. Specifically, the
header tells how many fragments there are, where each is located on the disk and how big each one is.
So a defragmenter can zip through the index file, picking out those files that need defragmenting,
consuming typically only one disk access per file checked. Better yet, by reading several headers at
once, multiple files can be checked for fragmentation with each disk access. A good defragmenter uses
the index file to find files to process.
The Defragmentation Process
After a file has been selected for defragmentation, the overhead involved in the defragmentation process
itself can be significant. If the file is large, it can be very significant. After all, it is usually necessary to
copy the file in its entirety to make it contiguous. As many as 200 disk accesses may be required to copy
Written 1994
Page 72 of 121
THE FRAGBOOK
By Craig Jensen
a 100-block file (100 reads and 100 writes). These two hundred disk accesses at 25 milliseconds apiece
would consume 2.5 seconds. With this kind of overhead, processing even a fraction of 10,000 files on a
disk holding hundreds of megabytes can be a time-consuming activity. Fortunately, the OpenVMS file
system is more efficient than these figures would imply. Only the smallest disks, for example, have a
cluster size of 1, so disk reads and writes generally move 2 or 3 or more blocks at once. Further, regular
defragmentation holds down the amount of activity required. It is worth noting that performance worsens
geometrically as the degree of fragmentation increases, so catching fragmentation early and
defragmenting often requires less resources overall than occasional massive defragmentation.
The defragmenter itself can do a lot to lessen the impact of defragmentation overhead. A throttling
mechanism, for example, can reduce defragmentation I/O during times of intense disk activity and
increase it during slack times. This mechanism gives the appearance of greatly reduced overhead by
scheduling the overhead at a time when the resource is not needed anyway. Using idle time in this way
can make the defragmenter invisible to users of the system.
Perhaps the worst source of excess overhead for a disk defragmenter is the attempt to analyze an entire
disk before defragmenting and plan a "perfect" defragmentation pass based on the results of this
analysis. The idea is that a defragmenter can calculate the ideal position for each file, then move each
file to the best position on the disk. This is a holdover from the off-line defragmenter days and, besides
carrying the risks described in Chapter 5, it is enormously expensive in terms of overhead. Such an
analysis requires examining literally every file on the disk. On top of that, the analysis becomes obsolete
instantly if there is any activity on the disk other than the defragmenter.
A good defragmenter, then, should approach the process one file at a time and not require the overhead
of analyzing every file on a disk in order to defragment only a few files.
Basic Functionality
After safety and performance, you should look for basic functionality in a defragmenter. The most basic
functionality is the decision of what to defragment and what to leave alone. Not in all cases is it desirable
to defragment everything. Some selectivity is required.
A defragmenter has to exclude from its processing certain system files, like INDEXF.SYS. It should be
wary of placed files and files that have allocation errors. It should also have the capability of excluding a
list of files provided by the System Manager. You might also look for the ability to include certain files in
processing (or exclude "all files except ______") and possibly the ability to force immediate
defragmentation of a particular file or group of files.
Disk Integrity Checks
Perhaps the most important basic functionality of a defragmenter is determining whether a disk is safe to
defragment or not. It is possible, even commonplace, for a disk to get into a state where the data on the
disk is not exactly where the file headers in the index file indicate it should be. When this occurs, it is
extremely important for the matter to be corrected before any file involved is deleted, as deleting a file
(using the erroneous information in the header from the index file) might cause the wrong data on the
disk to be deleted! A good defragmenter must detect this condition and alert the System Manager to it so
it can be corrected before defragmentation begins.
Written 1994
Page 73 of 121
THE FRAGBOOK
By Craig Jensen
It is also possible for a good defragmenter to detect and isolate certain types of problems on a disk and
avoid those areas while continuing to safely defragment the rest of the disk.
Frequency
How often should you defragment a disk? "Often enough so performance does not suffer noticeably," is
the simple answer. Of course, by the time your system's performance is suffering "noticeably," it's too
late, so this is not a workable answer.
To answer this question with a numeric quantity, like "every week" or "every two weeks," you have to
know how long it takes for fragmentation to build up to a level where performance suffers noticeably. You
can use a disk analysis utility or a performance monitor to measure the level of fragmentation on your
system periodically, perhaps daily. Then, when performance begins to suffer noticeably, you can take
note of what level of fragmentation you have. Let's say this happens when fragmentation reaches an
average of 1.1 fragments per file (10% fragmentation). Thereafter, you can periodically measure
fragmentation and when it gets to, say, 1.05, defragment the disk.
An automatic on-line defragmenter includes a mechanism to measure fragmentation and schedule
defragmentation passes accordingly. The ideal automatic on-line defragmenter would detect
performance drains attributable to fragmentation and eliminate the causes before the drains became
noticeable.
Full Disks
It is one thing to ramble on about the workings of a defragmenter in an ideal, laboratory environment, but
it is quite another thing to see one working in the real world. One of the tricks played on us System
Managers in the real world is full disks. Somehow, despite our best efforts, disks that really ought to
remain 80% full or less drift up to 98%, 99% or even 100% full. Sure, you can spot this and take
appropriate steps to handle it, but what happens to your defragmenter during that critical window of time
between the disk going 100% full and your clearing off some space? Look for a defragmenter that
survives this circumstance intact and leaves every bit of your user data equally intact. A defragmenter
can't do much in the way of defragmenting with zero free space on the disk. The point is, if it can't
defragment, it shouldn't consume overhead either. So a good defragmenter should do nothing at all
when there is nothing useful it can do.
Large Files
Another side of the same coin is the fragmented file that is larger than the largest free space on the disk.
Suppose, for example, you have 10,000 blocks of free space, all in one place, but there is a 12,000 block
fragmented file. How does the defragmenter deal with that?
Older defragmenters used to rely on scratch space on a second disk to handle this problem, but that
proved so unreliable that it has disappeared as a practice. Some defragmenters don't deal with the
problem at all; they just ignore the file. A good defragmenter will partially defragment the file, giving you
the best result it can within the constraints of space available, and then return to the file later for
complete defragmenting when sufficient space has been freed up.
Always-Open Files
Another one of those real world tricks that doesn't show up in the test lab is the file that is held open all
the time, leaving no "downtime" for that file in which to defragment it. Database files are prime
candidates for this trick, particularly large database files. And why not? That big database probably
Written 1994
Page 74 of 121
THE FRAGBOOK
By Craig Jensen
contains most of the data that justifies the computer's existence. It ought to be in use around the clock. A
defragmenter needs to take such files into account and provide a means of dealing with them safely.
Besides the always-open file, there is also the one file that a user application happens to need at the
instant the defragmenter is working on it. What happens in that case? Does the defragmenter give way?
Does it even notice? Or does the application program trip, fail and abort with a file access error?
The minimum proper action is for the defragmenter to 1) notice that an attempt is being made to access
the file, 2) abort its own operation safely and quickly, and 3) try again later. The ideal defragmenter
would process files in such a way that no user application could ever falter from or even detect an
access conflict. In other words, the defragmenter should have enough control over system operation to
move the file at a time when no user is attempting access and in such a way that no attempted access
by an application would ever fail.
File Creation and Modification Dates
Another simple but important piece of basic functionality is the preservation of file creation and
modification dates. You can defragment a file quite easily by simply using the DCL command COPY
/CONTIGUOUS. If there is a free space of sufficient size available, DCL will make a contiguous copy of
the file for you. The problem with this method is that it gives the copy a new file creation date. You might
not care whether the date is changed or not, but the VMS BACKUP utility will. The next time you do an
incremental backup, the copied file will be saved even though it was saved on an earlier backup. The
reason is the new date given to the file by the COPY command. For a single file, this may be no big deal,
but clearly a defragmenter cannot go around changing file creation dates wholesale. Nor can the file's
modification date or date of last backup be changed. Either action would cause your incremental
backups to explode from relatively small savesets to ones rivaling full backups in size.
A good defragmenter should not change file creation dates, file modification dates, file backup dates or
any other information in the file header except the size and location of the extents (fragments) that make
up the file.
Directory Files
Directory files never become fragmented, but they are otherwise just like any other file. Directory files do
fragment the disk's free space, however. Directory files present a special problem for a defragmenter in
that while a defragmenter has a directory file locked or held open for relocation, not only is that directory
file inaccessible to users, so is every file in that directory and every file in every subdirectory below it. To
access a file by name, a user application must go through the directory or directories containing that file's
name. If the directory is locked, the user application gets an access conflict error. If the user application
is not expecting an access conflict error or is not designed to deal with such errors, the application may
abort.
A good defragmenter is designed with this problem in mind and moves directory files without any
restrictions whatsoever on user access to files in that directory or its subdirectories. It is no solution to
just
ignore
directory
files,
as
this
leaves
your
free
space
badly
fragmented.
Written 1994
Page 75 of 121
THE FRAGBOOK
By Craig Jensen
Red Herrings
"Optimization" by File Placement
"Optimization" of disk access by deliberate placement of files in certain locations is a red herring - an
attempt to draw your attention away from the real issues of defragmentation and onto something else
entirely.
First of all, optimization has nothing to do with defragmentation. Defragmentation is the solution to the
problem created by fragmented files and disk free space. Your system is slow. The real reason it is slow
is that files and disk free space are fragmented. The solution is to make the files contiguous (not
fragmented) and group the free space together. That's it.
Where does optimization come in? Well, this is a different subject altogether. The concept of disk
optimization supposedly accelerates file access even when all the files are contiguous and all the free
space is grouped together. Disk optimization is an attempt to speed up file access by forcing certain files
to be permanently located in certain positions on the disk. The theory goes that if you put the
INDEXF.SYS file in the middle of the disk and group the most frequently accessed files around it, the
disk heads will generally have to travel a shorter distance than if these files were located randomly
around the disk.
There are some major holes in this theory. In fact, the holes are so major that I think the "optimization"
proponents either don't fully understand the OpenVMS system or are just using optimization as a
marketing gimmick. There are too many holes in the theory.
Hole number one: There is no standard, supported way on an OpenVMS system to tell which files are
most frequently accessed. In fact, there is no way to tell which files are frequently accessed or even
which files have ever been accessed. You can tell which files have been written and when they were last
written, but not when they were read. The only thing that comes close to providing this information is the
enabling of volume retention dates, but enabling this feature consumes more overhead than you are
likely to get back by "optimizing" file placement. The cure is worse than the disease.
Hole number two: Extensive analysis of real-world computer sites shows that it is not commonplace for
entire files to be accessed all at once. It is far more common for only a few blocks of a file to be
accessed at a time. Consider a database application, for example. User applications rarely, if ever,
search or update the entire database. They access only the particular records desired. Thus locating the
entire database in the middle of a disk is wasteful at best and possibly destructive as far as performance
is concerned.
Hole number three: File placement capability in OpenVMS was designed for the realtime laboratory
environment in which a single process has continuous control of the computer system. In such a system,
the time consumed by head movement from one particular file to another particular file can be critical to
the success of the process. The system designer can minimize that critical time lag by calculating the
ideal location for the second file in relation to the first and forcing the two files to exact locations. Then,
when the process has completed reading the first file, access to the second is effected with minimal
delay.
Written 1994
Page 76 of 121
THE FRAGBOOK
By Craig Jensen
By comparison, consider the typical interactive user environment. Dozens or even hundreds of
interactive users might be logged on and active at any moment, running who knows what applications,
accessing innumerable files willy-nilly in every conceivable part of a disk. How can one even hope to
guess where the disk's read-write head might be at any given time? With this extremely random mode of
operation, how can a disk optimizer state flatly that positioning such-and-such a file at such-and-such an
exact location will reduce disk access times? It seems to me that such a statement is foolish and such
file positioning is equally as likely to worsen system performance as to improve it. Even if the two
conditions balance out at zero, the overhead involved gives you a net loss.
Hole number four: When you force a file to a specific position on the disk by specifying exact LBNs,
how do you know where it really is? You have to take into account the difference between logical block
numbers (LBNs) and physical block numbers (PBNs). These two are not the same thing. LBNs are
assigned to PBNs by the disk's controller. Disks supplied by Digital Equipment Corporation often have as
many as 10% more physical blocks than logical blocks. The LBNs are assigned to most of the physical
blocks and the remainder are used as spares and for maintenance purposes. You see, magnetic disks
are far from perfect and blocks sometimes "go bad." In fact, it is a rarity for a magnetic disk to leave the
manufacturer without some bad blocks. When the disk is formatted by Digital or by the customer, the bad
blocks are detected and "revectored" to spares. Revectored means that the LBN assigned to that
physical block is reassigned to some other physical block. This revectoring can also be done on the fly
while your disk is in use. The new block after revectoring might be on the same track and physically
close to the original, but then again it might not. Thus, all LBNs do not correspond to the physical block
of the same number and two consecutive LBNs may actually be widely separated on the disk.
So I ask again, "When you force a file to a specific position on the disk, how do you know where it really
is?" You may be playing probabilities and perhaps you should think twice before gambling with user data
and system performance.
Hole number five: Where is the "middle" of a disk? Obviously, no one is suggesting that the geometric
center of the round disk platter, like the hole in a phonograph record, is the "middle." Of course not. We
are talking about data storage. The middle is the point halfway between LBN zero (the "beginning" of the
disk) and the highest LBN on that disk volume (the "end" of the disk). Right?
Well, maybe not. We have already seen that LBNs do not necessarily correspond to the physical disk
block of the same number. But what about a multi-spindle disk (one with two or more sets of platters
rotating on separate spindles)? There are several different types of multi-spindle disks. Besides the
common volume sets and stripesets, there are also disks that use multiple spindles for speed and
reliability yet appear to OpenVMS as a single disk drive. Where is the "middle" of such a disk? I think you
will agree that, while the location of the apparent middle can be calculated, the point accessed in the
shortest average time is certainly not the point halfway between LBN zero and the last LBN. This halfway
point would be on the outermost track of one platter or on the innermost track of another - not on the
middle track of either one. Such disk volumes actually have several "middles" when speaking in terms of
access times.
There are even disks that have no performance middle at all. I am thinking of electronic (semiconductor)
disks, which have no heads and thus no head movement. With an electronic disk, all overhead
associated with "optimizing" file placement is wasted time and lost performance.
Written 1994
Page 77 of 121
THE FRAGBOOK
By Craig Jensen
Hole number six: With regular defragmentation, a defragmenter needs to relocate only a tiny
percentage of the files on a disk; perhaps even less than one percent. "Optimization" requires moving
virtually all the files on the disk, every time you optimize. Moving 100 times as many files gives you 100
times the opportunity for error and 100 times the overhead. Is the result worth the risk and the cost?
Hole number seven: What exactly is the cost of optimizing a disk and what do you get for it? The costs
of fragmentation are enormous. A file fragmented into two pieces can take twice as long to access as a
contiguous file. A three-piece file can take three times as long, and so on. Some files fragment into
hundreds of pieces in a few days' use. Imagine the performance cost of 100 disk accesses where only
one would do! Defragmentation can return a very substantial portion of your system to productive use.
Now consider optimization. Suppose, for the sake of argument, that disk data block sequencing really did
correspond to physical block locations and you really could determine which files are accessed most
frequently and you really knew the exact sequence of head movement from file to file. By carefully
analyzing the entire disk and rearranging all the files on the disk, you could theoretically reduce the head
travel time. The theoretical maximum reduction in average travel time is one-quarter the average head
movement time, after subtracting the time it takes to start and stop the head. If the average access time
is 32 milliseconds (for an RA82 model disk) and 24 milliseconds of this is head travel time, the best you
can hope for is a 6 millisecond reduction for each file that is optimized. On a faster disk, such as the
RA71 (12.5 milliseconds), the potential for reduction is proportionately less - about 2 milliseconds. Taking
rotational latency into account, your savings may be even less.
Each defragmented file, on the other hand, saves potentially one disk access (32 milliseconds) per
fragment. That's over five times the optimization savings, even with the bare minimum level of
fragmentation. With badly fragmented files, the difference is astounding.
On top of all that, what do you suppose it costs your system to analyze and reposition every file on your
disk? When you subtract that from the theoretical optimization savings, it is probably costing you
performance to "optimize" the files.
The fact is that it takes only a tiny amount of fragmentation, perhaps only one day's normal use of your
system, to undo the theoretical benefits of optimizing file locations. While "optimization" is an elegant
concept to the uninitiated, it is no substitute for defragmentation, it is unlikely to improve the performance
of your system at all, and it is more than likely to actually worsen performance in a large number of
cases.
In summary, file placement for purposes of optimizing disk performance is a red herring. It is not
technologically difficult to do. It is just a waste of time.
The "Perfect" Disk
What should the end result of defragmentation be? What, exactly, is the product of a defragmenter's
efforts?
How about a perfect disk? Wouldn't that be reassuring, to know that your disks are "perfect"? A perfect
disk, in terms of fragmentation, is a thing of beauty. It is a disk which has each and every file in a
perfectly contiguous state, with every bit of free space all collected together in one spot, preferably at the
beginning (near LBN 0) of the physical disk.
Written 1994
Page 78 of 121
THE FRAGBOOK
By Craig Jensen
This seems straightforward and well-grounded in sensible reasoning, yet there is quite a controversy
over the matter. Why? Well, there are other factors that need to be taken into consideration.
Some say that free space on the disk should not be organized at the beginning of the disk; that putting
the free space there does no good because new files are allocated from blocks pointed to by the extent
cache (blocks recently freed up by file deletions) instead of from repeated scanning of the storage
bitmap.
This may be true, but it is also true that the extent cache is loaded first from the storage bitmap and then
added to as files are deleted. It is also true that the location of blocks freed up by deletions is relatively
random across the disk. A defragmentation strategy that groups files near the beginning will merely
reinforce the random nature of the extent cache because holes caused by deletions will appear near the
beginning (as well as everywhere else) and the extent cache will be loaded with lots of little pieces. On
the other hand, if the defragmentation strategy grouped free space near the beginning, the extent cache
would be loaded initially with a large amount of contiguous free space. This would then result in newly
created files being more likely to be created contiguously in the first place and reduce the need for
defragmentation. In other words, performance would degrade less (remain high) through prevention.
I must admit that when we are talking about where to consolidate free space, we are splitting hairs. The
performance to be gained from consolidating free space in a particular area of the disk is slight, even
under the best of circumstances. Moreover, consolidation of free space is overrated. While badly
fragmented free space is likely to cause performance problems indirectly by forcing new files to be
created in a fragmented state, slightly fragmented free space does not affect performance at all. In the
absence of an absolute requirement for large contiguous files, there is no performance benefit
whatsoever to a single large contiguous free space over as many as a few hundred smaller free spaces.
Any resources expended consolidating a few free spaces into one large one are likely to be wasted. The
important number to look at is the percentage of free space that is consolidated into a few large spaces.
Some say that free space on a disk should be grouped around the index file in the middle of the disk.
Their argument is that, by placing free space near the index file, the disk head will have less distance to
travel to and from the INDEXF.SYS file when creating new files. By keeping the head near the middle
tracks of the disk, the greatest overhead factor in disk performance, head movement, is reduced.
This is certainly true under certain specific circumstances, but it is decidedly not true under others. For
example, while this technique is sensible for creating new files, what about accessing the existing files?
By relegating these files to the outside edges of the disk (the lowest and highest LBNs), the distance the
head must travel to access these files is increased. Should we assume that any file created before the
defragmentation will never be accessed again? Or that such "old" files are not accessed often enough to
matter? Surely somebody accesses data files that are more than a day or two old. Under a scheme such
as this, those data file accesses are going to be slowed, not speeded.
There is also the question of where is the INDEXF.SYS file located? This scheme assumes that it is
located in the middle of the disk. But OpenVMS allows for the INDEXF.SYS file to be located at the
beginning, at the end, or at any specific block location the System Manager might choose. What
happens to your system's performance if the INDEXF.SYS file is positioned at the beginning of the disk
and the defragmenter groups all the free space in the middle? Performance gets clobbered, that's what
happens.
Written 1994
Page 79 of 121
THE FRAGBOOK
By Craig Jensen
The problem with schemes like this is that they are based on theoretical assumptions rather than on realworld observation of disks in use on live computer systems. The worst assumption of all is that disks
have a "beginning," an "end" and a "middle." As noted earlier, we talk of disk locations in terms of LBNs,
or Logical Block Numbers. While a logical block ordinarily corresponds to a physical block, this is not
always so. The OpenVMS disk architecture allows for logical block numbers to be reassigned, if
necessary, to any physical disk block at all. This allows a disk to continue to be used even though one or
more physical disk blocks are unusable. The corresponding LBNs are merely reassigned to other
physical blocks. It also allows for widely varying physical disk architectures that can be treated identically
by the I/O subsystem of OpenVMS.
Take a multi-spindle disk as an example. A disk with a single platter on a single spindle, with only one
side used, is easy to discuss in terms of "beginning," "middle" and "end." Here is a diagram showing
such a disk:
Figure 6-1 Single Disk
Now here is a diagram of a two-spindle disk with LBN numbers and arrows indicating the "beginning,"
"middle" and "end."
Figure 6-2 Two-Spindle Disk
Note that the "middle" of this disk spans the innermost tracks of one platter and the outermost tracks of
the other. What will happen to head movement if the INDEXF.SYS file is placed at the "middle" of this
multi-spindle disk and the free space is all grouped on either side of it? It will be a disaster performancewise.
Your disk might not look like this. But it also might not look like the idealized conceptual scheme used by
designers of such a system. The point is that, to be useful, the design of a defragmentation strategy must
be based on the logical architecture of the disk and must work well regardless of different physical
Written 1994
Page 80 of 121
THE FRAGBOOK
By Craig Jensen
structures. A strategy that assumes one particular physical architecture may be devastating for other
types of disks.
Grouping free space near the logical beginning of a disk (LBN 0) is guaranteed to reduce time spent
scanning the storage bitmap for free clusters. It is also guaranteed to maximize the probability that newly
created files will be created contiguously or, at least, minimize the number of fragments in newly created
files, regardless of the physical architecture of the disk involved.
The final and worst problem with a "perfect" disk is that its perfection doesn't last. Moments after
achieving that exalted state, some user application has created, deleted or extended a file and the
perfection is history. You now have an imperfect disk that is well on its way to becoming less and less
perfect.
So
just
how
valuable
is
that
fleeting
moment
of
"perfection"?
The True Goal Of Defragmentation
I submit that a perfect disk is the wrong goal. Elegance, beauty and perfection in disk file arrangement
are not the ideals to which a System Manager aspires. You cannot take them to management and show
them off as a demonstration of your value to the organization. You cannot even use them to keep
yourself employed. How many System Manager resumes include a section for "beauty, elegance and
perfection" of disk file arrangement? None.
The true goal, of course, is performance. Now there's something you can take to the bank. A system that
is so slow you can't log on this week is a system that will get you fired. A system that performs even
better than the expectations of users and management might just get you a raise or a promotion or both.
So let's talk about better performance of the system.
The mistake made by the "perfect disk" advocates is that of viewing a disk and even a VAX (or Alpha
AXP) computer system as a static, unchanging thing. Static, unchanging VAX systems don't need
defragmenters. They need a lift to the nearest scrap yard.
A real VAX or Alpha AXP system is dynamic, in motion, changing continuously - almost alive with
activity. Real perfection, in my mind, would be having all this activity streamlined for efficiency and
directed at the specific production goals of the organization and the individuals who make it up. Let's see
a computer system that can be pointed at a computational problem and vaporize that problem in a flash.
The culprit we are targeting is slowness, sluggishness, the "can't-help-you-now-I'm-too-busy-doingsomething-else" of a mismanaged computer system. The ideal defragmenter would vaporize that
problem and give you the laser-fast precision instrument you expect.
Let us take a careful look, then, at exactly what factors are important to deliver this capability. What really
counts when it comes to defragmentation results?
Factor Number One: Safety
Your defragmenter must be safe. Above all other requirements, features and benefits, safety of user data
is the most important. One trashed disk of user data, one crashed system, even one lost user file can
outweigh all the performance benefits in the world. What good does it do to talk of performance
increases when the system is down and user data forever lost?
Written 1994
Page 81 of 121
THE FRAGBOOK
By Craig Jensen
There must be no compromises with safety, either. "99% safe" is not a feature. "Able to recover most lost
data" does not go over too well with users and management. What you want is absolute, 100%
guaranteed safety; no chance whatsoever of any user data being lost at all.
Even the ability to recover lost data in the event of a catastrophe is dicey. The minute you hear that data
is lost, even temporarily, your confidence is weakened. The thread of recovery is too tenuous for you to
sit back, relax and coolly await the recovery of the lost data by the very tool that trashed it moments ago.
OK, maybe it's better than a poke in the eye with a sharp stick, but it's still not a very comforting situation.
Factor Number Two: Low Overhead
Earlier, we covered the troublesome defragmentation techniques that consumed more system resources
than they gave back - the "cure" that is worse than the disease. We answered the question, "How long
should a defragmenter take to do its job?" with "Less than the time and resources being lost to
fragmentation."
While it is a definite improvement for a defragmenter to spend 19% of the system's resources to get a
20% improvement in performance, it is a much better improvement for a defragmenter to spend only 2%
of the system's resources to get a 20% improvement in performance. As a rule of thumb, your
defragmenter should consume not more than two percent (2%) of the system. You can measure this by
using the DCL command SHOW PROCESS or looking in ACCOUNTING records, then dividing the
resources consumed (CPU time, for example) by the total resources available during the same time
period. For example, if the defragmenter consumed two hours of CPU time out of every 100, that would
be 2 divided by 100, or 2% of the available CPU time.
The ideal defragmenter would not only consume less than 2% of the system resources, it would take its
2% from resources that are otherwise idle, such as CPU idle time and unused disk I/O bandwidth.
Factor Number Three: Worst Problems Handled First
Sometimes, in a highly technical subject, it is easy to get so caught up in the details that one overlooks
things of major importance. Like the disk that is 99% full. Defragmenting such a disk is a complete waste
of time. Except in extreme cases, the performance benefits you might get from defragmenting such a
disk are insignificant when compared to the performance benefits available from simply clearing off 10%
or 20% of the disk. When a disk is very nearly full, OpenVMS spends so much time looking for free
space, waiting for the disk to spin around to bring a tiny free space under the head, and allocating files in
lots of little pieces all over the place, it is a wonder that any useful work gets done at all.
Similarly, as mentioned earlier, the defragmenter that works hard to consolidate every bit of free space
into a single contiguous area is wasting its time. Rigorous testing shows clearly that an OpenVMS disk
with up to 63 free spaces performs just as well as a disk with one large free space. So all the overhead
expended by the defragmenter consolidating free space into less than 63 areas is wasted. And it is your
computer resources that are being wasted.
A defragmenter should address itself to real problems instead of theoretical ones and deliver
real solutions to them. The single most important piece of advice I can give on the
defragmenters is to distinguish between the idealized results obtained in a perfect
environment and what can be expected in your system under real-world
Written 1994
Page 82 of 121
immediate,
subject of
laboratory
conditions.
THE FRAGBOOK
By Craig Jensen
Who Buys a Defragmenter?
According to Computer Intelligence Corporation, a respected source of information about the VAX
market, only 1% of all VAX sites had a defragmenter installed in October 1987. By October 1988, the
number had grown to 4%. By October 1989, it had shot to 11%. The survey was not done in 1990, but in
October 1991, market research showed 18% of all sites running a disk defragmenter.
Graph 6-1 Percentage Of Vax Sites With A Defragmenter
For comparison, the next graph shows the increase in disk capacity over the same time period (these
figures are also from Computer Intelligence):
Graph 6-2 Average Disk Capacity Per System (Megabytes)
Experienced System Managers: The average System Manager has three years experience. Fully twothirds of the defragmenters in existence are in the hands of the more experienced half of the System
Managers. A much higher percentage of System Managers with eight or more years of experience use a
defragmenter
than
System
Managers
with
less
experience.
Sites with many users: The average VAX site has 102 users. Two-thirds of all defragmenters are
installed at sites with 50 or more users. Twenty-six percent of sites with 200 or more users have a
defragmenter. That's substantially more than the 18% of all sites that have a defragmenter.
Written 1994
Page 83 of 121
THE FRAGBOOK
By Craig Jensen
Sites with many disks: The average VAX site has 7.5 disks. (Don't you love statistics that imply that
someone has half a disk?) Sixty percent of all defragmenters are installed at sites with six or more disks.
Twenty-four percent of sites with 10 or more disks use a defragmenter. Again, substantially more than
average.
One more interesting statistic is that 62.5% of System Managers who have attended training classes at
Digital Equipment Corporation use defragmenters.
In looking at the distribution of defragmenters amongst VAX sites, it can be seen that defragmenters are
not distributed evenly across all sites. More sites than you would expect from an even distribution have a
defragmenter when:
•
They have a VAX 6000, 8000 or 9000 series computer (the larger VAXes);
•
They have over 500 users (the sites with heaviest usage);
•
They have three or more VAXes (the larger sites); and
•
They have six or more disks (the larger sites).
Specifically, survey results from Computer Intelligence show that the sites running the big VAXes have
26% to 33% more defragmenters than a random distribution would predict; sites with 500 or more users
are 36% more likely to have a defragmenter, and sites with three or more VAXes are 131% more likely to
be running a defragmenter.
Also, as you might expect, the survey results show that System Managers with the most experience and
training are substantially more likely to use a defragmenter than inexperienced, untrained System
Managers.
The conclusion is plain: the people who buy defragmenters are the more experienced and trained
System Managers, the ones with many users, many disks and many VAXes, and particularly those with
the big machines.
In surveying defragmenter buyers as to why they made the purchase, the overwhelming response is that
they had to handle the fragmentation problem, but backing up to tape and restoring each disk was far too
tedious and time-consuming. The defragmenter frees the System Manager from this unpleasant chore
and saves time so he or she can get more done.
Finally, it should be noted that ninety percent of defragmenter buyers place the safety of the product
above performance in importance, but three-quarters of them expect a performance improvement as a
result of defragmenting.
Who Does Not Buy a Defragmenter?
The people who don't buy defragmenters are the sites that have more disk space, by far, than they really
need. Files tend to be created contiguously in the first place, as there is plenty of free space to do so. By
keeping lots of free disk space available, these sites suffer very little from fragmentation.
Written 1994
Page 84 of 121
THE FRAGBOOK
By Craig Jensen
Naturally, these tend to be very small sites. At a larger site, the cost of a defragmenter is a very small
portion of the computer budget. As we have seen from the Computer Intelligence data, inexperienced
System Managers don't buy defragmenters. These folks either don't understand the cause of their
system's slower and slower performance or they lack the expertise to demonstrate the problem and the
need for a solution to those who hold the purse strings. This book is the answer to both problems, the
former in the main body of the book and the latter in the appendices.
What Does the Future Hold for Defragmentation?
We have seen, much earlier in this book, that fragmentation did not just happen. Rather, it was
deliberately introduced as a solution to an earlier problem. The file structure for the OpenVMS operating
system and its predecessor, RSX-11, was purposefully designed to allow fragmentation so users would
not have the more serious problem of running out of file space prematurely. Then, as disk capacities
grew to proportions previously unimagined, fragmentation came to be a problem in its own right.
Reverting to a file system that allows only contiguous files is no solution for the nineties. The automatic
on-line defragmenter is the ideal solution for now. But what does the future hold? Can we envision what
disks might be like ten years from now? What new and even more capacious forms of storage might
come along and how would these impact the problem of fragmentation and its defragmenter solution?
We have already seen the introduction of another deliberate form of fragmentation: disk striping. With
striping, files are deliberately fragmented across two or more disks in "stripes" of data that can be
retrieved from the multiple disks much faster than they could be retrieved from any one disk. Extensions
of this technology to large arrays of disks could dramatically counter the fragmentation problem.
Electronic "disks" have made their debut and, with falling prices for semiconductor memory chips, could
become a viable form of mass storage. Even before seeing electronic storage broadly in use, however,
we will see more of the hybrids, which combine electronic storage with magnetic. All that is needed is
some mechanism for sorting out the performance-critical data from the non-critical and this system
becomes very cost effective. We are seeing this type of system now with data caching, particularly as
caches are built into disk controllers to provide better performance.
This path can be extrapolated to intelligent disk subsystems that determine for themselves where best to
store data and have the data all ready to return promptly when needed.
We can also envision a new file allocation strategy that is not sensitive to fragmentation. Thinking of files
as "flat" one- or two-dimensional objects leads us to think of the parts of files as being "close together" or
"separate." A collection of data can also be thought of as a three- or more-dimensional pile of associated
records that can be accessed in any old way. More elaborate indexing methods give us faster access to
the data. Larger and faster storage devices allow for more elaborate indexing methods.
This all culminates in a vision of data storage as a completely automatic mechanism without any files at
all. You put data in and you get information out. Don't ask me how it works, but that's where I think we
are headed. Naturally, without files, there can be no file fragmentation, so the problem is gone
altogether, no doubt to be replaced by some new, even more perplexing problem that will keep us
system
programmers
in
business
for
a
long
time
to
come.
Written 1994
Page 85 of 121
THE FRAGBOOK
By Craig Jensen
Conclusion
It should be clear by now that fragmentation is well understood and a good solution is available in the
form of an automatic on-line disk defragmenter. It should also be clear that a defragmenter designed for
a static laboratory environment won't cut the mustard in the real world where things are changing
continuously. It is vital both for performance and for the safety of your data that the defragmenter be able
to deal with a disk that is organized differently from defragmentation pass to defragmentation pass, with
files that appear and disappear and with heavy user loads at unexpected times. In choosing a
defragmenter, look for one that addresses known, demonstrable performance issues and not just
theoretical problems demonstrable only in a laboratory environment. I have taken a great deal of time in
this chapter to explain the ins and outs of defragmentation so that you will be well-informed and better
able to judge your needs and the best solution for you. I believe that the more you know about
fragmentation and defragmentation, the easier your job will be. How much do you need to know?
Enough to handle the problem once and for all and get on with more important and interesting things.
Written 1994
Page 86 of 121
THE FRAGBOOK
By Craig Jensen
CHAPTER 7
THE ULTIMATE SOLUTION TO THE FRAGMENTATION PROBLEM
Disclaimer: This chapter is an unabashed advertisement for DISKEEPER, the defragmenter sold by
Executive Software. Rick Cadruvi and I conceived of, designed and developed DISKEEPER together
in 1986. We took great pains to create exactly what System Managers wanted. We were both
performing VAX management functions at the time and had years of experience managing VAXes.
More importantly, we did a lot of research to find out what System Managers thought about
fragmentation, how they dealt with it and how they wanted to deal with it. We designed our product to
be exactly what they wanted.
Through Executive Software, I have continued to stay closely in touch with System Managers - tens
of thousands of them. Therefore, I believe what I say to you now and I have good reason to believe it.
If you are offended by the fact that I own the company that sells the product I recommend in this
chapter, please don't read it. If possible, I request that you take note of my relationship to the
company and to the product, accept my offer to help and read on.
As noted at the beginning of the previous chapter, the first defragmenter became available for VMS in
1986. Within a year, there were seven on the market. During the next few years, defragmenter
competition and a skeptical marketplace weeded out the weakest of the products and one defragmenter,
DISKEEPER, from Executive Software, rose to dominate the field, achieving an installed base roughly
double that of all others combined. Obviously, this defragmenter had something going for it.
I believe that the reason for the success of DISKEEPER is that careful market research was done to find
out what System Managers needed and wanted most in a defragmenter, careful technical research was
done to determine whether that could be provided, and we then delivered to the System Managers all
they had asked for and more. Since that time, we have always tried to outdo ourselves, enhancing the
product to fit customer needs and take advantage of new technology, always striving for the seemingly
unobtainable goal of the utter elimination of fragmentation as a System Manager headache.
We have come very close, and we are not finished yet.
I have nothing to say about other defragmenters. Most of them have disappeared from the market, for a
variety of reasons. The ones that remain are offered by well-intentioned people who really believe their
product is the best for you. I know that it is the customer who decides and, in a free market, he who
serves the customer best wins. I want to win, but only if you win, too. By giving you everything you want
and more, at a fair price, with first-class service, we all win.
Design Goals
The driving requirement for DISKEEPER was that it run automatically, in the background, safely
reorganizing files as needed to keep a disk performing optimally, while users continued accessing files
on the same disk.
DISKEEPER was designed with the following goals in mind:
1. The product must be completely safe to use.
2. It must make OpenVMS operate more efficiently.
3. It should process any OpenVMS supported ODS-2 disk.
Written 1994
Page 87 of 121
THE FRAGBOOK
By Craig Jensen
4. It should process live disks without interfering with user access to files on that disk.
5. It should operate while OpenVMS is running without affecting performance.
6. It should process system disks as well as user disks.
7. It should run without operator intervention.
The implementation of each of these design goals is discussed in detail below.
Goal 1: Safe to Use
The foremost design goal was to make sure that no data is ever lost - neither user data nor VMS internal
information. To accomplish this, the DISKEEPER proprietary method for relocating files was developed.
It uses the following criteria for accessing files:
•
The contents of data files are never modified under any circumstances.
•
Only one file is processed at a time, not the whole disk.
•
Each processing pass is independent of the other passes.
•
No information is stored on any other device or in a "scratch space".
•
A file currently in use is not processed.
•
DISKEEPER accesses a file in such a way that no user access can conflict with DISKEEPER
during the critical portion of the relocation process.
•
Read and write checks are used in all I/O to verify that the relocated file is a bit-for-bit duplicate of
the original.
•
File relocation is aborted if any error is encountered, leaving the file in its original state.
•
File structure integrity is verified before any files are processed.
The program was designed to err on the side of caution. In other words, the program only moves file
information on the disk when it is absolutely certain that no data will be lost, including file attributes. The
only change to file attribute information is the physical location of the file on the disk. None of the file
dates are changed and no reserved fields in the header are used to store DISKEEPER information.
Placement control is not changed unless DISKEEPER is explicitly instructed to do so by the System
Manager.
If your system crashes while DISKEEPER is running, or if DISKEEPER aborts abnormally, the worst that
can happen is that some empty disk blocks may end up marked allocated when they are not part of any
file. DISKEEPER properly deallocates any such blocks resulting from DISKEEPER interruption.
With OpenVMS V5.5, Digital introduced a mechanism for moving files that is guaranteed safe by Digital.
This mechanism, called the MOVEFILE primitive, only moves a file if the following conditions are met:
Written 1994
Page 88 of 121
THE FRAGBOOK
By Craig Jensen
•
The program has write access to the file.
•
The file is closed.
•
MOVEFILE operations are not disabled for the file by the DCL command SET FILE /NOMOVE.
•
The operation is not interrupted.
•
The source and target locations are on the same disk.
When DISKEEPER is run on OpenVMS V5.5 or higher, you may select either of these methods of
relocating files - proprietary or MOVEFILE.
Goal 2: Make OpenVMS More Efficient
When a file is moved by DISKEEPER, it is made contiguous or, at the very least, less fragmented. If it is
already contiguous, the file is not moved unless moving it would markedly improve the arrangement of
free space on the disk.
With plenty of contiguous free space, file creations are faster and new files tend to be created
contiguously, or nearly so. To demonstrate this, try copying a large file on a non-DISKEEPER disk (use
the DCL COPY command), then do the same on a disk processed by DISKEEPER (or run DISKEEPER
on the same disk and COPY the same file). Use the DCL DUMP /HEADER command to examine the file
headers of the copied files. You should see fewer map pointers for the file created on the DISKEEPER
disk than on the other.
All this adds up to better performance because files are created faster and files can be accessed more
quickly because they are contiguous.
Note that the goal was not "to make every file contiguous" or "to combine all free spaces into one large
contiguous free space." Disk perfection is not a requirement to get better performance from OpenVMS.
In fact, a perfect disk will perform no better than a nearly perfect disk. While a single giant contiguous
free space will allow the creation of a single giant contiguous file, it does no more for performance than a
small number of relatively large contiguous free spaces. It is not the difference between one 100,000
block space and four 25,000 block spaces that makes a difference in performance; it is the 30,000 threeblock spaces that really hurt.
Nonetheless, DISKEEPER will do an excellent job of consolidating free space on your disks. But do not
use this as a yardstick for measuring defragmentation benefits; it is the number of fragments into which
your files are broken that really impacts disk I/O performance.
How much better will performance be? That depends on your particular circumstances. If your system is
not I/O bound, the gains may be slight. If it is, the gains should be dramatic. It is not unreasonable to
expect a 20% improvement in CPU utilization and disk I/O from even a well-managed system. Some
sites may achieve a much greater improvement.
Goal 3: Process any OpenVMS ODS-2 Disk
This design goal was accomplished by using OpenVMS itself to do the "diskeeping" wherever possible.
Written 1994
Page 89 of 121
THE FRAGBOOK
By Craig Jensen
DISKEEPER supports the entire range of OpenVMS ODS-2 disk types: system disks, common system
disks, quorum disks, user disks, volume sets, stripesets and shadow sets. DISKEEPER supports fixed,
removable, and floppy disks. It works in clusters whether the disk is on a local controller, an HSC, MSCP
served, or LAVC-style MSCP served. It can deal with empty or full disks and anything in between.
DISKEEPER works with all Digital and third-party disk controllers.
DISKEEPER is designed for any Digital-supported configuration.
Note that system disks and common system disks really are processed. DISKEEPER does not merely
exclude all files in system-rooted directories. DISKEEPER actually processes all files on a system disk
except open files and a few reserved files that cannot be moved while OpenVMS is running from that
disk. The same applies to common system disks.
Goal 4: Process Live Disks Without Interfering With User Access To Files
As covered earlier, it is not acceptable to force users off the disk while defragmenting it. To do so would
be a case of the cure being worse than the disease. Access to fragmented files is better than no access
at all.
The only acceptable solution is to defragment on-line with users active on the same disk. DISKEEPER
was designed with this in mind, and accomplishes the task without compromise, primarily due to the
following features:
No File Access Conflict
During most of the time DISKEEPER is processing a file, it shares the file with any other users that may
access the same file. The last step of processing the file, however, involves locking the file for a very
brief period, the duration of two QIO operations, a matter of milliseconds. If another user requests a file
that DISKEEPER has locked, that request is suspended for the brief period until DISKEEPER releases
the file. Then the request is serviced. There is never an interruption of either process as a result of this
delay.
I/O Throttling
DISKEEPER limits its own I/O to the equivalent of disk I/O "idle time." This feature, especially important
for the MicroVAX RQDXn disk controller, makes the impact of DISKEEPER on the load of your VAX or
Alpha AXP virtually unnoticeable, even during peak levels of activity. This feature is particularly important
on any system where I/O to the disks is usually at or close to the maximum possible throughput.
Suspending defragmentation activity when users most need access to their data assures maximum
system performance.
Exclusion List
DISKEEPER gives the System Manager the option of excluding certain files from processing. The
Exclusion List is evaluated at the start of each set of multiple passes and the files specified (in the list)
are skipped over by DISKEEPER.
On-Line Directory Moves
DISKEEPER moves directory files, provided the directory is not open. This allows larger contiguous free
spaces to be made which, in turn, allows larger files to be defragmented by DISKEEPER, or created
contiguously by the user.
Written 1994
Page 90 of 121
THE FRAGBOOK
By Craig Jensen
Caches Updated
DISKEEPER does take into account the file header cache, and makes sure that the file headers are
correctly updated so that no data is lost. The extent cache is not changed.
Open Files Ignored
Files that are always held open are not processed by DISKEEPER. These files can be made contiguous
safely only by DCL COPY /CONTIGUOUS, by backup and restore, or by closing the files so DISKEEPER
can process them. As long as the files remain open, they will be untouched by DISKEEPER.
Goal 5: Operate While OpenVMS Is Running Without Affecting Performance
Three steps were taken to assure that DISKEEPER overhead had the lowest possible impact on system
performance:
First, DISKEEPER is designed to be run as a detached process running at priority 2. With the typical
OpenVMS system running user jobs at priority 4 and batch jobs at priority 3, DISKEEPER will use only
CPU time that would otherwise be idle. Priority 1 remains available for even lower priority jobs that you
do not want to interfere with DISKEEPER.
Second, advanced system programming techniques were used to write DISKEEPER, to assure the
highest possible performance. It uses QIOs for I/O instead of high-overhead RMS services, and it copies
a file only once - directly from its original location to the new location. No intermediate copies are made,
so no scratch space or second device is required.
Third, DISKEEPER includes a built-in I/O throttling capability. DISKEEPER monitors I/O on the disk
being processed and adjusts its own I/O accordingly. If the I/O rate increases, DISKEEPER reduces its
own I/O. If the I/O rate decreases, DISKEEPER raises its I/O level. This mechanism effectively limits
DISKEEPER I/O to the equivalent of disk "idle time."
As proof of its efficiency, DISKEEPER typically requires only a few minutes of CPU time per day to keep
an active 456MB RA81 disk defragmented. This constitutes overhead of a small fraction of 1%.
Goal 6: Process System Disks As Well As User Disks
A system disk by itself has little need for defragmentation because few files are ever created on the
system disk. The only files ordinarily created on the system disk are log files. These do not particularly
affect performance because they are rarely, if ever, read. Some sites, however, put user files on the
system disk, and small systems such as MicroVAXes sometimes have only one disk for both system and
user files. DISKEEPER can be run on such a shared system/data disk without having to shut the system
down and without making the system unusable during the processing.
DISKEEPER processes are automatically prevented from moving all system files that OpenVMS will be
expecting to find in a particular location on the disk. There are three different ways in which this is done.
First, any file that is currently open is not moved. In addition to open user files, this includes
INDEXF.SYS on every disk and such files as PAGEFILE.SYS and all SYS$MANAGER:*.LOG files
currently in use on a system disk. This includes installed images that are installed with the /OPEN
qualifier, such as License Management, Cluster Server, Audit Server, Logical Name Server, and many
other operating system components.
Written 1994
Page 91 of 121
THE FRAGBOOK
By Craig Jensen
Finally, some files are excluded from DISKEEPER processing by file specification. Wild card file
specifications are used to look up and identify the specific files on each disk to be excluded in this
manner.
One system file is too critical to trust to exclusion by file specification. That is the boot image, VMB.EXE.
Because it is possible for the boot image to have a different file name, DISKEEPER identifies the file by
way of the boot block in the INDEXF.SYS file, rather than by file name, then excludes that file from
DISKEEPER processing. This assures that the boot image is 100% safe, regardless of its file name.
DISKEEPER, running on any CPU in a cluster with separate or common system disks, can process all
disks accessible by that node, including system disks.
Goal 7: Run Without Operator Intervention
Regardless of how much a defragmenter increases system performance, the System Manager has no
need or desire for the added problem of figuring out how to run the defragmenter and taking the time to
baby-sit it. System Managers need less work, not more. Accordingly, one of the primary design goals of
DISKEEPER was for it to do its job with little or no intervention by a human operator.
We accomplished that in our design so well that a System Manager can literally install the software, start
it up and just forget about fragmentation (and DISKEEPER) thereafter. DISKEEPER cleans up the
existing fragmentation and then prevents it from returning.
I remember calling up one of our customers to see how he liked the product. I was calling specifically to
find out how his life had changed now that he had had DISKEEPER on his three VAXes for six months. I
was particularly interested in this fellow because he was the System Manager for a Computer Aided
Design facility that depended so heavily on contiguous files that he had to backup his disks to tape and
restore them every night. I thought, if anyone would love my product, this would be the guy.
When I asked him about DISKEEPER, he at first didn't know what I was talking about! Then he
remembered and burst out laughing. "You know," he said, "I haven't spent even one evening in the office
since DISKEEPER took over the defragmentation chores." DISKEEPER is so automatic, he had
forgotten it was there.
How does DISKEEPER determine when to defragment a disk? It uses a heuristic formula, which means
a formula based on feedback from the real world. Each time DISKEEPER defragments a disk, it waits a
while and runs again. It compares the two passes and determines whether it had to work harder or not
as hard the second time. If it had to work harder the second time, then clearly it waited too long, so it
adjusts itself to wait a little less and work a little more often. If it had less work to do the second time, it
adjusts itself to wait a little longer between passes. The waiting between passes saves DISKEEPER
from incurring unnecessary overhead. This automatic mechanism keeps DISKEEPER running at just the
right frequency to keep your disks optimally defragmented all the time with the smallest amount of
system
overhead.
Special File Handling
Certain file types are processed by DISKEEPER differently from others. These include partial files, multiheader files, multi-volume files, placed files, directory files, INDEXF.SYS, page files and swap files. The
differences are explained below.
Written 1994
Page 92 of 121
THE FRAGBOOK
By Craig Jensen
Partial Files
If a fragmented file cannot be made contiguous, DISKEEPER can make the file less fragmented by
partially defragmenting it. It uses the largest free spaces on the disk and moves successive file
fragments into these spaces. This feature allows DISKEEPER to process a file even when the file is
bigger than the largest free space on the disk. DISKEEPER uses this mechanism to process a file to
obtain the minimum number of fragments that can be achieved within free space constraints.
Multi-Header Files
Sometimes file fragmentation can become so bad that all the pointers to the pieces of a badly
fragmented file will not fit in a single Files-11 file header. When this occurs, OpenVMS allocates a
second file header for the same file and the file becomes known as a multi-header file.
When DISKEEPER encounters a multi-header file, it defragments the file segments that are associated
with each of the file's headers. Having done that, it cannot accomplish further defragmentation of a multiheader file because it cannot safely consolidate the segments of the file mapped to different file headers.
To consolidate two or more file headers would mean having to do multiple I/Os to the disk to complete
the header consolidation. DISKEEPER accomplishes all defragmentation using only atomic
(uninterruptable, indivisible) operations for critical actions such as updating file headers. This is not
possible with a multi-header file.
There are two manual methods by which you can consolidate multi-header files. The file can be copied
with either the COPY command or the BACKUP utility. Among the drawbacks of these two approaches
are:
•
A new version of the file is created and mapped to another file header. This changes the File ID,
creation date and other identifying attributes of the file. This could also cause interference with
OpenVMS batch and print queues because they reference files by File ID.
•
The disk must have enough contiguous free space for an extra copy of the file, at least
temporarily.
•
Either of these approaches may be very time-consuming, especially for large files.
DISKEEPER includes a Multi-Header Consolidate Utility (MHC). With MHC, the System Manager has a
third and better method available for consolidating multiple header files. MHC protects the files from the
risks
of
automatic
consolidation
in
the
following
ways:
•
The temporary situation of blocks multiply allocated to two file headers is very brief, and under
the direct observation of the System Manager. Under normal circumstances, it is eliminated as
MHC successfully completes consolidation of the file. In the rare instance where MHC is
interrupted while this condition exists, the System Manager knows about it and has the means to
correct it easily.
•
Files that use Access Control Lists (ACLs) are safeguarded. On VMS versions prior to V5.2 such
files are not consolidated by MHC, so their contents and controls are never jeopardized.
Written 1994
Page 93 of 121
THE FRAGBOOK
By Craig Jensen
MHC allows the System Manager to consolidate all eligible multi-header files on a disk, one by one,
without the drawbacks of using COPY or BACKUP. This is true because:
•
No new version of the file is created. All of the file's identifying attributes, including File ID and
creation date, remain unchanged. There is no risk of interference with OpenVMS queues.
•
No disk free space is used by this operation, because the data in the file is never copied or
relocated by MHC.
•
MHC handles each file in a fraction of the time of COPY or BACKUP. This is most noticeable with
large files.
•
Access Control Lists are not jeopardized.
Multi-Volume Files
DISKEEPER does not process a volume set as a set. Each disk in the volume set is processed
separately and defragmented as an individual disk. Files are not relocated from one volume in the set to
another.
A single file that spans two or more disk drives in a volume set, however, presents a particularly delicate
problem for relocation. Often, the spanning is deliberately planned because of the unusually large size of
the file. In this case, relocating the entire file to one disk may actually worsen performance. For this
reason, DISKEEPER compresses each component of the multi-volume file separately and retains the
component on its original disk volume. In other words, a multi-volume file remains a multi-volume file
after processing by DISKEEPER, but the portion of the file on each volume is normally made contiguous.
Placed Files
Placed files are files that are deliberately located at a particular place on the disk by the System
Manager. Usually, this is only done in a real time environment where file placement is critical for
maximum speed. On an interactive OpenVMS system, placement control is not beneficial and can even
worsen performance.
DISKEEPER leaves placed files where they are unless it is told to move them. Its Disk Analysis Utility
can be used to list the placed files on your disk, if any exist. Then DISKEEPER can be used to remove
the placement control from the files and relocate them as needed.
Directory Files
DISKEEPER moves directory files, unless forbidden by an override. As with any other files, directory files
are moved only if moving them would improve the arrangement of free space on the disk.
Some people believe that placing directory files near the physical middle of a disk enhances
performance. While this is true for some other operating systems, OpenVMS caches directories. If
properly tuned, the directory cache hit rate should be at least 90%, meaning that directories are
accessed from memory, not from disk. Therefore, the physical location of directory files on the disk is
irrelevant for optimizing directory lookup time.
If directory files are not moved, it is more difficult for DISKEEPER to make a large contiguous free space.
The free space tends to be broken up by immovable directory files.
Written 1994
Page 94 of 121
THE FRAGBOOK
By Craig Jensen
INDEXF.SYS
INDEXF.SYS is used by OpenVMS not only for file headers but also as a container file for the OpenVMS
home blocks. These blocks are scattered in physically distant locations to maximize the probability that
one of them will be intact following a physical failure of the disk. Accordingly, it is neither possible nor
desirable to make the INDEXF.SYS file contiguous and DISKEEPER does not do so, nor does any other
means of defragmentation, such as backup and restore.
DISKEEPER holds INDEXF.SYS open for the duration of each defragmentation pass.
Page Files and Swap Files
PAGEFILE.SYS and SWAPFILE.SYS are not defragmented when DISKEEPER is run on-line. These two
files and their alternates should be created contiguously initially and should remain so.
Alternate page and swap files can be processed by DISKEEPER when they are not installed. When they
are installed, DISKEEPER detects them as unprocessable and skips over them, whether they are on the
system disk or any other disk.
Note: Fragmentation of PAGEFILE.SYS should not be confused with fragmentation of the record space
within that file. This latter form of fragmentation is reported by OpenVMS with the message PAGEFRAG,
pagefile badly fragmented, system continuing. The condition warned about by this message cannot be
resolved by defragmenting the page file, as it indicates that the page file is probably too small. The
condition can be temporarily alleviated merely by rebooting the system, which causes the page file to be
flushed and reloaded. To correct the condition permanently, it is necessary to extend the page file or
create a new one with sufficient space. DISKEEPER can be used effectively to create a contiguous free
space large enough for a page file of sufficient size.
The Impact of Moving Files
When DISKEEPER relocates a file on a disk, only the mapping pointers in the file header are changed.
The mapping pointers tell OpenVMS where the file is located on the disk. The file ID is not changed; the
creation, modification, expiration and backup dates are not changed; and no data in the file is ever
changed.
No reserved fields in the file header are used by DISKEEPER. The file in its new location is bit-for-bit the
same as before the move. No change is made to the file's allocation, either. Even if excess blocks are
allocated to the file, DISKEEPER leaves the allocation the same.
Only
with
this
hands-off
approach
can
you
be
confident
that
your
data
is
safe.
What Makes DISKEEPER Unique?
DISKEEPER is rich with features, but lives up to its reputation for being "elegant in its simplicity." By
using a simple approach, useful features are incorporated, yet system overhead is kept to a minimum.
Nearly twice as many VAXes are defragmented with DISKEEPER than all other defragmenters
combined.
Written 1994
Page 95 of 121
THE FRAGBOOK
By Craig Jensen
DISKEEPER was designed with the basic assumption that the files on a disk are constantly changing. In
a typical OpenVMS timesharing environment, new files are being created, and existing files are being
accessed, updated, and extended by a large number of diversified users. DISKEEPER was designed to
operate under these conditions, without adversely affecting performance of the applications on the
system.
DISKEEPER is designed to run as a detached process, alternating between brief periods of
defragmenting the disk and long periods of inactivity. It automatically determines the frequency of
defragmentation periods, based on the file activity experienced on each disk.
DISKEEPER can keep a reasonably active 456 MB RA81 disk, for example, defragmented in just a few
minutes of CPU time per day. If it took an hour of CPU time to defragment a disk, that hour might be
more than the performance benefits of defragmenting, so the cure would be worse than the disease.
DISKEEPER does not waste valuable system resources by attempting to "optimize" a disk.
DISKEEPER adjusts the level of its own direct I/O to assure that it does not interfere with the I/O
requirements of application processes on the system. It typically runs as a detached process at priority 2,
so it uses what would otherwise be CPU "idle" time.
DISKEEPER defragments one file at a time, choosing a new location for the file that best defragments
disk free space, also. In the course of defragmenting that file, it is never in an unsafe state. The data in
the file is accessible from application programs, without risk, at all times.
DISKEEPER has a unique method for checking the integrity of data blocks on a disk. This feature
provides the System Manager with an early warning system for detecting potential problems. It does this
by indicating to the System Manager the presence of invalid data blocks on a disk. The DISKEEPER
validation procedure checks for:
1. Multiply allocated blocks. These are blocks allocated to more than one file at the same time.
2. Blocks that are allocated to a file but appear to be free according to the storage bitmap.
Based on the information it finds in the validation procedure, DISKEEPER decides whether or not to run,
and lets the System Manager know exactly where the problem blocks are located and in which files, so
that the System Manager can take steps to handle the situation.
DISKEEPER includes an interactive utility for safely consolidating multiple header files on a disk, without
risk to file attributes or Access Control List (ACL) data.
DISKEEPER includes a 100% full satisfaction money-back guarantee.
DISKEEPER technical support is available 24 hours a day, 7 days a week.
DISKEEPER is the ultimate answer to your fragmentation problem. From the day you install it, you will
never have to concern yourself with fragmentation again.
Unless you already have DISKEEPER installed and running on your OpenVMS disks, fragmentation is
costing you time, money and performance every day. If you follow the advice in Appendix B on the cost
Written 1994
Page 96 of 121
THE FRAGBOOK
By Craig Jensen
of fragmentation, you will see that the cost is substantial - certainly more so than the price of
DISKEEPER.
System Managers sometimes see clearly the need for and benefits of DISKEEPER, but they have a hard
time communicating these effectively to management. It seems like others view this important product as
a nice-to-have. If this sounds familiar to you, see Appendix C on justifying the purchase of a
defragmenter to management.
Conclusion
In this book, I have explained the ins and outs and the terminology of disks. I have explained
fragmentation in considerable detail. I have shown you how to detect fragmentation and explained what
is wrong with it. I've shown you what you can do about it and how to get the computer to clean up after
itself. I have even included methods for calculating the cost of fragmentation and justifying the cost to
management. In short, I have told you just about everything I know about fragmentation and
defragmentation.
I have done what I can do. The rest is up to you.
My purpose in doing this has been to educate. I believe that the more you know about fragmentation,
about defragmentation and about System Management, the better off you will be.
If I have missed something, if you have any questions, or if you just want to communicate, write to me at:
Executive Software
701 North Brand Boulevard, 6th Floor
P.O Box 29077
Glendale, California 91209-9077
Written 1994
Page 97 of 121
THE FRAGBOOK
By Craig Jensen
APPENDIX A
What Is A Computer?
COMPUTER — An electronic machine for making calculations, storing and analyzing information fed into
it, and controlling machinery automatically. (Oxford American Dictionary)
That, simply stated, is what a computer is.
A computer is capable of feats just so long as one does not make the mistake that it is thinking. It isn't
thinking. But it can sure be made to look like something that thinks. In actual fact, the ones doing the
thinking are the computer system designer, the programmer and the user.
I should mention that what I mean by "user" is not a typist or someone who simply feeds in data. The
user is somebody who knows how to get data into the computer and out of it. He knows that the
computer can be made to do work, and he is running that computer for blood.
Operating a computer is not operating a calculator. A computer is not something which "eases the work"
or "saves time" or "permits staff to do other things." That comes under the heading of wasting a
computer. Used right, they can dig up and generate income by the steam shovel-full, and boost
efficiency and production to the sky. They are a tool with mammoth capabilities. The state of mind to
assume in using a computer is "Now how can I use this thing to enormously increase the production and
income of an area?"
What's happened on this planet, obviously, is that they think the computer will think — when it can't —
and so they don't do enough thinking for the computer in terms of developing uses for it and putting
these into action.
One point should be mentioned which is very valuable: and that is the speed of operation which can be
attained using a computer. The computer can contribute enormously to operational speed in its ability to
rapidly relay information over long distances, its ability to keep constant and accurate track of thousands
of individual data and actions, and its capacity for rapid data collection and evaluation for action.
The datum here is that power is proportional to the speed of particle flow. This is the real secret behind
the prosperity which can arise in connection with a computer operation.
Given good ideas, a good heart, a worthwhile project and the addition of near instantaneous computer
particle flow, the power of an organization becomes almost unlimited.
L. RON HUBBARD
APPENDIX B
File Fragmentation Cost Analysis
Fragmentation is a costly affair. Fragmentation can slow a computer down within a week. The speed with
which performance degrades is dependent upon the number of users, the type of applications being run
Written 1994
Page 98 of 121
THE FRAGBOOK
By Craig Jensen
and the frequency of adding, changing and deleting files. Typically, a system experiences a noticeable
slow-down in performance within the first week of normal operation after a complete backup and restore
or other defragmentation. This performance drain averages 10% to 15%.
How much does this performance drain cost? There are no set figures for all sites, obviously, so you
have to calculate the cost of fragmentation based on conditions at your own site. The following
worksheet is designed to give you a good ballpark estimate of the cost of fragmentation on your system.
The figures I have given in this worksheet are typical, based on normal industry practice. If you know
what your figures are and they differ from mine, by all means use yours.
This is available here in Microsoft Excel® format. Using Excel (v5.0 or higher) you can view and
complete your Fragmentation Cost Analysis by configuring Excel to launch as a helper application for
Netscape or you can download a copy of the worksheet and complete it locally.
If you don't have Excel, you can download a free Microsoft Excel Viewer from Microsoft.
Written 1994
Page 99 of 121
THE FRAGBOOK
By Craig Jensen
APPENDIX C
How To Get a Company to Spend Money for Something You Need and Want
Many System Managers are in the awkward position of having a great deal of responsibility for one of
their company's most valuable assets - its information - yet having very little authority to act to preserve
that information or care for it, particularly when it comes to purchasing authority. The reason for this is
entirely within your control as the System Manager.
Get the idea that there are two hats you wear as System Manager: One is the responsibility for ensuring
that the computer system produces the products and services required of it for the company. The other is
the responsibility to act in the best interests of the company in all things. The first is the System Manager
hat and the second is the staff member hat. Doing what is best for the system and its users is the
System Manager hat. Doing what is best for the company is the staff member hat.
These two hats are usually in sync; what is best for one is best for the other. But there are occasions
when the two seem to conflict. Nowhere is this more evident than in purchasing. For example, the CPU
is saturated and really needs to be upgraded. Clearly, a bigger CPU is in the best interests of the
computer and its users. But how does the (substantial) cost of that CPU affect the company's interests?
If the CPU upgrade costs $100,000, you may find considerable resistance to your spending that much
money, particularly if your spending authorization limit is $500. No one doubts that you know what is best
for the computer and its users. What they are concerned about is that you do not know what is best for
the company. Perhaps that $100,000 could be spent much more profitably elsewhere in the company.
How would a System Manager know? He knows nothing of profit and loss, return on investment, or tax
and balance sheets, right? Wrong. You know all you need to know of these things already. I'll show you.
If you have read Appendices A and B, in this book, and worked through the worksheet in Appendix B,
then you already have a clear idea of the terrific value of a computer system to the company. You know
that it can enormously increase the production and income of an area. You know that it can be a great
burden financially if it is not cared for and managed properly. And, with the data from your Appendix B
worksheet, you even have the dollar figures to quantify the importance of the computer system in terms
that can be well understood by management: money.
All you need now is a foolproof means of communicating your information to management. To do this,
you need to be able to adopt the management viewpoint, the investment viewpoint.
Most people understand investing. You put money into a savings account and a year later you can take
that money out with a little extra added to it in the form of interest. That interest is the return on your
investment. It is what comes back to you (returns) at the end of the investment cycle. This is perhaps the
simplest way there is to make a profit. The only problem is that the profit is so small.
Most people understand that companies run on profits. When they make a profit they survive. When they
don't, they don't. But few people realize that a company, any company, must make a very dramatic return
on their investment to survive. Our stock market mentality has people thinking of companies as
commodities that are bought and sold like pork bellies. This viewpoint is OK for capitalists, stock brokers
and hostile takeover artists, but it is not the viewpoint of management. To management, a company is an
Written 1994
Page 100 of 121
THE FRAGBOOK
By Craig Jensen
income producing machine. Management has to make a company produce income. If they can do that
well, they survive and earn bonuses. If they can't, they are replaced.
For you to communicate with management, you have to be able to adopt their viewpoint. Try it. What do
you think management wants to hear from you about your proposed CPU upgrade? They want to know
how much income it will produce, how much profit it will generate, how much return they will get on their
$100,000 investment.
Now this is really important: Management is not anti-spending. They are not dead set against spending
anything at all. Just look at how much your company spends every week; they spend virtually all that
they make. The trick is not to avoid spending. The trick is to spend well.
Your job, from your hat as a staff member, is to communicate to management how your proposed
$100,000 expenditure is in the best interests of the company because it will result in more than $100,000
in income. Of course, you also have to show that it will do so quickly.
How quickly? Very quickly.
With a savings account, you can put some money into the account one year and take it out the next, with
interest. All during that year, your money is right there in the account, safe and sound and earning
interest. You can get it back any time.
With a business, each week you spend your money. All of it, or very nearly. When you spend it, it's gone.
It's not there any more. It's not sitting in a "safe" savings account where you can get it back. It has been
spent on such things as rent, payroll (your pay, for example) and equipment (a CPU upgrade, for
example).
So the money the company made that week is gone. Yet the company has to bring in more money the
next week to be able to cover the next week's rent, payroll and equipment, and so on, week after week.
Where does the next week's income come from? It comes from the paid staff using the purchased
equipment in the rented space, who create products or services that are exchanged with customers for
income.
Are you with me so far? If not, read over the previous few paragraphs until it is clear.
Here is the heart of managing a company: each week, the production of the staff and equipment has to
produce enough to exchange for the income needed to pay the staff, buy the equipment and rent the
space, plus a little extra. And it has to do so in time to make the next round of payments. Every week, the
company is reinvesting virtually everything it has to raise the money to make it through another week. If
your $100,000 expense will derail that plan, you won't get the money. If your $100,000 expense won't
derail the plan but might erase all the profits, you probably won't get the money, either. But, if your
$100,000 expense will bring in $100,000 in income, plus a tidy profit, your expense stands a very good
chance of getting approved.
The worksheet in Appendix B shows you how to calculate the value of a defragmenter to your company.
Collect up the data you need to fill out the worksheet, do the calculations, and then write a memo to
accompany your purchase request. Lay out the memo in three sections:
Written 1994
Page 101 of 121
THE FRAGBOOK
By Craig Jensen
First Section: The exact situation your purchase is intended to handle. The situation, in this case, is the
horrid expense fragmentation is incurring for the company. (To you, it's a performance problem; to
management, it's an expense. Get the idea?)
Second Section: All the data needed to demonstrate the truth of your assertions about this horrid
expense, and to support your proposal that purchasing a defragmenter will resolve the whole matter and
pay for itself and bring in a tidy profit to boot. Hint: If the data is lengthy, put it in attachments, keeping
the cover memo to one or two pages.
Third Section: The solution, which should be entirely self-evident at this point, which is to approve your
purchase.
Do this, and you will get your truly needful purchases approved every time.
Written 1994
Page 102 of 121
THE FRAGBOOK
By Craig Jensen
GLOSSARY OF TERMS
A
Access: To store on and retrieve data from a disk or other peripheral device. The term access is used as
a verb in the computer world, and it means to retrieve a file or part of a file from disk so that the data (in
the file) can be looked at, modified, and so on. Users access files constantly.
Access Control List (ACL): In OpenVMS, a list that defines the kinds of access to be granted or denied
to users of objects such as files, devices and mailboxes. Each access control list consists of one or more
entries known as "access control list entries."
Address: The location on disk or in memory where information is stored. Just as addresses are used to
distinguish houses on a street, the computer uses addresses to distinguish the exact locations of all the
pieces
of
information
stored
either
in
memory
or
on
the
disk.
Allocate: To reserve a resource such as memory or disk.
Allocated: Reserved for use. Before data can be stored on the disk, space for the data must be
allocated from whatever remains available. One cluster is the minimum amount that can be allocated.
Application: An application is a computer program which controls the computer system to perform some
useful work for the user.
Architecture: The basic design of something, the way in which it is built; e.g., disk architecture, software
architecture.
Arm: Disk heads are mounted on arms that hold the heads close to the platter surface at precisely the
right point to read or write data. There may be one arm for each head, but on multiple-platter disks a
single arm may support two heads; one for the platter above the arm and one for the platter below.
Average Access Time: This is not exactly equal to half the sum of the rotational latency plus the seek
time, as the rotation and the seek occur at the same time. The average access time, therefore, is
somewhat less than half the total.
B
Bandwidth: The measurement of the transmission capacity of a communication channel, usually a set of
wires within a cable. The higher the bandwidth, the more information can be transmitted on these wires.
The bandwidth is measured in bits per second. The more bits (of information) per second that can be
transmitted, the better, as this speeds up the flow of communication between computers in a network, for
example.
Batch: 1. Group, or collection, of items. 2. A method of scheduling and executing programs in which the
programs run with no programmer interaction. (The Digital Dictionary)
Written 1994
Page 103 of 121
THE FRAGBOOK
By Craig Jensen
Binary: Binary means "having two and only two possible states" such as on or off, true or false, and so
on. Taken from the Latin "bini," meaning two by two and "ary," meaning of, or pertaining to. Therefore,
binary describes a numbering system for counting, where only two digits (0 and 1) are used.
Bit: BInary digiT. The smallest part of information in a computer. Groups of bits make up storage units in
the computer. For example, it takes eight bits to make up a byte, which is equivalent to one character,
whether it be a number, letter, punctuation mark, or even a space. In written communication, bits are
represented as zeros and ones.
Bit Map, Bitmap: A binary representation (literally, a map) in which each bit or set of bits corresponds to
a particular condition (such as "empty," "not empty") or object (such as a type of font or an image). A
bitmap is a table or series of bits which indicate whether something is, or is not in a certain condition. An
example would be a table of 1s and 0s, pre-determined to mean "yes" and "no," representing the ratio of
girls to boys sitting in a classroom, where "yes" meant "girls," and "no" meant "boys." By looking at the
bitmap, you could tell where the girls were sitting in relation to the boys. In the context of VAX
computers, a bitmap is located at the beginning of the disk, and is a set of bits, one for each cluster (of
blocks) on a disk, indicating whether or not those blocks are available for use.
Block: In OpenVMS, a block is a collection of 512 bytes. OpenVMS disks are formatted into blocks of
512 bytes each. A block is a unit of data storage space on a disk that equals approximately one third of a
page of text. A block is 512 contiguous bytes of information (or empty disk space), and is the smallest
addressable unit of data that can be transferred in an I/O operation on a VAX computer. A block of space
on
a
disk
is
equivalent
to
a
page
of
space
in
memory.
Boot: To cause the computer to start executing instructions. The term comes from "bootstrap" since
bootstraps help you get your boots on, booting the computer helps it get its first instructions.
Boot Image: The program that boots the system. The instructions that tell the computer how to start
itself up. See boot and image, definition 2.
Bottleneck: A slow down due to too much activity being requested when inadequate facilities for the
requests exist. In relation to a computer system, this refers to a situation that occurs when many
processes are slowed down because there are too many I/Os waiting to be handled by the disk.
Bucket: A storage structure of 1 to 32 disk blocks that is used to store and transfer data in files. Unlike a
block, a bucket can contain only entire records.
Bug: Programming error occurring in software. The term was coined in the 1940's when a moth was
found inside an early computer prototype, shorting its circuits.
Byte: When eight bits are considered together, they are referred to as a byte. A single eight-bit byte is
the amount of computer storage typically used to store a single letter of the alphabet or other symbol of
human communication. The word "animal" could thus be stored in six bytes of computer storage.
C
Written 1994
Page 104 of 121
THE FRAGBOOK
By Craig Jensen
Cable: The electronics in the disk drive are connected to circuitry in the computer by means of cables,
which are no more than wires with a certain type of connector on each end. Often, the individual wires
are color-coded for clarity.
Cache: 1. Dramatic performance improvements can be gained by placing a significant amount of
memory inside the disk controller. This local memory is called a cache and is used to store data recently
retrieved from the disk by the computer. Then, if the computer should happen to request exactly the
same data again, the controller can service the request from the local cache at memory speed
(microseconds) instead of at disk speed (milliseconds). 2. A very fast memory that can be used in
combination with slower, large capacity memories, such as disks.
Cache Hit Rate: The rate at which I/Os are satisfied from the cache, rather than from disk. Each time an
I/O is satisfied from the cache, it is referred to as a hit.
CD-ROM: Compact Disc Read-Only Memory. A compact disc format used to hold text, graphics, and
high-fidelity stereo sound. As suggested by its name, the end user cannot write data to a CD-ROM, but
can only read what is already on the CD.
Chip: A very small component that contains a large amount of electronic circuitry. Chips are the building
blocks of a computer and perform various functions, such as serving the computer's memory or
controlling other chips.
Chunk: Due to the way OpenVMS disks are structured, disks in a stripeset are divided into "chunks"
rather than clusters, chunks consisting (usually) of more blocks than clusters.
Cluster: 1. Disk blocks are grouped into clusters, which are groups of blocks read and written as a unit.
In other words, a cluster is the minimum allocation quantity for a disk. The cluster size, in terms of
number of blocks per cluster, can be varied only by reinitializing the disk. 2. See VAXcluster.
Common System Disk: A System Disk is the disk which holds the operating system and all the files
necessary to start the computer. A Common system disk is the system disk in the VAXcluster from which
all the other nodes in the cluster boot.
Compatibility Mode: Some VAXes have the ability to run PDP-11 programs as if the VAX were a PDP11 computer running the RSX-11M operating system. When operating in this way, the VAX is said to be
in compatibility mode.
Computer: The simple definition of computer is: An electronic machine for making calculations, storing
and
analyzing
information
fed
into
it,
and
controlling
machinery
automatically.
Contiguous: Adjacent; placed one after the other.
Controller: Sometimes referred to as an interface, or storage controller. The controller, which is attached
to the computer, decodes instructions from the computer and issues instructions to the disk drive to do
what the computer has instructed. The controller also receives data and status information from the disk
drive, which it passes on to the computer in a form the computer can understand. A single controller may
service
more
than
one
disk
drive.
See
also
HSC.
Written 1994
Page 105 of 121
THE FRAGBOOK
By Craig Jensen
CPU: Central Processing Unit. The computing part of the computer. In general use, the term CPU implies
memory as well, since data must be stored in memory in order to be processed.
Cylinder: The tracks at the same radius on each platter, taken together, are referred to as a cylinder. If
you visualized these tracks without any other part of the disk, they would form the shape of a hollow
cylinder.
D
Data Structure: A design for the way data is laid out in a file for access by a program. Not unlike the
concept of basic building blocks, there are several different standard data structure designs which apply
to different uses. Word processing documents, spreadsheets and database files are all examples of data
structures.
DAU: Disk Analysis Utility. The DAU can provide anything from a quick summary to a very detailed
analysis of the condition of a disk. It can be run at any time on any Digital-supported disk and it is not
necessary to allocate the disk, dismount it, or stop users from accessing it.
DCU: Disk Compression Utility. DCU was an off-line style defragmenter for RSX systems.
DECUS: The Digital Equipment Computer Users Society. A world-wide organization of users of Digital
products, including thousands of System Managers. Provides technical information exchange among its
members.
Delete: To remove an item of data from a file or to remove a file from the disk.
Desktop
Workstation:
A
high-performance,
single
user
microcomputer
or
minicomputer.
Detached Process: A process which runs without further intervention from the user. The user can set it
up, and then the process will go ahead and run without the user having to give it any more instructions.
Digital: The preferred name (by Digital) for Digital Equipment Corporation. Also known as DEC.
Direct I/O: An I/O that goes to disk to read or write data without going through a cache first.
Directory: A directory is "a file that briefly catalogs a set of files stored on a disk . . . " (The Digital
Dictionary). From the user's point of view, a directory is a catalog of the names of files that are grouped
in a particular way.
Disk: A thin, round plate with a magnetic surface coating on which data can be stored by magnetic
recording. (The Digital Dictionary)
Disk Cartridge: A removable disk unit which consists of a case containing a hard disk or a floppy disk.
Disk Compression: Another word for defragmentation.
Written 1994
Page 106 of 121
THE FRAGBOOK
By Craig Jensen
Disk Device Designation: In OpenVMS, a disk is referred to by a unique code consisting of letters and
numbers that indicate exactly which disk is meant. In the example DUA0, "D" means it is a disk device,
"U" indicates the type of disk, "A" indicates that it is the first controller for disks of that type on this
computer (the second would be "B", the third "C" and so on), and "1" indicates that it is the second disk
drive
associated
with
that
controller
(the
first
would
be
"0").
Disk Drive: The combination of one or more spindles, arms, heads, platters and electronics into a single
physical device for storing and retrieving data is known as a disk drive. The term drive is often used to
refer to the disk drive.
Disk File Structure: The standard form used for arranging files on a disk.
Disk I/O Bandwidth: The bandwidth is the transmission capacity of a computer channel or
communications line of any kind. In this case, the disk I/O bandwidth refers to the amount of disk I/Os
that can be transmitted; i.e., the number of I/Os that can be transferred to and from the disk.
Disk Structure: The way data is organized on a disk.
Drive: See disk drive.
Driver: The set of instructions used to manipulate a controller is known as a driver. The driver resides at
the lowest levels of a computer's operating system, where it can interact directly with the hardware. The
driver interprets instruction codes of the disk controller into standardized instructions recognizable and
usable by the more generalized parts of the computer system, and vice versa. The driver enables an
application program to issue a generic "get data" instruction, for example, to the disk without having to
concern itself with the peculiarities of that particular disk (the number of tracks, platters, and so on). A
single disk driver may service more than one disk controller.
Dump: (Noun) A printable file showing the contents of memory, disk or tape, often occurring in response
to an error condition. Often preceded by the word "crash" (as in crash dump). Useful for the analysis of
what was taking place when a computer system crashes, for example. The computer literally "dumps"
the data onto paper.
Dump/Header Command: A command used to analyze the contents of a file header with the OpenVMS
Dump Analysis Utility, which, as the name suggests, is a utility used for analyzing dumps.
E
Electronics: Electronic circuitry is required to sense and record the magnetism on the surface of the
platters and to move the heads. This circuitry is commonly referred to as the electronics of the disk. The
electronics
communicate
data
between
the
physical
disk
and
the
computer.
Electronic "Disk": Another name for "Solid State" or "Semiconductor" disks. Memory chips (in a
separate box) set up to emulate a hard disk.
.EXE:
A
file
Written 1994
extension
commonly
used
to
show
that
the
file
is
an
executable
Page 107 of 121
file.
THE FRAGBOOK
By Craig Jensen
Executable: A program file in a language which the computer can understand that is ready to run. Any
commands contained in that file can be executed by the computer. A non-executable file is a file that
holds data, such as a user data file.
Extent: When OpenVMS allocates space on a disk for a file, it is not always possible to allocate all the
needed space contiguously on the disk. Sometimes it is necessary to allocate part of a file in one place
and the remainder in another. Files have been known to be allocated in dozens and even hundreds of
pieces scattered around the disk. Each piece of a file so allocated is called an extent.
Extent Cache: A portion of the system's memory that is set aside solely for the use of the OpenVMS file
allocation mechanism. The extent cache stores the addresses of deallocated (i.e. free) block clusters,
making it fast for OpenVMS to find free disk space by reusing these same clusters. This saves the
overhead
of
scanning
the
storage
bitmap
of
a
disk
to
find
free
space.
Extent Header: A header that contains the additional map area pointers of a file. A file header contains
(in addition to other data) map area pointers and it can hold only so many of these. As a file becomes
more and more fragmented it has to have more and more map area pointers in order for all the
fragments of the file to be located. Therefore, at a certain point an extra file header must be created to
hold these, hence the term (extent header).
F
FCP: File Control Primitive. The FCP is the set of instructions that are used to control files.
Field: An area reserved for data. The smallest unit normally manipulated by a database management
system. For example, in a personnel file the person's age might be one field and their zip code another.
File: Information stored on a disk is ordinarily stored in a file. In fact, for any OpenVMS disk using the
ODS-2 structure, no information can be retrieved from a disk unless it is contained in a file. A file is "a
collection of related records treated as a unit and usually referenced by a . . . name." (The Digital
Dictionary)
File Fragmentation: File fragmentation refers to computer disk files that are not whole but rather are
broken into scattered parts, while free space fragmentation means that the empty space on a disk is
broken into scattered parts rather than being collected all in one big empty space.
File Header: "A block in the index file that describes a file on a . . . disk. Every file residing on the disk
has at least one header, which provides the location of the file's extents." (The Digital Dictionary). All the
information necessary to identify, secure, and access a file is contained in the header. The header is
NOT stored as part of that file; it is stored, along with all other headers, in a special file on each disk,
called INDEXF.SYS.
Filespec: File Specification. Reference to the location of a file on a disk, which includes disk drive,
directory name, filename, and file type (or extension).
Files-11: The name of the disk structure used by the OpenVMS operating system.
Written 1994
Page 108 of 121
THE FRAGBOOK
By Craig Jensen
File System: A method of recording, cataloging, and accessing files on a disk.
Fixed Disk: See hard disk.
Floppy Disk: A removable storage medium. It consists of a single, round disk of flexible, tape-like
material housed in a square envelope or cartridge.
Formatted: The term formatted is used to mean that the disk has certain marks that trigger reading and
writing of data in particular spots, allowing storage and retrieval of data in groups of a particular size. In
the case of a Files-11 disk, the size of the data groups is 512-byte blocks.
Fragmentation: The word fragmentation means "the state of being fragmented." The word fragment,
means "a detached, isolated or incomplete part." It is derived from the Latin fragmentum, which in turn is
derived from frangere, meaning "break." So fragmentation means that something is broken into parts that
are detached, isolated or incomplete.
Free: Not allocated.
Free Space Fragmentation: The condition in which free space on a disk is broken up into small pieces
or areas and scattered around the disk.
G
Goes Down: Computer industry slang for "stops running."
H
Hard Disk: A metal disk covered with a magnetic recording material. It comes in removable and fixed
varieties that hold from five to several thousands of megabytes of information.
Head: A tiny magnetic device capable of reading or writing magnetic bits of information on the disk
surface. The platter spins near the head(s), so that a single track of recorded information is continuously
passing under the head, available for reading or writing. The head never touches the surface. Rather, it
floats on a cushion of air so thin that a human hair or even a particle of cigarette smoke cannot pass
between the head and the surface. As foreign particles that small would cause the disk to fail, such disks
are sealed in air-tight containers.
Header: See file header.
Hit: When an I/O request is satisfied from the cache, it is referred to as a hit.
Home Block: The block contained in the INDEXF.SYS file on a disk or tape that contains information
such as what type of disk or tape it is, and establishes the characteristics of that storage medium.
HSC: Hierarchical Storage Controller. A controller which controls, organizes, and prioritizes the activities
of the storage devices connected to the system.
Written 1994
Page 109 of 121
THE FRAGBOOK
By Craig Jensen
I
Image: 1. Duplicate information or data copied from one medium to another. 2. A program file of which
an image (or copy) is made and placed in memory to run.
Incremental Backup: Backing up only files that have been changed since the last backup, rather than
backing up everything.
Index File: The Digital Dictionary defines index file as "The file on a . . . volume that contains the access
information for all files on the volume and enables the operating system to identify and access the
volume." The index file is a catalog of all the files on a particular disk. In fact, the header of a file resides
within the index file. All the information needed to access a file is contained here.
Indexed Files: Indexed files have an index that contains pointers to organized data records elsewhere in
the file. In such a file, variable length data records are stored in buckets of a certain number of blocks
each.
INDEXF.SYS: This is the name of the file which contains the headers for all the files on that disk. It also
contains certain information critical to the system's ability to access data on that disk volume, like the
location of the INDEXF.SYS file itself. INDEXF stands for index file. The file type, .SYS, indicates that
this is a system file, reserved for use by the OpenVMS operating system. In a volume set, there is an
INDEXF.SYS file on each disk in the set.
Initialize: The process by which the computer prepares a disk for handling user information. This
process erases any information that was on the disk.
I/O (Input/Output): Refers to the transfer of data between the CPU and a peripheral device, such as a
disk drive, or a printer. An I/O is a read or write transaction. A user or application either needs to read
data from the disk (output) or write data to the disk (input).
I/O Bound: A condition in which it takes an excessive amount of time to get data in and out of the
computer, as compared to the time it takes to process the data. This condition adversely affects the
performance of the computer by slowing it down.
Input/Output (I/O) Request Queue: A driver has associated with it a queue for holding I/O requests.
This queue is merely a data structure enabling the computer to store an I/O request while it carries on
with its work without it having to wait for the I/O processing to complete.
Install: To place software in a computer and get it up and running.
Installed Image: An executable program that is kept in an "always ready" state. Part of the program is
kept in memory and the program image file is kept open in order that the program can be started faster
than it could be otherwise.
Intelligent Disk Subsystems: A disk subsystem which handles some of the work involved in accessing
and managing files which would otherwise be done in the CPU. An intelligent disk subsystem is created
Written 1994
Page 110 of 121
THE FRAGBOOK
By Craig Jensen
by moving some of these functions (such as determining file placement on the disk) from the software
into the disk controller.
Interrupt Stack: When an external event must be serviced by the computer operating system, an
"interrupt" occurs. During this time, all processes are temporarily halted, while the operating system
works to service the event. Then all the processes resume their operations. During the interruption,
process information is stored on the "interrupt stack."
I/O Bottleneck: Refers to a situation that occurs when many processes are slowed down because there
are too many I/Os waiting to be handled by the disk.
I/O Routine: A set of instructions that perform tasks related to I/O.
I/O Subsystem: See I/O and subsystem.
I/O Throttling: Refers to the act of evaluating the amount of CPU idle time and reducing the number of
I/Os to the disk.
I/O Transfer: The actual carrying out of an I/O request.
K
Kernel Mode: Kernel mode is the deepest operating mode in a computer system. So called because it is
the central part, or core, of the computer. Device drivers operate at this level.
L
LAVC: Local Area VAXcluster. A VAXcluster is a group of OpenVMS computer systems which can share
disk-related resources. The computers also share certain other resources which ease the task of
managing users and their system usage.
LBN: Logical Block Number. When the blocks on a disk are considered from a programming point of
view, they are viewed as logical blocks. The address of a logical block on a disk is its Logical Block
Number (LBN). LBN 0 (zero) is the first LBN on a disk. Logical blocks correspond one-for-one to physical
blocks, but the logical block number might not correspond directly to the same physical block numbers.
Local: Directly connected to, as in local disk or local node.
Local Memory: Memory used by a single CPU, or allocated to one single program function.
Logical: In computer terms, logical means "conceptual." For example, a disk may be given a logical
name, such as USERDISK, so that users do not need to know the hardware (or physical) name.
Logical Block Number: See LBN.
M
Written 1994
Page 111 of 121
THE FRAGBOOK
By Craig Jensen
Machine: A slang term for computer.
Magnetic: When something is said to be magnetic, it means that it is capable of storing a small amount
of magnetism.
Mainframe: A large computer.
Map: 1. (Noun) A picture (or list of addresses) of the location of data or objects. 2. (Verb) To map an
object (such as a file, for example) is to locate it physically on the disk by indicating the individual blocks
in which it is stored. When an LBN is mapped to a PBN, the LBN can be located and any data stored on
it can be retrieved.
Megabyte: One million bytes.
Memory: The computer's working storage that is physically a collection of chips. All program and data
processing takes place in memory. The program instructions are copied into memory from a disk or tape.
Memory can be viewed as an electronic checker board with each square on the board holding one byte
of data or instruction.
Memory Chip: A chip that holds programs and data either temporarily or permanently.
Microsecond: One one-millionth of a second.
Millisecond: One one-thousandth of a second.
Monitor: A high-resolution display screen for output from a computer, camera, VCR, etc.
MOVEFILE Primitive: Relocating a file on the disk for purposes of defragmenting is a multi-step
process. Doing some of the steps without doing the rest can result in a file that is confused, damaged or
even lost. The solution is to isolate the critical steps that must be all completely done or none done at all
and treat these as a single step. Such a group of steps treated as a unit is called a primitive. Beginning
with version 5.5 of OpenVMS, this operation is called the MOVEFILE primitive.
MSCP: Mass Storage Control Protocol. A software program which acts as a server to make all locally
connected disks available to all nodes in the cluster.
Multi-Header File: As its name implies, this is a file with more than one header or, to be more precise,
with a header containing so many retrieval pointers they won't fit into a single one-block header.
OpenVMS, therefore, allocates a second (or third or fourth) block in the INDEXF.SYS file to
accommodate storage of the extra retrieval pointers.
Multiprogramming: The capability of running two or more programs at the same time without
interference.
N
Node: A single computer in a network or cluster of computers.
Written 1994
Page 112 of 121
THE FRAGBOOK
By Craig Jensen
"(An) n-squared order problem": A problem that squares in difficulty for each increment in size of the
problem. For example, a problem of size 2 has a difficulty of 2 x 2 = 4, while a problem of size 3 has a
difficulty of 3 x 3 = 9. As the numbers grow large, the difficulty of the problem snowballs. A problem of
size 20, though only ten times as large as the 2 problem, has a difficulty of 20 x 20 = 400, or 100 times
the difficulty of the 2 problem.
O
ODS-2: On-Disk Structure Level 2. The second generation disk file structure supported by OpenVMS.
Off-line: Not connected to or not installed in the computer. If a terminal, printer, or other device is
physically connected to the computer, but is not turned on or in ready mode, it is still considered off-line.
On-line: 1. A peripheral device (terminal, printer, etc.) that is ready to operate. 2. An on-line computer
system refers to a system with terminals and users.
OpenVMS: An operating system used on VAX and Alpha AXP computer systems. The VMS stands for
Virtual Memory System. Open was added to the original VMS name in 1993 to illustrate the fact that
VMS now runs on more that one type of computer.
OpenVMS Home Block: A block in the index file on a disk that contains information about the disk as a
whole.
Operating System: A collection of programs that controls the execution of computer programs and
performs system functions. It is the software which organizes a CPU and peripheral devices into an
active unit for the development and execution of programs.
Overhead: Amount of processing time used by system software, such as the operating system, or
database manager.
Override: (Noun) One action which is set up to override (i.e. change, or prevent from occurring) another
action.
P
Page: 1. (Noun) In OpenVMS, one page is 512 bytes, the same size as a block. In other systems, a
page may be more or less than 512 bytes. A page is to memory as a block is to disk. 2. (Verb) The
process of putting a page of memory on disk. See pagefile.
Page file, Pagefile: An area of a disk that is set aside to hold information intended to reside in the
computer's memory. The information can be brought into memory from this file, a few pages at a time, for
use whenever needed. This mechanism requires a much smaller amount of physical memory than would
be required if the entire program were to be loaded into memory at once.
PAGEFILE.SYS: The name of the pagefile. The extension .SYS indicates that this is a system file.
Written 1994
Page 113 of 121
THE FRAGBOOK
By Craig Jensen
PBN: Physical Block Number. The actual arrangement of information on the surface of a disk platter is
referred to as a physical block. The physical block number (PBN) is an address used for identifying a
particular block on the surface of the disk.
PDP-8: Programmable Data Processor-8. A 12-bit computer preceding the PDP-11. The PDP-8 is widely
regarded as the first minicomputer, which, at that time, was simply defined as a computer which cost less
than $100,000. Considering inflation, the cost of such a computer would be closer to $1 million today.
PDP-11: Programmable Data Processor-11. A 16-bit computer on which the VAX was based. The PDP11 introduced virtual memory to the Digital world, although early versions of the PDP-11 lacked this
capability.
Peripheral device: Any hardware device connected to a computer, such as a terminal, printer, disk drive
etc.
Physical Block: The actual arrangement of information on the surface of a disk platter is referred to as a
physical block. The physical block number (PBN) is an address used for identifying a particular block on
the surface of the disk.
Physical Disk: The actual, real, touchable disk.
Platter: A disk may consist of one or more platters, each of which may be recorded on both sides. The
platter spins like a phonograph record on a turntable.
Primitive: Relocating a file on the disk for purpose of defragmenting is a multi-step process. Doing some
of the steps without doing the rest can result in a file that is confused, damaged or even lost. The
solution is to isolate the critical steps that must be all completely done or none done at all and treat these
as a single step. Such a group of steps treated as a unit is called a primitive.
Process: 1. (Verb) To manipulate data in the computer, regardless of what is being done, is referred to
as processing. It could refer to updating data in a file, or displaying the data on the terminal screen. 2.
(Noun) One individual complete manipulation of data, such as updating data in a file, logging on, etc. is
referred to as a process.
Program: A collection of instructions that tell the computer what to do. Synonymous with "software."
Program Image File: See image file.
Q
QIO: Queue Input Output. The OpenVMS operating system contains a mechanism for queuing (inserting)
an I/O request to the queue of a driver. This mechanism is called the $QIO system service. The dollar
sign indicates that this abbreviation is Digital's. QIO stands for "Queue Input Output," where queue is
used as a verb.
QIO Operation: A single act of queuing the I/O requests.
Written 1994
Page 114 of 121
THE FRAGBOOK
By Craig Jensen
Queue: A sequence of items waiting for service, like people in line at the checkout counter in a store.
Quorum: A quorum disk is one which substitutes for a VAX, acting as a node in a VAXcluster.
R
RAM: Random Access Memory. The computer's primary working memory in which program instructions
and data are stored so that they are accessible directly to the central processing unit (CPU).
RA81 Disk: A type (or model) of disk made by Digital.
Random: In a random access or direct access file, every record is the same size. Because of this,
records can be deleted and replaced with new ones easily. An example of a direct access file is a
bookshelf full of books. You can go directly to any book desired and withdraw it from the shelf.
Fragmentation of the contents of such a file causes virtually no performance problems, as the file is
designed to be accessed in random order and any new record is guaranteed to fit precisely within any
free space in the file.
Realtime: Immediate response. Processing time that corresponds to the real world. Realtime video
transmission produces a live broadcast.
Reboot: To reload the operating system and restart the computer.
Record: A collection of related data items treated as a unit. A record contains one or more fields. (The
Digital Dictionary)
Removable Disk: A type of disk that is inserted into a disk drive for reading and writing and removed
when not in use. Floppy disks and disk cartridges are examples of removable disks.
Reserved Field: A field which is reserved by Digital for possible future development. A segment of a
data record that is set aside for special functions and cannot be used for anything else.
Retrieval Pointers: Sometimes referred to as Map Area Retrieval Pointers, or Map Pointers. Within the
file header, the information critical to a discussion of file fragmentation is the section headed Retrieval
Pointers. These pointers indicate where the file's data is located on the disk. Each pointer consists of the
LBN of the first data block and a count of how many successive contiguous blocks contain data for that
file.
Revectored: Revectored means that the LBN assigned to that physical block is reassigned to some
other physical block. This revectoring can also be done on the fly while your disk is in use. The new
block after revectoring might be on the same track and physically close to the original, but then again it
might not.
RISC: Reduced Instruction Set Computer. By using simple sets of instructions, processing speed can be
increased considerably. Digital's Alpha AXP computer uses RISC technology, as do other computers.
Written 1994
Page 115 of 121
THE FRAGBOOK
By Craig Jensen
RK05: The RK05 disk, an older model disk, which held 2½ megabytes (5,000 blocks).
RMS: Record Management Services. The VAX file management system.
Rotational Latency: As a disk platter spins around the spindle, the blocks in a single track of recorded
data are brought near a disk head. The head can only read or write a block when that block is
immediately under the head. Accordingly, the time to access a block of data on the disk varies. It is much
quicker to access a block that is currently or about to be under the head than it is to access a block that
has recently passed under the head and is moving away. The block that has just passed under the head
has to wait nearly a full rotation of the disk for another access opportunity. This delay is known as
rotational latency.
RPM: Revolutions Per Minute.
RQDXn Disk Controller: A type of controller.
RSX-11: Resource Sharing eXecutive-11. An operating system used with PDP-11 computers which
allowed several users to share the computer simultaneously.
RT-11: Real Time-11. An operating system used with PDP-11 computers.
RZ23: A model of disk drive made by Digital. In size, it is about 2 inches by 4 inches by 6 inches, and it
has a storage capacity of 104MB (megabytes).
S
Saveset: A container file holding the data saved during a backup operation.
Sectors: The surface of a disk is sectioned into parts. This sectioning is not a physical marking on the
surface, but rather it is just an idea that the disk is so divided. These sections are called sectors or
blocks. The term sector is more common to personal computers and VAX and Alpha AXP hardware,
while block is common OpenVMS terminology.
Seek: The movement of a disk head from one track to another is called a seek. The time it takes for a
head to seek is one of the most critical factors in determining the speed of a disk.
Seek Ordering: By keeping track of the exact position of the heads at all times, the controller can
determine which one of multiple requests from the computer can be serviced in the shortest time. Then,
instead of servicing the computer's requests in the order received, the controller can service first the
requests for data nearest the heads and then the requests for data farther away. This is called seek
ordering,
which
simply
means
putting
the
seeks
in
a
better
order.
Seek Time: The time it takes for a head to seek is the most critical factor in determining the speed of a
disk. This is known as the disk's seek time.
Semiconductor Disk: See electronic disk.
Written 1994
Page 116 of 121
THE FRAGBOOK
By Craig Jensen
Sequential: In a sequential file, every record except the first falls immediately after the preceding record.
There are no gaps. An example of a sequential file is a music cassette. You cannot get to any selection
without searching through the tape. Accordingly, sequential files are not subject to internal fragmentation.
The situation simply cannot exist.
Server: A specialized part of the computer system. It takes over one specific function of the computer to
provide better service to the other parts of the system and to the people using it. For example,
"Printserver" is a computer in a network that controls one or more printers. It stores the print image
output from the users of the system and feeds it to the printer one job at a time. See HSC and MSCP.
Shadow Set: A group of two (or more) identical disks combined so as to be treated by OpenVMS as a
single disk equal in capacity to only one of the disks in the group. Each time a block is written to the
shadow
set,
the
same
block
is
written
to
all
the
disks
in
the
set.
SIR: System Improvement Request. Each year, DECUS surveys its members to determine the things
that most need improvement in the OpenVMS operating system. This survey is called the System
Improvement Request (SIR) ballot.
Spindle: A disk platter is attached to a spindle around which it rotates like a wheel on the axle of a car.
The spindle is at the exact center of the platter. The arm moves the head from the outer edge of the
platter
toward
the
spindle
at
the
center
and
back
out
again.
Split Transfer, or Split I/Os: This refers to how many times the VAX is having to do two or more I/O
transfers when one would serve. A split transfer is the result of fragmentation.
SPM: Software Performance Monitor. A performance analysis tool from Digital Equipment Corporation.
SPM provides limited fragmentation analysis capability.
SQUEEZE: Under the RT-11 operating system, the solution to the problem of not enough contiguous
free space to create a file was the SQUEEZE command. SQUEEZE compacted the disk, rearranging the
files so they were all together near the beginning (LBN 0) of the disk, leaving all the free space in one
large, contiguous area at the end.
Stack: A reserved amount of memory used for arithmetic calculations for keeping track of internal
operations. Called a stack because it handles things in a "first in, last out" sequence, like a stack of
papers to be read, or plates to be washed.
Standalone: A single computer that is not part of a VAXcluster.
Stand-Alone Backup: A version of the OpenVMS BACKUP utility that runs by itself without OpenVMS
running on the computer. Used to make backup copies of the system disk.
Storage Bitmap: Literally a map made up of bits, each of which represent an object or condition. In the
case of a storage bitmap, the map would represent which blocks or clusters are free, and which are
allocated.
Written 1994
Page 117 of 121
THE FRAGBOOK
By Craig Jensen
Stripeset: Two or more disks can be combined into a stripeset. A stripeset is similar to volume set in that
the full capacity of all the disks in the set is available for data storage, and in that the whole set is treated
by OpenVMS as a single large disk. The main difference is that, while each file extent must reside
entirely on a single disk in a volume set, in a stripeset each extent of the file is deliberately spread across
multiple disks. One chunk resides on one disk, the next chunk in sequence resides on the next disk, the
next chunk on the next disk, and so on, starting over at the first disk when the last disk in the set is
reached.
Subdirectory: A directory within a directory. Seven levels of subdirectories are allowed in OpenVMS.
Subsystem: In OpenVMS, a set of instructions that implement a particular facility in the system such as
the file subsystem, batch subsystem, I/O subsystem and so on.
SWAPFILE.SYS: The file that contains the data pertaining to files which have been written to disk to
make extra room in memory.
SYS$MANAGER:*.LOG File: SYS$MANAGER: is a directory name. *.LOG refers to all the log files in
that directory.
System Disk: The disk containing the program images and main data structures for the OpenVMS
operating system, from which the operating system is loaded into memory at boot time.
System Root Directory: The top-level directory that contains system files and subdirectories. All files
used by the operating system are in this directory or subdirectories under it. System root directories are
named SYS0, SYS1, SYS2, etc. Those directories are reserved for the OpenVMS operating system.
System Software: Programs used to control the computer and run application programs.
T
Third-Party: Software products made by companies other than Digital, but whose products are
compatible with Digital's.
32-bit Architecture: A computer system with the capability of processing 32 bits of information
simultaneously in each cycle of the computer. Like a 32-lane highway, the path on which data flows
through the computer carries 32 bits of data at once.
Throttling: Regulating the flow or speed (of I/Os, for example).
Throughput: Speed with which a computer processes data.
Track: A single track is one strip of disk space beginning at one point on the surface and continuing
around in a circle ending at the same point. The tracks are concentric rings, not a spiral like the grooves
on a phonograph record. Each surface has many tracks.
V
Written 1994
Page 118 of 121
THE FRAGBOOK
By Craig Jensen
VAX: The brand name of a computer system made by Digital Equipment Corporation of Maynard,
Massachusetts. VAX is a high-performance, multiprogramming computer system based on a 32-bit
architecture. VAX stands for Virtual Address eXtension (The Digital Dictionary). The name indicates the
ability of the hardware to give the user a limited amount computer memory to work with while simulating
much more. See also Virtual Memory.
VAXcluster: A group of OpenVMS computer systems which can share resources such as disk and tape
storage.
VMS: Virtual Memory System. See OpenVMS.
Virtual Address: The address of a location in a conceptual memory space that may or may not
correspond to a location in the computer's physical memory, but which is translated by the computer in
such a way as to make it appear that it does so correspond.
Virtual Memory: In the OpenVMS operating system, to give the appearance of having more memory
than it does, the operating system software places some of the user's programs and data in peripheral
storage (usually on a disk). When they are needed, the system "swaps" one or more blocks of data from
the disk with one or more pages of data from memory which are not currently in use. This gives the
appearance
of
having
much
more
memory
than
there
is
in
actuality.
VMB.EXE: The main boot program for the OpenVMS operating system.
Volume: In OpenVMS, the word volume refers to a structured (initialized) disk. When considering a disk
as a logical (conceptual) unit of storage, rather than a physical unit of storage, it is referred to as a
volume.
Volume Retention Dates: Volume retention dates are used by OpenVMS to determine when a file on
the volume "expires." When a file is created, its expiration date is set to the current date and time plus
the specified maximum retention time. Each time the file is accessed, the current time is added to the
minimum time. If the sum is greater than the expiration date, the expiration date is recomputed. This
allows the System Manager to determine which files are not being used and so might be candidates for
archiving.
Volume Set: OpenVMS has the capability of treating one or more physical disk drives as one disk. This
capability is implemented in software and does not involve any additional electronics or cabling. When
two or more disks are so combined, the combination is referred to as a volume set. A volume set is a
group of two or more disks combined so as to be treated by OpenVMS as a single disk equal in capacity
to
the
total
capacities
of
all
the
disks
in
the
set.
VPA: VAX Performance Advisor. A performance analysis software tool from Digital Equipment
Corporation. VPA provides limited fragmentation analysis capability.
W
Written 1994
Page 119 of 121
THE FRAGBOOK
By Craig Jensen
Window: The term window, as it relates to this book, means the set of retrieval pointers the system
keeps in memory to access the file. If the file is contiguous, only one pointer is needed to access the file.
A window typically holds seven pointers.
Window Turn: The situation in which OpenVMS had to load new retrieval pointers from a file's header to
gain access to the desired portion of a file. If the file is contiguous, only one pointer is needed to access
the file. A window typically holds seven pointers, so a file can be fragmented in up to seven pieces and
still can be accessed with no further action on the part of the operating system. When there are eight or
more pieces, however, one or more pointers have to be flushed out of the window and new pointers
loaded to locate the later parts of the file. This is known as a window turn. If a file is fragmented into
many pieces, window turns can become a major performance bottleneck.
Written 1994
Page 120 of 121
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement