Windows 2003 Storage Resource Management The Definitive Guide To

Windows 2003 Storage Resource Management The Definitive Guide To
tm
The Definitive Guide To
Windows 2003
Storage Resource
Management
Evan Morris
Introduction
Introduction to Realtimepublishers
by Sean Daily, Series Editor
The book you are about to enjoy represents an entirely new modality of publishing and a major
first in the industry. The founding concept behind Realtimepublishers.com is the idea of
providing readers with high-quality books about today’s most critical technology topics—at no
cost to the reader. Although this feat may sound difficult to achieve, it is made possible through
the vision and generosity of a corporate sponsor who agrees to bear the book’s production
expenses and host the book on its Web site for the benefit of its Web site visitors.
It should be pointed out that the free nature of these publications does not in any way diminish
their quality. Without reservation, I can tell you that the book that you’re now reading is the
equivalent of any similar printed book you might find at your local bookstore—with the notable
exception that it won’t cost you $30 to $80. The Realtimepublishers publishing model also
provides other significant benefits. For example, the electronic nature of this book makes
activities such as chapter updates and additions or the release of a new edition possible in a far
shorter timeframe than is the case with conventional printed books. Because we publish our titles
in “real-time”—that is, as chapters are written or revised by the author—you benefit from
receiving the information immediately rather than having to wait months or years to receive a
complete product.
Finally, I’d like to note that our books are by no means paid advertisements for the sponsor.
Realtimepublishers is an independent publishing company and maintains, by written agreement
with the sponsor, 100 percent editorial control over the content of our titles. It is my opinion that
this system of content delivery not only is of immeasurable value to readers but also will hold a
significant place in the future of publishing.
As the founder of Realtimepublishers, my raison d’être is to create “dream team” projects—that
is, to locate and work only with the industry’s leading authors and sponsors, and publish books
that help readers do their everyday jobs. To that end, I encourage and welcome your feedback on
this or any other book in the Realtimepublishers.com series. If you would like to submit a
comment, question, or suggestion, please send an email to [email protected],
leave feedback on our Web site at http://www.realtimepublishers.com, or call us at 800-5090532 ext. 110.
Thanks for reading, and enjoy!
Sean Daily
Founder & Series Editor
Realtimepublishers.com, Inc.
i
Table of Contents
Introduction......................................................................................................................................i
Chapter 1: Introduction to Windows Server 2003 Storage Resource Management ........................1
Overview..........................................................................................................................................1
Plan for Success ...................................................................................................................3
SRM .....................................................................................................................................3
The Ever-Growing Need for Storage...............................................................................................4
The Storage Benefits of WS2K3..........................................................................................4
Overview of Windows Storage........................................................................................................6
Frequently Asked Questions ................................................................................................7
Hardware Foundation...............................................................................................7
OSs and Storage.......................................................................................................8
Windows Storage Basics..........................................................................................8
Windows Server Advanced Storage ........................................................................8
Windows Server Third-Party Enhancements...........................................................8
Storage Management ...............................................................................................8
The Future of Windows Storage Technologies .......................................................9
Windows Server Storage Architecture.............................................................................................9
Application Layer ..............................................................................................................10
Filter Drivers......................................................................................................................10
Storage Managers...............................................................................................................11
Device Drivers ...................................................................................................................11
Windows Server Storage Features .................................................................................................12
Core OS Functionality .......................................................................................................12
Dynamic Volume Management .............................................................................12
Distributed-Link Tracking .....................................................................................13
Volume ShadowCopy Service ...............................................................................13
System File Protection ...........................................................................................14
Indexing Service ....................................................................................................14
Administrative Benefits .....................................................................................................16
Windows Server Backup........................................................................................18
LDM.......................................................................................................................20
Disk Quotas............................................................................................................21
HSM.......................................................................................................................23
ii
Table of Contents
End-User Benefits..............................................................................................................23
Information at Your Fingertips ..............................................................................23
File Encryption.......................................................................................................24
What Is Missing? ...............................................................................................................24
Information Storage Options..........................................................................................................24
SRM Products ....................................................................................................................25
Summary ........................................................................................................................................25
Chapter 2: Analyzing Your Storage...............................................................................................26
Phase 1: Analyzing Storage Requirements ....................................................................................26
Storage Analysis Activities................................................................................................27
Storage Analysis Goals ......................................................................................................28
Levels of Auditing .............................................................................................................28
Types of Audit Information ...............................................................................................30
Auditing File and Folder Access........................................................................................30
Storage Tools .................................................................................................................................32
Native Windows Server Tools ...........................................................................................32
File Server Management MMC .............................................................................32
Performance Monitor .............................................................................................35
Storage-Management Utilities ...........................................................................................41
DiskPart..................................................................................................................41
Driverquery ............................................................................................................44
WMIC ....................................................................................................................45
Cleanmgr................................................................................................................48
Defrag ....................................................................................................................49
Event Utilities ........................................................................................................51
Forfiles ...................................................................................................................51
Freedisk..................................................................................................................51
Fsutil ......................................................................................................................52
Openfiles ................................................................................................................52
RSS ........................................................................................................................53
Systeminfo .............................................................................................................53
TakeOwn................................................................................................................53
Additional Windows Server Resources .............................................................................53
iii
Table of Contents
Administration Pack Tools ....................................................................................53
Support Tools.........................................................................................................54
WS2K3 Resource Kit Tools...................................................................................54
WS2K3 Feature Packs ...........................................................................................56
Win2K Server Resource Kit ..................................................................................56
Windows Server Resource Kit Security Tools ......................................................58
Analyzing Storage Usage Tools.....................................................................................................59
Summary ........................................................................................................................................59
Chapter 3: Analyzing and Planning Storage..................................................................................60
Phase 2: Planning SRM .................................................................................................................60
Outcome of Storage Analysis and Planning ..................................................................................61
Trend Analysis and Capacity Planning..............................................................................62
Storage Management Decision Points ...............................................................................65
SRM Product and Process..................................................................................................67
SRM Product Evaluation ...................................................................................................67
StorageCentral Installation.....................................................................................67
Auditing with StorageCentral ................................................................................68
Tips for Selecting Objects......................................................................................73
Working with Report Data.....................................................................................76
Customizing Reports..............................................................................................80
Preparing for the Next Phase .........................................................................................................81
Eliminating Duplicate Files ...................................................................................82
Eliminating Unused Files.......................................................................................83
Eliminating Wasted Space .....................................................................................84
Reducing Consumption..........................................................................................85
Enforcement Policies .........................................................................................................85
Support of Storage Applications....................................................................................................87
Summary ........................................................................................................................................89
Chapter 4: Developing the Storage Resource Management Solution............................................90
Phase 3: Developing the SRM Solution.........................................................................................90
SRM Project Team Roles...................................................................................................91
General Project Guidelines ................................................................................................92
Project Tools ..........................................................................................................92
iv
Table of Contents
Deployment Template............................................................................................92
Communication Plan..............................................................................................94
Minimizing Disruptions .........................................................................................95
Deployment Topology ...........................................................................................95
Delegating Administrative Functions ................................................................................95
Developing an Organizational Storage Policy...............................................................................96
Product Support and Escalation Procedures ......................................................................97
SRM Goals.....................................................................................................................................97
SRM Tools .........................................................................................................................97
Windows Server Disk Quotas............................................................................................97
Creating Additional Storage Capacity .........................................................................................103
Windows Server DFS ......................................................................................................104
Windows Server RSS.......................................................................................................104
Supported Storage Systems..........................................................................................................105
Networked Storage...........................................................................................................105
Virtualization ...................................................................................................................107
Storage Solutions .........................................................................................................................108
Storage Service Provider..................................................................................................108
Storage Devices ...............................................................................................................108
Disk Drives ......................................................................................................................110
Storage Density....................................................................................................110
SCSI versus IDE ..................................................................................................111
RAID................................................................................................................................113
RAID Controllers.................................................................................................114
Performance Design.....................................................................................................................115
Performance Design: Exchange Server Example ............................................................117
Summary ......................................................................................................................................118
Chapter 5: Piloting and Revising the SRM Plan..........................................................................120
Installing and Testing the SRM Solution.....................................................................................122
Documenting Installation Procedures ..............................................................................122
Unattended Installation ........................................................................................123
Service Accounts .................................................................................................123
Pre-Deployment Testing the SRM Solution ................................................................................125
v
Table of Contents
Testing System Stability ..................................................................................................125
SRM Application Monitoring ..............................................................................127
NT 4.0 to Windows Server Upgrade and Application Compatibility..................127
DFS ......................................................................................................................127
MSCS Clusters.....................................................................................................127
Additional Client-Side Impact .............................................................................128
Offline Files .........................................................................................................129
Interaction with Third-Party Utilities...................................................................129
Antivirus ..............................................................................................................129
Backup Applications............................................................................................130
Other Disk Utilities ..............................................................................................130
File Blocking and Quarantining...........................................................................131
Performance Impact .........................................................................................................131
SRM Filter Drivers ..............................................................................................132
Performance Monitor Objects..............................................................................134
Performance Testing Results ...............................................................................135
NetBench..............................................................................................................136
NetBench Test Conclusions.................................................................................139
Iometer .................................................................................................................140
Performance Testing Conclusions .......................................................................142
Inter-System Interactions .................................................................................................142
Testing Quotas .....................................................................................................142
Communication and Education Plan............................................................................................143
Pilot Deployment .........................................................................................................................145
Assess Effectiveness of Implemented Solution ...............................................................145
Preparing for Deployment................................................................................................145
Assigning SRM Support Boundaries and Roles ..................................................145
Summary ......................................................................................................................................146
Chapter 6: Deploying the SRM Solution .....................................................................................147
The Deployment Phase ................................................................................................................148
SRM Goals and Components...........................................................................................149
Return on Investment.......................................................................................................150
Storage Management .......................................................................................................150
vi
Table of Contents
Organizational View of Storage Management.....................................................151
Storage Management Strategies...................................................................................................152
Product Selection Criteria ....................................................................................153
Policy-Based Object Management...................................................................................153
Device Configuration and Management ..........................................................................154
Enterprise Storage Management ......................................................................................155
Application-Centered Storage Management....................................................................155
Fibre-Channel SAN Approach to Storage Management .................................................155
Project Management ....................................................................................................................156
Critical Path .....................................................................................................................156
Milestones ........................................................................................................................156
Risk Analysis ...................................................................................................................156
Risk Mitigation Roles ..........................................................................................159
Mitigation Techniques .........................................................................................159
Identifying Project Issues.....................................................................................159
Technical Issues ...................................................................................................160
People Issues........................................................................................................161
Resource Constraints .......................................................................................................161
Change Control ....................................................................................................162
Extending AD ......................................................................................................162
Best Practices ...............................................................................................................................165
Hardware Standardization................................................................................................166
Server Naming Standards ................................................................................................166
Server Consolidation........................................................................................................167
SAN Best Practices ..........................................................................................................168
Success Measurement Criteria.........................................................................................168
Summary ......................................................................................................................................169
Chapter 7: Manage and Maintain the SRM Solution...................................................................170
Project Management Aspects.......................................................................................................170
Security Issues .............................................................................................................................171
Microsoft Baseline Security Analyzer .............................................................................171
Systems Management and Monitoring ........................................................................................172
Anticipating Changes.......................................................................................................172
vii
Table of Contents
OS Monitoring .................................................................................................................173
Storage Event Monitoring................................................................................................173
Storage Application Monitoring ......................................................................................174
MOM....................................................................................................................175
Integration with Other Applications ....................................................................180
Improving the System ......................................................................................................180
Maintaining Availability......................................................................................180
Improving MTBF.................................................................................................181
Improving MTTR.................................................................................................181
File Share Security...........................................................................................................181
Creating a Secure Drop Directory........................................................................182
Ongoing Process of Storage Management.......................................................................189
Summary ......................................................................................................................................192
Chapter 8: SRM and Storage Futures ..........................................................................................193
Immediate Future .........................................................................................................................193
The Future of SRM ..........................................................................................................193
Storage-Management Utilities .............................................................................194
DiskPart................................................................................................................194
Fsutil ....................................................................................................................196
Enhanced Device Support....................................................................................197
GPT Disks............................................................................................................197
Cluster Cluster .....................................................................................................197
SAN Boot.............................................................................................................199
Multipath I/O .......................................................................................................199
Volume Mounting................................................................................................199
DAS vs. SAN vs. NAS ....................................................................................................200
Interoperability.................................................................................................................201
SAN Management API ....................................................................................................202
Hardware Technology’s Future ...................................................................................................203
Speeds and Feeds .............................................................................................................203
2Gbps Fibre Channel and Beyond .......................................................................203
Fibre-Channel Topology......................................................................................203
10GB Ethernet .....................................................................................................204
viii
Table of Contents
Volume Management.......................................................................................................204
The Role of HBAs ...........................................................................................................204
Virtualization ...................................................................................................................204
In-Band ................................................................................................................205
Out-of-Band .........................................................................................................205
Distance Mirroring...........................................................................................................205
Native Fibre Channel to Disk ..........................................................................................206
SAN Boot.........................................................................................................................206
New Device Classes.........................................................................................................206
Bus Architecture ..............................................................................................................207
Software Technology Futures ......................................................................................................207
WinFS ..................................................................................................................207
DEN Enhancements .............................................................................................208
Dynamic Volume Management ...........................................................................209
Multipath I/O .......................................................................................................209
Security ................................................................................................................209
Shared File Systems.............................................................................................209
Storage Protocols .............................................................................................................210
FCIP .....................................................................................................................210
Storage over IP and iSCSI ...................................................................................210
The Direct Access File System ............................................................................210
Storage Management ...................................................................................................................211
SAN Devices....................................................................................................................211
Policy-Based Management ..............................................................................................211
Operations and Procedural Futures..............................................................................................211
Storage Certifications.......................................................................................................212
Enterprise Backup Strategies ...........................................................................................212
BCVs....................................................................................................................212
Serverless Backup................................................................................................213
Summary ......................................................................................................................................214
Appendix A: SRM Software and Hardware Vendors..................................................................215
Appendix B: SRM and Storage Web Sites, Portals, and Mailing Lists.......................................217
ix
Copyright Statement
Copyright Statement
© 2004 Realtimepublishers.com, Inc. All rights reserved. This site contains materials that
have been created, developed, or commissioned by, and published with the permission
of, Realtimepublishers.com, Inc. (the “Materials”) and this site and any such Materials are
protected by international copyright and trademark laws.
THE MATERIALS ARE PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
TITLE AND NON-INFRINGEMENT. The Materials are subject to change without notice
and do not represent a commitment on the part of Realtimepublishers.com, Inc or its web
site sponsors. In no event shall Realtimepublishers.com, Inc. or its web site sponsors be
held liable for technical or editorial errors or omissions contained in the Materials,
including without limitation, for any direct, indirect, incidental, special, exemplary or
consequential damages whatsoever resulting from the use of any information contained
in the Materials.
The Materials (including but not limited to the text, images, audio, and/or video) may not
be copied, reproduced, republished, uploaded, posted, transmitted, or distributed in any
way, in whole or in part, except that one copy may be downloaded for your personal, noncommercial use on a single computer. In connection with such use, you may not modify
or obscure any copyright or other proprietary notice.
The Materials may contain trademarks, services marks and logos that are the property of
third parties. You are not permitted to use these trademarks, services marks or logos
without prior written consent of such third parties.
Realtimepublishers.com and the Realtimepublishers logo are registered in the US Patent
& Trademark Office. All other product or service names are the property of their
respective owners.
If you have any questions about these terms, or if you would like information about
licensing materials from Realtimepublishers.com, please contact us via e-mail at
[email protected]
x
Chapter 1
[Editor's Note: This eBook was downloaded from Realtime Nexus—The Digital Library. All
leading technology guides from Realtimepublishers can be found at
http://nexus.realtimepublishers.com.]
Chapter 1: Introduction to Windows Server 2003 Storage
Resource Management
Arguably, storage is the most important area of Information Technology (IT). Certainly network
infrastructure is required to provide access, and server operations are necessary to process and
share information, but storage is the heart and soul of an organization’s information system.
When storage is endangered or stored information is lost, the other pieces of the system become
irrelevant—the system becomes technology without the information. Through this guide, you’re
embarking on a hopefully enjoyable journey through which you will learn valuable information
about the Windows approach to storage and how to best make use of it.
Windows Server 2003 (WS2K3) provides exciting new storage-related features (especially
compared with the offerings of Windows NT and Windows 2000—Win2K). However, this guide
isn’t simply an opportunity for lavish praise of Windows Server and storage resource
management (SRM). The reality is that storage and SRM are necessary evils.
Perhaps you have worked for one of those companies whose approach to storage management
was to simply continue adding storage. However, when the dot com boom went bust, most
companies began slashing budgets, which put an end to this storage management methodology.
Systems and network administrators have been left with the job of cleaning up and managing the
resulting storage nightmare. Compounding the problem is users’ expectation of simply adding
storage as the solution. To clean up this mess and ensure IT health for your organization in the
future, you need to employ a thorough SRM methodology.
Overview
I have organized this book in the form of a deployment methodology—the information is
structured to logically flow in a manner to help you with your storage deployment and ongoing
management. As a fledgling consultant, I was taught to organize my projects in a methodology
such as the one that I’ve used to organize this book. I’ll break out each of the subsequent
chapters into the following action phases: Analyze, Plan, Develop, Pilot, Deploy, and Maintain.
If you’re familiar with the Microsoft Solutions Framework (MSF), these phases correspond,
respectively, to the Vision/Scope, Planning, Development, Proof of Concept and Pilot,
Deployment, and Post Implementation Review phases in the Microsoft Windows 2000 Enterprise
Project Planning Workbook. The following list provides a brief overview of each phase of the
process:
•
Analyze—Gather usage information. Determine how storage is being used; which types
of information are being stored and where. Decide whether storage space and
performance is adequate, and prepare to use storage reporting tools to gather information.
In addition, gather business information. Understand the need for storage at the business
level (storage not only affects file sharing but also collaboration on business functions).
1
Chapter 1
•
Plan—Define the problems and prioritize solutions. Ask yourself, Can we find a better
way? Lay out the organizational policies in the following areas:
•
Supported storage systems
•
Supported storage applications
•
Storage quotas or allocations
•
Restricted file types
•
Enforcement policies
•
Support and escalation plan
•
Backup and disaster recovery procedures
•
Develop—Build the solution for testing. Use storage reporting tools and prepare to use
them in real-time production scenarios.
•
Pilot—Test the solution and revise the solution based on feedback. Install and test the
SRM product(s), communicate policies, educate end users, and assess the effectiveness of
the implemented solution.
•
Deploy—Deliver the solution to the target audience. Bring the tested pilot systems into
production and gather feedback.
•
Maintain—Continue to support the solution and prepare to improve as needed. Monitor
disk usage and add storage as necessary (hopefully only for performance upgrades or to
replace defective hardware).
I have structured this book around this methodology so that you can get started on the initial
phases while you’re waiting to read the next chapters.
Solution Acceleration
For a simplified project methodology, consider the one used by Microsoft for the company’s Solution
Accelerators. Check out http://www.microsoft.com/technet/itsolutions/default.asp for the full list of solution
accelerators. Microsoft provides accelerators for solutions such as messaging and ecommerce, each of
which has a storage management component; however, there is no overall storage management solution.
The simplified project methodology consists of three phases: Plan, Deploy (also known as Build), and
Operate (PDO). Although this simplified version is useful from the standpoint of the documentation
provided by Microsoft for the solution accelerators, the importance of the analysis and pilot phases of an
SRM project cannot be overstated. The more time and effort spent on these phases of the project, the
more likely the solution will succeed.
Microsoft’s Solution Accelerators were formerly called the Microsoft Systems Architecture (MSA). At
http://www.microsoft.com/technet/itsolutions/msa/Default.asp, you can see the Plan, Build, and Operate
phases. Follow the download links to access the Service Blueprints, particularly Chapter 4 “Storage
Devices”, Chapter 9 “File & Print Services”, and Chapter 13 “Backup & Recovery Services”.
2
Chapter 1
Plan for Success
Although many of us “techie types” despise project methodology, it can make the difference
between success and failure. Our greatest threat to project failure is from attempting to deploy a
solution that hasn’t been tested and is the result of an incomplete analysis—essentially,
attempting to solve the wrong problem.
Perhaps there is pressure on your IT department to expand the quality of service (QoS) that you
deliver (for example, tighter service level agreements—SLAs—and more available storage),
meanwhile your budget has been cut or frozen. In this situation, your most expensive option is to
let these pressures drive your day-to-day operations instead of developing an SRM project.
Another risk to the project is from scope creep—introducing, in the middle of the project, new
requirements or problems that the solution must solve. The problem is that the success of the
project is measured by these new requirements, yet it is constrained by the original resource
inputs. A thorough risk analysis is highly recommended; however, this analysis is unique to each
environment, so it isn’t included in this guide.
Over the course of this book, we’ll explore the entire process of creating storage that is useful to
the end user and effective for the organization. We’ll look at the processes of creating shares,
installing quota management, defining policies for file storage and share management standards,
and developing the all-important support, disaster recovery, and escalation policy.
SRM
SRM is not a new concept; it has been around for years, but the timing is ripe in the storage
world for SRM to gain front-page attention. SRM is a subset of storage management, which
encompasses a broader scope—everything from monitoring storage area network (SAN)
connections to proactively notifying of disk read errors in advance of spindle failure.
SRM focuses on how well storage resources are serving their intended function—the business
needs of the end user, either on an individual or group level. SRM ensures that storage devices
such as disks are kept available for business functions and meet service level objectives (SLOs,
which are part of SLAs) and involves quite a bit of planning. You must plan even detailed
choices such as the appropriate file systems and storage media (hard disks, tapes, optical/CDROMs, and so on) and storage systems (disk and tape libraries) as well as the ongoing, daily
storage management activities such as file maintenance, backup, and recovery operations.
The following list outlines the goals of an ideally managed storage or SRM environment, which
we’ll pursue throughout the course of this book:
•
Efficiently uses existing resources—Reports about whether storage is being wasted or
over-utilized.
•
Secures resources—Ensures that information is kept secure, including protection from
viral attacks.
•
Manages resource in a cost-effective manner—Provides the ability to manage storage
resources effectively, thus reducing administrative intervention. Ideally, the storage will
be self-provisioning, responding to a set of policies that the administrators have defined
or at least approved.
3
Chapter 1
•
Uses cost-effective storage resource types—Provides the ability to place information on
the appropriate storage type, which is the principle behind Hierarchical Storage
Management (HSM) and Remote Storage Services (RSS).
•
Employs improved storage management functionality—Takes advantage of new features
such as the ability to restrict unwanted file types.
•
Provides increased access to the information stored on storage resources—Makes finding
and accessing files over the network easier for end users, including integration with
directory services.
•
Provides administrative detail about storage resources—Provides the ability for
administrators to access desired information on a detailed level, such as per file share, or
aggregated by volume, server, department, group, and organizational levels.
The Ever-Growing Need for Storage
If you aren’t already convinced by the backlog of storage improvement requests in your inbox or
voice mailbox, there are many market studies that show growth in storage usage and predict
future requirements for capacity planning, such as those available from the International Data
Corporation (IDC) and GartnerGroup. Suffice it to say that demand for storage will continue to
grow at an astounding rate while the capacity to deliver continues to decrease at a frenetic pace
and the ability to manage storage tags along behind.
With Windows Server, as more storage space is needed, you can easily create larger storage
arrays or even build another server and make the new storage available. However, the result of
these actions is a management headache—each server requires separate administrative efforts,
and locating files can be difficult. WS2K3 takes steps toward solving the management headache,
and we’ll discuss how to take advantage of these features.
The Storage Benefits of WS2K3
I was reminded of a favorite phrase from Postcards from the Edge at a meeting to discuss storage
management: “Instant gratification takes too long.” Windows Server is truly the instant
gratification of the operating system (OS) world. The OS certainly provides the easiest way to set
up a network server, whether for applications or file and print services. The ease of deployment
and operations has made Windows Server the choice for organizations small and large, and it is
currently displacing the leaders in networked storage and applications. But instant gratification is
a mixed blessing and can lead to storage chaos.
Many Windows Server alternatives offer access to storage, including embedded OSs that do
nothing but drive storage devices. However, Windows Server provides a wide variety of storage
access methods, and later we’ll look at Windows Server as a customized OS for such devices.
The first thing you might notice when you install and log on to a new WS2K3 server is the
Manage Your Server tool, which Figure 1.1 shows. Resist the initial temptation to select the
Don’t display this page at logon check box (if you do so, you can restart the tool later by
selecting it from the Start menu, Administrative Tools); instead, explore the tool a bit before you
dismiss it. We’ll actually run through it later to see how it configures Windows Servers in
various roles.
4
Chapter 1
Figure 1.1: The Manage Your Server tool in WS2K3.
If you’re familiar with NT and are researching the advantages of moving from NT to WS2K3,
you’ll discover that many are inherent to WS2K3 such as the ability to add and remove storage
devices without rebooting. You can perform online disk management without shutting down the
system and interrupting users. You can add disks and extend or mirror a volume, and the changes
are available immediately without rebooting. In addition, you can use wizards or the Disk
Management console to perform remote management of storage volumes across the network.
Additional features reside at the application level and take some configuration and getting used
to, such as the Distributed file system (Dfs), which Figure 1.2 shows, which lets you spread file
shares over multiple servers.
5
Chapter 1
Figure 1.2: Windows Server’s DFS.
Overview of Windows Storage
This book will cover the following Windows Server storage topics:
•
The different types of physical storage supported by Windows Server: optical, tape, and
magnetic media
•
The different storage architectures available in Windows Server: direct attached storage
(DAS), removable storage, SAN, and Network Attached Storage (NAS)
•
Means of increasing fault tolerance through Redundant Array of Inexpensive Disks
(RAID) and other forms of redundancy
•
How to optimized design and improve performance through bandwidth, latency, and
data-transfer rate tweaks
•
Windows Server’s storage implementation, including file systems (NT file system—
NTFS, file allocation table—FAT, FAT32, CD-ROM file system—CDFS, UDFS),
installable file systems, device drivers, and filter drivers
•
NTFS’s new features: file encryption, reparse points, directory junctions, volume mount
points, sparse files, change journal, and distributed-link tracking
•
Storage feature enhancements—such as hot plug, hot swap, and extension and
expansion—that third-party applications provide
•
How to make the most of Removable Storage Manager (RSM), RSS, HSM in Windows
Server, Dfs, quota management, File Replication Service (FRS), and the Indexing Service
•
The various options in Offline Files and how best to use them
•
Microsoft applications that can impact your storage resource planning: Microsoft Cluster
Server (MSCS) and Exchange Server 2003 with its Web Storage System (WSS) and
Exchange installable file system (ExIFS)
•
Third-party vendor applications that offer backup and file recovery, defragmentation,
volume replication, availability and load balancing, and antivirus for storage, and HSM
solutions and augmentations; included in this list of applications are email archival and
compliance products, which are seeing increased interest
6
Chapter 1
•
Whether you need third-party tools’ features for storage management, including
advanced file system management that includes monitoring and reporting on network
storage usage (from the viewpoint of application-centric utilities and storage-centric
utilities), quota management, and file system screening
•
The future of storage technology, including new software applications for storage and
new hardware such as new system bus and SAN architectures and protocols (small
computer system interface—SCSI, Fibre Channel (FC), Internet SCSI—iSCSI, and so
on)
Frequently Asked Questions
More specifically, the following list provides frequently asked questions (FAQs) that you might
be asking and that I’ll answer in this book. The following questions are broken down by topic:
Hardware Foundation
•
What are the different types of storage technology available and why would I want to
choose one over another?
•
At the highest level, what are the overall storage architectures that I should consider?
•
When should I use DAS, NAS, or SAN?
•
Which types of interconnect would I use in each circumstance—what are the relative
merits or performance capabilities of each type?
•
Hard drives are getting smaller and faster, but capacities are getting bigger; how can this
be?
•
Should I consider upgrading my old drives?
•
Which type of drive is better?
•
Given two drives at the same price, which one should I choose?
•
How do hard drives work and what are the factors that determine hard drive
performance?
•
I’ll need some type of backup device to protect my data, what are my options and why
would I want to choose one over the other?
•
What am I missing out on in terms of brand-new drive technology?
7
Chapter 1
OSs and Storage
•
What can I do to make storage useful to the end user?
•
What does Windows Server offer compared with other OSs—what types of usability,
data protection, and administrative manageability features does this OS provide?
•
When is Windows Server a good choice for storage or not?
•
How robust is the OS at the file-system level—how sure am I that my bits on disk will
always translate to useful information?
•
To be useful, an OS should provide some set of disk utilities; which types of utilities are
needed to make this set functional?
Windows Storage Basics
•
What are the features that most people will be using, even on their Windows XP
computers?
•
What are the features that I might take for granted—what would I use without even
knowing it?
•
How do I take advantage of some of the new features?
Windows Server Advanced Storage
•
What features does the WS2K3 product line (Web, Standard, Enterprise, Datacenter, and
even the 64-bit version) include that Win2K doesn’t offer?
•
What are the features most likely to be used in my server room or data center?
Windows Server Third-Party Enhancements
•
Where does Windows Server fall short and what type of application should I consider
purchasing to fill this functionality hole?
•
How do some of the available third-party applications compare?
•
Which applications are worth evaluating and how should I go about doing so?
Storage Management
•
How much storage is available and where?
•
How can I get more information about how my storage is being used—what types of files
are placed where, how often are they being used, and by whom?
•
Can I get a big picture of my storage topology as well as drill down into individual
systems to see how they’re configured?
•
Can I get detailed performance information about individual RAID subsystems to see
how they’re performing and whether a configuration change would provide some
benefit?
8
Chapter 1
The Future of Windows Storage Technologies
•
What are the new technologies on the horizon that I’ll actually be buying in a year or
two?
•
How likely is it that I’ll be using any high-density re-writeable optical media at any
reasonable cost soon?
•
What types of storage will be available at varying cost levels and how can I match that to
the priority of my data?
•
How can I consolidate data from isolated servers and SAN islands into better managed
storage resources?
Windows Server Storage Architecture
Let’s start by looking at what type of storage infrastructure you’ll want to provide for your end
users based on the type of information that they need to store. From there, you can begin to make
a determination about which type of storage architecture (DAS, NAS, SAN, or other) and which
of the Windows Server features you can take advantage of (NTFS, quotas, Dfs, and so on). Then,
when we reach the limitations of the built-in feature set, we’ll look at extending your capabilities
with third-party products.
Windows Server provides a layered approach to storage. The advantage of an abstracted or
layered driver model is that you can replace or swap out underlying layers without drastically
affecting the other layers. What this functionality means to end users is that they can read from
or write to a volume without knowing the underlying file system or even what type of logical
volume is created on what type of disk or array of disks. Let’s consider the idea of Dfs. In this
case, the volumes are created from a pool of disks that may be located across several servers. In
the NT world, we tend to associate volumes with disks. However, in other OSs, such as UNIX,
the view is more file-system based, and volumes are created and mounted independent of or
across several disks.
As Figure 1.3 illustrates, starting at the top, Windows Server’s storage architecture has the
applications that access storage, from your third-party disk utilities to Windows Explorer. These
operate in user mode, typically in the logon context. The filter drivers and file systems exist in
kernel mode, thus they run when the system boots and don’t require a logon context. Underneath
that layer are the volume managers: Logical Disk Manager (LDM) and RSM. These may give
the impression of being application-layer utilities because they’re both managed via the
Microsoft Management Console (MMC); however, they both load as services regardless of the
user logon. Because they are modular, they can be stopped and the OS will still work, albeit at
reduced functionality. Or they can be replaced, as is the case when upgrading the LDM to the
full VERITAS Volume Manager product.
9
Chapter 1
Figure 1.3: Windows Server architecture and layered approach to storage.
Application Layer
At the very top of the Windows Server layered model are the applications that you use as an
administrator. These include backup and removable storage applications or even applications that
run as a background service providing you with benefits that you may take for granted. We’ll
look at each in more detail in later chapters.
Filter Drivers
Filter drivers operate in the Windows Server OS kernel mode. Third-party independent software
vendors (ISVs) write filter drivers to add functionality, such as antivirus and undelete features, to
Windows Server.
10
Chapter 1
Storage Managers
Below the filter driver layer is the volume or storage-management layer. This layer controls the
interaction and layout or configuration of the physical devices attached to your system. The
LDM handles disk systems on your computer and lets you configure partitions and choose the
file system when formatting. The RSM (see Figure 1.4) is designed to manage the devices that
come and go—tape libraries and CD-ROM or optical jukeboxes. Another storage manager
example, which we’ll look at later, is the Windows Server HSM, which also has a filter driver
component. The big benefit of this layer to you as a storage administrator is that you can make
changes to the OS environment—such as adding and removing devices or changing partitions—
without rebooting the server (in the right circumstances; albeit not in all circumstances).
Figure 1.4: The Removable Storage interface in the Computer Management console.
Device Drivers
At the level just above the physical media are the device drivers, provided either by the drive
vendor (even when bundled into the Windows Server installation source) or by Microsoft. Also
provided is a network redirector, which handles requests for storage that is part of a Dfs.
11
Chapter 1
Windows Server Storage Features
If you’re a current Windows Server user, you know that there are many storage enhancements to
the product. But you might not yet be aware of many of the product’s benefits. Some of them are
not apparent from the initial installation and use of the product; discovering them takes some
study and experience with large-scale deployments. I organize the Windows Server storage
features into the following categories:
•
Core OS functionality—Services and drivers that affect the OS performance and
provide competitive ability or features
•
Administrative benefits—Storage-management features that provide an administrative
benefit in making the OS easier to support
•
End-user benefits—Directory services and storage integration that assist in sharing and
locating information
Core OS Functionality
The most integral improvement in Windows Server’s core OS functionality is the file system
known as NTFS. To distinguish between the original NT version and the one delivered with
Win2K Server, we’ll call the newer version NTFS5. Windows Server can boot on and read and
write to NTFS5 volumes as well as legacy file systems such as FAT and NT’s NTFS4. (I’ll give
you more detail about Windows Server’s supported file systems in the next chapter.) NTFS5 lets
you use the Encrypting File System (EFS) to protect sensitive information and files placed on
NTFS volumes. EFS runs as a background service and uses public-key encryption integrated into
Windows Server and, if desirable, Active Directory (AD) to encrypt files. The entire process is
fairly transparent to the end users, as it is integrated with their logon credentials. Also, the
administrator can assist in key recovery, meaning that the encrypted information should never be
lost forever.
Dynamic Volume Management
The most notable improvement in the core OS functionality is the support for dynamic volume
management (DVM), which is provided by both the Plug and Play (PnP) driver and the LDM.
This combination allows for DVM without requiring the OS to be restarted. I’m often amazed at
what I can do in Windows Server without a reboot, especially compared with its predecessor.
Given the right hardware and circumstances, I can add storage to a server and even make an
existing volume larger while the power is running. Part of the credit for this functionality is due
to the dynamic-disk support in LDM. After I upgrade a disk volume from basic to dynamic, it
becomes more like having a decent hardware RAID controller: I can extend the volume size
dynamically and I have more options for configuring software fault tolerance. In addition, I can
PnP a wider variety of storage devices and standards including FC, Intelligent Input/Output
(I2O), Institute of Electrical and Electronics Engineers (IEEE) 1394 (Firewire), and Universal
Serial Bus (USB) devices.
12
Chapter 1
Although Windows Server’s support for a variety of hardware is much greater than NT 4.0’s support, if
you want to install the latest and greatest cutting-edge technology, you must still be prepared to press
F6 during the Windows Server text-mode installation to provide an original equipment manufacturer
(OEM) driver. If Windows Server Setup doesn’t find your SCSI hard disks or if you receive an
Inaccessible_boot_device STOP error during installation, see the Microsoft article “Text-Mode Setup
May Not Identify Some SCSI Adapters as Plug and Play Devices” at
http://support.microsoft.com/support/kb/articles/q267/5/65.asp. Fortunately, you’ll see less need for
creating the floppy diskette in WS2K3, as more storage drivers ship on the original CD-ROM.
Distributed-Link Tracking
Another core improvement is known as distributed-link tracking. You might have already seen
the direct benefit of this enhancement in action. For example, you might have moved a file
across NTFS volumes or even across computers and noticed that shortcuts and OLE links
automatically undergo a path update. This “magic” update takes place because NTFS5 creates a
volume-wide indexed ID for every file, which lets distributed-link tracking track the file, even
when the location changes. This feature is quite useful for your end-user applications, such as
file sharing on a network server.
Volume ShadowCopy Service
Perhaps the biggest file system and storage improvement that we’ve been waiting for in WS2K3
is the Volume ShadowCopy Service (VSS). We’ll look at VSS in much more detail in later
chapters, but, for now, you should know that there are two implementations of VSS designed to
provide different services. It seems that the most widely known or talked about implementation
of VSS is the one you are least likely to use right away. This implementation is the framework
for taking snapshots or clones of databases while the application is still running. The second
implementation, often referred to as timewarp (the original project codename, which is still
visible in the executable bits), is directly beneficial for protecting shared files. As you will see,
timewarp provides immediately valuable benefits to your file shares.
You’re most likely familiar with the scenario: You (or someone you know) open a file to
essentially use it as a template for creating another file. You edit away, making tons of changes
and doing search and replace. A reminder pops up and you must hurry off to a meeting, so you
quickly hit File, Save. As you head off to the meeting, you slowly realize that you have
overwritten the original file, which was created just hours earlier for an important client
proposal. Throughout the meeting, you have trouble paying attention, as you wonder how you
will be able to recreate the original file, as no backups exist. No doubt, your boss will call up
with a few changes that he or she wants to make immediately. If only you had the ability to do
snapshots, this situation might not have happened.
The idea behind snapshots is to take a point-in-time backup of the files on a system, but unlike a
traditional full backup, the files are not backed up by copying to another device (which, of
course, takes quite a bit of time). Instead, the OS is instructed to keep a copy of any file that is
changed from the point-in-time that the snapshot was taken. The virtual view of the data, which
is a point-in-time snapshot, can even be presented to another host. The second host sees the data
as it existed previously, which can be a logically constructed representation of physical blocks on
the first server. We’ll explore this option in more detail, as it presents considerations and has
limitations.
13
Chapter 1
In WS2K3, snapshots are implemented through VSS. Microsoft has been demonstrating this
feature and working with vendors to ensure that their storage drivers and devices work well with
VSS. This feature will be widely used because it enables recovery to be delegated to let end users
recover accidentally deleted files or folders on network shares without requiring administrator
intervention. VSS is even used in the Windows Backup application in Windows XP to enable
backup of open files.
System File Protection
Similarly, System File Protection (SFP) helps protect the Windows Server OS from becoming
unstable or unusable as the result of a critical file being replaced. This built-in Windows Server
feature, which you don’t need to install or maintain, prevents the replacement of essential system
files by protecting OS files. In the event that one of these files is deleted or overwritten, SFP
replaces the file with the original from a cache that it maintains or requests the files from the
original installation media if the cached version is unavailable. SFP even maintains a cache of
applied service packs so that it doesn’t inadvertently use an out-of-date version after you have
applied a service pack. On non-NTFS volumes, you can take a look in
\WINNT\system32\dllcache and see the files, but NTFS provides the advantage of using the
compressed files and folders feature (as was available in previous versions) to limit the amount
of disk space that these files consume.
Indexing Service
Another built-in feature that you might take for granted is the Indexing Service. This service and
the functionality it provides are essential for one of the goals of SRM: the ability for end users to
be able to access stored information. Initially created for the Site Server component of Internet
Information Server (IIS), the Indexing Service is a core component of WS2K3. Once you enable
the service by clicking Yes in the Enable Indexing dialog box that Figure 1.5 shows, the service
automatically starts at system startup, tracks files, and provides index creation, index updates,
optimization, and crash recovery in the event of a server crash or power failure. This feature
offers administrators management flexibility. For finding the indexed information, you can use
the Windows Server Search function, the Indexing Service query form, or a Web browser.
Figure 1.5: Enabling the Indexing Service to start at system startup.
14
Chapter 1
You can also fine-tune the Indexing Service’s performance. To do so, expand Services and
Applications in the Computer Management window, and select the Indexing Service (see Figure
1.6). You must stop the Indexing Service by right-clicking Indexing Service, selecting All Tasks,
then selecting Tune Performance. In the resulting Indexing Service Usage dialog box, click
Customize to reach the Desired Performance dialog box. For the best end-user experience, move
the Querying slider to the right to High load, but keep the Indexing load at Lazy (unless files
change rapidly, in which case the indexes will not be 100 percent up to date if this slider is set to
Lazy).
Throughout this book, if I instruct you to click an object (usually in the MMC), then right-click and
select a menu choice, and you don’t see the menu selection to which I am referring, you must have
simply right-clicked the object rather than first clicking the object. In Windows Server, the OS presents
a different menu depending on whether you have first selected the object. Notice in Figure 1.6 that
the Indexing Service is highlighted, which means that I have first selected it.
Figure 1.6: Tuning the Indexing Service’s performance.
Although the Code Red outbreak is, hopefully, a distant memory as you read this, it is important to be
aware that the Code Red worm (and its variants) involved the Indexing Service. Most of the publicity
focused on IIS as the cause because the worms spread through IIS. However, the original security
alert was titled “Unchecked Buffer in Index Server ISAPI Extension Could Enable Web Server
Compromise.” I clarify this point because a new Win2K server has IIS installed and enabled by
default (even if the server is just a file server and not a Web server), so you should apply the security
patch (available at http://www.microsoft.com/Downloads/Release.asp?ReleaseID=30800) to the
Indexing Service until Service Pack 3 (SP3) is released and includes the security patch. Fortunately,
as part of the Trusted Computing Initiative, Microsoft ensured that WS2K3 ships “secure by default,”
and IIS is not installed or enabled.
15
Chapter 1
The features we’ve just discussed are all available when you install Windows, even Win2K
Professional. The following set of features is available to administrators, mostly as storage
management applications.
Administrative Benefits
One of Windows Server’s most immediate benefits is having several storage and sharemanagement functions available in one place—the Computer Management console, which
Figure 1.7 shows. This console lets you create file shares, access information about open
sessions to those file shares, create or manage disk volumes, launch the disk defragmenter, and
manage removable storage and the Indexing Service.
Figure 1.7: Share management functions in the Computer Management console.
Figure 1.8 illustrates a limitation of connecting to a remote computer from the Computer
Management console; you are limited to read-only mode for managing devices. To change a
device driver property or to disable it entirely, you must use a remote control application such as
Virtual Network Computing (VNC) or Win2K Terminal Services.
16
Chapter 1
Figure 1.8: Connecting to a remote computer through the Computer Management console.
WS2K3 enables new remote management functionality. The Terminal Services functionality of
NT 4.0 has evolved into two modes: one for application sharing and one for remote server
management. The latter is now installed by default on WS2K3, but it must be enabled. To do so,
select the Allow users to connect remotely to this computer check box on the Remote tab of the
My Computer system properties window.
In Win2K, Terminal Services isn’t installed by default. To install it, from the Control Panel,
launch the Add/Remove Programs applet, and select Add/Remove Windows Components.
Terminal Services operates in two modes: as an application server or in remote-administration
mode. Select remote-administration mode during the installation wizard-guided setup. One
“disadvantage” of Terminal Services is that each administrator (limited to two connections by
default) maintains a separate window, so they can’t see what the other is doing to the system.
This shortcoming allows for potentially conflicting changes.
Another remote control option is VNC, which shows one, and only one, current logon. VNC is
available in a wide variety of platforms.
If you prefer using a browser to administer your remote servers, you might consider adding the
Remote Desktop Web Connection, which Figure 1.9 shows. To access this window, open the
Control Panel, and select Add/Remove Programs. Click the Add/Remove Windows Components
(you can now close the Add/Remove Programs window). Select Internet Information Services,
click Details, select World Wide Web Service, click Details, then select the Remote Desktop Web
Connection check box. You will need to ensure that the Allow users to connect remotely to this
computer check box on the Remote tab of the My Computer system properties window is
selected, as mentioned earlier. You will be amazed at the similarities between WS2K3 and
Windows XP!
17
Chapter 1
Figure 1.9: Adding the Remote Desktop Web connection.
Windows Server Backup
The first storage-management application that you should be aware of and skilled at using is the
Windows Server Backup program, also known as NTBackup (based on the .exe file name). This
application was originally provided to the Microsoft OS by Seagate Software and will be familiar
to you if you’ve ever used the VERITAS Backup Exec program. The big difference between
NTBackup in NT 4.0 and in Win2K Server and WS2K3 is that the Win2K and WS2K3 versions
of the program will back up the System State. For a Win2K Workstation, Win2K Server, and
WS2K3 system, the System State includes the folders and files needed to boot the OS, the
COM+ Class Registration Database, and the registry. On a Windows Server domain controller, a
System State backup includes AD, Certificate Server, and the File Replication Service (FRS).
As Figure 1.10 shows, the NTBackup in WS2K3 introduces a new option, the Automated
System Recovery (ASR) Wizard. ASR replaces the Emergency Repair Disk (ERD) process from
NT and Win2K. ASR provides faster recovery and is easier to use than the ERD process. To use
ASR, you will need the ASR-created floppy disk, the original OS installation CD-ROM, and the
backup media. To start the ASR process, press F2 when prompted during the text-only portion of
Setup. You will then be prompted to insert the ASR floppy disk.
18
Chapter 1
Figure 1.10: Selecting the WS2K3 ASR Wizard option.
In addition, Windows Server Backup supports backing up to hard drive media, including
network-attached devices. The big disadvantage or limitation of Windows Server Backup
compared with the full VERITAS Backup Exec program is that you must manage Windows
Server Backup on each server—there is no central console for enterprise backup. However, you
can run Windows Server Backup from the command line and store the backups on a centralized
server. If you have not yet done so, make a System State backup and ERD now—you’ll be glad
you did.
Windows Server Backup can communicate with another Windows Server service, RSM, so that
the backup application doesn’t need to deal with device drivers directly. This feature is
especially important when dealing with the type of storage that would cause serious system
problems when removed during file operations, such as automated tape libraries. The backup
application rides on top of the file system drivers, so the backup application is file system
independent; that is, it doesn’t need to know about the different file systems and can read from or
write to tape and FAT or NTFS volumes equally well.
19
Chapter 1
LDM
The second storage management application that you should be skilled at using is the LDM.
Windows Server’s LDM is an improvement to the Disk Administrator application found in NT.
LDM runs as a service and is presented to the end user within the MMC Computer Management
console by selecting Disk Management, as Figure 1.11 shows.
Figure 1.11: The LDM interface.
It is also possible to use the Computer Management console for remote administration, assuming
that you have sufficient rights. (Figure 1.12 shows the resulting dialog box that is presented if
you don’t have sufficient rights.) LDM offers the same RAID levels as NT 4.0 offers for creating
software-based fault-tolerant RAID sets, but the management features for these fault-tolerant
volumes is much improved in LDM. (Many of the improvements result from the benefits of
dynamic disks, which I’ll cover later.) For example, you can now create, break, and recover
NTFS-formatted stripe sets and mirrors without taking the volume offline. Under NT 4.0, if you
wanted to change or repair an existing mirror set or a RAID5 set, you needed to disconnect all
users and reboot the server. Similarly, you can now grow an NTFS volume without taking the
volume offline, assuming that you have a system with hot-pluggable drives. Thus, if a particular
share is running out of disk space, you can add a new disk to the server, then add the free space
provided by the new disk to the existing volume.
Figure 1.12: LDM checks permissions for remote management.
20
Chapter 1
There are two types of disks available to you in LDM: basic disks, which are the default, and
dynamic disks, which are an upgrade that you can choose. We’ll get into more detailed
information later, but for now, think of basic disks as what you might have used in NT 4.0 if you
used fault-tolerant disks (provided by the FTDISK.SYS driver). Existing fault-tolerant disks
created in NT 4.0 will be supported in Windows Server as you upgrade the server, but the disks
can only be repaired and can’t provide all the features of Windows Server disks without being
upgraded to dynamic disks. Windows Server dynamic disks have several advantages over basic
disks. For example, basic disks can’t be extended online, and in order to repair fault-tolerant sets,
you must take the volume offline.
Dynamic disks also have the advantage of being “self-describing”—meaning that all the disks in
a dynamic disk set have a unique identifying signature, and a small database on every disk tracks
all the members in its collection or group. In addition, other disk information is kept in the
database so that changes in SCSI cabling that affect SCSI IDs, Logical Unit Numbers (LUNs),
and host adapter ordering doesn’t adversely affect the disk or host. Because volume
configuration information is no longer contained in the registry as NT FTDISK, you can move
the disk (or set of disks) to another server in the event of a disaster, and the new server will
recognize the disk configuration. Also, the disk configuration database is transactional in nature,
which helps to protect it against loss of updates to the disk configuration.
When adding a new disk, the default type will be a basic disk. When creating a new partition on the
disk, reserve at least 1MB of free space for the disk configuration database to allow for future
upgrade to dynamic disks. If you forget to leave this space, you might be unable to perform the
upgrade to dynamic disk at a later date. Such will be the case when the partition is created outside of
Windows by third-party software or another OS. I have not found this situation to be a problem when
the disk is created within WS2K3, as I have been able to partition the entire disk space, yet convert to
dynamic disks later.
LDM also allows Windows Server to break the 26-drive letter limitation by letting you mount
volumes to directories. This functionality, known as volume mount points, is quite useful even if
you aren’t in danger of using up all 26 possible drive letters. To illustrate, suppose that you just
added a new drive to a system and didn’t want to add it as another drive letter; instead, you
would rather add it to the \Projects folder, which has been growing rapidly. You could mount the
new volume to a folder within the \Projects folder and begin using the space there.
Disk Quotas
Perhaps the most anticipated feature for file servers in Win2K Server was the new disk quota
feature. Although it is true that NT 4.0 SP4 implemented a quota feature (Profile Quota Manager
or Proquota), it was based only on limiting NT user profiles rather than controlling how much
disk space a user could fill on a particular volume.
I’ll cover Proquota in a bit more depth later and explain the differences between Proquota and
WS2K3’s disk quota feature.
21
Chapter 1
The new NTFS quotas in Win2K and WS2K3 allow for either soft or hard storage limits on a per
user basis for a given NTFS volume. The two types of quotas have been well defined in the
UNIX world, with the soft quota being the threshold for the space that you’re allocated, and the
hard quota being the point that no further disk access is allowed. In Win2K, when you exceed the
soft quota, it is only recorded to the event logs and you’re still allowed disk access; you can
continue running programs without losing any files. However, when you reach the hard quota,
you receive an insufficient disk space error from Windows. In fact, as quotas affect the free disk
space reported to applications, a running program might abort if it attempts to use disk space that
exceeds the hard quota. (A situation that you definitely want to avoid!) As an administrator,
weigh carefully which is more important: to deny disk access or the loss of important business
data.
In Figure 1.13, the soft quota is set to 450MB, and the hard quota is set to 500MB. You can also
management quotas through monitoring without actually enforcing them by clearing the Deny
disk space to users exceeding quota limit checkbox.
Figure 1.13: Setting disk quotas before enabling them on a volume.
Figure 1.13 shows sample entries for a disk quota, but the quota hasn’t yet been enabled on the
volume (until I click OK or Apply). After you apply the quota, you will receive a pop-up
warning message asking whether you are sure that you want to apply the quota and informing
you that a volume rescan will take place to update disk usage statistics. When you click OK, the
traffic light will turn yellow as the system rebuilds quota system information, then eventually
turn green. A similar message also appears when you disable the quota.
22
Chapter 1
We will explore quotas more in upcoming chapters, as they can be quite complex and there are
some situations in which the built-in or native quota system falls short. In such cases, you must
turn to a third-party quota-enforcement application. You might have noticed that I stated earlier
that you set a quota on a particular volume. Notice in Figure 1.13 that the quota is a property of
my C drive. You must either accept this level of detail or look for an alternative quotaenforcement application because Windows Server’s quota management is a property of each
volume, which can mean a lot of administrative overhead.
HSM
Another feature or application in Win2K that benefits systems administrators is HSM, which also
works with RSS. The idea behind HSM is to automatically move data between high-cost and
low-cost storage media if the files aren’t in active use. Faster, more expensive hard disks can be
reserved for high-demand files, and removable optical disks and tapes can be used as secondary
storage. When the data is needed, it is copied back on demand. RSS (provided by Seagate
Software) monitors the amount of space available on the hard disk, and when the free space
drops below a set level, RSS moves data to remote storage and keeps the directory and property
information up to date. (To do so, it uses reparse points, which we’ll look at later.) Because the
file recall from secondary storage involves latency, there is a performance penalty.
End-User Benefits
WS2K3’s storage features offer many benefits to systems administrators, but without clear
benefits to the end users, the perception of Windows Server in your organization will be less than
favorable. Thus, let’s look at the benefits to end users.
Information at Your Fingertips
What good is a massive amount of storage if you can no longer keep track of the information on
it? As I mentioned earlier, there is a new, improved Indexing Service in Windows Server that
promises to find more information stored in files in faster search times. What I didn’t mention is
that this service integrates with Windows Server’s other storage features, such as HSM and RSS.
Thus, you need not search multiple places nor do you need to be concerned with which type of
media the information is stored on; the OS will find the information and deliver it to you.
Similarly, Dfs allows systems administrators to build one centralized hierarchy of file folders
under a single root. End users will no longer need to ask “Which server is that share on?” In
addition, Dfs has the ability to replicate content for fault tolerance as well as performance. The
response time is much better directly on the LAN than over the WAN for a user opening a file
during peak business hours, so replicating the file from one Dfs server to another over the WAN
during the wee hours of the morning saves bandwidth when it is needed and makes for a more
satisfactory end user experience. Obviously, designing and setting up Dfs takes some
administrative work, but the benefits of doing so are considerable for end users.
AD isn’t a requirement of running Windows Server, but it can provide centralized directory
services and administration. As such, it can help users to search and locate file share information
as well as provide you, the administrator, with a more fault-tolerant environment.
23
Chapter 1
File Encryption
Some employees have information that must be kept confidential even from systems
administrators. The EFS is designed to allow end users to keep their files private. As mentioned
earlier, the process is fairly transparent to the end user and is integrated with their logon
credentials; administrators can only assist in recovery. File encryption, however, is mutually
exclusive from file compression; the two can’t be combined, so just be aware that the users must
decide between one and the other.
What Is Missing?
We’ve covered the new, improved features in Windows Server, but this book is about SRM. So
what’s missing from WS2K3 that is needed for a successful SRM implementation? I’ll spell out
areas for improvement in great detail as we take a deeper look, especially in the areas of
managing storage for an actively growing, dynamically changing organization. The first phase of
our SRM methodology is to perform an analysis or audit of storage utilization, and this analysis
will uncover the first of many gaps left in Windows Server’s functionality. From there, we’ll
move into planning for and developing a solution based on the built-in functionality as well as
functionality provided by third-party add-ons.
Information Storage Options
As a storage administrator, you must decide which type of storage best fits your business users’
needs. Most likely, you will have several of the types of storage covered here. To illustrate, let’s
follow the life cycle of a typical source of information, the Office document. As I create the
document, I’ll save it, typically to the My Documents folder. If my systems administrators have
done their homework, this folder will be redirected to a network file server. (Benefits of offline
folders will be covered later). After I’m done with the file, I might want to share it with others,
which I can do by sending it via email, publishing it to a Web page, or placing it in a file share
on the network. Perhaps I have a collaborative application such as Microsoft SharePoint (either
SharePoint Portal Server or the SharePoint Services that shipped as an add-on to WS2K3). If you
are using the new Office System 2003, you will already see the greater emphasis on using
SharePoint as the de facto storage for Office documents.
Where I place the file helps determine how you (as another business user) deal with it—whether
you decide to leave it in place and use it or create your own copy, especially if you want to use
version tracking. Another important consideration about where to store the information is
whether you will ever be able to find it again—that is, how well is the content of the file indexed
for search and retrieval.
Suppose that I email you the file. When you receive the file, you can choose from the following
storage options:
•
Keep it in your mailbox
•
Move the mail message and attachment to a personal folder (.pst folder)
•
Move the mail message and attachment to a public folder (.pf)
•
Save the attachment to your local computer
•
Save the attachment to a file share
24
Chapter 1
•
Save the attachment to another collaborative application (such as Microsoft SharePoint)
•
Publish the attachment to an intranet or Internet Web server
•
Save the attachment to another database system (such as SQL Server)—which is most
often how email archiving solutions store the attachment files
•
Delete it because you just can’t deal with another file, and you hope that someone else
can cough it up should you ever need it again
As you can see, the storage end user has a variety of choices for storing information, and each
choice has unique advantages and disadvantages compared with the other methods or choices.
We’ll explore the advantages and disadvantages in greater detail in the following chapters
(except for the last option, as it, hopefully, isn’t very common in your organization).
SRM Products
Depending on which of the previous solutions you choose to store your information (mail server,
mail client, public folders, file shares, local storage, SharePoint, Web server, or database
system), you’ll need an appropriate SRM solution. SRM has been around for quite a while—as
long as there have been storage resources that need to be managed—but it is only over the past
few years that it has reached critical mass. It’s now viewed as a priority for IT departments.
SRM products come in two main flavors. The first of which are reporting products that will tell
you how much space is being used on which storage device by which type of files. The key word
here is reporting, as this type of product requires that a report be run to present the information.
The actual gathering of the information might have been done hours ago, as you set the schedule
to run in the wee hours of the morning. I’ll cover as much as is possible for you to do using
resource kit utilities, before moving on to the third-party tools that you can purchase.
A more desirable alternative is a much more dynamic product that provides real-time
functionality and storage information. Static or scheduled reporting is useful, but it forces you to
be proactive and handle problems on a regular basis. If you don’t stay on top of the information
being generated by the reports, you’re asking for trouble. Being proactive is much more difficult
than reacting to the problems that the SRM tool reports—falling behind is easy. I much prefer
the type of application that handles the problems for me.
Realistically, your solution will be a combination of these two solution types and include
applications developed in-house to fit your needs. Hopefully, you can already tell that you will
have a much easier time administering Windows Server than any other storage OS in the past.
Summary
This chapter provides an overview of the topics that we’ll cover in this book, and takes an
introductory look at the Windows Server OS from a storage standpoint. The rest of the book will
be presented in the format of a deployment methodology, intended to guide you through your
SRM project. There is a substantial set of new storage features in WS2K3, and this chapter helps
you to ask questions about which features you’ll be able to take advantage of. A high-level view
of information storage options is presented so that you can begin to assess the differences and
apply them in your future storage decisions.
25
Chapter 2
Chapter 2: Analyzing Your Storage
In Chapter 1, we took a quick look at Windows Server’s storage offerings. In this chapter, I’ll
show you how to make the most of the new WS2K3 product using the built-in features and a
healthy portion of resource kit and support tools to analyze your storage requirements. There are
many new tools and they are often free for download, so I’ll show you where to get them and
what they do.
In addition, I’ll discuss the process of analyzing your current storage environment. First,
detailing the levels or hierarchy of auditing: organization, network, domain, servers, storage
systems, shares, folders, and files. I’ll then provide templates for the types of information that
you’ll want to gather, including determining storage utilization (used and available disk space)
and identifying the storage users.
Finally, we’ll explore what you can and can’t do with Windows Server’s native analysis tools,
such as Performance Monitor, as well as with tools in the Windows Server resource kit. We’ll set
the stage for taking a first-hand look at the need for third-party SRM tools to show how they can
improve the audit process and prepare you for the next phase—planning your SRM deployment.
Phase 1: Analyzing Storage Requirements
Analyzing storage requirements is an ongoing process, especially in a dynamically changing
environment. Table 2.1 shows this phase in the overall SRM deployment methodology. The first
step in the analysis process will be to take a snapshot of where you’re at today. The next step will
be to perform a gap analysis to determine where you want to be and what’s missing.
Phase
Process
Storage Resource Management
Analyze
Gather usage information
How storage is being used―What types of
information are being stored and where. Is storage
space and performance adequate? Prepare to use
storage reporting tools to gather this information.
Gather business information
Understand the need for storage at the business
level―not just for file sharing but also for
collaboration on business functions.
Table 2.1: Phase 1 of an SRM deployment methodology.
The challenge is that SRM is a moving target. You must observe and take time to gather
information about the current situation before you can take action. So start your SRM
deployment by printing out some storage analysis reports and taking them with you to your next
extra-long meeting. Spend some extra time observing and asking whether the actions that you
intend to take are really the best possible.
26
Chapter 2
Storage Analysis Activities
The following list defines the activities that your storage analysis will entail:
•
Gather storage tools―Why start gathering information before you know which tools are
available? For example, I’ll list some inexpensive resource kit utilities, but if your
company has already purchased an SRM product, you’ll be much better off using that.
Prepare to use the storage reporting tools to gather information.
•
Gather storage-management information―In addition to knowing which tools are being
used, determine which storage-management processes have already been established.
How can you tap into those processes rather than reinvent them?
•
Gather storage information―Which types of storage are available and in what quantity?
What are the storage devices and how are they configured? What are the storage
applications commonly in use?
•
Gather storage-usage information―How is storage being used? What types of
information are being stored and where? Where are the hot spots in which you’re running
out of room and cold spots in which you have excess capacity? Identify the storage space
that isn’t being used to its full potential.
•
Gather user-usage information―Identify the users, groups, and departments that are
using storage and how they’re using it. This activity is usually a requirement when
performing charge-back analysis to bill users or departments for the cost of providing the
storage resources (including administration in addition to hardware). Nevertheless, it is a
foundation of storage management to understand how storage is being consumed.
•
Gather performance information―Is the storage performance adequate? Where does it
need to be improved? Identify current hardware configurations to determine which
configurations need to be refreshed or phased out. In addition, analyze the environment to
ensure that it can support newer technologies such as NAS.
•
Gather data-protection information―Is the data adequately protected against hardware
faults, operator error, malicious intent, physical access, natural disaster, and so on?
Identify which forms or techniques of data protection are in use. Identify business-critical
and even mission-critical data and how it is protected.
•
Gather business information―What types of storage benefit the business? Try to
understand the need for storage at the business level; that is, not just for file sharing but
also for collaboration on business functions and workflow. Are there limitations in the
storage or applications that can be addressed to improve storage functionality?
27
Chapter 2
Storage Analysis Goals
Despite the many possible types of information you can gather, I prefer to keep this phase of an
SRM deployment simple. Doing so makes your chance of success much greater. That said; let’s
keep the following goals in mind for an SRM deployment:
•
Reduce the cost of ownership―Reduce capital expenditures for new hardware as well as
the cost of managing storage resources. Part of this goal includes the effort of server
consolidation, reducing the number of systems that need to be managed. You can
accomplish this goal only by making the larger, consolidated systems easier to administer
with the same or fewer resources. By managing storage effectively, you also reduce the
effort and cost associated with maintaining files that don’t need to be part of your critical
data-protection strategy (fault-tolerant hardware and backup). The sum of these costs is
known as the total cost of ownership (TCO).
•
Improve end-user experience―Improve both the perception of performance and system
availability. Although this goal isn’t directly quantifiable, the SRM deployment should
provide some assistance in making users’ daily tasks easier to accomplish rather than
interfere with their business functions and make their lives miserable.
•
Improve protection of business-critical and mission-critical data―This goal ties in
directly with the TCO, as a hit to storage-system availability can have a devastating effect
on business profit and loss.
•
Develop a sound organizational policy―Out of an SRM-deployment effort, you should
gain a well-developed policy that defines storage best practices for your organization.
•
Develop qualified personnel―Let’s not forget that the SRM-deployment process will let
you identify the necessary tools to proactively manage storage resources, and it will give
employees knowledge about how to accomplish SRM. Also, administrators will become
aware of storage-management solutions and tools and how to use them.
Levels of Auditing
Now that you know what type of information you need and why, you can take action. As you
pick up the technical details of the SRM-deployment process, be sure to relate them back to the
business environment.
The different levels, or hierarchy, of auditing include the organization, network, servers, storage
subsystems, domains, users and groups, shares, machine accounts, OSs, storage applications,
logical drives, folders, controllers, interconnects, physical disks, and files, as Figure 2.1 shows.
To start organizing your collection of information, let’s start at the top—the organizational level.
28
Chapter 2
Organization
Network
Machine
Accounts
Domains
Users &
Groups
Shares
Operating
Systems
Storage
Applications
Logical Drives
Folders
Interconnects
Physical
Disks
Files
Servers
Storage Subsystems
Controllers
Figure 2.1: Levels of auditing.
Most organizations consist of many pools of storage as well as a variety of directory services. In
the NT world, you have any number of domains that can contain storage resources. The idea
behind AD is to centralize the directory services and publish storage resources, such as file
shares, in the directory. You’ll see later how well this system works (or doesn’t work), but for
now, you’ll most likely need to deal with groups, users, and server accounts that aren’t integrated
into a single directory. You might also deal with other OSs and storage applications, but for this
guide, I’ll focus on Windows Server storage applications.
The storage subsystems layer contains storage controllers, storage interconnects, physical disks
and their logical representation, and the folders and files on those disks. Your method of
accessing this information is most likely dependent on the equipment vendor or manufacturer;
that is, there is such a variety of hardware-specific information that you’ll require a reporting tool
from the server or storage maker. There are a few products available for managing SAN
equipment that attempt to report on all devices in the SAN, but this task is difficult. For the
information to be specific enough to be of value, it typically must come from the hardware
manufacturer. For the purpose of SRM, our concern is the presentation and usage of the storage
resources at a higher level―as storage from which the end-user can benefit. This storage is
typically in the form of shares, folders, and the files they contain, as Figure 2.1 illustrates.
29
Chapter 2
Types of Audit Information
There is quite a wide variety of information that you can acquire from a storage audit, and it
corresponds to the levels of auditing. You can identify who the storage users are in your
organization and can audit file and folder access at the server level. You can perform security
auditing to make sure that file and folder access is set properly and is being used as planned. This
type of network audit has become increasingly important as we have been hit by viruses that take
advantage of network share permissions allowing write access to unauthenticated users (for
example, the Everyone group).
You can conduct a performance analysis to measure how well your storage systems are
performing at different levels. You can also gather information such as the level of data
protection (for example, RAID and other fault-tolerance measures) and whether backups are
completing successfully. For SRM, we’re most interested in assessing disk space used and
available (storage utilization) on both a per-server and per-user or group level, and we’ll take an
in-depth look at how to use certain tools to accomplish this mission.
Auditing File and Folder Access
WS2K3 makes auditing file and folder access on NTFS volumes fairly easy. First, you or another
member of the Administrators group define which files and folders, whose actions, and what
types of actions to audit. In WS2K3, auditing is enabled through Group Policy, as Figure 2.2
shows. (To navigate to this Group Policy console, select Start, Programs, Administrative Tools,
Local Security Policy.) As you can see, the Group Policy interface for auditing is much simpler
in WS2K3 than it was in Win2K, in which you had to expand several levels.
30
Chapter 2
Figure 2.2: Enabling auditing for object access in WS2K3.
Once the auditing policy is enabled, you can then apply it to specific folders and files using the
Windows Explorer interface. An entry is then written to the Event Viewer Security Log
whenever the file or folder matching your criteria is accessed. Then another entry is written to
the Security Log, and then another, and so on.
The problem is that this type of information can create overload fairly easily unless used
carefully. For example, auditing successful file access can indicate normal business usage, which
can occur quite frequently. Instead, what you should be looking for are the rare occurrences of
failed file or directory access attempts, which might signal a problem. Let’s explore the tools you
can use to help you accomplish this task.
31
Chapter 2
Storage Tools
In this section, we’ll look at the tools that you can use to manage—including analyze—your
storage, starting with the tools that you’ve already paid for: the tools included in Windows
Server, including both the graphical consoles and the powerful command-line utilities.
Native Windows Server Tools
Some useful tools are available immediately when you install WS2K3. These tools had
previously been available only for purchase as resource kit utilities or for download (sometimes
only within Microsoft). Most of the built-in tools will be listed when you type
HELP
at a command prompt, although there are a few exceptions, such as Eventcreate (discussed later).
The best starting point for Windows Server storage auditing is to see what the built-in tools—
such as Performance Monitor and the Computer Management console—are capable of.
File Server Management MMC
As Figure 2.3 shows, the Manage Shared Folders MMC is different than the one in Win2K. In
the WS2K3 version, there are task commands that can be launched by a single mouse-click,
including the new task, Configure Shadow Copies. You can access this console by selecting
Start, Run, and typing
filesvr.msc
Alternatively, if you have configured the file server role in the Manage Your Server Wizard, you
will see this console available from the Administrative Tools menu. (Note that Fsmgmt.msc
brings up the simpler Shared Folders MMC).
Figure 2.3: The Manage Shared Folder MMC in WS2K3 Administrative Tools.
32
Chapter 2
As you use the Manage Shared Folders MMC in WS2K3, you might notice changes in the
default security permissions. As Figure 2.4 shows, the new default permissions no longer allow
write access to unauthenticated users (for example, the Everyone group). This security setting is
a result of the recent viruses that have taken advantage of network share permissions.
Figure 2.4: WS2K3’s default network share permissions.
The Logical Disk Manager provides the Disk Management view that Figure 2.5shows. This Disk
Management view of drive information includes important information about unallocated
storage. The big news about WS2K3 is that the Logical Disk Manager is kinder, gentler than it
was in previous versions. We’ll explore the Logical Disk Manager in more detail in later
chapters; for now, you should know simply that the underlying storage mounting mechanism is
improved in WS2K3. One of the big improvements of Win2K over NT 4.0 was how Win2K
handled mounting disks without the need to reboot. WS2K3 takes a step back and says “Do you
really want me to mount that disk?” The reason is to make it less disruptive and friendlier to
SAN environments—it will no longer grab every disk that it sees.
Another change in WS2K3 is that the option to upgrade a basic disk to a dynamic disk is turned
off by default. When you mount new disks, the Initialize and Convert Disk Wizard has only the
option to write a signature selected.
33
Chapter 2
Figure 2.5: Disk Management view of drive information including unallocated storage.
The Windows Server Event Viewer performs storage management functions such as warning
you when a disk is nearly full (see Figure 2.6). However, depending on the level of activity, you
might already be in trouble by the time you get this message. You must be set up to get this type
of message before the situation is irreparable. To do so, you need to use an application—such as
Microsoft Operations Manager (MOM) or Hewlett-Packard OpenView—or a resource kit
utility—such as Eventquery and Eventtriggers, discussed later—to monitor the event logs and
notify you of the warning. The Windows Server tool also lacks features such as being able to
configure threshold settings. Thus, in this case, the native Windows Server tool alone isn’t
enough.
34
Chapter 2
Figure 2.6: An example disk nearly full event log warning message.
Performance Monitor
Despite its name, the Windows Server Performance Monitor is designed for more than just
monitoring performance. Technically, the new name for the Windows Server version of the tool
is System Monitor, but it will likely always be known as Performance Monitor or PerfMon (the
name of the actual executable). As Figure 2.7 illustrates, Performance Monitor can monitor the
amount of free disk space on a volume or drive. As you can see, the chart view can be messy and
difficult to read if you try to view too much information at once.
35
Chapter 2
Figure 2.7: Performance Monitor chart view of % Free [Disk] Space.
When using Performance Monitor, you must know the name of the server that you want to
connect with because there is no browse button. After you enter the server name, and
Performance Monitor validates the name (this validation can take a bit of time), the next step is
to select the object and counters to monitor, such as Current Disk Queue Length, % Disk Time,
and Avg. Disk Bytes/Write. For a brief explanation of a counter, click Explain as you add the
counter.
Some of the counters that Figure 2.7 shows, such as % Free Space and Free Megabytes, are more
resource-management related than truly performance related. However, these counters provide
useful information that you can use in your analysis. Note that the object being monitored is the
LogicalDisk; however, you won’t see the LogicalDisk as an object until you enable the Diskperf
counters (as I explain later).
Once you add all the disk counters, you end up with a chart that is messy and difficult to read.
The chart view is useful for comparing how several servers are performing for the same counter
(for example, which servers have the least disk space). However, for comparing counters from
different objects, the report view provides information that is much easier to read at a glance.
Figure 2.8 shows the report view for PhysicalDisk counters for one server. If, for example, you
needed to know how much space is free on the D drive, you would look at the numbers in the
middle of the screen to get that information.
Notice that the following figure is missing the all-important LogicalDisk counters for % Free Space. To
see LogicalDisk counters, I need to enable them by typing
36
Chapter 2
diskperf –y \\computername
at a command prompt (you can omit the \\computername for the local computer). The option –y sets
the system to start all disk performance counters when the system is restarted. In NT 4.0 and earlier,
none of the disk counters are enabled by default. However, in Windows Server, the PhysicalDisk
counters are enabled but not active until you start polling them using Performance Monitor.
The option diskperf –n turns off all disk counters, and the –nv option disables only the logical counter.
In all cases, a reboot is required after toggling the counters. Of course, there is a very slight
performance impact for monitoring the disk counters, so you might not want to leave them on all the
time.
Figure 2.8: Report view of PhysicalDisk counters for a single server.
If you can see the LogicalDisk object but can’t add the associated counters, the problem might be that
diskperf –y was set but the server wasn’t rebooted.
In Win2K, to gather Performance Monitor information on one computer from another, you need
to configure the Performance Logs and Alerts service (which has a short name of SysMonLog)
with credentials at the domain level. By default, the Performance Logs and Alerts service is
configured with the LocalSystem account, which means that it will be unable to gather
performance information from remote computers. When you add the desired account and click
OK, the tool will present a message stating that the account will be granted the Log On As A
Service right, which saves you the step of performing that task manually.
37
Chapter 2
By default WS2K3 adds credentials (the NT Authority\NetworkService account) that allow the
service to access remotely over the network. If you check your WS2K3 Performance Logs and
Alerts service, you will see that the change has already been made for you. The NetworkService
account presents the computer’s credentials to remote servers with a token containing the
security IDs (SIDs) for the Everyone and Authenticated Users groups. Oddly, Windows XP still
uses the Local System Account for log on when starting the service.
Let’s take a look at the Performance Monitor reports in several views; first, at the single server
view of both LogicalDisk and PhysicalDisk counters. As you can see in Figure 2.9, the circled
number for % Free [Disk] Space is a bit low for the comfort zone of most administrators.
Figure 2.9: Single server with LogicalDisk counters for % Free Space circled.
The key is that the report view makes it much easier to review multiple counters of information
at a glance. Figure 2.10 shows another example through the multiple server view of LogicalDisk
counters.
38
Chapter 2
Figure 2.1: Multiple server report of LogicalDisk counters.
In Figure 2.10, you might notice that the % Free Space counter shows no value under some
counters. The reason is that the counter wasn’t selected for that computer. Although adding
counters for multiple systems can be a tedious process, you will benefit from learning how to
edit the Performance Monitor settings files. To do so, simply right-click in the monitoring area of
the console, and select Save As from the resulting menu. This action will create an HTML file.
Next, edit the HTML (you can use a program as simple Notepad), and add servers or replace the
current server name with a new one, then save the file with a new name. When you want to start
capturing or logging performance data, simply right-click in Performance Monitor, select New
Log Settings From, and open the .htm file you just created.
Does the % Free Space counter really provide useful information? What good does 20 percent free
space on each of five partitions do you? To gain more from the information provided, you could use
volume mount points to combine them under one share. We’ll explore this solution in detail later.
39
Chapter 2
File Properties
Have you ever saved a downloaded file only to return to it weeks or months later with no idea what the file
was for or what the application did? Have you ever wondered where you downloaded the file in the first
place because you could not find it by searching the Internet? You can make use of the file metadata
information (stored in the properties fields) and store the relevant information in the Summary Properties,
as Figure 2.11 shows. This figure provides an example of a file that is somewhat difficult to find and that
has a somewhat misleading name (VCD commonly stands for Video Compact Disc). Although difficult to
find, this tool is a very handy utility that I can now recognize using file properties.
Figure 2.11: Using file properties to store metadata information.
Figure 2.12 illustrates a simple change in applying file properties in either Windows XP or WS2K3. This
dialog box reflects the fact that Microsoft is responding to feedback about the need to make its OSs
easier to use. When you are changing file properties, the default option is now to Apply changes to this
folder, subfolders and files. How often do you copy files from a CD-ROM (without using Xcopy, which
would reset the read-only bit) or want to compress an entire subfolder tree? With this slight change, it is
much easier to administer larger numbers of files. I remember in the NT days, you had to run a search
without any specific criteria (essentially for *.*) just to highlight all files and apply the same properties (or
attributes).
Figure 2.12: The default option in Windows XP and WS2K3 is to apply changes to subfolders.
40
Chapter 2
Storage-Management Utilities
The following Windows Server utilities will make disk and storage management easier. These
utilities are available for download and installation; you simply need to know where to find
them. In addition, the new WS2K3 product ships with many additional command-line tools and
utilities that were previously available only in the resource kits. Many of these utilities are diskand storage-related.
Here comes the good stuff: the powerful tools that remain hidden until you discover them,
perhaps by reading this chapter. I’ll run through these utilizes alphabetically, weaving in related
tools as they become applicable. I hope that you will be impressed by what you find here.
You must be logged on as an Administrator or a member of the local Administrators group to use
these utilities. Ideally, use Run As to escalate your privileges to this level only when necessary. For
example, because Run As works on executables, I find it easiest to bring up a command prompt as
Administrator, then launch compmgmt.msc (the computer management console, which I can then use
to add an account to the local administrators group if necessary). Simply right-click on the Start menu
item Command Prompt, which is found under Accessories.
DiskPart
You can use DiskPart to manage disks, partitions, and volumes, including the all-important
extending of volumes (for example, volumes that are grown through storage virtualization on a
SAN). DiskPart runs in interactive mode, essentially bringing up its own command line. For
example, you would enter
diskpart
then enter
list volume
to make sure you are operating on the correct volume. Next you would enter
select volume #
where # is the volume number. Then you would enter
extend size (MB)
and finally
exit
When you receive a message that DiskPart successfully extended the volume, your new space is
added to the existing drive while disrupting the data already on the volume.
You can use DiskPart only to extend volumes that were created on a dynamic disk. If the disk was
originally a basic disk when the volume was created and then converted to a dynamic disk, you will
receive an error and DiskPart will fail.
41
Chapter 2
DiskPart lets you manage disks, for example, by extending a disk volume while the storage is
online to the OS. DiskPart is fully scriptable, using the syntax
Diskpart /s <script>
Figure 2.13 shows the commands for using DiskPart. This utility is also useful for rescanning the
server to detect any devices that have been presented from a SAN. For example, after breaking
off a Business Continuance Volume (BCV—such as a clone) and presenting it to the host, you
can use DiskPart to detect the new drive and mount it.
Figure 2.13: DiskPart commands for managing a disk volume (WS2K3 version).
DiskPart is available for Win2K both by download and as part of the Recovery Console as well
as in the default installation of Windows XP and WS2K3. Make sure that you are using the
appropriate OS version, as there are differences in how they operate (see the following note). As
Figure 2.14 illustrates, in Win2K, the DiskPart command is only available when you are using
the Recovery Console, so most of the benefit in a production environment for changing disks
will be to WS2K3 systems.
42
Chapter 2
Figure 2.14: Installing the Recovery Console.
To install the Recovery Console as a startup option in Win2K, insert the Win2K CD-ROM, and hold
down the Shift key to prevent the CD-ROM auto-run feature from running, or wait for the auto-run
feature to bring up the installation options. Close the installation wizard, run a command prompt, and
type the following
x:\i386\winnt32.exe /cmdcons
where x is your CD-ROM drive letter. If you have the bits copied to disk, you can run the installation
directly from the hard drive. Answer Yes to the prompt, and installation will begin.
The installation won’t prompt you to reboot your system, but the Recovery Console will be available
as a boot option the next time you reboot your system. The installation did not prompt me for the SP2
source location, so I recommend running the installation from a Win2K source that has had SP2
slipstreamed in (by running
\i386\update>update –s <dir>
where <dir> is the location of your Win2K source files).
DiskPart can also add or break mirrors, assign or remove a disk’s drive letter, create or delete
partitions and volumes, convert basic disks to dynamic disks, import disks and bring offline disks
and volumes online, and convert master boot record (MBR) disks to GUID Partition Table
(GPT) disks. The options under CONVERT for DiskPart are as follows:
•
BASIC—Converts a disk from dynamic to basic
•
DYNAMIC—Converts a disk from basic to dynamic
•
GPT—Converts a disk from MBR to GPT
•
MBR—Converts a disk from GPT to MBR
Just because you can run it from a command line or script does not mean that it will not destroy your
data! Always test your backup before you perform these types of disk operations!
43
Chapter 2
Driverquery
Another built-in utility is Driverquery, which displays a list of installed device drivers and can be
run remotely against a server. This utility is useful for checking driver status, especially using the
verbose mode, as the sample in Listing 2.1 shows. As you can see, the driver list is quite long (it
was truncated for brevity for this example). However, Driverquery cannot be used for drivermanagement tasks such as stopping, starting, or removing.
Module Name Display Name
Description
Driver Type
Start Mode State
Status
Accept Stop Accept Pause Paged Pool Code(bytes BSS(by Link Date
Path
Init(bytes
ACPI
Microsoft ACPI Driver Microsoft ACPI Driver Kernel
Boot
TRUE
FALSE
45,056
106,496
0
3/24/2003 11:16:21 PM
C:\WS2003EE\system32\DRIVERS\ACPI.sys
8,192
Running
OK
ACPIEC
ACPIEC
ACPIEC
FALSE
FALSE
4,096
8,192
C:\WS2003EE\system32\drivers\ACPIEC.sys
Stopped
OK
aec
Microsoft Kernel Acous Microsoft Kernel Acous Kernel
Manual
FALSE
FALSE
67,968
5,632
0
8/28/2002 6:09:10 AM
C:\WS2003EE\system32\drivers\aec.sys
2,176
Stopped
OK
AFD
AFD Networking Support AFD Networking Support Kernel
Auto
TRUE
FALSE
139,264
12,288
0
3/24/2003 11:40:50 PM
C:\WS2003EE\system32\drivers\afd.sys
16,384
Running
OK
AsyncMac
RAS Asynchronous Media RAS Asynchronous Media Kernel
Manual
FALSE
FALSE
0
12,288
0
3/24/2003 11:11:27 PM
C:\WS2003EE\system32\DRIVERS\asyncmac.sys
4,096
Stopped
OK
USBSTOR
USB Mass Storage Drive USB Mass Storage Drive Kernel
Manual
TRUE
FALSE
12,288
8,192
0
3/24/2003 11:10:50 PM
C:\WS2003EE\system32\DRIVERS\USBSTOR.SYS
4,096
Running
OK
usbuhci
Microsoft USB Universa Microsoft USB Universa Kernel
Manual
TRUE
FALSE
0
15,744
0
3/24/2003 11:10:43 PM
C:\WS2003EE\system32\DRIVERS\usbuhci.sys
640
Running
OK
vga
vga
vga
TRUE
FALSE
20,480
4,096
C:\WS2003EE\system32\DRIVERS\vgapnp.sys
Kernel
Manual
3/24/2003 11:08:03 PM
Running
OK
VgaSave
VGA Display Controller VGA Display Controller Kernel
System
FALSE
FALSE
20,480
4,096
0
3/24/2003 11:08:03 PM
C:\WS2003EE\system32\drivers\vga.sys
4,096
Stopped
OK
ViaIde
ViaIde
ViaIde
TRUE
FALSE
0
4,096
C:\WS2003EE\system32\DRIVERS\viaide.sys
Running
OK
VIAudio
VIA AC’97 Audio Contro VIA AC’97 Audio Contro Kernel
Manual
TRUE
FALSE
25,344
30,336
0
10/19/2003 8:37:04 PM
C:\WS2003EE\system32\drivers\viaudio.sys
1,408
Running
OK
VolSnap
Storage volumes
Storage volumes
Kernel
Boot
TRUE
FALSE
86,016
8,192
0
3/24/2003 11:05:47 PM
C:\WS2003EE\system32\DRIVERS\volsnap.sys
8,192
Running
OK
VPCNetS2
Virtual Machine Networ Virtual Machine Networ Kernel
Manual
TRUE
FALSE
0
36,864
0
12/3/2003 5:36:34 PM
C:\WS2003EE\system32\DRIVERS\VMNetSrv.sys
4,096
Running
OK
Wanarp
Remote Access IP ARP D Remote Access IP ARP D Kernel
Manual
TRUE
FALSE
4,096
24,576
0
3/24/2003 11:11:22 PM
C:\WS2003EE\system32\DRIVERS\wanarp.sys
4,096
Running
OK
WLBS
Network Load Balancing Network Load Balancing Kernel
Manual
FALSE
FALSE
0
122,880
0
3/25/2003 12:41:10 AM
C:\WS2003EE\system32\DRIVERS\wlbs.sys
12,288
Stopped
OK
Kernel
Disabled
3/24/2003 11:16:26 PM
0
4,096
...
0
4,096
Kernel
Boot
3/24/2003 11:04:49 PM
0
4,096
Listing 2.1: Sample output of the Driverquery utility (in verbose mode).
44
Chapter 2
WMIC
Windows Management Instrumentation Command-line (WMIC) is an interactive command shell
for WMI and can do amazing things. WMIC is only available on Windows XP and WS2K3
(Microsoft has stated that the company cannot make WMIC available for Win2K—the coding
effort would be too great). The first time you run WMIC by typing
WMIC
the utility kicks off a self-installation. You are then at the WMIC command prompt. WMIC can
be used for remote management of multiple computers with a single command. For example, the
following command lists logical disk information, such as file system (FAT, NTFS, and so on),
and other driver parameters, such as free disk space, for the servers that you list following the
/node switch:
WMIC /Node:Server1,Server2,Server3 logicaldisk
Note that node in the WMIC context is any server and not specifically a cluster node, as the term
often means. WMIC introduces a term called aliases, which can be thought of either as
commands or objects on which actions are performed. Using WMIC is such a huge topic that it
cannot be covered in entirety here. To illustrate this point, take a look at the list of aliases that
Listing 2.2 shows.
ALIAS
—Access to the aliases available on the local
system
BASEBOARD
—Base board (also known as a motherboard or
system board) management.
BIOS
—Basic input/output services (BIOS) management.
BOOTCONFIG
—Boot configuration management.
CDROM
—CD-ROM management.
COMPUTERSYSTEM
—Computer system management.
CPU
—CPU management.
CSPRODUCT
—Computer system product information from
SMBIOS.
DATAFILE
—DataFile Management.
DCOMAPP
—DCOM Application management.
DESKTOP
—User’s Desktop management.
DESKTOPMONITOR
—Desktop Monitor management.
DEVICEMEMORYADDRESS
—Device memory addresses management.
DISKDRIVE
—Physical disk drive management.
DISKQUOTA
—Disk space usage for NTFS volumes.
DMACHANNEL
—Direct memory access (DMA) channel management.
ENVIRONMENT
—System environment settings management.
FSDIR
—Filesystem directory entry management.
GROUP
—Group account management.
IDECONTROLLER
—IDE Controller management.
IRQ
—Interrupt request line (IRQ) management.
JOB
—Provides access to the jobs scheduled using
the schedule service.
LOADORDER
—Management of system services that define
execution dependencies.
LOGICALDISK
—Local storage device management.
LOGON
—LOGON Sessions.
MEMCACHE
—Cache memory management.
MEMLOGICAL
—System memory management (configuration layout
and availability of memory).
MEMORYCHIP
—Memory chip information.
45
Chapter 2
MEMPHYSICAL
—Computer system’s physical memory management.
NETCLIENT
—Network Client management.
NETLOGIN
—Network login information (of a particular
user) management.
NETPROTOCOL
—Protocols (and their network characteristics)
management.
NETUSE
—Active network connection management.
NIC
—Network Interface Controller (NIC) management.
NICCONFIG
—Network adapter management.
NTDOMAIN
—NT Domain management.
NTEVENT
—Entries in the NT Event Log.
NTEVENTLOG
—NT eventlog file management.
ONBOARDDEVICE
—Management of common adapter devices built
into the motherboard (system board).
OS
—Installed Operating System/s management.
PAGEFILE
—Virtual memory file swapping management.
PAGEFILESET
—Page file settings management.
PARTITION
—Management of partitioned areas of a physical
disk.
PORT
—I/O port management.
PORTCONNECTOR
—Physical connection ports management.
PRINTER
—Printer device management.
PRINTERCONFIG
—Printer device configuration management.
PRINTJOB
—Print job management.
PROCESS
—Process management.
PRODUCT
—Installation package task management.
QFE
—Quick Fix Engineering.
QUOTASETTING
—Setting information for disk quotas on a
volume.
RDACCOUNT
—Remote Desktop connection permission
management.
RDNIC
—Remote Desktop connection management on a
specific network adapter.
RDPERMISSIONS
—Permissions to a specific Remote Desktop
connection.
RDTOGGLE
—Turning Remote Desktop listener on or off
remotely.
RECOVEROS
—Information that will be gathered from memory
when the operating system fails.
REGISTRY
—Computer system registry management.
SCSICONTROLLER
—SCSI Controller management.
SERVER
—Server information management.
SERVICE
—Service application management.
SHADOWCOPY
—Shadow copy management.
SHADOWSTORAGE
—Shadow copy storage area management.
SHARE
—Shared resource management.
SOFTWAREELEMENT
—Management of the elements of a software
product installed on a system.
SOFTWAREFEATURE
—Management of software product subsets of
SoftwareElement.
SOUNDDEV
—Sound Device management.
STARTUP
—Management of commands that run automatically
when users log onto the computer system.
SYSACCOUNT
—System account management.
SYSDRIVER
—Management of the system driver for a base
service.
SYSTEMENCLOSURE
—Physical system enclosure management.
46
Chapter 2
SYSTEMSLOT
—Management of physical connection points
including ports, slots and peripherals, and proprietary connections
points.
TAPEDRIVE
—Tape drive management.
TEMPERATURE
—Data management of a temperature sensor
(electronic thermometer).
TIMEZONE
—Time zone data management.
UPS
—Uninterruptible power supply (UPS) management.
USERACCOUNT
—User account management.
VOLTAGE
—Voltage sensor (electronic voltmeter) data
management.
VOLUME
—Local storage volume management.
VOLUMEQUOTASETTING
—Associates the disk quota setting with a
specific disk volume.
VOLUMEUSERQUOTA
—Per user storage volume quota management.
WMISET
—WMI service operational parameters management.
Listing 2.2: Complete WMIC alias listing.
As you can see from the listing, WMI provides access to the storage volume management and
quota settings. We will cover the built-in WS2K3 quota feature later, as it has some limitations
and difficulties (for example, in being able to include or exclude certain users or groups). For
now, you can see where it fits in for the storage analysis phase.
If you bring up the properties for a volume and click on the quota tab, you will see a Quota
Entries button in the lower right corner. Clicking this button brings up the quota report that
Figure 2.15 shows. This report is somewhat useful for storage analysis as it shows the amount of
disk space used by each user (or technically, SID; as you can see, there are some SIDs that are
not resolved in the figure).
Note that the BUILTIN\Administrators are excluded from the Quota Limit—but be warned that disk
utilities will be affected by the quota limit. Thus, if you enable a 40GB quota, as the example shows,
don’t be alarmed when your disk defragmenter reports a 40GB drive. Also note the Warning Level is
set at 40KB, which should be 40GB.
47
Chapter 2
Figure 2.15: Using the quota entries table as a simple volume report.
Cleanmgr
The Disk Cleanup manager offers the option to compress files older than a certain number of
days, as Figure 2.16 shows. Of course, this setting only applies for NTFS volumes, and you
should use this option with caution to ensure that you know which files you are compressing
(some files are not worth compressing because doing so offers little gain and others—such as
system files—should be left alone).
Figure 2.16: The Disk Cleanup tool (Cleanmgr) option to compress files.
48
Chapter 2
Type
cleanmgr /d x:
to launch the dialog box, and select the x drive.
Note that most of the information on the Disk Cleanup tool in Windows XP applies to WS2K3 except
the references to System Restore Points.
The Disk Cleanup tool option that Figure 2.17 shows launches the Add/Remove programs
applet. I have included it here as a reminder to watch that 200MB free space threshold for
Windows.
Figure 2.17: The Disk Cleanup tool option to launch the Add/Remove programs applet.
Defrag
The Windows Disk Defragmenter now offers a command-line option
Defrag –f
that will automate defragmentation and force it to run even if free space is low. However, as with
the GUI version (dfrg.msc), the volume should have at least 15 percent free space for
defragmentation to work properly.
How Important is Disk Defragmentation?
The concept of disk defragmentation makes great sense when dealing with a single spindle, but does it
make sense for drive arrays with RAID striping? There have been many studies that claim improved
performance as a result of disk defragmentation, but many of them have been funded by the software
company selling the product (http://www1.execsoft.com/pdf/NSTL-XP_mddvdk.pdf and
http://www.raxco.com/products/perfectdisk2k/whitepapers/benefits_of_defragmentation.pdf).
49
Chapter 2
An odd side effect of disk defragmentation to watch out for is excessive replication of FRS files. For
more information about this potential problem, see the Microsoft article “FRS: Disk Defragmentation
Causes Excessive FRS Replication Traffic,” which states that you should not run defragmentation
utilities on volumes that contain FRS replicated files. In addition, see the Microsoft article “Shadow
Copies May Be Lost When You Defragment a Volume,” which basically states that you should avoid
defragmenting these volumes or use a 16KB or larger cluster allocation unit size when you format the
volume if you plan to use shadow copies of shared folders and defragment the volume. However, you
cannot change the cluster size on the fly—you must reformat, so, hopefully, this information is not too
late.
An indication that defragmentation is beneficial is that it is called internally by a new process
available only in WS2K3—the logical prefetcher. What prefetch can do is provide faster boot
and application launch by running when a WS2K3-based system is booted, saving the
information about all logical disk read operations. On the following reboots, the files that are
loaded will be optimized on disk by running the defragmentation mentioned earlier. This feature
can help your server boot faster—which, hopefully, does not happen too often—as well as help
any applications that load after boot—such as your antivirus and SRM applications. The system
organizes disk reads that need to be done to start the system in parallel with device initialization
delays, providing faster boot and logon performance.
Prefetch is enabled for system boot by default. In order to enable the prefetch feature for
applications, you need to set the following registry key (simply search for a key value of prefetch
with the value and data check boxes cleared as you search):
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory
Management\PrefetchParameters. Set the value name to EnablePrefetcher (DWORD) and the
value to
•
0x00000001= application launch prefetching
•
0x00000002= boot prefetching
•
0x00000003=both application and boot prefetching
The parameters use AND, so to enable both boot and application EnablePrefetcher would be set
to 0x00000003. The setting takes effect immediately and does not require a reboot, but, of
course, the prefetch process takes advantage of each reboot, so it is a good idea.
Every 3 days or so, during system idle periods, the Task Scheduler organizes a list of files and
directories in the order that they are referenced during boot or application start, and stores the list
in the prefetch directory. Then the defragmenter (mentioned earlier) is launched with a
command-line option to defragment based on the contents of this prefetch file instead of
performing a full defragment. The defragmenter finds a contiguous area on each volume large
enough to hold the listed files and directories, then moves them in their entirety to that area so
that they are stored contiguous. As a result, future prefetch operations will be more efficient
because the data to be read is stored physically on the disk in the order it will be needed.
50
Chapter 2
Event Utilities
You can use a trio of utilities—Eventquery, Eventcreate, and Eventtriggers—to read, write, and
respond to Windows events, respectively. For example, suppose you’re watching for Event ID
2013 (disk is at or near capacity) and you launch a system cleanup utility such as the built-in
cleanmgr.exe (which, unfortunately, has no automated options). Alternatively, you can use a
different cleanup utility or a custom script that you have written. Some companies place a large
file (a gigabyte or two) as a buffer against the disk filling up—the automated script would then
delete this file, which is very safe, and notify the administrator. This gives you some time to
assess your disk cleanup instead of immediately being in panic mode. To enable this scenario,
simply enter the following text at t a command line and it will create the event:
eventtriggers /create /tr “my TriggerName” /l application /eid
2013 /tk c:\mycleanup.cmd
You will then be prompted for the Run As password for the account with which you are logged
on. You can then see the scheduled tasks listing by using the built-in utility schtasks or the
scheduled tasks under Control Panel in Windows Explorer. If the event fails to execute, check
the log file %systemroot%\system32\wbem\logs\cmdTriggerConsumer.log.
Forfiles
Forfiles selects files in a folder or tree for batch processing (for example, you can use it to select
files older than a certain date to move them to an archive location). Forfiles is more powerful in
its options than other commands such as xcopy; it allows you to specify a certain number of days
from today, which is handy for scheduling a cleanup or reporting script. For example, to list all
files on drive x older than 365 days:
forfiles /p x:\ /s /m*.* /dt-365 /c”cmd /c echo @file : date >=
365 days and next command will archive it”
@echo on
pause Hit any key to continue with archive process
pause !Warning! This will move old files to z:\archive!
@echo off
forfiles /p x:\ /s /m*.* /dt-365 /c”cmd /c move @file z:\archive”
Freedisk
Freedisk checks whether the specified amount of disk space is available before continuing with
an installation process. You can even use this utility in a logon script to check available disk
space set on a volume by a disk quota before running logon commands, because Freedisk runs
under the user context. You can use this utility to check quotas on a volume for specific users,
and you can provide it with username and password by using the /U and /P options.
51
Chapter 2
Fsutil
Fsutil is another command-line utility that eases file–system–management tasks, such as
managing reparse points, managing sparse files, dismounting a volume, or extending a volume.
Figure 2.18 shows sample Fsutil commands and usage for the quota command.
Figure 2.18: Fsutil quota commands.
You can also use Fsutil to check for a dirty bit by running
fsutil dirty query x:
where x is the volume you want to check.
For more information about the powerful options available through Fsutil, see
http://www.microsoft.com/technet/prodtechnol/windowsserver2003/proddocs/entserver/fsutil.asp
Openfiles
Openfiles lists the open files and folders on a system and allows you to force a disconnect (using
the /Disconnect parameter). In order to list opened local files, the system global flag ‘maintain
objects list’ needs to be enabled via openfiles /local on and requires a reboot to take effect.
52
Chapter 2
RSS
RSS helps manage disk space by automatically copying infrequently used files from a volume to
a tape or disk library. When the amount of free disk space falls below the defined level on the
volume, RSS kicks into action. You can use the command-line RSS utility to manage RSS, but
be aware that you must first install RSS. To do so, navigate to the Add/Remove Windows
Components Control Panel applet to the Remote Storage option. Select this option, and make
sure that you have your WS2K3 installation media handy. You will also need to restart your
computer.
We will look at RSS in more detail in the next chapter.
Systeminfo
Systeminfo is one of many ways to get a listing of system information. A useful feather of this
utility is that it includes such items as Original Install Date, system uptime, and a list of
hotfixes—plus it is guaranteed to be on every WS2K3 machine because it is a built-in tool.
TakeOwn
TakeOwn lets administrators take ownership of files, which allows you to recover access to a file
orphaned through incorrect file ownership. To use this utility, you must be a member of the local
Administrators group, which is automatically assigned on a member server to the Domain
Admins group. This utility can also come in handy if you install WS2K3 on a computer that had
a previous WS2K3 installation on it, as the files in the \Installer folder may be locked and require
System permissions to delete them. Using TakeOwn can reset the ownership (you might also
need to use Windows Explorer or another utility, cacls, to reset all permissions).
Additional Windows Server Resources
A great starting point for discovering additional Windows Server resources is the WS2K3
Downloads page at http://www.microsoft.com/windowsserver2003/downloads/default.mspx.
Microsoft has split the available resources into several areas, including WS2K3 Feature Packs
and Tools.
On the WS2K3 Tools page, you’ll find the downloadable tools that help you support WS2K3
systems. After installing a new WS2K3 Server using the original installation media (CD-ROM),
I immediately install the Administration Pack Tools, the Support Tools, and the Resource Kit
Tools.
Administration Pack Tools
The Administration Pack Tools contains Active Directory Tools, Terminal Services Tools, and
other tools such as the Distributed File System console. This pack is installed by running the
Adminpak.msi found in the \i386 folder of the WS2K3 CD-ROM. It is also available on the Web
at http://download.microsoft.com/download/c/7/5/c750f1af-8940-44b6-b9ebd74014e552cd/adminpak.exe.
53
Chapter 2
Support Tools
The WS2K3 Support Tools provides more than 40 tools installed by running the Suptools.msi
found in the support\tools folder of the WS2K3 CD-ROM. Table 2.2 offers examples of the
support tools that are relevant to storage management.
Utility
Description
Example Usage
Devcon.exe
Device Console Utility—Command
prompt alternative to hardware
device manager
List or disable problematic devices from the
command line
Diruse.exe
Displays a list of disk usage for a
directory tree with additional
options
An alert can be generated if folders exceed
specified sizes
Dmdiag.exe
Lists disk volume configuration
Does a complete dump of disk and storage
information (can be overwhelming, but also
useful as input to other tools)
Efsinfo.exe
Displays information of encrypted
files on NTFS partitions
/T option can be used to ‘touch’ all files in a
given folder or subfolder, forcing an update to
their EFS information
Ftonline.exe
Mounts NT 4.0 fault tolerance disk
sets
Forces mounting of an existing NT 4.0 disk
volume after the upgrade to WS2K3
Remote.exe
Run a command line on a remote
computer
Useful when remote access has not been
enabled and you need to run a utility
Rsdiag.exe and
Rsdir.exe
Remote storage diagnostics and
reporting
Requires remote storage to be installed on
system
Topchk.cmd
DFS and SYSVOL Replication
Topology Analysis Tool
Shows the FRS replication topology
Xcacls.exe
Displays or modifies access
control lists (ACLs) of files
The X stands for Extended, as it can change
Special Access rights
Table 2.2: WS2K3 Support Tools that are relevant to storage management.
WS2K3 Resource Kit Tools
The Resource Kit Tools have always been valuable and worth having. They are now available
for download for WS2K3. You can find them at
http://download.microsoft.com/download/8/e/c/8ec3a7d8-05b4-440a-a71eca3ee25fe057/rktools.exe. Table 2.3 lists some of the WS2K3 Resource Kit Tools that may be of
interest for SRM.
54
Chapter 2
Utility
Description
Cdburn.exe
Burn ISO images to CD-ROM
Cleanspl.exe
Flush print spoolers
Clusdiag.msi
Cluster diagnostics and verification tool
Clusterrecovery.exe
Server cluster recovery utility
Cmdhere.inf
Adds menu item in Explorer to launch a command prompt
Compress.exe
Compress files
Confdisk.exe
Disk configuration tool
Creatfil.exe
Create file
Csccmd.exe
Client-side caching command-line options
Diskraid.exe
RAID configuration tool
Diskuse.exe
User disk usage tool
Dvdburn.exe
ISO DVD burner tool
Fcopy.exe
File copy utility for message queuing
Gpmonitor.exe
Group Policy Monitor
Gpotool.exe
Group Policy Objects (GPOs)
Hlscan.exe
Hard link display tool
Iniman.exe
Initialization files manipulation tool
Moveuser.exe
Move users
Nlsinfo.exe
Locale information tool
Ntrights.exe
Grant or revoke NT rights to a user/group
Permcopy.exe
Copy file- and share-level permissions between shares
Perms.exe
Display a user’s access permissions for a file or directory
Rcontrolad.exe
Active directory remote control add-on
Robocopy.exe
Copy files between two locations
Showacls.exe
Show ACL for subdirectories
Showperf.exe
Performance data block dump utility
Sleep.exe
Batch file wait
Srvcheck.exe
Server share check
Srvinfo.exe
Remote server information
Srvmgr.exe
Server Manager
Subinacl.exe
Move security information between objects
Tcmon.exe
Traffic Control Monitor
Usrmgr.exe
User Manager for Domains
Vfi.exe
Retrieve and generate detailed information about files
Volperf.exe
Shadow copy performance counters
Volrest.exe
Shadow copies for shared folders restore tool
Vrfydsk.exe
Verify disk
Winpolicies.exe
Policy spy
Table 2.3: Storage-related WS2K3 Resource Kit Tools.
55
Chapter 2
WS2K3 Feature Packs
The WS2K3 downloadable Feature Packs provide new functionality and extend the capabilities
of WS2KS:
•
Automated Deployment Services (ADS)
•
Identity Integration Feature Pack (part of the Metadirectory System for integrating
Directory Services)
•
Software Update Services
•
Windows SharePoint Services
•
Windows System Resource Manager (WSRM)
WSRM is only available for use with WS2K3 Enterprise Edition and Datacenter Edition.
We’ll touch upon some of these feature packs as they relate to storage management in later
chapters.
Win2K Server Resource Kit
The Win2K Server resource kit is available for purchase from Microsoft and is well worth the
investment.
If you haven’t already purchased the resource kit, check out the following Web links, as some of
these tools are available for free download. You can find a list of free Win2K Server resource kit tools
at http://www.microsoft.com/windows2000/techinfo/reskit/tools/default.asp and a complete list of tools
at http://www.microsoft.com/windows2000/techinfo/reskit/rktour/server/S_tools.asp.
The resource kit includes about 300 tools; Table 2.3 provides a list of some storage-related tools
that can help you in your SRM deployment.
Utility
Description
Example Usage
Free
Download
Diskmap.exe
Displays information
about a disk and the
contents of its partition
table.
Reports disk signature as
well as cylinder, head,
and sector information.
Yes
Dmdiag.exe
Saves disk volume
configuration to a text
file and writes a
signature to a disk
partition.
Does a complete dump of
disk and storage
information. (This much
information can be
overwhelming!)
Yes
Updated version in
2003 kit
Dumpcfg.exe
Reads and writes disk
information such as
signatures.
Can be used for disksignature repair. (This
utility is very useful for
cluster disks).
No
Replaced on 2003
by the built-in
DiskPart
56
Notes for 2003
Chapter 2
Utility
Description
Example Usage
Free
Download
Notes for 2003
Efsinfo.exe
Displays information
about EFS NTFS
partitions.
Lists encrypted files, the
user, and the recovery
agent.
Yes
Updated version in
2003 kit
Forfiles.exe
Enables batch
processing of files in a
directory or tree.
Automated operation to
find and clean up a
directory tree or an entire
drive.
No
Updated version in
2003 kit
Freedisk.exe
Checks for free disk
space, returning a 0 if
there is enough space
for an operation and a
1 if there isn’t enough
space.
One method to ensure
that a disk doesn’t run out
of room.
No
Updated version in
2003 kit
Linkd.exe
Links an NTFS
directory to a target
object.
I’ll discuss this utility
when I discuss volume
mount points and junction
points later in this book.
No
Netcons.exe
Displays current
network connections.
Monitor or determine
status on connections to
a file server
No
Permcopy.exe
Copies file- and sharelevel permissions from
one share to another.
Duplicate ACL
permissions from one
directory tree to another
No
Perms.exe
Displays a user’s
access permissions for
a file or directory.
Handy for clean up or
migrations
Yes
Robocopy.exe
Robust File Copy
Utility
Create and maintain
multiple mirror images of
large folder trees on
network servers.
No
Rsm_dbic.exe
Removable Storage
Integrity Checker
Checks the integrity of
the RSM database for
media and removable
media drives and
libraries.
No
Rsm_dbutil.exe
Removable Storage
Database Utility
Steps through the RSM
database and inspects
each database object
attribute for valid values
and referential integrity.
No
Rsmconfg.exe
Removable Storage
Manual Configuration
Wizard
Aids in manually
configuring (from a
command prompt)
libraries that RSM
autoconfiguration can’t
configure.
No
57
Updated version in
2003 kit
Chapter 2
Utility
Description
Example Usage
Free
Download
Notes for 2003
Subinacl.exe
Move security
information between
objects (users, groups,
domains, printers, files,
and services).
Changes the account
used in service startup
properties, as I previously
mentioned.
No
Updated version in
2003 kit
Vfi.exe
Retrieves and
generates detailed
information about files,
such as attributes,
version, and flags.
Allows you to find
duplicate files and do a
size as well as a cyclical
redundancy check (CRC)
comparison.
No
Updated version in
2003 kit
Xcacls.exe
Displays and modifies
security options for
system folders.
More powerful than
Perms.exe in that it lets
you set the ACLs.
Yes
Table 2.3: Win2K Server Resource Kit storage-related tools.
Windows Server Resource Kit Security Tools
In addition to the storage-related tools, the resource kit provides security tools that can help you
during your SRM deployment. Do not overlook security, as recent virus outbreaks and networkbased attacks have affected storage servers. What we’ve learned from the recent outbreaks is:
•
Systems must be secured from the start—deployed from a secure baseline image (or built
off the main network). The Security Readiness Kit (SRK) can be used for offline builds
and preparation (http://www.microsoft.com/technet/security/readiness/232.mspx).
•
Establish an escalation response team and communication procedures. It might become
necessary to disable certain network services (or block certain types of attachments
temporarily).
•
All systems must be managed (for example, using Software Update Services for patch
management). Scanning tools must be used to probe the network for unmanaged at-risk
systems. Systems can be scanned with the Microsoft Baseline Security Analyzer (MBSA)
and HFNetChk, which I’ll discuss shortly.
•
Corporate security must be proactively managed. Studies have shown that staying current
on threats and patch management and actively filtering for malicious code reaps
substantial rewards (in terms of reduced cost associated with outbreaks).
For additional reading and guidelines to assist you in securing Windows Server, the National
Security Agency (NSA) publishes about 20 Security Recommendation Guides that you can
download from http://nsa1.www.conxion.com/win2k/index.html. Many of the security guides
will still be helpful for WS2K3, although some configuration directions might not be as
necessary with the “secure by default” initiative of WS2K3. In fact, the NSA does not plan to
publish a security guide for WS2K3 and has stated that the “high security” settings in
Microsoft’s “Windows Server 2003 Security Guide” are close to the security level represented in
NSA guidelines. You can access the Microsoft publication at
http://www.microsoft.com/downloads/details.aspx?FamilyId=8A2643C1-0685-4D89-B655521EA6C7B4DB&displaylang=en.
58
Chapter 2
The following list highlights additional security tools that might be of interest:
•
HFNetChk—This tool is designed to check which patches have been applied to your
system. HFNetChk is available via the command-line interface of MBSA.
•
MBSA version 1.2—A new version of the MBSA is available to work with WS2K3.
Version 1.2 includes both a graphical and command-line interface to perform local or
remote scans of Windows systems. The MBSA scans for common security
misconfigurations, vulnerabilities, and missing security updates, and generates security
reports. It is available at http://support.microsoft.com/default.aspx?kbid=320454.
•
System Scanner—System Scanner for Windows is a security-assessment tool designed
for Win2K Server. It performs security checks on files, the registry, and user-account
settings and can verify the configuration of virus scanners. You must install it separately
by running Sysscansetup in the \Apps\Systemscanner folder of the Win2K Resource Kit
CD-ROM, then click Start, Programs, ISS, and System Scanner Help.
For more information about the System Scanner for Windows, check out the Microsoft article
“Description of the Windows 2000 Resource Kit Security Tools” at
http://support.microsoft.com/support/kb/articles/Q264/1/78.asp.
Analyzing Storage Usage Tools
We’ve covered some great new utilities for performing disk operations and storage management,
but what about analyzing storage usage? How do we know which types of files exist, how old
they are, who they belong to, if they are duplicates, and whether they are being properly
managed? There aren’t any great tools built-in to the Windows Server products that can do all of
this. Many storage administrators use rather crude methods to access such information, but a
truly efficient enterprise will require the features that have defined the SRM market and
products, which we’ll explore later in this guide.
Summary
In this chapter, we covered the process and available tools for analyzing your current storage
environment. First, I laid out the levels or hierarchy of auditing from the organization to the files
level. I spelled out the types of information that you’ll want to gather, showing you sample
reports including storage utilization (disk space used and available) and identifying who the
storage users are. Next, we looked at the built-in tools of WS2K3 for analyzing your storage
requirements. We’ve reached the need to look at third-party SRM tools to show how they can
improve the storage audit process and prepare you for the next phase, planning your SRM
deployment.
59
Chapter 3
Chapter 3: Analyzing and Planning Storage
In the previous chapter, we explored the built-in features, resource kit, and support tools of
WS2K3. I defined the requirements for the storage analysis phase, then we took a look at how
much of this work could be done using the available tools. Although there are many built-in tools
at our disposal, there are still some tasks—such as detailed analysis about who is using storage—
that will benefit from the features offered by third-party SRM products. We’ll explore such tools
in this chapter.
In addition, we’ll begin to explore the next phase—planning an SRM deployment. We’ll use the
information that we captured about both storage capacity and storage performance to determine a
course of action (or courses of action, if you’re used to contingency planning). I’ll present you
with a flowchart that will help to lay out the storage-management decisions that you must make;
most notably, whether you decide to live within your existing capacity or to add capacity. During
the course of this chapter, we’ll look at using SRM tools for performing trend analysis and
capacity planning.
Finally, we’ll look at how SRM solutions can help eliminate duplicate files, unused files, and
wasted space as well as reduce consumption. I’ll show you how to perform an analysis of your
current environment—highlighting SRM techniques and reporting—to improve your storage
usage efficiency. I’ll also explore the tools you can use to perform a detailed storage analysis,
gathering information about who is using storage and how much.
Phase 2: Planning SRM
Table 3.1 shows the planning phase in the overall SRM-deployment methodology.
Phase
Process
SRM
Planning
What are the problems and
the priorities to solve them?
Can we find a better way?
Lay out the organizational
policies in these areas:
—Security Policies
—Supported Storage Systems
—Support of Storage
Applications
—Storage Quotas or
Allocations
—Restricted File Types
—Enforcement Policies
—Support and Escalation Plan
—Backup and Disaster
Recovery Procedures
Table 3.1: Planning phase of the SRM-deployment methodology.
60
Chapter 3
In the previous chapter, we used several tools to take a snapshot view of storage usage, including
looking at several angles of usage either by user, folder, or even the type of application being
used. But a snapshot view is merely a point-in-time capture of information. To be of value for
storage-resource planning, the snapshot views must include more than one point in time. We
need to capture the information discussed in the previous chapter several times to evaluate
growth patterns and plan for anticipated growth. In the next section of this chapter, we’ll look at
growth analysis, then we’ll develop a flowchart for planning a course of action.
Outcome of Storage Analysis and Planning
The expected outcome of storage analysis and planning is twofold:
•
Create a plan of action for storage design
•
Develop an organizational storage policy
Although the first item is certainly important, it must be married with an organizational storage
policy. As we develop technological solutions for managing storage, the solutions will be used to
support the organizational storage policy that we’ll develop in the next phase of the project. In
fact, the two must complement each other, as a conflict between storage design and
organizational storage policy is disruptive to business process. For example, if user space is
limited by policy on a business-critical storage system, but the tools aren’t in place to enforce the
policy, then additional administrative burden is created to monitor and maintain the system.
The end result of this twofold strategy is to increase the availability of storage systems while
reducing the cost of administrative maintenance. These goals are difficult to attain; as we
increase the importance and usage of storage systems, we must also increase their fault tolerance
and ability to recover from any disruption to service.
In the area of TCO, entropy is the enemy—if a system is left alone over time, it will begin to
decay. For storage management, this decay means that if we don’t put the tools in place to
monitor and maintain the system in an efficient manner, the state of storage will move towards a
natural decay in order and chaos. Perhaps you have seen the results of this corrosion for a
particular file server: an ever-increasing number of files being stored that have diminishing
access patterns but use ever more storage, which increases the backup and recovery window. In
the next section, we’ll look at the impact of increasing storage capacity.
One more point about the backup and recovery window, as the concept of Service Level
Agreements (SLAs) can have a powerful impact in storage-capacity planning. When users are
storing more and more information, throwing storage space at the problem can actually work—
for a short time. But before long, business-critical or even mission-critical data can be
compromised because it is mixed with files that are no longer essential, but all the data must be
backed up or recovered during the same time period. This scenario can put you in a situation
from which you cannot possibly recover in time to meet your SLA. Even if you don’t have a
formal SLA, you may have an informal Service Level Objective (SLO) that states the
expectation that any downtime is unacceptable and you should do everything within your power
and budget to reduce or eliminate it!
61
Chapter 3
This situation is the driving force behind HSM—moving nonessential files away from businesscritical and mission-critical information. The SRM plan can include how HSM will be
incorporated, and the benefit to the company (which drives the cost justification) is the improved
response and protection of business-critical and mission-critical information. Recently, the
storage industry has begun calling this overall process information lifecycle management (ILM).
The next sections will illustrate how to use an SRM product to address each of these problems.
In the previous chapter, I showed you how to use tools such as the Windows Server resource kit
tools and Performance Monitor to perform a storage analysis; however, these tools are quickly
left behind when compared with the functionality of SRM products.
Trend Analysis and Capacity Planning
In the previous chapter, I showed you how to use Windows Server’s Performance Monitor for
gathering disk-usage information. Unfortunately, Performance Monitor doesn’t make the grade
for performing trend analysis and capacity planning. You can use it to capture information at
periodic intervals, showing increasing disk usage over a 30-day interval, but its real weakness is
the lack of control over the level of detail. Performance Monitor is best suited for reporting on
the level of logical disk drives. The information we’ll want is disk usage on the file share or
directory level. For example, Figure 3.1 shows a third-party tool’s chart of Disk Space Used for a
30-day trend against a specific file share. This trend is fairly typical, showing disk cleanup
followed by ever-increasing storage usage, with a recent spike. Also, the ability to tie this
information back to the responsible user is paramount, and we’ll run reports to do so.
Figure 3.1:VERITAS’ StorageCentral SRM Space Allocation Trend Summary.
62
Chapter 3
To access the trend reports, you simply configure this tool to collect the data in advance. Rightclick either the server or AD root (depending on whether you are using the Standard or Active
Directory edition, respectively), and click Properties. On the Trending tab, select trending and
any parameters you want to change, such as the time interval to collect data.
Figures 3.2 through 3.4 illustrate the serious consequences of the storage growth battle that you
might be currently fighting. I’ve included this series of figures to illustrate—to you or perhaps
even to convince your management—that the issues are more serious than at first glance. The
series shows IT spending mapped against storage spending ranging from 20 percent to 60 percent
per year. I’ve deliberately left the storage growth lower than some market predictions, which
show storage demand increasing by more than 100 percent per year (doubling) for many
companies. Granted, that number might be hard to sustain with recent economic events, but over
a 10-year period, the bursts in growth may make a 20 percent to 60 percent rate easy to achieve.
Figure 3.2 shows IT resources increasing annually at 5 percent mapped against storage demand
increasing annually at 20 percent. The column labeled Effective Ratio illustrates the impact on
IT resources, given constrained resources to manage the growing storage. If storage demand is
growing at a rate of only 20 percent, then IT resources relative to the amount of storage that must
be managed reaches a ratio of 3.8 after a 10-year cycle. This ratio can be interpreted as meaning
that for every dollar of IT spending, the company must be capable over time of increasing its
efficiency to managing nearly four times as much storage resource allocation.
Storage
Demand at Effective
20%
Ratio
$
100
1.0
$
120
1.1
$
144
1.3
$
173
1.5
$
207
1.7
$
249
1.9
$
299
2.2
$
358
2.5
$
430
2.9
$
516
3.3
$
619
3.8
$900
Storage Demand at 20%
$800
Thousands $
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
IT
Resources
at 5%
$
100
$
105
$
110
$
116
$
122
$
128
$
134
$
141
$
148
$
155
$
163
$700
IT Resources at 5%
$600
$500
$400
$300
$200
$100
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
Figure 3.2: IT resources mapped against storage demand increasing at 20 percent.
63
Chapter 3
Figure 3.3, similar to Figure 3.2, shows IT resources increasing annually at 5 percent mapped
against storage demand; except in this case, storage demand is increasing annually at 40 percent.
The column labeled Effective Ratio illustrates the impact on IT resources, given constrained
resources to manage the growing storage. If storage demand is growing at a rate of only 40
percent, IT resources relative to the amount of storage that must be managed reaches a ratio of
17.8 after a 10-year cycle. This ratio can be interpreted as meaning that for every dollar of IT
spending, the company must be capable over time of increasing its efficiency to managing nearly
18 times as much storage resource allocation.
Storage
Demand at Effective
40%
Ratio
$
100
1.0
$
140
1.3
$
196
1.8
$
274
2.4
$
384
3.2
$
538
4.2
$
753
5.6
$ 1,054
7.5
$ 1,476
10.0
$ 2,066
13.3
$ 2,893
17.8
$3,600
Storage Demand at 40%
$3,100
Thousands $
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
IT
Resources
at 5%
$
100
$
105
$
110
$
116
$
122
$
128
$
134
$
141
$
148
$
155
$
163
IT Resources at 5%
$2,600
$2,100
$1,600
$1,100
$600
$100
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
Figure 3.3: IT resources mapped against storage demand increasing at 40 percent.
Figure 3.4 shows IT resources increasing annually at 5 percent mapped against storage demand
increasing annually at 60 percent. The column labeled Effective Ratio illustrates the impact on
IT resources, given constrained resources to manage the growing storage demand. If storage
demand is growing at a rate of 60 percent, IT resources relative to the amount of storage that
must be managed reaches a ratio of 67.5 after a 10-year cycle. This ratio can be interpreted as
meaning that for every dollar of IT spending, the company must be capable over time of
increasing its efficiency to managing nearly 68 times as much storage resource allocation!
Storage
Demand at Effective
60%
Ratio
$
100
1.0
$
160
1.5
$
256
2.3
$
410
3.5
$
655
5.4
$ 1,049
8.2
$ 1,678
12.5
$ 2,684
19.1
$ 4,295
29.1
$ 6,872
44.3
$ 10,995
67.5
$12,100
Storage Demand at 60%
$10,100
Thousands $
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
IT
Resources
at 5%
$
100
$
105
$
110
$
116
$
122
$
128
$
134
$
141
$
148
$
155
$
163
$8,100
IT Resources at 5%
$6,100
$4,100
$2,100
$100
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
Figure 3.4: IT resources mapped against storage demand increasing at 60 percent.
64
Chapter 3
Here is where the going gets even tougher—technological improvements in storage mean that for
the same dollar amount, the amount of storage purchased increases at a constant rate. So you can
use the information in the figures in two ways: either as a dollar-per-dollar comparison of
administrative spending versus storage spending, and as a comparison of administrative spending
versus the storage capacity that must be managed.
Storage Management Decision Points
You’ve analyzed existing storage and now you must decide the appropriate course of action.
Based on either the capacity and performance analysis, are you going to upgrade your existing
hardware to achieve more performance and more storage, or are you going to live within the
existing capacity? If the latter is your main choice of action, SRM tools will be indispensable in
achieving your aims. Even if your solution requires that you add more storage, SRM tools can be
your best bet for managing the new storage.
Figure 3.5 illustrates the storage-management decision points in a flowchart and is designed to
give you an overview of the SRM process. I’ll refer back to this flowchart throughout the
chapter, as it covers the primary concepts of SRM planning. The flowchart shows several
options, or plans of action, depending on whether you’re storage constrained or performance
constrained (and you might be constrained different ways for different systems or applications—
we’ll look at the different storage applications later). First, let’s look at the area covered in the
flowchart in the SRM Product and Process section.
Some of the options can accomplish both goals: by adding newer, faster, larger systems, you get
more storage space with higher performance. But this option assumes that you have the necessary
budget. For those of you lucky few, I’ll look at performance design and new technology later in this
chapter.
65
Chapter 3
Figure 3.5: Storage-management decision points.
66
Chapter 3
SRM Product and Process
In this section, we’ll look at how SRM tools can assist in the capacity-planning process. The first
step in the planning process is to identify what you will use the SRM tools to accomplish. The
primary functions of SRM solutions that we’ll include in the plan are to
•
Eliminate duplicate files
•
Eliminate unused files
•
Eliminate wasted space
•
Reduce consumption
Controlling future disk space usage has several benefits that might not be immediately apparent.
The most obvious is that the disk space is available for business usage. But controlling disk
usage through SRM also helps to improve the availability of servers by shortening backup
windows and reducing the amount of data that may need to be recovered. Controlling disk usage
through SRM can also reduce corporate liability when file blocking is used to eliminate
unwanted file types (such as MP3s that might involve copyright violations).
SRM Product Evaluation
For the rest of this chapter, I’ll show you how to use VERITAS StorageCentral SRM as an
example of an SRM product, although you can certainly use any competing product. Ideally, you
should be able to perform the same functions and generate the same reports. Extensive
information that compares SRM products is available from industry trade magazines. If you
already have another product, you can still use the SRM deployment methodology detailed here.
You can access an evaluation copy of the StorageCentral SRM suite from http://www.veritas.com.
You will need to provide an email address and fill out a brief survey for download registration.
StorageCentral Installation
VERITAS provides ample documentation with StorageCentral SRM. I’ll just point out a few
items to consider before you install this product. The first consideration is the service account
that will be used to run the StorageCentral services. As the installation wizard explains as it
guides you through the product installation, if you don’t specify a service account,
StorageCentral will use the LocalSystem account. As with the Performance Monitor service
account, you should change the StorageCentral account from LocalSystem to a domain account.
This modification is necessary for the service to be able to connect to other computers. Thus, if
you want to be able to report against network drives or use email notification, you’ll want to use
a domain account for the service startup account.
If you get an InstallShield error 1607 “unable to install,” try running the setup with the source file
copied to a local hard drive. Such a simple error can throw you; I attempted to run the install with the
downloaded setup file on my network share and received this error.
67
Chapter 3
Another issue that you might encounter, is that StorageCentral depends on the Messenger service
for receiving notifications from agents running on other servers (see Figure 3.6). On WS2K3, the
Messenger service is disabled by default; during installation, you will receive a pop-up message
that notification messages sent to this server by agents will not be received. The default property
for the Messenger service was changed in WS2K3 and it is no longer started automatically (as a
security precaution). You can change the service startup, but this issue will only affect you if you
are running StorageCentral in a multi-server environment in which you will be pooling the
information to a central server.
Figure 3.6: The Messenger service is required for receiving notifications.
Figure 3.7 shows another prompt that you might encounter during installation. You will receive
this prompt if you do not specify email server properties during setup. As it states, you can
configure these settings later, so no need to be concerned about it or let it delay your installation.
Figure 3.7: Prompt during installation if no email server properties entered.
Auditing with StorageCentral
Auditing with StorageCentral is a simple three-step process:
1. Select the report(s)
2. Select the object(s)
3. Run the report(s)
Figure 3.8 shows a report running; in this case the Sample Storage Audit, which is a good place
to start. The figure also shows a list of the Report Sets that are available.
68
Chapter 3
Figure 3.8: Running the Sample Storage Audit.
To run this report, launch the StorageCentral console, expand My Computer if you are running
the report on this same machine, expand Reports, and select Report Sets. Alternatively, you can
right-click Preferred Machines, and select the option to create a new Preferred Group. You can
then select the reports on one of those servers. Next, right-click on the Sample Storage Audit,
and select Run Report.
Double-clicking on the report set is the same as right-clicking and selecting Run Report. To get to the
report properties, you will need to right-click and select Properties.
At this point, you will see a dialog box similar to that shown in Figure 3.9, except that you will
not have any objects selected (as I have already selected \\sanfile\f:\public). To select the objects,
which are the folders on disk that you want included in the report, click the ellipses button next
to the Object(s). At this point, you can also select whether to run the report just once or—even
better—set up a schedule to run the report every Monday morning to start your week off with
some good information.
69
Chapter 3
Figure 3.9: Running the report interactively against the selected object(s).
Figure 3.10 shows the dialog box in which you select the folder objects for the report. Subfolders
are automatically selected, and you can select multiple disks and folders.
70
Chapter 3
Figure 3.10: Selecting the objects for reporting information.
Figure 3.11 shows the HTML Report generated automatically for the Sample Storage Audit. If
you click on Sample Storage Audit, and choose Properties, you will find where you can change
the output properties and location. We’ll look at how you can manipulate CSV files in particular
in a moment. Notice that on the Content tab, you can select the Report Definitions that compose
this report. In this case, the Sample Storage Audit is mainly Space Allocated by User. So, at this
point, you should get a feel for the relation between definitions and reports.
71
Chapter 3
Figure 3.11: The HTML Report generated automatically for the Sample Storage Audit.
From the report THAT Figure 3.11 shows, you get a quick graphical view of the top users of
your public share. So now let’s take a look at the types of files that are eating up that space and
whether they are files you should be managing.
Another useful report to run to get you started is the Duplicate Files report. This report provides
information that helps you immediately recoup the cost of StorageCentral. Through this report,
you can instantly see the largest files and how many copies there are. Is it time to set up a
\Public\MovieTrailers share just so that you can maintain only one copy of those 50MB movie
trailers as opposed to every other user keeping a copy in his or her home share? Whether you
take drastic steps depends on your corporate policy, but performing this audit will help you to
develop that corporate policy
We’ll explore developing SRM corporate policy in the next chapters.
72
Chapter 3
Tips for Selecting Objects
When selecting objects to run your reports against, you can benefit from a few tricks. For
example, you can add a server with network shares to the list of preferred machines, but you
can’t simply retrieve from a mapping or connection from a server with StorageCentral installed.
The remote server also needs StorageCentral installed, otherwise, you’ll receive the report that
Figure 3.12 shows.
Figure 3.12: Unable to retrieve data directly unless StorageCentral is installed.
StorageCentral reports are fairly processor intensive (even canceling a report takes a while to
shut down). In addition to scheduling reports for later, the product gives you the ability to select
and run multiple reports at one time, as Figure 3.13 shows. You can use this feature to group a
bunch of reports together because they can take a few minutes to run against the thousands and
thousands of files that a company keeps. As the following figure shows, when you’re selecting
multiple reports, you can configure the settings for each report (by clicking Properties). As the
reports finish running, they’ll generate the Active HTML, which allows you to deal directly with
the files from within the report, with the right-click enabled command menus. Alternatively, if
you need to continue working with StorageCentral, you can launch another instance of the
console while the first is still running.
73
Chapter 3
Figure 3.13: Selecting multiple reports.
Rather than hunt through the reports list, selecting reports individually, I prefer to use the Daily
Storage, which generates the following reports by default:
•
Files by Type
•
Large New Files
•
Large Stale Files
•
Most Commonly Used Files
•
Nightly Backup Capacity Requirement
•
Space by Disk Drive
Rather than grab additional reports individually from the reports list, I prefer to select additional
reports in the existing reports properties. This functionality lets you select one report and let it
run while you get other work done. You then have multiple reports ready and waiting in the
console for when you need them.
74
Chapter 3
The following list highlights additional reports and their benefits:
•
Space by User―The heart and soul of file server administration and SRM is knowing
who is using what. Figure 3.11 shows the Space by User report in graph view. This report
is useful for identifying who is using what.
•
Space by Directory―This report shows the space used at a root point by all folders and
files below that point. I’ll use the Space by Directory report later as an example of
working with data imported into Microsoft Excel.
The following circumstances provide an example situation in which the Space by Directory report
came in handy. I was working on a project in which I would burn a CD-R disc and create a new
project directory when the total size of files in the project directories hit 650MB. If I failed to keep track
and the files exceeded 700MB, I would need to create a second folder to move about 650MB into and
create the project CD-R disc. The Space by Directory report let me, at a glance, see which folders
were ready to archive. Later, we’ll look at quotas, which you can use to automatically notify you when
the threshold is hit.
•
Duplicate Files―As I mentioned earlier, this report is a high-return report, instantly
allowing you to free up a good chunk of space. Give this report a little bit more time to
run than the others, as it has a lot of comparing and sorting to do before presenting the
results.
The reports contained in StorageCentral aren’t static output. Right-click one of the files listed in
the Duplicate Files report, and notice that you’re given several menu options including Find,
Delete, Copy to, and Move to. These functions let you perform file management directly within
the reports. However, because the reports are geared for direct file management, they lack the
ability to manipulate the onscreen view of the information; there is no way to sort or shift the
columns or even copy the information to the clipboard to paste into another program. Figure 3.14
shows an example of the Disk Drive Summary report.
To change the view, such as the sorting of a report, you have two options. First, you can modify
the report’s properties and run it again. Second, you can output the report data and work with it
externally.
Before you get too extreme deleting files in the Duplicate Files reports, remember that the deleted
files don’t go to the Recycle Bin even when you run the reports and delete files on the file server.
Thus, you might want to offload the files to tape or cheap disk storage before deleting.
75
Chapter 3
Figure 3.14: Disk Drive Summary report.
Working with Report Data
Let’s take a look at exporting one of the reports and working with the data externally. Figure
3.15 shows the Disk Drive Summary Report in the .txt file format viewed in Notepad. The
columns are separated by a pipe symbol (|). An alternative to using the Save As option, is to open
the report properties, and on the Format tab, specify that you will output to an HTML CSV or
ASCII file. Also on this report tab, you can choose whether to display the file sizes in bytes,
kilobytes, megabytes, or gigabytes. By outputting to one of these formats, you can save the raw
data for custom presentation or mathematical manipulation.
Figure 3.15: Disk Drive Summary Report in the .txt file format in Notepad.
As you can see in Figure 3.16, you don’t gain much clarity from simply bringing the file into
Microsoft Excel. The next step will be to break apart the data into cells and apply formatting.
76
Chapter 3
Figure 3.16: Disk Drive Summary report in Excel before parsing and formatting.
After you apply formatting in Excel, as Figure 3.17 shows, the report is much easier to read,
especially because you can apply formatting and eliminate unnecessary columns.
See the sidebar “Using Excel to Manipulate CSV Files” for more information about Excel.
Report Set: Disk Drive Summary
2/21/2004
8:16
Space By Partition
Server: 'SANFILE'
Description: This report lists all partitions and their capacity, free space and type.
Drive Name
F:\
D:\
C:\
E:\
B:\
All Local
Percent
Used
65
41
32
90
7
58
Total
SizeGB
62.50
37.29
37.25
37.24
0.06
174.35
Free
SpaceGB
21.81
21.94
25.00
3.60
0.05
72.41
Used
SpaceGB
40.69
15.35
12.25
33.64
101.94
FileSys
Name
NTFS
NTFS
NTFS
NTFS
FAT
N/A
Volume Name
StripedVol
D_Data
C_OS_Progs
E_Apps
TEMP
N/A
This report was generated by VERITAS StorageCentral.
Figure 3.17: Disk Drive Summary report in the .txt file format in Excel after formatting.
77
Volume
Compressed
No
No
No
No
No
N/A
Chapter 3
Using Excel to Manipulate CSV Files
The following list highlights a few quick tips for using Microsoft Excel to manipulate CSV files.
•
Set your cursor just below the field headings in column A (such as Machine Name, Directory; usually
cell A9 in my worksheets), and select Alt + Window, Freeze Panes. This setting will let you scroll
about the report and still be able to see the column headings. For maximum viewing, set your scroll
lock button before freezing the panes, and scroll the A9 up to the top position.
•
With your cursor still in what is now the anchor cell, A9, select all cells by pressing Ctrl + Shift + End
and you’re ready to sort data however you want. Just select Alt + Data, Sort and notice that the Sort
by options now show the field headers such as Machine, Directory, File Owner, Dir Size Used KB,
and so on.
•
Do you find that you’re constantly adjusting the print settings of CSV reports? Open a new workbook
and select Alt + File, Page Setup. Enter your print setup options; I usually select the fit-to-one page
option (whenever possible), zero out the margins, center horizontally and vertically, and enter the
page headers and footers, such as illustrated in Figure 3.18.
Figure 3.18: Sample page setup options for printing CSV reports using Excel.
Save the file in %ProgramFiles%\Microsoft Office\Office\XLStart where %ProgramFiles% is usually
C:\Program Files. The file will be available when you start Excel. (You can even change directory by
using
cd %ProgramFiles%
at a command prompt.) If instead you save the file as a template in the Templates folder
(%APPDATA%\ Microsoft\Templates), it will be available when you select Alt + File, New. The
difference is that instead of opening the CSV file, you’ll select Alt + Data, Get External Data, Import
Text File, enter your CSV file (or *.csv to see all), choose Delimited, and select the comma as the
delimiter in the Text Import Wizard.
•
Do you find that you get an error when trying to save a CSV file because you the file is still open in
Excel and you still need the file open in Excel? By using the above method for bringing the CSV data
in (Get External Data from the Data menu), it creates a new worksheet and doesn’t lock the CSV file.
(I still have to delete the first row and the first column.)
78
Chapter 3
Figure 3.19 shows the output of the Space by Folder report imported into Excel. Not a pretty
picture, which is exactly why I included it here. There is always the risk of including too much
information in a report and losing the focus on what you set out to audit. In this example, I
wanted to know the size of all files and folders below the root user folder, which is known as the
Branch Size Used. The Branch Size Used is the sum of all files and folders below a particular
folder, which can be handy information. The other option is to shell out to a command prompt
and run the Dir /S command, but even that can take a very long time to generate the last line of
information, which is really all that you want to know.
Figure 3.19: A difficult-to-read report shown in Excel.
When working with Space By Folder or other reports on directories, one trick is to change the Number
of Levels from the default of 1, as Figure 3.20 shows. If you run this report at the root of a share or
drive, you might not get enough detail on those user directories and other subfolders. Note that this is
done on the Report Set that you are running and not in the Report Definition. Simply open the report
properties, select the Content tab, and select the report properties button. On the Settings tab, select
the Number of Levels input field.
79
Chapter 3
Figure 3.20: Number of Levels for Folder Object Reporting.
Customizing Reports
When you look at Report Definitions you’ll see two icons, which are highlighted in Figure 3.21.
The two icon types represent standard report definitions (top icon, with the hammer) and custom
report definitions.
Figure 3.21: Report definition icons.
The first type has a limited set of properties (description, settings, and detail), and the second
type offers the full set of properties that Figure 3.22 shows. The engine behind many of the
reports in StorageCentral is the query. For the most part, you should be able to use the many
default reports supplied with the product. However, you might want to either modify an existing
query structure or create your own from scratch. Figure 3.22 shows a sample query expression.
80
Chapter 3
Figure 3.22: Sample query expression in Custom Report Properties.
Preparing for the Next Phase
At this point, you have the capability to generate reports that will give you the following
information:
•
Storage utilization (space used and free space)
•
Storage users
•
Types of files stored and where
•
Duplicate files
•
Storage hierarchy (where storage is being used and administered)
From this information, you can begin to generate a gap analysis, which compares where you are
now with where you want to be, both in terms of your existing storage resources and taking
advantage of the new WS2K3 features. Let’s start by exploring how to eliminate duplicate files.
We’ll continue this analysis in the next chapter as well as begin the next phase, planning an SRM
deployment.
81
Chapter 3
Eliminating Duplicate Files
Figure 2.23 shows a sample SRM report for duplicate files. In this case, the amount of wasted
space from duplicate files is significant, 833MB of the gigabyte or so of storage.
Figure 3.23: Report showing wasted space from duplicate files.
By looking at the report, you can understand the reasons that the duplicate files are being
created:
•
People are saving the same files; for example, email attachments that are sent to many
people
•
People are working on the same projects and files; for example, video and white paper
files in the sample report
•
People are saving setup and application installation files; for example, the downloaded
setup files in the sample report
•
There are files that share a common filename even though they’re not actually duplicate
files. (The example in this report being the Outlook.OST, which is an offline
synchronization folder.) This occurrence isn’t so much a problem that needs a solution as
a caution to the administrator that some files might not be duplicates but have the same
name and in rare cases are the same size.
82
Chapter 3
Thus, your SRM plan must deal with the source of each of these reasons for creating duplicate
files. First, resolve the situation in which people are using the file share as a dumping ground for
files received by email. This behavior creates duplicate versions of the files because users are
receiving many of the same files and instead of maintaining the single-instance storage that an
email system (in this case Exchange Server) would maintain, they’re creating duplicate files. The
solution to this problem is twofold:
•
The original file author could merely send out a link to the original file (I’ll show a good
method for doing so as we develop a storage-management solution).
•
If the users have sufficient space in their mailboxes, they may be less inclined to dump
the attachments.
Another reason for the duplicate files is that many of the users are working on a joint project, and
they’ve copied the files to an individual share. The solution to this problem is more complex, as
collaboration and workflow applications are designed to address this problem. The document
author might say “I have created this document, and you are enabled to add to it,” but the end
user might still feel the need to keep a personal copy.
In the report showing duplicate files, there is a column that provides Revised Days. From this report,
you can immediately see that some of the files are quite old and might fall into the next category,
unused files.
Eliminating Unused Files
Unused files are most often older files that no one is currently accessing, as illustrated in the
wasted files report shown in the next section. A more difficult-to-manage category of unused
files are those that have been orphaned, usually by a user who is no longer working for the
company. For those of you who provide working space to contractors and vendors, you know
how difficult a task cleaning up files can be. Who wants to wait for the files to age to clean them
up? SRM products can generate a report for files with undefined access control entries (ACEs),
which displays all the files that have an owner security identifier (SID) in an ACE that has been
deleted or is invalid.
83
Chapter 3
Eliminating Wasted Space
Wasted space can come in several forms: from files that are no longer being used, duplicate files,
prohibited files, or all of the above. The wasted space report, which Figure 3.24 shows, contains
a column labeled Wasted, which uses a string to indicate why the file was flagged for this report.
The string a is for aged, e is for expired, l is for large, and d is for duplicate. Not shown is o for
over-allocated, as I have not set any quotas on the users or directories. The wasted space report is
helpful for allowing you to see why space is being used and getting to the root cause.
Figure 3.24: Best practices report in StorageCentral SRM.
84
Chapter 3
Reducing Consumption
So you’ve run these reports and done all that you can to remove older, unneeded files and
developed a solution to reduce duplicate files. There are several additional methods for reducing
the level of storage consumption. The most severe is to set immediate storage quotas or
allocations on the storage used without any contingency for alternative storage locations. The
least severe is to provide a storage location alternative and assist users in migrating from one
storage system to another.
One method of reducing consumption is to remove files that fall under the category of restricted
file types, the most infamous of late being .MP3 files. You can remove these files by scavenging
existing shares and removing them, or you can prevent future storage of the files by using an
SRM product.
Enforcement Policies
The process of implementing storage quotas can be initiated by a soft touch (for example, by
providing information messages and warnings) or by immediately imposing a hard limit and
preventing further file storage until the user cleans up storage to stay under the quota threshold.
Informational messages can be used to let the users know that a hard limit is coming.
As an administrator, you might also be faced with making the initial decision of where to define
the quota limits. A third-party SRM product can be useful for this decision. For example, the
StorageCentral SRM product offers a template called Baseline directory utilization that allows
you to set the future quota at 200 percent of the current usage. Figure 3.25 illustrates using the
predefined template as a soft quota enforcement policy. This setting gives users a little bit of
breathing room before they hit their quota, but can also reward the space hog at the time the
SRM product is first deployed, so you might want to look at overall usage reports before setting
this quota.
Notice also the footnote in the dialog box, stating “Not applicable with Network Appliance.”
This note is new to StorageCentral SRM and applies to NAS using Network Appliance Filers. If
you are using or planning to use one of these storage solutions, you should read the
StorageCentral SRM Help file information about why the exceptions shown below apply. The
Filer uses a different file screening or blocking policy that does not match StorageCentral, so
you’ll want to learn how to best deal with the differences.
85
Chapter 3
Figure 3.25: Allocation Policy for 200% of Used Space as a soft quota.
How you implement storage policy is entirely up to you and your organizational policy; if you
want to increase acceptance of your quota-management system, people must benefit from the
SRM product and policy. You must make clear the fact that quota management is in the end
users’ and the organization’s best interests and not just something that is being done so that you,
the administrator, can leave work on time. In the next chapter, we’ll look at defining
administrative boundaries and assigning roles as part of the SRM process.
The next step in the planning process is to look at different storage applications and the various
storage location alternatives. The plan will include which type of storage you are providing and
how you will manage it—which tools are available and what policies will be set. Perhaps some
of these storage alternatives will be new to your organization, for example, applications such as
Windows SharePoint Services (which was formerly known as SharePoint Team Services—
STS—which explains why the download file is named STSV2.exe).
86
Chapter 3
Support of Storage Applications
In the first chapter, I introduced the concept of storage location alternatives and gave a simple
scenario in which I send you an email with an attachment that you want to save and share with
others. Let’s look deeper at each of the storage location alternatives, both from the benefits to the
end user (the main reasons that they would choose to use one over the other) and the impact that
each has on the storage administrator.
Table 3.2 shows the advantages, disadvantages, administrative considerations, and deployment
recommendations for these types of storage: mailboxes, personal mail folders (PSTs), public
mail folders, local file folders, network shares, Microsoft collaborative applications, intranet or
Internet Web servers, and database applications.
Storage
Location
Advantages
Disadvantages
Administrative
Considerations
Recommendations
Mailbox
Saved in context
of message;
usually some
simple search
functions
Mailbox sizes are
usually very
limited; may be
needed for
business-critical
communication
Must usually back
up the entire mail
store, so you must
consider limits
placed on mailbox
sizes; depends on
deployment of
messaging
application, such as
Exchange Server
Keep the limits on
mailbox sizes and
use file shares; see
the section about
HSM solutions for
multi-class storage
tiers
Personal
folder
(PST)
Saved in context
of message;
usually some
simple search
functions
Depending on
location, may not
be RAID
protected storage
(local computer);
more prone to
corruption, very
difficult to share
information
Difficult to back up if
the file is open, loss
of single-instancestorage (one file or
instance in the
messaging database
shared among many
users); no quota on
file or disk usage
Do not use for
shared or critical
information
Public
Folder
(email)
Saved in context
of message;
better search
functions through
content indexing;
open for sharing
among multiple
users
Often difficult to
navigate tree
(separate from
mailbox folders)
Requires welldesigned public
folder hierarchy and
some level of enduser training to be
useful; depends on
deployment of
messaging
application, such as
Exchange Server
Create structure
and encourage
usage instead of
personal folders for
shared information
Local
computer
folders
Personal/private
information; ease
of use; roaming
information
(notebooks)
Lack of fault
tolerance (RAID,
regular backups)
and usually
lesser-quality
hard drives;
notebooks can be
lost or stolen
Distributed
management (which
can be a good thing
or not); difficult to
share or locate
information
Use network file
sharing with offline
folders to
synchronize
information
87
Chapter 3
Storage
Location
Advantages
Disadvantages
Administrative
Considerations
Recommendations
Network
file share
Easiest structure
(no Web pages
or database to
create); direct to
the end user with
very little
administrative
intervention.
Storage
appliance can
add features
such as file
screening and
data replication
Used to be a
problem for
roaming users
until offline
folders allowed
complete
synchronization
Requires a thirdparty utility to really
manage the storage
and provide enduser quotas; can
use replication to
place copies of
information closer to
end users
Use network file
sharing with offline
folders to
synchronize
information
Web
Storage
System
(WSS)
Content indexing
and permissions
on stored files
(similar to NTFS
file shares); can
be used for
collaborative
workflow
(document
versioning, and
controlled checkin and out);
maintains singleinstance storage
(one copy of a file
for many users)
Requires learning
the new
application (for
collaboration)
Depends on
deployment of
application, such as
SharePoint or
Exchange WS2K3;
can use built-in
replication model to
place copies of
information closer to
end users
Consider for
collaborative
workflow and
publishing
environments;
benefits over Web
server (see below)
Intranet or
Internet
Web
server
Broad variety of
client browser
support (for
reading and
linking or
indexing
information)
Must go through
Web master to
publish; difficult
for end user to
act as document
editor or
publisher, which
can lead to stale
information
Depending on the
amount of
information to be
published and the
intended audience;
not for use as a
collaborative storage
system
Useful for read-only
publications and
distributing software
over the Internet
Database
system
Rapid retrieval of
information
based on indexes
and relational
structures
Takes a lot of
effort to set up
and use,
requiring
database
specialists who
must act on
behalf of the end
user or
information
consumer
The only choice for
bits of information
that must be
gathered into tables
and structured for it
to be useful (for
example, human
resources personnel
information)
Will continue to be
used for information
that must be fairly
well structured to be
shared
Table 3.2: Storage location alternatives.
88
Chapter 3
There are at least seven storage location alternatives for you to provide to your end users.
Whether you offer one or all, you must develop the ability to include them in your SRM plan and
process.
Summary
In this chapter, we started the next phase, planning the SRM deployment. We used the
information captured on both storage capacity as well as storage performance to determine a
course of action, which I laid out in a flowchart. The flowchart identified the storage
management decisions that you must make, most notably whether to live within your existing
capacity or to add capacity, even if for performance reasons.
We then looked at how SRM solutions can help eliminate duplicate files, unused files, and
wasted space and reduce consumption. We also covered the tools and utilities you can use for
storage management. Phase 1 of your SRM deployment is analysis, so I presented some analysis
tools, showing what you can do natively in Windows Server with the Performance Monitor as
well as with the Windows Server resource kit tools. Later I showed why you may need a thirdparty SRM suite to improve the audit process and prepare you for the next phase, planning your
SRM deployment.
In the next chapter, we’ll cover structuring the SRM project and using storage-management tools
to make better use of either your existing storage or your newly deployed storage. We’ll also
explore the options creating additional storage capacity using the Windows Server RSS, a basic
HSM system. We’ll do a storage performance design example for Windows Server, using
Exchange Server 2003 and file servers as examples to illustrate storage performance
applications. Next we’ll look at expanding storage arrays through Dfs and the hardware selection
process involved in migrating to new storage systems. This discussion will include exhausting
the core Windows Server functionality and SRM features (including quota management) before
turning to a more comprehensive SRM solution. We’ll cover product functionality that assists
toward your SRM goals—eliminating duplicate files, eliminating unused files (aged and
orphans), eliminating wasted space, and reducing excess consumption—through setting disk
quotas.
89
Chapter 4
Chapter 4: Developing the Storage Resource Management
Solution
In the previous chapter, we took a look at capacity planning to ensure that you don’t
underestimate storage resources’ growth and the resulting impact on IT administrative
capabilities. I gave an example of using an SRM tool to analyze storage at more than one point in
time so that you can perform trend analysis. I covered storage resource planning, and explained
how to plan a course of action with a focus on increasing storage capacity or improving
performance. Then I illustrated how you can use an SRM tool to support your decision to more
efficiently use your existing storage rather than deploy new storage. We’ll dive into more detail
about this decision as we develop an SRM solution for your organization.
In this chapter, we’ll explore two broad areas: structuring the storage management project and
using storage management tools to make better use of your existing storage or your newly
deployed storage. To address the need to expand storage arrays, you can attach to WS2K3 DFS;
I’ll discuss this option and the hardware-selection process involved in migrating to new storage
systems. In addition, we’ll take a brief look at the Windows Server RSS to decide whether it’s
right for you. Finally, to illustrate how to design with performance in mind, we’ll look at
hardware and a performance design example for Windows Server that uses Exchange Server
2003 as a storage application.
A word on methodology: As stated in the first chapter, I’ve organized this book in a project
methodology based on the MSF. Because the MSF was created for software development or
programming, the structure is slightly different than for IT infrastructure solutions or projects. This
difference was reflected in a recent update to the Microsoft solutions architecture; however, I prefer to
stick to the original MSF because it calls out the development and testing (pilot) portions of the project
with greater emphasis. This point is important, as greater emphasis is often placed on beta testing
software development solutions rather than infrastructure solutions. The net result of more beta
testing is higher-quality solutions, as you learn about and change your design to better fit the
business workflow.
Phase 3: Developing the SRM Solution
Table 4.1 shows Phase 3 in the overall SRM deployment methodology.
Phase
Process
SRM
Development
Build the solution (for
testing)
Use the storage reporting tools and prepare to move them
to real-time production usage.
Table 4.1: Phase 3 of the SRM deployment methodology.
Before we look at what will be accomplished in this phase and how, you’ll need to decide who
will be performing which type of work.
90
Chapter 4
SRM Project Team Roles
As I break out each of these roles, don’t be surprised if you fill many of the roles. In a small
business environment, one person often fulfills all or most of the duties. In a larger or enterprise
environment, identifying the duties that one person could perform (given infinite resources such
as time and leadership charisma) and offloading duties to other individuals becomes an essential
task. By no means does Table 4.2 imply that each role requires a separate person or that one
person couldn’t handle multiple roles. The number of people involved simply depends on the
scope or size of the project (for example, whether you will build a new SAN) and the resources
available.
SRM Project Team Role
Function
Project Sponsor
Provides funding and overall go or no-go decisions. Narrows the
vision (what could be accomplished) to the scope (what will be
accomplished). Defines which problems will be solved.
Project Manager
Allocates resources (people, tasks, funding, and so on). Manages
the project scope, ensuring that the tasks performed and resources
used are targeted at successful project completion.
Technical Architect
Designs the solution to be implemented and defines how problems
will be solved.
Technical Advisors
May be called in on short notice to provide guidance about which
direction to proceed or how to repair a specific problem.
Subject Matter Experts
May be called in on short notice to provide guidance about how to
repair a specific problem.
Design and Build
Engineers
Make decisions about the selection and installation of SRM
products. Performs the installations or creates the process to
delegate the installations to regional administrators or server
operators.
Test Engineers
If the team is large enough, separate the test engineers from the
design engineers because the test engineers lend a different
perspective about how the solution will be used; might consist of
pilot users drawn from the company or organization.
Communications
Specialists
Communicate the goals and benefits of the project to the business
or end-user community. This duty is a specific role in larger
organizations and is mainly an issue when the enterprise has an
existing channel in place (such as Human Resources); otherwise
the project manager or team may fulfill this role.
Regional Administrators
Perform the same functional responsibility as the design and build
engineers, but might not be able to participate in the centralized
project in a distributed environment; might apply the architecture to
their regional systems and provide feedback to the centralized
team.
Server Operators
Perform some installations and routine maintenance. Can provide
feedback about the quality of project documentation, such as
operations guides.
Table 4.2: SRM project team roles.
91
Chapter 4
A note about technical advisors and the wisest use of consulting resources: Doesn’t the word consult
mean that you are asking for advice? Ideally, this person is someone whose experience you can draw
upon. They are consultants, not employees—they should tell you how to do your job (or how to do it
better), not actually do your work for you. This role in the project should be limited in duration and
scope, and if the prospective consultant is not willing to work in this manner, they’re looking for longterm employment!
General Project Guidelines
These project guidelines are based on lessons learned from previous projects. Regardless of your
end goals and the means to accomplish them (which we identified in Chapter 3), these are the
principals that will hold true across the widest variety of projects.
Project Tools
In Chapter 3, we explored how to make SRM decisions such as which options or choices to make
to better utilize existing capacity or to expand capacity. These ideas were central to the themes
covered in that chapter. Similarly, in this chapter, the deployment template that Table 4.3 shows
plays an instrumental role and is the focal point for this chapter. This tool is meant to be a project
aid for your SRM deployment and will serve you well printed out and pinned to your cubicle
wall or distributed to management at that all-important meeting.
Deployment Template
Table 4.3 outlines a deployment template that you can customize to suit your project. You may
also use it for communication to upper management, as it outlines the SRM goals, processes,
benefits, and status. I have designed it to be used as a template for your project, meaning that you
can customize each item for what you plan to accomplish, removing items that may not be
immediately relevant.
Goal
Process and
End Result
Priority and Estimated
Benefit
Delegate
administrative
functions;
increase enduser knowledge
and awareness
Create and
evaluate
communication
channels,
conduct regular
project meetings,
and train the
involved parties
in the SRM
solution and
administrative
duties.
1. Increases storageconsumer buy-in and
reduces total cost of
ownership (TCO).
Match storage
location
alternative to
type of
information
Evaluate the
storage location
alternatives (print
out the table from
Chapter 3).
2. Reduces
administrative overhead
and increases
functionality to business
users, such as
customer
responsiveness.
92
Team
Members
Status
Get feedback on
customized
reports. Test
communication
channels such
as email and
corporate
intranet/portal as
well as
traditional paperbased means
(brochures).
(Insert the
names of
the team
members
responsible
for this
task.)
Identify storage
location
alternatives and
provide bestmatched storage
options.
Chapter 4
Goal
Process and
End Result
Priority and Estimated
Benefit
Improve system
availability and
reduce backup
and recovery
windows
Separate
business-critical
storage from
non-essential
storage.
3. Reduces
administrative overhead
of maintenance and
recovery operations.
Requires a
combination of
the other
solutions
detailed below.
Eliminate
unused/ orphan
files
Perform SRM
reporting and
administrative
cleanup.
4. Reduces wasted
space on file servers
and application servers.
As simple as a
weekend
cleanup job
(assuming you
have the right
tools).
Eliminate
unnecessary
files (based on
type)
Implement file
blocking.
5. Reduces wasted
space on file servers
and application servers.
Evaluate and
test SRM
products.
Reduce storage
consumption
(that is, reduce
non-essential
files)
Implement file
system quotas
per user or
group.
6. Encourages users to
maintain appropriate
levels of information
storage.
Evaluate and
test SRM
products.
Eliminate
duplicate files
Perform SRM
reporting and
administrative
cleanup. In
addition, this step
requires
measures to
ensure
prevention of
future file
duplication
(involves giving
users better
ability to share
files).
7. Reduces wasted
space on file servers
and application servers.
Create shared
file mechanisms
(common
storage or
collaborative
applications),
run duplicate file
reports, and
migrate
duplicate files.
Eliminate
unnecessary
files (based on
aging)
Use HSM.
8. Offload infrequently
used files.
Decide on date
criteria, select an
HSM product
(native Windows
Server versus a
third-party), and
evaluate a
physical storage
medium.
Expand existing
storage for wellknown
\\Server\shares
(part of mapped
drives for logon
scripts)
Create DFS
architecture.
9. Consolidate existing
servers and storage
into a unified
namespace.
Pilot in lab to
test replication
over WAN.
93
Team
Members
Status
Chapter 4
Goal
Process and
End Result
Priority and Estimated
Benefit
Expand existing
storage (as
previously
mentioned)
Expand storage
arrays (more
physical drives).
Low priority; requires
the ability to add to
storage capacity; few
solutions support online
capacity expansion
(see OVG and VVM in
Chapter 3).
Test volume
mount points
versus
expanding
existing arrays (if
this testing is
possible, it
usually requires
downtime).
Expand existing
storage (as
previously
mentioned)
Replace servers
or storage.
Low priority; requires
additional funding and
cost-benefit analysis.
Investigate DAS,
NAS, and SAN
alternatives,
both the initial
cost and as an
ongoing
management
solution.
Improve existing
storage
performance
Replace or
upgrade servers
or storage
devices.
Low priority; requires
additional funding and
cost-benefit analysis.
Investigate DAS,
NAS, and SAN
alternatives,
both the initial
cost and as an
ongoing
management
solution.
Team
Members
Status
Table 4.3: Deployment template outlining the SRM goals, processes, benefits, and status.
Communication Plan
An essential part of an SRM project is identifying who needs to know what and how. For
example, you must develop a process for communicating to your upper management the ongoing
status and progress of the project. Have you ever lifted your head from what you are working on
to realize the entire day is gone, and you have come nowhere near accomplishing what you
originally set out to accomplish? Where did the day go? What did you get done? And most
importantly, how will you communicate this day in your status report to management? Looking
back over the day, why did your immediate focus shift to the new task, and did it move your
project closer to the goal? This type of information needs to be communicated not so much on a
daily basis but more likely on a weekly basis: Is the project moving closer to the originally stated
goal, and are resources being used in the right manner?
Table 4.3 is provided as part of developing your communication plan, and can be used to
communicate to upper management the individual goals, benefits, and the status of your SRM
project.
94
Chapter 4
Minimizing Disruptions
At some point in the project, you will make tradeoff decisions between doing an in-place
upgrade of an existing storage system and building a new storage system and migrating files to
it. An analogy for doing an in-place upgrade of an existing storage system versus migrating to a
new one is whether you should remodel your existing home or move to a new one. You know the
hassles of moving, but the reason you face those hassles is to get a bigger, better home. But if
you cannot afford that, the piecemeal approach is to add on to your existing house as much as
you can afford at the time. Just as in remodeling a home, the in-place storage upgrade can be as
disruptive as moving to a new home.
Deployment Topology
If you do not have a clear picture or diagram of the topology of your organization, you will need
to develop one as part of the SRM project. You will need to map out the storage topology of the
organization to reflect both the geography and the administrative model, whether it is centralized
or distributed. Your organization may exist primarily in one or a few locations, making for a
centralized model, or it may have branch or division offices spread across the world that are
maintained autonomously. The hierarchy of the deployment model—centralized or distributed—
will have an influence on the design of your SRM solution and project. For example, in the
distributed model, the decisions that you make as the top-level administrator may or may not be
accepted by the administrators at the distributed locations. Perhaps they will not have
administrative override and must live with your decisions. Either way, you need to consider the
topology as part of testing the SRM solution because the solution may work well in the main
office data center but break down at the remote offices.
Delegating Administrative Functions
The benefit to the organization of delegating administrative functions is twofold. First,
delegating administrative functions is cost-effective if the SRM solution can be operated and
maintained by lower-cost employees. The more employees that are able to use the system, the
greater the chance of its survival in the long-run, as the organization will never fall into the trap
of losing the only employee who knows how to operate the system. Second, as you increase enduser knowledge and awareness of the SRM solution, you increase the end-user buy-in that this
solution isn’t just something that is imposed from on high but a tool that is designed for their
benefit. In the case of managing files and important business information, the capabilities and
decisions must absolutely be distributed to the information owners and end users of the files, as
there is no way that the storage systems administrators can know the relative importance of each
and every file.
95
Chapter 4
Developing an Organizational Storage Policy
Although a plan of action for storage is certainly important, it must be married with an
organizational storage policy. As we develop our technological solutions for managing storage,
they will be used to support the organizational storage policy that we will develop in the next
phase of the project. In fact, the two must complement each other, as a conflict between storage
design and organizational storage policy is disruptive to business process. For example, if user
space is limited by policy on a business-critical storage system, but the tools are not in place to
enforce it, additional administrative burden is created to monitor and maintain the system.
The end result of this twofold strategy is to increase the availability of storage systems while
reducing the cost of administrative maintenance. These are difficult goals to attain. As we
increase the importance and usage of storage systems, we must also increase their fault tolerance
and ability to recover from any disruption to service. In the area of TCO, entropy is our enemy—
if a system is left alone over time, it will begin to decay. For storage management, this idea
means that if the tools are not in place to monitor and maintain the system in an efficient manner,
the state of storage will move towards a natural decay in order and chaos. Perhaps you have seen
the results of this corrosion for a particular file server: an ever-increasing number of files being
stored, with diminishing access patterns, using ever more storage, which increases the backup
and recovery window. In the next section, we will look at the impact of increasing our storage
capacity.
Implementing storage policies without the aid of your compliance department can be difficult. Many
corporations have very specific rules about what can and should be kept and for how long. These
rules are often the first place to start in developing a policy, and if a policy is developed without
checking with the compliance department first, you can get into a situation in which the policy that the
IT department decides on isn’t backed up by a corporate policy.
One more point about the backup and recovery window, as the concept of SLAs can have a
powerful impact in our storage capacity planning. When users are storing more and more
information, throwing storage space at the problem is a temporary solution at best. Before long,
the end result is that business-critical or even mission-critical data can be compromised because
it is mixed with files that are no longer essential, and all the data must be backed up or recovered
during the same time period or window. This behavior is the driving force behind HSM—
moving non-essential files away from business-critical and mission-critical information. The
SRM plan can include how HSM will be incorporated, and the benefit to the company (which
drives the cost justification) is the improved response and protection of business-critical and
mission-critical information.
96
Chapter 4
Product Support and Escalation Procedures
As part of the assignment of roles in the project, an ongoing role will be that of supporting and
maintaining the SRM solution, including dealing with any end-user issues that will arise. In the
next section, we will dig into SRM tools, so carry forward this reminder that product support and
escalation needs to be laid out before you roll out the SRM solution; otherwise you may stress
your Help desk and support system. For example, when we look at setting disk quotas on end
users, consider the wording of the notification message that users will receive. Ideally, it will not
create any confusion, and should give a point of reference that the user can go to get more
information, such as a Web page or a Help desk phone number.
SRM Goals
As we dig into SRM policies and the tools used to create them, keep in mind the four goals of
SRM:
1. Eliminate duplicate files (and improve the sharing of files)
2. Eliminate unused files (based on aging and orphaned files)
3. Eliminate wasted space (from non-essential files, based on file type)
4. Reduce storage consumption (by setting disk quotas)
We will translate these SRM components into five distinct actions in our development. Five
actions rather than four because eliminating unused files consists of cleaning up two different
types of files that are no longer used—orphan files and aged files—and they need to be identified
separately.
SRM Tools
In this section, we will see how the core Windows Server functionality and features, including
quota management, can attempt to solve the SRM goals. After we have exhausted the capabilities
of Windows Server, we will turn to a more comprehensive third-party solution.
Windows Server Disk Quotas
Regarding the goals of eliminating duplicate files, eliminating unused files (aged and orphans),
eliminating wasted space, and reducing excess consumption, the core Windows Server
functionality is limited to the Disk Quota feature. (I covered some limited reporting functionality
in Chapter 3 and introduced disk quotas in Chapter 1).
You access the Windows Server disk quota feature by right-clicking a disk volume (usually
synonymous with a logical drive that has a letter such as D) in Windows Explorer, and selecting
the Quota tab, as Figure 4.1 shows. In the example that the figure illustrates, I have enabled
quota management and set several of the options that are not set by default. For example, I have
set the warning level at 190MB and prevented users from writing any further than 200MB. I
have selected the property so that when users exceed their disk space, the system will log an
event on the server. Because I selected the Deny disk space to users exceeding quota limit check
box, the quota limit is a hard quota. If I did not select this option, the quota limit would be a soft
quota, which is used more for informal reporting and alerting end users as well as administrators
of disk-usage levels.
97
Chapter 4
Figure 4.1: Properties for a Windows Server volume showing quota-management settings.
Disk quota notifications are not written to the event logs immediately; they are written to the event
logs every hour. To modify the default one-hour disk quota notification time, locate the
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem registry key, and create a
new REG_DWORD type entry called NtfsQuotaNotifyRate. Set the value of this entry to the desired
interval (specified in seconds).
There are a few things to consider about Windows Server disk quotas before you rush to
implement them:
•
Windows Server disk quotas are based on disk volumes and cannot be set on the level of
individual objects files or folders. (This limitation doesn’t exist in more robust SRM
products. Many such products let you specify that a folder does not grow to, for example,
more than 500MB, regardless of which user is placing the files in that folder, including
anonymous or guest accounts). Windows Server disk quotas are assigned to users and
group members.
98
Chapter 4
•
Windows Server disk quotas are based on file ownership. This setup is generally true of
any disk quota system or software, as it helps to fairly assign disk usage based on who is
using or storing the files. The one exception is that members of the Administrators group
(either at the local server level or at the domain level, depending on the server’s domain
membership) are not assigned individual file ownership. This exception makes applying
disk quotas to administrators difficult, as file ownership is assigned to the Administrators
group instead of each logon account.
•
The Quota tab is not displayed on the volume properties unless that volume is formatted
with NTFS and the logged on user is a member of the Administrators group for that
machine.
•
Windows Server disk quotas are based on uncompressed file sizes, meaning that the end
users cannot increase the amount of available space by compressing the data.
When you enable disk quotas and click OK or Apply, you will see the pop-up warning that
Figure 4.2 shows. The message informs you that the disk quotas will not take effect immediately.
Figure 4.2: Pop-up warning when you enable disk quotas.
Notice that the quota settings specified in Figure 4.1 are for new users on the volume. To
selectively control individual user settings, use Quota Entries, as Figure 4.3 shows. When you
click Quota Entries on the Quota tab, the Quota Entries table will show you any existing quotas
and allow you to add user-specific quotas by selecting New Quota Entry from the Quota menu.
The system takes a very long time to fill the table; you will see the [Retrieving Name] entry for quite
some time, as I have found this report to be slow, even on the faster systems in my company.
99
Chapter 4
Figure 4.3: The Quota Entries table for Windows Server disk quotas on the D volume.
Figure 4.4 shows the Add New Quota Entry dialog box that you see when adding multiple users
to the disk quota on a volume. Note that you can set this disk quota at different levels than the
primary disk quota for new users that you set on the Quota property page (which Figure 4.1
shows).
Figure 4.4: The dialog box for adding multiple users to a Windows Server disk quota on a volume.
100
Chapter 4
Any time that you make a change to the quota settings, you will need to wait for the Quota Entries
table to refresh. The table is so slow to retrieve information that I have often seen it only do 2 or 3
users per minute. Thus, I advise leaving the console open or you will suffer this painful refresh period
every time you launch the Quota Entries table. However, the table isn’t updated in real time, so
eventually you will have to hit refresh.
Using this information in a report is quite easy. To drag-and-drop information from the Quota
Entries table to a document or spreadsheet, open a Word document or Excel spreadsheet while
the Quota Entries table is building. When it is done, select the Quota Entries you want to include
in your report, and drag the rows to the program you are using to create the report. You might
need to drag and hold the selection on the taskbar until the program pops up to the foreground.
At this point, you can apply formatting, as Figure 4.5 shows, and save the file.
Quota Entries for E_NTFS_280GB (E:)
Status
Nam e
Logon Nam e
Am ount
Quota
Warning
Percent
Used (MB) Lim it (MB) Level (MB) Used
17,311 No Limit
No Limit
N/A
OK
BUILTIN\Administrators
OK
W2KFS017\Alice
OK
W2KFS017\Aram
2
30
20
7
OK
W2KFS017\Armitage
9
30
20
29
-
30
20
-
OK
W2KFS017\Ashcroft
5
30
20
17
Warning
W2KFS017\Balmerino
22
30
20
74
Warning
W2KFS017\Barbour
21
30
20
69
Above Limit
W2KFS017\Bartelano
32
30
20
105
Above Limit
W2KFS017\Barthelmey
30
30
20
100
Above Limit
W2KFS017\Bates
30
30
20
100
34
30
20
30
20
Above Limit
W2KFS017\Beale
OK
W2KFS017\Bellingham
Above Limit
W2KFS017\Bishop
34
30
20
112
Above Limit
W2KFS017\Blakesley
33
30
20
110
Above Limit
W2KFS017\Bland
33
30
20
110
Above Limit
W2KFS017\Bob
76
30
20
253
Above Limit
W2KFS017\Bonsall
35
30
20
115
Above Limit
W2KFS017\Bousfield
35
30
20
115
OK
NT AUTHORITY\SYSTE
1
200
180
-
112
-
-
Figure 4.5: Windows Server Quota Entries report in an Excel spreadsheet.
Settings that you create for user quota entries can even be copied from one server to another by
using the quota export and import functionality. Just set up your quotas, and export to a file (the
system won’t suggest a file extension to use, so you will have to come up with one such as
.UQE). Then start the Quota Entries report for the volume to set quotas on, and import the file.
User quota entries cannot be deleted until all files on the volume owned by the user are either deleted
or moved to another volume. You will be prompted to delete, take ownership of, or move the files as
Figure 4.6 shows. So be careful before applying the quotas, as you may have to raise the disk usage
limit instead of deleting the user quota. Also note that you can only delete files and not folders
through this interface.
101
Chapter 4
Figure 4.6: Prompt to delete, take ownership of, or move files when deleting user quota entries.
A final caution about using Windows disk quotas—when implemented, they can give a disk the
appearance of being limited in size, as Figure 4.7 shows. This appearance can cause some
confusion, especially if a new administrator on the system does not realize why the disk size
displays incorrectly and why the free space is so low—these are both calculated by the quota.
The quota limited size even shows in some disk utilities, such as the defragmenter, which then
incorrectly calculates the percentage of free space.
102
Chapter 4
Figure 4.7: Disk Quota limits appearance of disk size.
Creating Additional Storage Capacity
When you decide to create additional storage capacity, you have three choices: you can expand
an existing array, migrate to a new storage system, or attach to another storage system using the
Windows Server DFS. We’ll look at each of these options in turn.
Expanding an existing array may or may not be possible depending on several criteria. First, if
using DAS, the physical cabinet boundaries may be a constraint. If you’re unable to add drives,
the alternative may be to pull the existing drives and recreate the array using larger capacity
drives. To retain the same drive in the OS, you’ll most likely need to create the array and restore
the data from backup. If adding more drives to the storage cabinet is a possibility, the methods to
expand the array (the physical drive presented to the OS) must be considered. The methods will
depend on what is available to you. For example, your method will be different if you’re using
dynamic disks or a full-featured third-party management tool.
103
Chapter 4
Windows Server DFS
You can use the Windows Server DFS to expand storage on an existing server by creating a
logical view of the storage that is actually distributed across multiple disks or servers. When an
application or a user attempts to access a data resource, the DFS server redirects the request to
the server that actually hosts that resource. This feature can be powerful, as it hides the
distributed nature from the end user. Instead of having to know that Human Resource documents
are at \\server143\hrdocs and sales presentations are at \\server192\salespres, the user can attach
to \\server1\fileshare and locate any folder as a subfolder underneath. The additional power of
DFS is the power of replicating the information so that several servers have a copy of the
information. Thus, if you need to perform maintenance time on server X, you can take it down
and add new drives.
That is how Windows Server DFS is designed to work on the surface. But note a few cautions:
As with any product based on replication, you must proceed carefully or risk creating a topology
that is unable to complete its replication. Too many sites over slow links will cause the topology
to fall apart. DFS works great for LANs, but be sure to test the impact on available bandwidth
before designing across WANs. Based on experience, it is a difficult task to develop a rule of
thumb stating that you can replicate a DFS topology of x servers, holding y gigabytes over a
WAN of z bandwidth. Regardless, DFS-controlled replication can be advantageous over users
pulling files across the WAN during peak business hours.
Another area of caution is that SRM products aren’t really designed with a DFS topology in
mind, but how could they be? The DFS is designed to consolidate file shares, and SRM products,
must operate directly at the server. For example, StorageCentral SRM loads a driver that you
would need to load on each server in the DFS topology.
Windows Server RSS
Rather than providing additional storage through DFS, perhaps the time has come to create a
secondary tier of storage using cheaper (and slower, unfortunately) storage such as tape to ease
the burden on the primary storage systems. Enter the world of HSM in Windows Server through
the new RSS. This service is based on an application originally developed by Seagate (which is
now part of VERITAS). Thus, if the Windows Server version is enough to whet your appetite
and you’re serious about this functionality, there is a full-blown version available.
RSS is not included in WS2K3 Standard Edition, you must upgrade to the Enterprise Edition; WS2K3
Datacenter Edition also includes RSS.
HSM makes a lot of sense when you look at how you and others in your company access data.
Access is hectic for a few days or weeks, then the files sit just to satisfy the Murphy’s Law that
says “don’t delete that file or else you’ll need it.” Key to HSM is that it looks no different than
ordinary file storage to the end user, except retrieving a file takes a long time. Especially during
busy times, HSM may saturate the system and test the patience of the end users. Worst case is a
file that is accessed just infrequently enough to be rotated to tape but is then requested by many
users.
104
Chapter 4
To set up HSM or RSS in Windows Server, the administrator defines a policy such as the amount
of free space to maintain on a volume, which files are candidates for migration to the secondary
storage, and the latency period (how long the system should wait before a file is moved to
secondary storage). To pull off this magic, Windows Server maintains a database that contains
the reference to the storage of the actual file. When the file is accessed, it is copied from tape to
disk. RSS depends on the Windows Server Removable Storage Manager to handle all the device
library functions, such as mounting and dismounting media.
If you can, give RSS a try and see whether users can put up with the latency of offloading files to
tape, or whether you should add more disk-based storage. Whether you decide to add a new
storage system because you’re performance constrained or storage constrained, let’s look at the
hardware-selection process and the decision points for the types of physical storage and location
of physical storage. Perhaps you’re even considering additional servers and storage to build your
DFS. We’ll look at storage design and selection criteria in the next section.
Supported Storage Systems
At the highest level, there are only a few types of storage architectures that you can deploy on
your Windows Server systems: DAS, SAN, NAS, and a new type of storage architecture based
on the iSCSI protocol (this architecture is somewhat of a hybrid product and is new to the
market). The server-to-storage relationship can be one-to-many, with one server attached to
multiple storage devices or arrays, or many-to-one, with one storage network (either SAN or
NAS) attached to multiple servers.
The many-to-one model introduces new levels of complexity and issues that must be dealt with,
namely that anything that is shared must be managed as such, and conflicts in shared access must
be dealt with gracefully. Unlike DAS, both NAS and SAN introduce the possibility of
heterogeneous environments in which Windows Server must share storage with other OSs. We’ll
consider this important aspect as we look at NAS and SAN.
If you’re building a server from scratch, you pop in some form of removable media (CD-ROM
or diskette), boot to that media, copy files to magnetic media (a hard drive), and once the OS is
running, configure other devices such as tape or rewriteable optical media. These are all forms of
DAS. By its nature, DAS can be used only by the server to which it’s connected, and it’s
connected by a channel (usually a SCSI cable or an IDE cable) that connects all devices directly
to the system bus. As the need for more advanced forms of computer architectures, such as
clustering, drove the need for shared access to storage devices, the shared architecture of SAN
arose.
Networked Storage
The SAN is often depicted as a cloud (similar to a network cloud through which clients gain
access to servers) that lets any number of hosts (servers) be attached to any number of storage
devices (including disk, tape, or other media). Alternatively, NAS is a network cloud that exists
between the application servers and the storage devices. Each NAS system is known or accessed
by a name or IP address, just as you would address a server. Figure 4.8 illustrates the DAS model
versus SAN and NAS models.
105
Chapter 4
There are a few emerging hybrid technologies on the market. Most of these products provide a
protocol bridge between SCSI (used natively in DAS), fibre channel (used in the SAN), and IP
(used for NAS). The protocol bridge allows multiple hosts access to multiple storage subsystems,
but the storage systems typically remain isolated from each other—the storage systems retain the
unique characteristics of DAN, SAN, or NAS.
An upcoming technology is storage virtualization, the ability to treat all storage subsystems as
one big pool and allocate useable space much more dynamically with less concern for individual
cabinet or subsystem boundaries or storage capacities. I’ll provide more information about this
newly emerging technology in future chapters. For now, the virtualization capacity exists
primarily on a single storage system or on a single storage network.
Figure 4.8: DAS model versus virtualized storage model.
These islands of information soon became networked, sometimes as a peer-to-peer network,
which continued the same problems of isolated pockets of information. Centralized computing
through the client/server model placed the information in a shared access repository, often only
isolated by the reach of the network.
106
Chapter 4
Enter the Internet era, which removed even more barriers, by extending the network and
flattening the standards for information sharing (through the Web browser and XML), reaching a
broader variety of mobile devices, such as the Internet-capable Windows CE PDA that I’m using
to type this book in-flight. Much of the growth in Internet-enabling traditional brick-and-mortar
companies has been in making data repositories accessible through the browser and allowing
business-to-business communication or data transfer through a common standard language.
Does this same removal of boundaries apply to storage? Most definitely—any barrier that says
“It’s not that way, it’s over there” must be removed or at least virtualized so that the end user
connects to information at the highest level and doesn’t have to make a decision about which file
format to retrieve from which application on which server on which network.
Virtualization
Virtualization is a key concept in storage and computer systems in general. From my viewpoint
in the computer industry, the progress of the past 20 years has been about removing barriers to
information flow and access. We don’t need to go into a lengthy discussion about the demise of
the mainframe, but keep in mind that some of the same concepts will apply to removing storage
barriers. The benefit of PCs is that they allow individual workers the ability to perform
computing on their own schedule. You can play what-if on your spreadsheet or compose and
revise a memo (before printing it out and sending it through the postal mail) without having to
schedule mainframe time or otherwise interfere in IT operations.
Virtualization also plays a key role in fault tolerance and performance, as can be seen from loadbalanced Web server-caching farms. The user connects to a virtualized namespace served by any
one of a number of Web hosts, which may even contain a virtualized view of the data repository.
The same concept can be found in storage, as an application server connects to a pool of faulttolerant storage, accessing its allocated storage as opposed to its installed devices. On the OS
side, you need an OS that is aware of this virtualization or at least tolerates it more than NT does.
(NT tends to want to own any device it can see.)
So storage can be direct attached, removable (on a SAN or NAS), or some hybrid mixture. A
differentiating characteristic is cost. In the area of cost, I’m not a big fan of SANs—they’re
expensive beasts. However, if you must create shared storage or use advanced volume-cloning or
replication features, SANs are your best choice.
NAS has been coming on strong by providing relatively less expensive storage that can also
provide some data-protection features. Although NAS is solving one problem by giving you a
massive amount of storage space, it’s creating another problem in that you must somehow
manage that storage. Think of all the storage-related applications or functions that you can add to
your NT and Windows Server servers—backup, antivirus, performance monitoring, utilization
monitoring, quota management, file recovery, and so on—and consider how you’ll make sure
that you have the same level of functionality on your NAS devices. A final note about NAS:
NAS can’t be used with some applications, such as Exchange Server. Thus, nothing can beat
DAS for cost-effective raw performance. By the time this book is complete, next-generation
RAID controllers will hit the market that can outperform the high-end SAN controllers provided
by those huge data-storage systems targeted at the Intel market and beyond.
107
Chapter 4
Storage Solutions
The storage market is characterized by typically having a few dominant players in each category
or type of device or product, and many categories of products are necessary to put together a
complete solution. How does this market setup apply to your Windows Server storage
deployment?
You’ll most likely be using products from a wide variety of vendors to put together an end-toend solution (from hardware to software). One bit of advice about the difficult process of
selecting storage devices and storage-management products: No doubt, you’ll put a lot of work
into research, through reading or talking to others at technology events. You’ll probably come
into contact with what I call technology bigots. Personally, I try to keep an open mind and keep
an eye out for new solutions. Technology bigots will attempt to sway you or cause you to
question the value of the solutions that you put together. They’ll claim that your product vendor
is no good and that the one they use is the right one. Some people just feel the need to justify the
expense of their solution. Many times what is actually taking place is comparing two completely
different solutions (like comparing a 4 × 4 truck with a two-seater roadster). Is one solution
better than the other? Yes, for approaching different problems or needs. You might find that
using products and solutions from a vendor that is familiar to you and well-tested in your
Windows Server Intel architecture is more appropriate than being swayed by some massive
storage behemoth lumbering down from the mainframe world.
The same goes for the storage-resource management solutions—I like to focus on the ones that
have a solid background in NT and Windows Server, and solve my pains as a Windows Server
administrator. It’s OK if the product can talk to 16 flavors of heterogeneous environments, but it
had better be the best choice in the environment that I care about—Windows.
Storage Service Provider
Another form of storage architecture that you might consider or already find in place within an
organization is the use of a Storage Service Provider (SSP) for outsourcing storage needs. The
appeal of the SSP is that you can purchase (or lease) your storage on a monthly basis and avoid
the associated headaches of storage expansion: space, power, cooling, and so on. This solution
might be appealing for the start-up company that faces growing storage needs but isn’t sure that
it can survive through the next few years. I’m just mentioning this option as a possibility,
although Windows Server makes hosting your own storage easy enough. Perhaps even easy
enough that you can offer storage as an outsourced utility that you resell to others, assuming that
you have the bandwidth to your facilities. The most likely benefit that an SSP can provide is
keeping a secondary data set at off-site storage for disaster recovery protection.
Storage takes a lot of bandwidth—we’re not talking about network transmissions, we’re talking
about moving huge amounts of data to and from channel devices. Let’s take a look at these
devices.
Storage Devices
In deploying your storage, you’ll need to choose amongst myriad storage devices. We’ll look at
the characteristics of each type to help you decide. No doubt you’ll be constrained by financial
resources and must make some tradeoffs—should you spend more to get that 15,000rpm drive or
stick with the 10,000rpm drive and get more drive space? Meanwhile you must protect your
investment from obsolescence before you have achieved satisfactory return.
108
Chapter 4
Most likely you can relate to the experience of explaining to a non-technical person how a
computer works. Remember the frustration of dealing with the fact that he or she could not keep
the concept of memory (RAM) distinct from storage (the hard drive). You were forced to clarify
storage as being distinct from memory in the following areas: used for holding larger amounts of
information for longer terms, sacrificing performance for a more cost-effective mechanism to
increase capacity. At times the boundaries may blur, with memory technologies being used for
storage, but the distinctions still hold true. Earlier computer systems considered memory as
primary storage and other mediums secondary storage, such as magnetic tape or even punch
cards. You may be familiar with the concept of a RAMDRIVE, which emulates the properties of
a storage device by presenting memory as a drive volume—with one exception: information is
lost when power is no longer supplied. This exception is where we’ll draw the line between
storage and memory; a storage device must not be quite so volatile. The characteristics of the
RAMDRIVE prevent any portability between systems, which is important functionality of many
storage devices.
From a practical standpoint, storage isn’t just about capacity and speed. Storage devices’ socalled abilities are extremely important: reliability, availability, scalability, and manageability.
We’ll take a look at applications that can be added in each of these areas as well as define the
necessary processes to achieve these abilities.
As Figure 4.9 shows, performance typically comes at a greater cost per megabyte.
Increasing Performance
Increasing
Cost
Tape
Optical
Magnetic
Solid State
Increasing Performance and Cost per Megabyte
Figure 4.9: Relative positioning of storage devices.
109
Chapter 4
Performance can be measured in many ways. The first is in the data-access method; whether it is
random access, at any point on the device in near real-time, or linear access, which can take a
substantial amount of time to reach the end of the media. After the initial data point is located,
performance is often a measure of access speed: retrieving the bits and sustained throughput and
moving the entire chunk of information to the requestor. At some point, we must make a
purchasing decision: which type of storage provides sufficient performance at a reasonable cost.
Most likely, the answer will be magnetic disk drives.
Disk Drives
As you plan your storage, you’ll need to select storage devices, so I’ll provide a quick overview
of the types of storage devices supported in Windows Server. The main types of storage devices
are hard drives, solid state drives, tape, and optical drives and media. The most common type of
storage technology that we’ll work with is magnetic disk drives. Most of the time when I talk
about storage devices, I’ll be talking about hard drives. Hard drives, hard disks, and fixed disks
are all synonymous for storage that sits near the middle of the chart in Figure 4.9.
Storage Density
For a while, hard drive vendors had reached a ceiling and squeezing more storage into the
existing form factor of hard drives was becoming increasingly difficult. The problem is that the
magnetic particles used to store the data on the spindle platters can be only so close before they
start interfering with each other. There are some workarounds to this problem such as increasing
density, for example, as the IBM pixie dust breakthrough claims to quadruple disk drive density.
From your knowledge of the storage market, you know that hard drive density has been steadily
increasing. A hard disk drive is typically composed of multiple platters, all of which are accessed
as a spindle, the total formatted hard drive capacity. Single platter density has been increasing,
recently reaching 40GB, and is expected to double to 80GB over the next year. The next
doubling, to 160GB, will take place the following year, but will be of less interest until a barrier
at 137GB is broken. The 137GB barrier exists because current drive technologies use a 28-bit
architecture for accessing data and can’t handle greater capacities. Maxtor and other industry
players such as Microsoft are working on 48-bit logical block addressing in the disk interface
that will allow as many as 144 petabytes of data (once the 32-bit address space of most OSs
grows beyond the 2.2 terabyte limit).
Since the greatest storage capacity is achieved with multiple platters, you may be asking why
hard drive manufacturers would ever build a single platter drive. If you actually had the choice of
buying two drives of 40GB capacity for the same price and found out that one was multiple
platters and the other was single platter, why would you prefer one to the other? The reason that
you might pick the single platter drive is that it should achieve greater levels of reliability
because the single platter means fewer mechanical components in the drive. It might even be a
smaller form factor and cost less. But the multiple platter drive most likely will outperform it, for
reasons that we’ll soon see.
110
Chapter 4
SCSI versus IDE
When designing high-end storage systems, SCSI hard drives are preferred over IDE drives for
many reasons. IDE drives are more common in personal desktop systems, but are definitely
available in some NAS. SCSI drives have always been faster than the fastest IDE drives and
allow more devices in larger disk arrays. But a newer type of IDE drive, referred to as a Serial
ATA (SATA) drive has entered the range of SCSI performance. Boasting speeds of as fast as
150MB/sec (in short durations, rarely sustained) the SATA drives are also more highly
instrumented, like SCSI drives, meaning that they use self-monitoring to allow alerting on
predicted drive failures. This is important, as the SATA drives to do not have the hardware
reliability of the SCSI drives and must be monitored.
Until recently, only SCSI drives have been used in RAID systems, but IDE RAID is certainly
available and could provide fault-tolerant drive protection for a small workgroup server.
However, IDE can add only two devices per channel (with most IDE controllers offering only
two channels), and SCSI definitely has the advantage of supporting chains of 8 or 16 devices.
Table 4.4 compares IDE and SCSI standards as well as Fibre Channel, which is essential for
SAN. The variety of choices can be confusing; fortunately, newer designs such as ULTRA3 for
hard drives are backward-compatible even though the cabling usually isn’t.
Channel
Name or
Standard
Transfer Speed
(Maximum)
Comments
UDMA 33
(IDE)
33MB/sec
Technical specification for the channel; however, rarely
achieved in any duration especially with the IDE-device
limitation of two per channel.
UDMA 66
(IDE)
66MB/sec
Same as above.
SCSI or
SCSI-1
Asynchronous data
transfer rates of
1.5MB/sec and
synchronous transfer
rates of 5MB/sec
As many as seven devices per adapter; narrow (8-bit),
single-ended cabling as long as 6 meters; SCSI bus clock
rate of 5MHz.
SCSI-2
A standard, the speeds
are defined by the
implementations below.
The standard that defines the differential interface, allowing
as long as 25-meter cable length and single-ended SCSI
cable length as long as 3 meters. Adds 16-bit and 32-bit
wide data bus, doubles data throughput by doubling the
clock rate, and a smaller 50-pin, high-density connector.
Includes other improvements in reliability through
synchronous negotiation and parity checking.
FAST SCSI
or FAST
SCSI-2
10MB/sec
Narrow (8-bit) at 10MHz; doubles the SCSI bus clock rate of
5MHz.
WIDE SCSI
10MB/sec
16-bit at 5MHz but requires two cables (A+B). The wide
designation indicates a 16-bit data path.
FAST WIDE
SCSI
20MB/sec
Transfers data over a 16-bit wide SCSI bus at 10MHz.
SCSI 3 and
SPI
A standard, the speeds
are defined by the
implementations below.
Establishes the SCSI Parallel Interface (SPI) using a 68-pin,
high-density connector for 16-bit wide SCSI.
111
Chapter 4
Channel
Name or
Standard
Transfer Speed
(Maximum)
Comments
ULTRA SCSI
20MB/sec
Also called Fast-20. Doubles the FAST SCSI throughput for
8-bit and has the same cable lengths as WIDE ULTRA
SCSI.
WIDE
ULTRA SCSI
40MB/sec
Doubles the FAST WIDE SCSI throughput to 40MB/sec for
16-bit. Maximum single-ended SCSI cable length of 1.5
meters for five to eight addresses (devices) and 3 meters for
four or fewer addresses. Maximum differential (HVD) SCSI
cable length of 25 meters.
ULTRA 2
and SPI-2
A standard, the speeds
are defined by the
implementations below.
A second SCSI Parallel Interface (SPI-2) that doubles the
bus speed to ULTRA 2 (Fast-40) SCSI throughput. Uses a
new electrical interface known as Low Voltage Differential
(LVD) as opposed to the older TTL-based differential SCSI
(now called High Voltage Differential or HVD). SPI-2 adds
two new connectors, the 80-pin Single Connector
Attachment (SCA-2, which includes the 16-bit SCSI signals
and power for the peripheral) and the Very High Density
Cable Interconnect (VHDCI) connector, a small 68-pin wide
SCSI connector allowing for as many as four connectors on
a controller back plate.
WIDE
ULTRA2
SCSI
80MB/sec
16-bit data path with ULTRA 2 SCSI SPI-2, effectively
doubling the bandwidth. Maximum cable length of 12
meters.
Fibre
Channel
100MB/sec to 2Gb
100MB/second is the single-loop speed. Fibre Channel is
capable of combining multiple paths in a SAN to boost
channel bandwidth several times. Encapsulates channel
transmissions for SCSI devices on both ends. Maximum
cable length of 10km to 100km (depending on the cable type
and wavelength), far exceeding the native SCSI capabilities
without SCSI extenders. Can theoretically support 126
devices in a loop and more in a switched environment.
ULTRA 3
and SPI-3
A standard, the speeds
are defined by the
implementations below.
SPI-3 doubles the SCSI bus speed to Ultra 3 (also known as
Ultra160 and Fast-80). Requires a 16-bit minimum bus width
and uses Double Transition (DT) clocking (on both the rising
and falling edges). Also includes Domain Validation (a write
verification test at full data rate, which will fall back to the
next lower speed if not successful) and a 32-bit cyclic
redundancy check (CRC) for better data integrity. U160/m is
a limited set of SPI-3 including these three improvements:
DT clocking, parts of Domain Validation and CRC. SPI-3
also defines Quick Arbitration and Select (QAS), and
Information Units (Packetization). Maximum cable length of
12 meters or 25 meters for point-to-point.
ULTRA160
SCSI
160MB/sec
Also known as Ultra 3 SCSI. Based on Fast-80, capable of
160MB/sec as long as 12 meters.
ULTRA320
SCSI
320MB/sec
Also known as Ultra4. At this point, the next step in the
development of SCSI standards, not a production
technology yet.
Table 4.4: Channel standards and resulting speed.
112
Chapter 4
Now let’s turn our attention to designing a storage system with fault tolerance to protect our data
against catastrophic loss (or even ordinary, everyday events).
RAID
RAID was designed to eliminate individual disk spindles as a single point of failure. Hardware
RAID controllers, sometimes with vendor-specific designs or feature sets, can implement RAID.
RAID can also be implemented in the OS itself, through software RAID in Windows Server or
an application. Table 4.5 shows the different RAID levels.
Unlike the others, RAID0 is merely a multi-spindle technique and provides no fault tolerance. In
fact, it increases the probability of data loss: as more spindles are added to the stripe set the
chance of one failing will increase. The other RAID levels offer distinct advantages in certain
situations. The same approach for fault tolerance or performance striping can also apply to other
storage devices, such as Redundant Array of Independent Tapes (RAIT).
RAID
Level
Implementation
Percentage
of Useable
Storage
Explanation
Comments
RAID0
Stripe set
100 percent
Data blocks written
across several spindles
No fault tolerance, but
best performance
RAID1
Mirror set
50 percent
Same data blocks written
to two spindles
Typically a two-spindle
implementation; for more
disks, see RAID0+1
RAID2
Hamming error
correction
n/a
Provides error correction
for drives that don’t have
built-in error detection
Rarely implemented, as
SCSI drives provide error
detection
RAID3
Byte-level striping
with parity spindle
n-1*
Similar to RAID4
dedicated parity spindle,
except byte-level striping
Not widely implemented
due to intense activity
and increased likelihood
of failure of parity spindle
(although read
performance similar to
RAID0)
RAID4
Known as Data
Guarding; blocklevel striping with
parity spindle
n-1*
Similar to RAID5 blocklevel striping, except
dedicated parity spindle
Not widely implemented
due to intense activity
and increased likelihood
of failure of parity spindle
(although read
performance similar to
RAID0)
RAID 5
Known as
Distributed Data
Guarding; stripe
set with parity
n-1*
Data blocks and
information necessary to
rebuild any missing
blocks if a single spindle
fails; written across
several spindles
Write-performance
penalty, as all disks must
be read, parity
recalculated, and all disks
written to, to update
information on disk; some
read performance impact
as parity information must
be read-over or skipped
113
Chapter 4
RAID
Level
Implementation
Percentage
of Useable
Storage
Explanation
Comments
RAID0+1
Mirrored stripe
sets
50 percent
Stripe sets mirrored to
another stripe set with the
same number of spindles
Can be considered the
same level of fault
tolerance as RAID10,
depending on whether it
can survive loss of two
disks in the same stripe
set
RAID6
Multiple parity
stripes
Less than n1
Similar to RAID5 but a
second parity calculation
is striped across all
drives, adding more fault
tolerance
Rarely implemented
RAID10
Striped mirror sets
50 percent
Mirror pairs combined
into a stripe set
Can be considered the
same level of fault
tolerance as RAID0+1,
depending on whether it
can survive loss of two
disks in the same stripe
set; typically a vendorspecific design
RAID3/5
or
RAID53
Hybrid approach
n-1*
Vendor-specific hybrid
approach
Designed to remove the
parity update penalty of
RAID5 and soften the
impact to the RAID3
drive; often a RAIDcontroller specific
implementation (not really
RAID3, but RAID3-like)
Other
Vendor specific
n-2*
A variety of other
techniques such as
Advanced Data Guarding
(ADG), which writes
parity information to more
than one spindle
Designed to provide more
fault tolerance than other
RAID without sacrificing
performance
*n-1 useable storage for RAID means that one spindle in a set of n is given up for parity information
and is not available for data storage.
Table 4.5: RAID implementations and analysis.
RAID Controllers
As I previously mentioned, RAID support in Windows Server is built-in through software RAID
and can also be purchased from third-party vendors as software or hardware (RAID SCSI
controllers). As is often the case with other features and functionality in the OS, Windows
Server’s software RAID can provide only limited functionality in certain circumstances and will
fall short for many applications. The built-in software RAID can be used to mirror (RAID1),
stripe (RAID0), or parity stripe (RAID5) disks.
114
Chapter 4
Hardware-based RAID justifies its cost by providing high performance and manageability. The
high performance is achieved by adding high-speed cache and a dedicated processor, essentially
a dedicated server on a card. Replacement of failed disks in RAID sets and recovery of disks is
usually much easier with a RAID controller, as long as the following caution is observed.
Highest performance is achieved on a RAID controller by enabling write-back caching, which
acknowledges to the OS that the data is safe on disk even though the data is really in the controller
cache waiting for the disk array to accept the data. In the event of a sudden loss of power, the data
would be lost unless the controller or the server has a battery (Uninterruptible Power Supply—UPS).
Some RAID controllers have a battery-backed cache onboard, which is a must if enabling write-back
caching, but the UPS is a must to protect the server, especially if the hard drives use caching.
Hardware RAID purists can be rather snobbish at times, turning their nose up at software RAID,
but there is a definite need and benefit for using software RAID. Hardware RAID is the way to
go for high performance and a wider variety of migration and recovery options. But software
RAID can be useful when riding on top of a hardware-based RAID controller. For example,
software RAID enables you to create larger disk volumes than are possible using the hardware
RAID controller itself. Recovery of software RAID in NT can be tenuous, as information is
stored in the system registry. But in Windows Server, the information is stored on the disks
involved in the form of metadata, which means that the storage should be recoverable even in the
event of total system loss.
Performance Design
Now that we’ve looked at fault tolerance, let’s take a look at performance considerations.
Because the costs and capacities of storage have been dropping rapidly, we’re able to store
greater amounts of information. But the speed of moving the information from the hard disk into
the system memory hasn’t grown at the same rate. So what you end up with is a relative loss in
serving up information—much more information moving slightly faster means that storage can
become the bottleneck in modern information systems.
There are two broad categories of devices that you’re probably already familiar with: random
access and linear access. Hard drives are random access and can quickly find data located
anywhere on the drive. Tape drives are linear access and must move a portion of the tape, which
varies depending on how far from the beginning the data is. Even hard drives have a delay
inherent when seeking the location of information, though it is nowhere near as much as for a
tape drive.
If a request for data is made and the head is not in the right position, it must wait for the platter to
complete its rotation before it can begin serving the data. Much emphasis has been placed on
rotational speed, as SCSI drives have increased from the 5400rpm to 7200rpm range to a
10,000rpm to 15,000rpm range. Rotational speed typically has a large impact on performance,
but the point to understand is that rotational speed is only a single factor affecting performance,
and it may not be the limiting factor.
115
Chapter 4
Performance of a storage subsystem is measured in several variables. First is the access speed, or
how long it takes for the storage device to be ready to read or write a requested piece of data
after an access command has been issued. Once the data is found, there is a time period, or
latency, that it takes to move the bits from the storage medium to the channel. Once the data is in
the channel, another measure of performance is how fast the device can continue to deliver the
data—the measure of sustained throughput or data-transfer rate, often referred to as bandwidth.
Oddly enough, this measurement is the performance criterion most people are familiar with and
is often a factor determining purchase, even though it may not be the most important criteria for
system performance. Quite often, applications are characterized as being either bandwidth
intensive (the most common example being streaming media such as video) or input/output (I/O)
intensive (a common example being a transactional database).
The total access time of a storage subsystem is the sum of all these factors: seek time, latency,
rotational speed, bandwidth, I/O capacity, and controller overhead. For example, an array of
several drives may be replaced with drives offering a faster rotational speed, but no performance
gains may be achieved if the controller is the limiting factor or if the number of drives in the
array can’t sustain the I/O requirements. I’ll illustrate this idea with a performance design
example in a moment.
A typical SCSI hard drive listing may list the following attributes: 36GB, U160W, 6.7ms,
68Pins, 7200rpm, 4MB cache. At this point, we should know enough to decipher this hard drive
as being an ULTRA3 SCSI hard drive with a 6.7 millisecond seek time, and a rotational speed of
7200rpm. The connector uses 68 pins, introduced in Table 4.4 under the ULTRA2 VHDCI
connector. The 4MB cache helps to speed disk access as the high-speed chips reduce the
occurrence of physical disk access. Most likely you’ll be asking “Is this disk the right disk for the
job?” as you compare it with higher-priced offerings for faster rotational speeds, lower seek
times, and greater capacities.
As you can see, there are quite a few variables to consider when designing the optimal storage
system. The following list provides questions to ask in determining the optimal storage system:
•
Application characteristics: Is it bandwidth intensive or I/O intensive?
•
What is the mix of read/write ratio?
•
Does it perform sequential or random I/O patterns?
•
Does the application have several components that perform with different characteristics?
Through most of this book, we look at the primary use of Windows Server storage as file-sharing
services, but we also look at managing application servers as a storage resource. To illustrate an
application server with a mix of performance characteristics, in the next section, we’ll look at
Exchange Server 2003.
116
Chapter 4
Performance Design: Exchange Server Example
Let’s look at a Windows Server application for a moment because it helps to illustrate many of
the different types of performance characteristics. For example, an application such as Exchange
Server 2003 acts as several types of services or applications and performs a mix of several
patterns. As a result, the optimal storage design for an Exchange Server, from a performance
standpoint, is to build separate storage volumes for each type of disk access. Let me emphasize
this point: the biggest bang for the buck comes not from tweaking disk array parameters but from
designing separate storage based on disk access patterns.
To illustrate, Exchange Server is a transactional database, which means that every incoming
transaction must be committed to a log file in case there is an interruption of service (power loss
or server crash). Each transaction is a messaging action, such as when you open an email, or
when you type a new email message and save it to a particular folder. Transactions are the
smallest unit of work that the Exchange Server needs to keep track of (to ensure that they’re not
lost or unnecessarily repeated in the case of server crash and recovery). The write to the log file
is sequential, and once it is committed to the log file, it can also be committed to the database
volume. No reads need to be made from the log file as long as the current transaction is in
memory, which it will be, unless the server had a hard crash and needs to replay the transactions
from the log files to determine where it left off. This circumstance, the reading of log volumes, is
unusual enough that you would not design the log drives for it, and instead design them for
continuous sequential writes.
The database part of a transactional database system is responsible for writing these committed
transactions to the database files. Quite often, this functionality extends or grows the size of the
database file, although this growth isn’t always the case, as it could write the transactions to any
available white space in the database file (space that has been freed up by deleting items). While
the database volume is writing these transactions, such as email coming in to your Inbox, you’re
also reading other email or checking your calendar, which means that the database must be read.
So you have a random mix of read/write on the database files, usually in the range of 50/50 that
is quite a different pattern than on the log files. Thus, for highest performance, the log files are
placed on a separate volume than the database volume.
In Exchange Server 5.5, this process was much simpler than it is in Exchange Server 2003,
which provides the ability to create multiple storage groups within the same server information
store process—each storage group having its own set of log files. So the tough question became
“Do I place each set of log files on its own disk array or should they share the same log disk?” If
they share the same log disk, the disk access is no longer pure sequential writes, and the disk
heads must move between multiple locations on the disk, much as they would on a heavily
fragmented drive.
Each log file array would most likely be a mirrored set of two spindles; thus, creating two logical
volumes on the same array hardly makes sense. It might help from an administrative or logical view,
but it would do nothing from a physical performance standpoint—you would still have the loss of pure
sequential writes.
117
Chapter 4
Exchange 2000 Server introduced another database file format to the information store, known as
a streaming media file (it has a .STM file extension). The purpose of this file is to allow direct
reads and writes in native Internet protocols—such as SMTP, POP3, and IMAP—for much faster
disk access and higher bandwidth streaming. This functionality can have an impact on your
storage design, as it gives you the possibility of separating the .STM files from the .EDB files for
higher performance. The size of the database pages is also enlarged from 4Kb for the native
Exchange EDB to 32Kb for the .STM file, so larger writes can be performed. This factor can be
taken into account when selecting the size of the allocation unit when formatting the disk. So a
possibility is to design an Exchange Server with multiple disk arrays, for each set of transaction
logs associated with each storage group (for the .EDB database file and possibly even the .STM
database file).
Another reason to use Exchange Server as an example for designing storage subsystems is that it
also performs another type of disk access: an Exchange Server can exchange messages through
SMTP queues, either as part of inter-server routing or as part of SMTP routing to the Internet.
The SMTP queue disk access pattern is a constant stream of writing messages in and reading
messages out. Messages must be committed to disk, to prevent data loss, before they’re
forwarded or delivered to another SMTP server.
Designing for Windows Server file and print services may actually be a bit less precise, as the
read/write mix and patterns may shift over time, making it more difficult to define a specific
storage volume. If there is a situation in which you can separate the sequential, read-intensive
disk access from the random write-intensive pattern, this separation would be beneficial, just as
in the example of Exchange Server.
Let’s apply what you have read so far to designing a streaming media server that will have online
training videos (such as Windows Media or Real Media files) uploaded to it for trainees to
watch. Because the files change infrequently and read performance is paramount, you would
place the video files on a RAID5 volume for best read performance and fault tolerance. If fault
tolerance isn’t important (if this is a replica or mirror of another server and the data drives can be
easily rebuilt) then RAID0 would provide excellent performance.
Summary
In this chapter, we covered two areas: structuring the SRM project and Windows Server
functionality. We looked at various ways of creating additional storage capacity—from linking
systems together through Windows Server features such as DFS and volume mount points to
adding storage systems. I briefly covered storage hardware to make you aware of some of the
design features to look for and some of the tradeoffs between performance and capacity. Also
important is the choice of storage applications and storage location alternatives, as this selection
affects your choice of SRM product and process. The outcome of your storage analysis and
planning is both the storage-management decision points—such as whether to add more storage
to the management problem or to make better use of your existing storage capacity—and the
development of an organizational storage policy.
118
Chapter 4
We exhausted the capabilities of the core Windows Server functionality and SRM features,
including quota management, before turning to a more comprehensive solution. I detailed the
product functionality that assists you in reaching SRM goals through setting disk quotas:
eliminating duplicate files, eliminating unused files (aged and orphans), eliminating wasted
space, and reducing excess consumption.
In the next chapter, we will look at using storage management tools to make better use of either
your existing storage or your newly deployed storage. We’ll cover testing the SRM solution,
getting feedback, and assessing the effectiveness of the implemented solution. I will give you a
heads up about problems that you should anticipate, and assist you in creating a communication
and education plan to smooth the impact of your project. In addition, I will give you several
deployment templates that can be quite valuable in guiding your pilot phase of the SRM
deployment.
119
Chapter 5
Chapter 5: Piloting and Revising the SRM Plan
In the previous chapter, we explored structuring the SRM project and Windows Server
functionality. We looked at various ways of creating additional storage capacity—from linking
systems through Windows Server features such as DFS and volume mount points to adding
storage systems. I briefly covered storage hardware to make you aware of some of the design
features to look for and the tradeoffs between performance and capacity. Also important is the
choice of storage applications and storage location alternatives, as this selection affects your
choice of SRM product and process. The outcome of your storage analysis and planning is both
the storage-management decision points—such as whether to add more storage to the
management problem or to make better use of our existing storage capacity—and the
development of an organizational storage policy.
We exhausted the capabilities of the core Windows Server functionality and SRM features,
including quota management, before turning to a more comprehensive solution. I detailed the
product functionality that will assist you in reaching your SRM goals through setting disk quotas:
eliminating duplicate files, eliminating unused files (aged and orphans), eliminating wasted
space, and reducing excess consumption.
In this chapter, we will look at using storage management tools to make better use of either your
existing storage or your newly deployed storage. We will look at the process of installing,
documenting, and testing the SRM solution, preparing a staging of your real-world deployment.
In this chapter, I’ll give you some sample tests and tools to use in your lab evaluation to help test
and select an SRM product.
This chapter will use a specific SRM product as an example, but I encourage you to download and
evaluate any SRM products that you are interested in evaluating.
There are several aspects of an SRM product to test, from measuring performance impact to
assessing system stability and watching out for any negative interactions with third-party add-on
applications (antivirus, backup, and other disk utilities) and Windows Server storage features.
We’ll cover testing the SRM solution, getting feedback, and assessing the effectiveness of the
implemented solution. Testing more complex storage systems and scenarios, such as NAS, DFS,
and clustered servers, must also be considered. I will give you the heads up about some problems
that you should anticipate, and assist you in creating a communication and education plan to
smooth the impact of your project. In addition, I will give you several deployment templates that
can be quite valuable in guiding your pilot phase of the SRM deployment.
120
Chapter 5
After lab testing, you can subdivide the pilot phase rollout into two mini-phases, the first
involving technical pilot users, and the second involving production business users. The goal of
the pilot is to maximize knowledge about the SRM solution interaction and issues and to
minimize the occurrence of negative outcomes and problems during the production rollout. In
addition to stability testing, we will gain feedback in other areas, such as measuring the
effectiveness of the implemented solution. This feedback will come from the system testers and
the end users involved in the pilot phase. When assessing the effectiveness of the implemented
solution, there are several angles to the measurement criteria, such as asking “Does it work?”
from a technical, business, and personal—meaning “How do the users like it?”—perspective.
In addition to giving you information about problems that you should anticipate, I will assist you
in creating a comprehensive plan to soften the impact of your project. This phase in the project
will help to test the communication channels, especially if problems escalate with the SRM
solution. At this point, you will continue to refine production team roles, support boundaries, and
contact points (beyond those laid out specifically for the project in the last chapter, as the roles
might need to be transitioned from external consultants to internal administrators or operators).
Several communication documents can be prepared for the end users, ranging from pre-emptive,
informative brochures to an intranet Web site that provides answers to frequently asked
questions (FAQs). Other documents created during this phase may include a share creation
request form. I’ll give you templates to help get you started creating such documents. In addition,
we will explore the share creation process, including migrating files from other systems. Finally,
we will prepare for the production deployment and look at where we will revise the solution as
needed. You will find that this chapter has quite a few tables in it, designed for you to use as
templates and give you a head start in your SRM deployment project. There is extensive
information, but it is not complex, so you should not get bogged down in the details. Table 5.1
shows the pilot phase of the SRM process.
Phase
Process
SRM
Pilot
Test the solution and revise
the solution based on
feedback
Install and test the SRM product(s). Communicate
policies, educate end users, and assess
effectiveness of the implemented solution.
Table 5.1: Phase 4 of the SRM process.
Policy-Based Management
Newer SRM products are designed around a model called policy-based management. The idea is to
create a set of storage-management policies and apply those polices to storage objects, hopefully in a
centralized cascading manner, as is made possible by AD. I have used a policy-based SRM product for
years (obviously not AD-integrated), but it lacked a dynamic approach to storage management—it mostly
generated reports on-demand to let me know who or what was operating outside of policy. Thus, I was
excited to discover that there are third-party products available that enable you to actively control disk
usage in real time. Although earlier versions of many of these products, such as StorageCentral SRM,
have let you use storage policies that are stored in predefined templates, the newest versions of these
products provide greater integration of storage allocations and file blocking as well as centralized
management when integrated into AD.
121
Chapter 5
Installing and Testing the SRM Solution
As part of this phase of the project, you will be installing the SRM solution. Be sure to take the
time to capture information about the installation process, as this information is much easier to
gather while you’re performing the installation rather than trying to recreate the same scenario
later.
Documenting Installation Procedures
To monitor and administer storage policies on servers that have an SRM suite installed, you will
want to install the management console on your computer (as opposed to the server component,
which, in the case of StorageCentral SRM, installs the filter driver used for policy enforcement).
Depending on the type of policy-based storage manager you use, installing and configuring your
polices on a per server and per share basis can be either quite easy (using a centralized and
template-driven product) or repetitive and quite unpleasant (exporting and importing or manually
configuring Windows Server quotas as detailed later).
I designed Table 5.2 for you to use as a template for setting your organizational storage policies.
This table lists some of the information that should be decided upon before handing over the
SRM implementation to the systems administrators or server operators. This information should
be captured and distributed to any site or location within your organization that wants to
implement policy-based storage management.
Installation
Requirements
Sample Setting
Your Setting
Product
StorageCentral SRM 5.2
Product source share
\\FileServ\Product\SRM
Setup type
Server or workstation (monitoring console)
Is the AD schema
being extended?
If yes, contact the Schema Admins group because
setup must be run on a domain controller
Are you performing a
new installation or an
upgrade?
If you’re performing an upgrade, uninstall the
earlier version and reboot
Product dependencies
Windows Server 200x Service Pack x
Product serial number
or license key
xxxx-xxxxx-xxxxx
Default installation
path
%systemroot%\Program Files\<StorageCentral>
Installation destination
folder
%programfiles%\VERITAS\StorageCentral
SRM\5.2\
Default
Selected components
Agent: Command Line, Collector, and the Web UI
Server Appliance: MMC UI
UI: MMC only or
Workstation (Monitoring
Console)
Service account
configuration
See the Service Accounts section later in this
chapter
Do not record the service
account and password
here; check with your
supervisor for the secure
location of the password
122
Chapter 5
Installation
Requirements
Sample Setting
Your Setting
SMTP email
configuration
Email address for receipt of administrative
notifications
Name or IP address of the SMTP gateways (note
that using the fully qualified domain name—
FQDN—of the mail host requires DNS name
resolution, so the IP address is sometimes used
instead)
Ping the host and record
the IP address
Reboot server after
install?
Use the shutdown.exe resource kit utility (for
Windows Server) or the one included in Windows
Server 2003 (WS2K3) to schedule a shutdown
and restart after normal business hours (option /r
will close all running applications and reboot).
Table 5.2: SRM product installation requirements.
Unattended Installation
Depending on how your organization prefers to deploy software, you might have a requirement
that the SRM product support unattended installation and configuration. Check with the vendor
to determine whether the product comes with a Windows Installer package and supports
deployment through Microsoft Systems Management Server (SMS). The pilot phase offers the
opportunity to learn and test the automated deployment.
Service Accounts
Most likely, the SRM product that you install will use service accounts. Many applications
provide the option to install under local system authentication under service startup logon
options, but this functionality can limit network access for the service between servers. Instead,
you will need to define necessary service accounts and grant them appropriate privileges.
To assist in delegating the configuration portion of the installation, the following table defines
the objects to be managed and the storage policy configuration settings. Rather than starting from
scratch, hopefully, your policy-based storage management product has predefined templates or
policy definitions that you can reuse and copy from one system to another. You use the template
that Table 5.3 shows during the installation. It will require customization depending on the server
resources and how you want to allocate them.
123
Chapter 5
SRM Policy Configuration
Sample Setting
Group and Machine
Windows File Server or
Application Server
Managed Object
Server\Share
Predefined Allocation Policy
250MB passive limit with blocking
Custom Modifications
Disk space limit
Overdraft limit
Passive limit
Send disk full error code
Always save open files
Exclude from folder limit
Reset high water mark
Default = 250MB
Default = 0MB
Default = Off
Default = On
Default = On
Default = Off
Default = Off
Thresholds
First
Second
Third
Fourth
Fifth
Trigger above (Default)
100 percent
Alarm Actions
Notifications
Notify User
Send to Event Log
Notify Administrator
SNMP Trap
Record Alarm
Event Log Server
Mail to
Execute Program
Run Report
Extend (Adjust disk space limit)
Default = On
Default = On
Default = Off
Default = Off
Default = Off
Default = None
Default = None
Default = None
Default = None
Default = Off
Include file blocking policy?
Yes
File Types to Block
Backup Files
Executable Files
Media Files
Alarm—Blocking Options
Passive
Check file content
Your Setting
Default = Off
Default = Off
Optional File Filters (Exceptions)
Include =
Exclude =
Propagate changes to managed objects
Default = On
Transparent Policy
Default = Off
Group Associations
Default = None
Table 5.3: SRM policy configuration template.
124
Exchange
Server mailbox
or SMTP Mail
Chapter 5
Pre-Deployment Testing the SRM Solution
Key to evaluating SRM products is to properly set up the test environment. Ideally, you want to
evaluate the SRM product’s performance and features relative to your needs. But you will also
want to measure the impact of SRM on performance and the end-users’ perceptions of how the
SRM product either hinders or helps their daily tasks.
Testing System Stability
To ensure that you are thorough during testing, a helpful practice is to break this phase of the
project into sub-phases and assign roles. After all, the production deployment is going to be
thorough in its testing! And the problems that are uncovered during deployment will need to be
solved at a much more hectic pace than during the pilot phase. Table 5.4 assists in identifying the
project sub-phases and assigning roles.
Project Task
Action
Step 1: Create the test
environment
Configure the storage
and build the Windows
Server OS including the
production-approved
service packs and
hotfixes (for example,
critical updates) and
OEM device drivers
Contact Person
Create a directory tree
on an NTFS partition
Create Users and
Groups
Create files under logon
context (ownership)
Create user home
shares and public share
Optional, include in the
test if you plan to use
the following: DFS
Populate shares with file
types such as: Office
documents,
synchronized files
(offline files), encrypted
files
Create public shares
Add files not owned by
the Administrators
Group
Create duplicate files in
each location
Step 2: Audit the
environment
Create and run desired
reports
Evaluate storage
capacity
125
Status of Testing
Chapter 5
Find restricted files and
duplicate files
Step 3: Configure the
environment
Set up SRM policies
Set up storage
management policies
Quality test the
procedural
documentation
Step 4: Evaluate
dynamic environment
Add new folders
(demonstrate Directory
Learn Mode)
Add new users
(Demonstrate User
Learn Mode)
Step 5: Evaluate impact
on end users
Run reports
Exceed quotas
Create blocked files
Mimic common user
actions (scripted)
Copy files to several
locations
Delete files
Open files
Modify and save files
Move files
Rename files
Step 6: Evaluate
exception handling
Add third-party utilities,
including backup
software, antivirus
software, disk
maintenance, and
management software
Consider unusual
actions, exceptions
Simulate workstation
lockup, leaving open file
handles
Step 7: Performance
measurement
Load file server with
Netbench, Iometer, and
others
Table 5.4: Status report template for system stability tests.
126
Chapter 5
SRM Application Monitoring
During this system testing you will also be working to integrate the SRM product within your
application management framework. This task might be as simple as setting the service to notify
you of failure or involve using an application monitor from Microsoft (such as MOM), HewlettPackard (such as OpenView), IBM (such as Tivoli), or NetIQ to trigger an alert if the service
does not respond.
NT 4.0 to Windows Server Upgrade and Application Compatibility
Upgrading from NT 4.0 to Windows Server might be part of your SRM project—and it is a
worthy upgrade. As mentioned in Chapter 1, not all applications will be compatible with
Windows Server. Most of the time, applications that are written for Win9x or the Win16
environment have a compatibility problem; however, even applications that work on NT do not
guarantee full compatibility. I have seen changes in the Windows Server filter driver model
cause problems for storage subsystem applications, so a wise practice is to check with the
application vendor before upgrading the OS. As part of your storage migration plan, you will
have a list of applications that are verified or in the process, as Table 5.5 illustrates.
Application
Windows Server
Verification Status
Business-Critical
Ranking
Contact Person or
Group
SRM Product:
Antivirus Product:
Backup Product:
Other disk utility:
Table 5.5: NT 4.0 to Windows Server migration and application compatibility verification status.
As part of your SRM deployment, you will need to test the interaction with the following
Windows Server features to guarantee compatibility.
DFS
SRM policies are typically designed to operate and provide functionality at the server and diskpartition level, but the Windows Server DFS architecture is designed to provide a layer of
abstraction into the enterprise-directory level. We have looked at some of the issues around
deploying SRM into the DFS architecture, and at this point in your project, you will want to test
your SRM product with DFS if you are using them in combination. To do so, you can use the
performance tools that I cover later. In addition to testing under normal work conditions, stress
test DFS replication, pushing it until it breaks.
MSCS Clusters
There are several methods or products for the clustering of file servers; the most common
product being Microsoft Cluster Server (MSCS), a component of WS2K3 Enterprise Edition. In
the past, clustering has been an exclusive technology, reserved for high-end systems or attempted
with mixed results in NT systems—often yielding poor or disappointing high-availability
statistics. But the tide is turning and Microsoft and many of its enterprise customers are
confidently deploying production clusters successfully as the new standard.
127
Chapter 5
Clustering requires external storage that can be shared either on the SCSI bus or through Fibre
Channel interconnects. These systems are becoming more affordable as we see the adoption of
SAN. Thus, clustering in WS2K3 is reaching a wider audience. Clustering can assist
organizations in the efforts of server consolidation, in which a large number of file servers are
combined into one large system, effectively increasing its importance to a business-critical level.
Clustering can reduce the downtime of routine system maintenance as each node can have
hotfixes and service packs applied while the other node is actively servicing users. Setting up
clustering in Windows Server is wizard driven, including configuring a resource, and is much
easier than in NT 4.0. Figure 5.1 shows the interface for setting up a clustered file server share in
Windows Server MSCS.
Figure 5.1: Setting up a clustered file server share in Windows Server MSCS.
Check with your SRM vendor before attempting to install the SRM product into an MSCS
cluster, as the SRM product should be cluster-aware and not cause any interference during
cluster node failover as the resources are brought online. Windows clustering has been around
for quite some time now, from NT to WS2K3, so you should expect all of the major SRM
products to be cluster compatible.
Additional Client-Side Impact
As part of the test plan, don’t forget to consider additional aspects of client-side impact that
haven’t been specifically covered by this guide. For example, perhaps you have a wide variety of
OSs that require continued support and you must guarantee that the SRM product enforces
quotas properly and sends appropriate messages. You might also need to assess the impact of
roaming users or differences in how you allow network access.
128
Chapter 5
Offline Files
Offline files were introduced with Win2K Professional and provide performance benefits and
convenience features (even when used against an NT 4.0 file server). The reason for the
performance benefit is that when a Windows client opens a file, say a Word document, if the file
is cached locally and the local copy is up-to-date, the Windows client opens the file instead of
pulling the file over the network. If the network link to the file server is lost, the client can
continue to work on files available in the client-side cache, and the user will receive a
notification when the file server becomes available.
If the file server copy is somehow destroyed, offline files can provide a measure of protection. If
you have not implemented offline files, you are in for a pleasant surprise, as they make storing
files on a file server much more reliable from the client perspective (at some point, documents
and other files can get too large to justify saving over the network, but you really want the
protection of being on the file server, so it becomes a difficult choice). Using offline files does
not necessarily require WS2K3 on the server. As long the client is a Windows Server Pro or XP
Pro client, the client can use offline files with NT as the file server.
For this phase of the project, we need to determine whether storage policies interfere with offline
files. What happens when a user attempts to synchronize a large file he or she has created but
that places the user over the quota (including overdraft)? You test to determine whether your
SRM product causes any error condition on the client side that would prevent the client from
successfully saving the file.
Interaction with Third-Party Utilities
If you are not in the business of testing software, the best strategy is to ensure that your thirdparty applications are up to the latest revision level that you can afford and support. However,
introducing new code presents an element of risk, so I am not suggesting that you install a beta
just to get the latest code. Instead, I recommend installing the latest release that has application
hotfixes or service packs available.
Antivirus
Many of us can share horror stories of antivirus software causing severe problems on servers,
including data loss and OSs that no longer boot. It is no secret that there were known
compatibility issues with previous versions of some antivirus software filter drivers and that
these issues have even affected SRM applications. Verify that you have the latest antivirus
software and include catching innocuous viruses as part of your test plan.
There are several ways to test the action of antivirus software when it identifies and attempts to
either repair or quarantine a suspected file. The safest way is to use the EICAR test string, which
should be recognized by all antivirus scanners. As part of my antivirus testing, I typically use a
real virus included in a .ZIP file. But I caution you to be careful with a real virus, and if it is an
executable, rename it to a file extension that will not automatically execute. For safety’s sake,
use the EICAR test virus, as it does not carry any payload, and will do no damage.
129
Chapter 5
When you configure your file server antivirus scanner, most likely you exclude certain files that
contain system data that change frequently and are not a likely target for a virus to avoid false
positives (in which the antivirus scanner detects a code string matching a known virus signature). If
you are using StorageCentral SRM, you will find a file called QaQuotasV4.dat (for version 4.1 and
version 5.2 standalone) or QAPolicyV5.mdb (for the AD-integrated version) in the root of the system
drive and any drive that has storage policies set on it. The attributes are hidden, so you will need to
enable hidden files in Windows Explorer to see it. You might want to exclude it from your antivirus
scanning because a false positive might adversely affect the system if the antivirus scanner
attempted to quarantine this file.
Another suite of tests that I perform are several abnormal situations that can either occur
naturally or as a Denial of Service (DoS) attack as Table 5.6 lists.
Antivirus Test
Purpose
Sample File
Known virus
Assess impact of quarantining file while SRM
quota manager is in effect
Known virus (in .ZIP file)
Zero-byte file
Previous issue with zero-byte .COM files that
could negatively impact antivirus scanners
Zerobyte.com
Very large file
Previous issue with extremely large files that
could negatively impact antivirus scanners
Create a document or
spreadsheet with one letter
repeated for hundreds of pages,
(save to a compressed NTFS
partition) and place within a zip
file
Other
Any other possible test that present situations
that might interfere with antivirus scanners
Table 5.6: Sample antivirus tests.
Backup Applications
Validating the interaction of SRM products with your backup applications is as simple as
performing routine backups and testing the restoration of data. You will determine whether you
need to disable or turn off your SRM product during large restorations. The matter is made a bit
more complex if you are using some form of snapshot services or volume replication to perform
backups and restores.
Other Disk Utilities
Finally, you will want to test interaction with any other third-party disk utilities such as volume
managers and defragmenters. The reason is not so much that I anticipate that you will encounter
problems, but if you do, better to find them in this phase, the pilot testing, than the deployment
phase.
130
Chapter 5
If you do quite a bit of administration on NT or WS2K3 servers and find that you are constantly
bringing up a command prompt that is in the wrong location so that you have to change drive and
directories before you can enter any commands, there is an easy way to launch the command prompt
directly on the folder that you are viewing in Explorer. Simply paste the following text into a Notepad
document and save it as a .REG file. Double-click the file to start the process of entering it into your
registry. Note that one variable in the text (shown in bold) might need to be changed—my OS is
installed at C:\WINNT (you can verify the location of your OS by typing echo %systemroot% at a
command prompt):
REGEDIT4
[HKEY_CLASSES_ROOT\Folder\shell\DosHere]
@="Command Prompt &Here"
[HKEY_CLASSES_ROOT\Folder\shell\DosHere\command]
@="C:\\WINNT\\System32\\cmd.exe /k cd \"%1\""
File Blocking and Quarantining
In the past, some SRM products allowed file blocking or screening, which would prevent a user
from saving a specific type of file to a location. File blocking was typically based on file
extension, instead of looking into the file header to determine the actual file type. But new
versions of SRM software will actually determine the file type, and can be configured to
quarantine the file in a location configured by the administrator, just as antivirus software usually
does. If you plan to use this feature, you will want to add it to your test suite to make sure that
there are no interference issues with the previously mentioned third-party utilities.
Essential features in a file-blocking product are, first, the ability to recognize and block files in
real time, or as close to real time as possible. The reason for this need is that it would be quite
easy to just run a nightly script, for example to delete *.MP3s, but this approach does little to
notify the end users immediately that they are violating an organizational policy. Notification
actions are an important part of the product, with the ability to control who is notified and how.
For example, StorageCentral SRM lets you notify both the user and the administrator, log an
event to an audit file, and run a report when necessary. Also, a file-screening or blocking product
can assist your efforts by providing a predefined list of file types to filter.
We will look at the practical usage of file screening in upcoming chapters.
Performance Impact
You might not need to perform performance testing or benchmarking of your file systems. As I
will show you later in this chapter, the impact of a well-developed SRM product should be
negligible. You can take it on faith or perform a before and after snapshot, as I have done. Either
way, a good idea is to baseline the systems so that in the future you have some frame of
reference to say whether the servers are performing well or poorly. Personally, I was interested
in measuring the performance drag of quota-management software.
131
Chapter 5
SRM Filter Drivers
One big difference between many third-party SRM products and Windows Server built-in disk
quotas is that the third-party products can enforce the quota as the user attempts to exceed a
quota limit, instead of after the file has been written (as Windows Server disk quotas must do).
The benefit to you as a systems administrator is that a user is prevented from far-exceeding his or
her quota and affecting others on the system (or crashing the server because the disk fills up).
This difference is made possible by the installation of a proprietary filter driver, QAFilter.sys,
which Figure 5.2 shows. To see the filter driver, you must select the Device Manager node so
that it is highlighted, then right-click to bring up the menu, and select the option Show hidden
devices.
Figure 5.2: Showing hidden devices to see the QAFilter driver.
132
Chapter 5
A couple of cautions about benchmarking: First, do everything you can to simulate the configuration
of the production environment, including using all the same hardware, drivers, software applications,
and utilities. Second, consider the time cycle involved in a workload, as a production file server has
time to perform house cleaning of its memory and disk cache during the normal peaks or bursts of
client activity. Some load-testing tools are designed to stress a server and might not give the server
the rest periods during which Windows Server would dynamically adjust itself for performance.
Table 5.7 lists benchmarking tools that I have used for testing server performance.
Benchmarking Tool
Description
Source
NetBench
Measures how well a file server
handles file I/O requests from 32bit Windows clients.
Reports throughput and client
response time measurements.
Includes standard test suite files
(DM_NB702.TST and
ENT_DM_NB702.TST for larger
server configurations).
http://www.etestinglabs.com/benc
hmarks/netbench/netbench.asp
NTiogen
An NT port of a UNIX
benchmarking tool that shows
average response time, I/O
operations per second, and KB
per second.
http://www.acnc.com/benchmarks/
ntiogen.zip or
http://www.seeksystems.com/old/
public/perftest.exe
Nbench
Benchmarking program originally
for NT that reports disk read and
write speeds.
http://www.acnc.com/benchmarks/
nbench.zip
Iometer
A workload generator and
performance analysis tool for disk
I/O subsystem measurement and
characterization; originally from
Intel, now open source.
http://www.iometer.org
HDTach
Shareware/commercial disk I/O
benchmark that uses a special
kernel-mode driver to bypass the
file system.
Logs the read speeds to a text file
so that you can load into a
spreadsheet and graph.
NT support requires the registered
version.
http://www.simplisoftware.com/Pu
blic/index.php?request=HdTach
Table 5.7: Benchmarking tools, usage, and sources.
133
Chapter 5
Performance Monitor Objects
Table 5.8 summarizes the Performance Monitor counters and some values used to evaluate the
testing:
Object
Counter
Explanation
Processor
%Processor Time
Primary indicator of processor activity. Calculated as
100 percent—Idle thread. Should only peak at 70 to 80
percent for short durations.
System
Context Switches/sec
Combined rate at which all processors are switched
from one thread to another. Establish a baseline for a
comparison of what is normal and what is out of
bounds.
Server Work
Queues
Queue Length
Server work queue for the CPU. A sustained queue
length greater than 4 indicates processor congestion.
Memory
Cache Bytes
Sum of several caches used by the system. Establish a
baseline for comparison.
Memory
Cache Bytes Peak
Maximum number of bytes used by the cache since the
last system restart.
Memory
Pages/sec
Number of pages read from or written to disk to resolve
hard page faults. Establish a baseline for comparison. If
it climbs over time, add more memory.
Memory
Pool Non-paged Bytes
Number of bytes in system memory for objects that
cannot be paged to disk.
Memory
Pool Paged Bytes
Number of bytes in system memory for objects that can
be paged to disk.
Cache
Copy Read Hits %
Percentage of read requests that use cache, instead of
a disk read.
Server
Pool Non-paged failures
Indicates insufficient memory when allocations from
nonpaged pool fail.
Server
Pool paged failures
Indicates insufficient memory when allocations from
paged pool fail.
Network
Interface
Bytes Total/sec
Rate at which data is sent and received on the interface
(including framing characters). Should closely match
the disk bytes/second.
Network
Interface
Packets Outbound Errors
Outbound packets that could not be transmitted.
Establish a baseline so that you will know what to
expect and consider an abnormally high amount.
Network
Interface
Packets Received Errors
Inbound packets that could not be transmitted.
Establish a baseline so that you will know what to
expect and consider an abnormally high amount.
Server
Bytes Total/second
Bytes the server has sent to and received from the
network.
Server
Bytes Received/second
Part of the above total, bytes the server has received
from the network.
CPU
Network
Server and Disk
Throughput
134
Chapter 5
Object
Counter
Explanation
Server
Bytes Transmitted/second
Part of the above total, bytes the server has sent to the
network. The server must be able to transmit more data
than it receives or it will become flooded.
Physical Disk
% Disk Time
Percentage of time the disk drive or array is servicing
read and write requests; should be kept less than 80
percent.
Physical Disk
Average Disk Queue
Length
Average number of both read and write requests
outstanding. Should not build or exceed the number of
spindles in a hardware RAID set.
Logical Disk
% Disk Read Time and
% Disk Write Time
Measured to calculate the Read/Write Distribution Mix
(%). This value will vary depending on the application
or user profile
Logical Disk
Average Disk Bytes/Read
Average Disk Bytes/Write
Measured to calculate the Transfer Request Size
(usually converted to KB). This value too will vary.
Table 5.8: Suggested Performance Monitor counters.
Performance Testing Results
I was interested in measuring the performance impact of quota-management software on an
‘aging’ file server. The server was due for an upgrade, so I selected it as the choice for this
testing.
Even though some of these tests can be directed against a specific drive or network share on a
storage subsystem, be cautious about running them against a production server or network—they
can congest the network and bring the server to a crawl. If you want to test a new disk subsystem on
an existing server, only do so after business users are off the system, after hours, or on the weekend.
Here is the configuration for the tests, with the RAID array illustrated in Figure 5.3:
•
Compaq ProLiant 1850 dual processor, 550MHz Pentium III, 512KB cache
•
1GB RAM
•
Dual Compaq Nettelligent 100Base-T adapters, each to a different network segment
•
Windows 2000 Server SP2 (with security hotfixes)
•
SmartArray 4200 RAID Controller, 56MB cache, firmware 1.30
•
Two arrays and three logical disks, all formatted with NTFS
•
•
•
9GB RAID1 internal for OS and page file
External storage array with fourteen 9GB drives configured for RAID5, two
logical disks
As many as 16 Compaq Deskpro clients (load generators) using 100Base-T network
cards, on the same switched network segment
135
Chapter 5
Figure 5.3: File server test physical drive layout.
NetBench
NetBench tests how well a file server handles file I/O requests from 32-bit Windows clients, and
reports throughput and client response time measurements. One of the useful aspects of
NetBench is that it includes standard test suite files. In this case, I used the enterprise test suite
ent_dm_nb702.tst designed for larger server configurations. You will need to run as many test
clients as it takes until you see a knee in the NetBench results curve, as Figure 5.4 shows.
136
Chapter 5
Figure 5.4: NetBench file server test before I installed the SRM product.
The following is not meant to replace the extensive documentation for NetBench, but is just a
quick-start guide. The process of running NetBench is as follows:
1. Download and unzip the client files (nbver#CL.exe) and the controller files
(nbver#CO.exe) to a shared network location.
2. Run each setup program to place the client files on as many clients as you will need, and
the controller files on the server.
3. Create a network share on the server and storage system that you want to test.
Defragment the drive before testing, if possible.
4. Run the batch file that I suggest creating in the following tip. This file will map the
necessary F drive (if F is not available on the client, do not use that client). Create the
necessary hosts file, and log the client IP address to a text file on the server
(\fileserver\fileshare\client_ip.txt).
137
Chapter 5
5. Run the NetBench controller and select Session, Options from the menu. You will see a
client ID database pointing to a specific client.cdb file. You can click Edit and paste the
IP addresses for the clients listed in client_ip.txt. I prefer to create an Excel spreadsheet
with IP addresses for all clients on my subnet in one column and ID in the next column
(even though I might not use all client IDs).
6. At the NetBench controller on the file server, start client logon (Ctrl+L).
7. Launch NetBench on each client by running the Client.exe.
8. After as many clients show up as possible (some might not log on), press OK, and select
the test suite (e.g. ent_dm_nb702.tst). You will be prompted to add more test suites if
desired, and to select the name for the results file.
9. Start executing the test suites if desired. Capture as much information as desired using
Performance Monitor.
There are some requirements for properly setting up NetBench. The following tips make the
process much easier.
Type or paste the following into a batch file (for example, CL_NBSet.bat) to speed the process of
configuring NetBench clients. Don’t forget to replace the italicized entries:
copy hosts %systemroot%\SYSTEM32\DRIVERS\ETC
net use f: \\fileserver\fileshare
ipconfig >> \\fileserver\fileshare\client_ip.txt
Create a file called hosts and place it in the same folder as the batch file. The batch file will copy the
hosts file to the correct location. In the hosts file, place this entry at the end of the file
IP address of fileserver controller
The NetBench product license requires that I follow certain disclosures, which I do below. This
test is an informal test, meant to illustrate the type of test tools that you can use to generate your
own results, and is not meant to represent how a third-party SRM product will perform in your
environment.
I tested using eTesting Labs’ NetBench 7.0.2 with the standard test suite ent_dm_nb702.tst without
independent verification by eTesting Labs. eTesting Labs makes no representations or warranties as
to the results of the test. NetBench is a registered trademark of Ziff Davis Publishing Holdings, an
affiliate of eTesting Labs in the U.S. and other countries. The server and client configurations were
listed earlier.
Figure 5.4 shows the NetBench test results before the SRM product installed, establishing a
baseline for normal file server operations. Establishing a baseline for a new system is a good idea
in general, as it allows you to repeat the tests later to determine whether changes to the system
have been for the better or worse. My server peaked out and leveled off around 50. Figure 5.5
shows the NetBench file server testing after I installed the SRM product, running the same test
suite.
138
Chapter 5
Figure 5.5: NetBench file server test after I installed the SRM product.
NetBench Test Conclusions
As you can see from the figures, there is a slight difference in the file server performance, as can
be expected when adding any additional layer of software that provides functionality. So the
question is really “Is this performance penalty reasonable for the functionality gained?” I think it
is, considering that I can turn my back on the server for extended periods of time and not have to
sort through file listings to weed out old and unused files, or clean up those funny videos, or
worry about the disks filling up! I bet you too would give up a tiny bit of I/O to be able to go on
vacation!
139
Chapter 5
Iometer
Iometer is an I/O workload generator and performance analysis tool for disk and network stresstesting. In the following tests, I use it to determine the maximum I/O capacity for my disk
subsystem before and after I deploy the SRM product.
The first step in running Iometer against a storage subsystem involves preparing the drives, and this
step can take a long time. The reason for this time requirement is simple; Iometer creates a file,
iobw.tst, at the root of the drive (or the shared folder) and writes to it until the drive is full. You can
save a lot of time by letting this file grow to the desired time and then shutting down the test. When
you start the test, it will find the iobw.tst file and launch without the preparation process. You can also
keep a copy of this file and copy it to a new server to avoid the preparation process.
The following figures show the results of running Iometer before and after installing the SRM
product. In the Edit Access Specifications window, which Figure 5.6 shows, you will input the
following information:
•
Transfer Request Size (KB)
•
Percent Read/Write Distribution
•
Percent Random/Sequential Distribution
Figure 5.6: Disk access specifications in Iometer.
140
Chapter 5
To determine the first two input values (transfer request size and read/write distribution mix),
capture the Performance Monitor values (listed in Table 5.8) for one of your production servers
(remember to enable disk counters using diskperf –y, if necessary, and reboot). The
random/sequential distribution mix most likely is 100 percent random, unless you have some
logging or transactional writes, as I have allocated at 20 percent in the test setup. Figure 5.7
shows the Iometer test results with no quota software installed.
Figure 5.7: Iometer test results with no quota software.
Figure 5.8 shows the Iometer test results with quota software installed.
141
Chapter 5
Figure 5.8: Iometer test results with StorageCentral SRM installed.
Performance Testing Conclusions
Based on the results from these performance tests, you can see that installing a third-party SRM
product created a very slight drop in the I/O throughput under peak conditions. Note the
emphasis on peak conditions, as this means that the impact on I/O throughput under normal
conditions will be much less noticeable.
Inter-System Interactions
Some words of wisdom that I carry with me when performing migration projects is that you
never fully know how users are using the old system until you bring in a new system and it
breaks their old way of doing things. This belief is the general purpose philosophy of this
section: leave an empty page in your project plan and assign some resources for fixing any
problems that might not be fully anticipated. A likely scenario is that these unexpected problems
will be the result of some complex inter-system interactions that might not be fully revealed in
pilot testing.
Testing Quotas
As I discussed in earlier chapters, the full-fledged SRM products have more than one quota type,
unlike the Windows Server quota functionality, which can only apply quotas toward users and
groups (as usual, identified by their SIDs). In your tests, be sure to include any types of quotas or
storage policies that will be applied toward managed objects in production, as Table 5.9
suggests.
142
Chapter 5
Quota Type
Description
Test Objectives
User (SID)
Quotas
Quota product that monitors and limits
space by user, disk, directory, and
file. This quota type is useful for
limiting disk space on a user basis.
Does the SRM product prevent users from
exceeding threshold limits? Is it possible for
the users to thwart or bypass enforcement
efforts?
Directory or
Object Quotas
Quota product that monitors and limits
space on directories and their
subdirectories. Useful for limiting the
size of a directory or share regardless
of who places files in it.
Can the SRM product control application
growth? How does it prevent access on a
folder by folder basis?
Table 5.9: Testing different types of quotas.
Communication and Education Plan
As part of your pilot program, you will be developing and refining the communication plans
discussed in the previous chapter. This step is your chance to test how well the communication
channels work, and to communicate storage policies to the pilot users. You will also be
developing the user education program. The pilot users can help you identify how best to
communicate answers to the following questions. A FAQ format works well for an intranet
communication site. The trick is to know which questions the end users are likely to ask.
•
What is storage resource management (SRM)?
•
What is the organizational policy on storage?
•
How much storage is available and where?
•
What types of files are allowed?
•
Why is this change being made (for example, more efficient use of business resources)?
•
Who can end users contact?
•
Where can end users get more information?
Practice the communication and end-user education during the pilot phase and get feedback not only
on the information communicated but also on how it as communicated—the communication channel
used (email, intranet pages, cafeteria bulletin board, hands-on training seminars, computer-based
training). Was it an effective medium and message? You might even need to deliver different
messages for different levels of technologically sophisticated users—some might know just enough
about technology to question why their quota is only 100MB on the server when they have a local
hard drive of several gigabytes. Why not just store all of my files locally? As you can see, you can get
into the issues of high-cost SCSI disks, RAID controllers, and backup infrastructure, but you certainly
don't want to blast out detailed technical messages to every employee—that would be information
overkill.
Table 5.10 shows a sample communication template, designed for you to revise as necessary for
your environment. Many companies have used printed brochures delivered to end users who are
impacted by SRM.
143
Chapter 5
Company Storage Resource Management (SRM) Announcement
FAQ
Answer
What is happening?
Your IT department is installing an upgrade to Windows Server and
an improved SRM system.
What is SRM?
Storage Resource Management—effectively managing the storage
space where we all store our files.
How will this change affect me?
We will be able to give you more feedback about the files you have
stored, how old they are, what types, and how much space you have
left to store new files.
Why is this change being made?
To gain more efficient use of business resources and save the
company money in the time we spend managing our files.
Why don’t we just add more
storage?
Many companies have done so in the past and have found that it just
compounds the problem because more storage requires quite a bit of
time to sort through the ever-larger amounts of storage, and only you
can tell what is important to you. It is important that we instead reduce
wasted storage consumption, as storage management can cost many
times more than the physical costs of storage devices.
How much storage is available
and where?
This amount might vary with location and department, as well as how
far along your IT support group is in the SRM migration. We’ll let you
know with plenty of warning as you near your storage limits.
What about all of this disk space
on my local PC?
You are welcome to use that disk space, but it is probably not being
backed up as regularly as our business-critical file servers. Your
desktop PC also has a single drive, so if it fails, all your files that are
not backed up are lost. Our file servers use high-performance disks
that are self-correcting for many errors including some drive failures.
What types of files are allowed?
We allow any type of file that has a direct business benefit. If you
store personal files that have no direct business purpose, they might
be removed from the file servers. Remember that our IT group must
protect all files on the file servers and the non-business files slow the
backup and recovery procedures.
What can I do to help keep our
systems running at peak
performance?
We know that keeping your information is important, but the more files
that we keep that are not frequently used, the more unnecessary
overhead on our file servers. We will help you identify files that might
no longer be needed and large files that are already stored in a public
share.
Who can I talk to if I need more
information or help?
Dial your local Help desk at: _____________
Where can I access the
organizational policy on storage?
Access our organizational storage policy on our intranet at <insert
URL here>
Table 5.10: Sample communication template.
144
Chapter 5
Pilot Deployment
In larger organizations, the pilot phase involves two groups and is broken into two mini-phases.
First is the technical group of users and second is the business pilot group. Each group has a
different set of characteristics, as the technical group is more devoted based on job function to
solving technical issues. The business group can provide invaluable feedback about how the
SRM solution either helps or hinders their daily tasks. Once you have performed a lab
evaluation, it is time to test the selected SRM product in a staging of a real-world deployment.
Assess Effectiveness of Implemented Solution
How do you know when you are ready to exit the pilot phase (assuming that it is not just because
you have run out of time)? Before deploying the product, an important task is establishing
measurement criteria that answer the following questions:
•
Does it work on a technical level?
•
Is it stable and can administrators work with it?
•
Does it work for the business?
•
Does it disrupt or prevent people from working?
•
Does it work on a personal level?
•
Is it helpful and non-intrusive?
Preparing for Deployment
At this point, you should be ready to answer the questions:
•
Are you ready to deploy the SRM solution?
•
Are you confident that the deployment will go smoothly with little interruption to
business users?
Remember that software deployment is an ongoing process of learning and revisions, and you
will need to revise the solution as needed. I’ll provide more detail about this process in the next
chapter, as I touch on some of the issues that I have had to resolve.
Assigning SRM Support Boundaries and Roles
Out of the pilot testing, you will be developing an escalation plan and defining the production
team roles. The template that Table 5.11 shows is an example that can be used for assigning
SRM support boundaries and roles.
145
Chapter 5
Role
Support Boundary
Storage implementation
SRM architecture team
File share storage request
Local Help desk
Share creation and setting
permissions
Local server and storage
group
Share management
SRM team
Change permissions on
existing shares
Local Help desk
Help desk support
Local Help desk
Server-side troubleshooting
Local server and storage
group
Backup and recovery
Local server and storage
group
Assigned To
Table 5.11: Template for SRM support boundaries and roles.
Summary
In this chapter, we looked at testing the SRM solution, getting feedback, and assessing the
effectiveness of the implemented solution. I continued to focus both on the technical aspects of
an SRM solution and the project-management tasks, giving you numerous sample templates to
work with.
I gave you heads up about some problems that you should anticipate, and assisted you in creating
a test plan for evaluating your solution. I showed you how to get started with your own
performance testing and covered the results of my own testing. Then we looked at the
communication channels and end-user education plan to convey the appropriate messages about
your SRM deployment.
In the next chapter, we will look at deploying the SRM solution, continuing the project
management discussion including issues such as change control. We will also focus on risk
mitigation, as you bring the tested pilot systems into production and gather critical feedback. We
will then extend the product focus to look at the other, broader types of storage-management
products in addition to SRM.
146
Chapter 6
Chapter 6: Deploying the SRM Solution
In the previous chapter, we developed a test plan for your WS2K3 SRM solution, using
StorageCentral as an example product for defining your pilot phase. In this chapter, we will
extend that product focus to look at the broader category of storage management. As you will
see, SRM and storage management are two distinct categories and do not necessarily solve the
same problems. For example, making sure that your servers don’t run out of disk space is a
different task than using a storage-management product to monitor the performance of your
fibre-channel fabric. If you work in an environment in which you perform storage-management
functions such as creating RAID arrays and allocating them to certain applications, you know
how many tools you need to really manage a storage environment. To put it another way, storage
management products manage storage devices, and SRM is about managing how people
consume those storage resources.
We will cover the categories of storage management in this chapter so that you can see where the
different approaches fit. I will give example representative products in each category; however, this
chapter is not meant to be a cross-product comparison.
Previously, I gave you the heads up about some problems that you should anticipate, and assisted
you in creating a test plan for evaluating your solution. I showed you how to get started
performance testing, and covered the results of my own testing. We also looked at the
communication channels and end-user education plan necessary to convey the appropriate
messages about your SRM deployment so that you can get feedback and assess the effectiveness
of the implemented solution.
In the last chapter, I continued to focus on the technical aspects of an example SRM solution as
well as the project-management tasks, giving you numerous sample templates to work with. In
this chapter, we will continue the focus on the project-management aspects, as you should be
over the hurdle regarding the technical challenges if you have been following this book so far.
I will give you some best practices in this chapter in the areas of hardware standardization and
server-naming standards. We will look at the pros and cons of server consolidation as well as
some best practices for SANs, as they can be an essential part of server consolidation. Finally,
we will define the criteria for success in this phase of the deployment.
We introduced the topic of change control during the project by looking at changes in the new
version of StorageCentral, and this chapter furthers the change-control discussion. Change
control is crucial to deploying an SRM solution, as we are impacting production servers and
might also be extending the AD schema. Both of these changes will factor into our risk analysis,
which we will detail as part of our project management. We will also focus on risk mitigation,
dealing with high-risk events (as opposed to risk avoidance, which can mean avoiding
opportunities that entail some risk, as most do). Table 6.1 shows Phase 5: Deploying the SRM
solution.
Phase
Process
SRM
Deploy
Deliver the solution to the
target audience.
Bring the tested pilot systems into production and
gather feedback.
Table 6.1: Phase 5 of the SRM process.
147
Chapter 6
The Deployment Phase
At this point, let’s spend a moment on a review of the SRM deployment process without
describing a specific methodology. The reason for this limitation in the discussion is that it may
help you to understand how the methodology is actually derived, and how you benefit from it. I
know that some readers are mainly interested in SRM and what it can do on a technical basis, but
to understand the technical aspects of SRM, you must follow a path, and that path is what the
methodology defines. The SRM deployment methodology is not an imaginary set of principles
and tasks; it is based on studies of what must be done to deploy SRM (or other infrastructure
projects). Let’s follow an example deployment process that was done in a more informal manner,
as the company is fairly small:
1. Individuals in the organization perceived a problem—some servers ran out of disk space
and crashed, preventing business users from saving files to the shared location.
2. To fix the problem for the short term, individuals in the organization deleted some files.
However, it was clear to all involved that the problems would reoccur if something was
not done (a long-term solution needed to be developed and implemented).
3. At a staff meeting, members of the IT department brought these problems to the
awareness of all teams, including managers. The managers decided to make the problem
a priority, allocating team members and resources (time costs money) to identifying and
solving the root cause of the problem.
4. The task force (project team) met to discuss the problem, making sure that they fully
understood it, instead of hastily putting together a solution. They also discussed
environmental issues—factors affecting the company’s storage resource growth and plans
for new equipment.
5. A team structure emerged and individuals were assigned responsibility of certain tasks,
such as communicating progress to management, evaluating different SRM solutions,
documenting the process, and testing and deploying the solutions.
6. Several SRM products were obtained. An SRM solution was put together in the test lab
and tested against business systems to ensure little or no interference.
7. A specific SRM product and solution was chosen. The SRM solution was integrated into
a more comprehensive SRM strategy that included monitoring and management.
8. The SRM solution was placed into limited production use and problems that arose were
identified and resolved.
9. The SRM solution was placed into further production use and end-user interaction was
further refined.
10. The focus shifted to future viability of the SRM solution including the support and
upgrade path.
148
Chapter 6
Looking back over this process, how the SRM deployment methodology is defined should be
clear. The ideal solution would avoid the following scenarios:
•
The SRM solution was placed into production use with little advance notice or thought
given to communication or support issues.
•
Users started complaining to the Help desk that their computers were not working right,
perhaps due to a virus, and they needed to be reinstalled. Several business-critical
systems crashed during the worst possible time of day, and the SRM solution was
blamed. Systems engineers and support personnel never got a chance to resolve the issue.
•
The SRM solution was postponed as new storage was brought online. Users’ disk quotas
were increased, and IT staff went back to running monthly reports in the hopes of
identifying which server would be the next to run out of storage.
These scenarios leave you with a failed project for several reasons. First, there are steps in the
ideal methodology that were not followed. Second, there were clear risks introduced to the
project, and project managers were either unaware or unable to deal with or mitigate these risks.
SRM Goals and Components
As I thought back over our SRM deployment, I began to develop a visual model, a threedimensional view of the deployment, which would clearly illustrate the importance of each phase
in the methodology. The three-dimensional view of the SRM deployment that Figure 6.1 shows
helps to illustrate some key concepts: First, the layers in the solution correspond to the SRM
goals that we defined earlier. Second, the number of layers, or height, of our SRM deployment
represents the level of complexity in our solution, which is driven by the analysis and planning
phases. That is, the perceived problem and our solution will be based upon these phases. Third,
the depth of each of these solutions is driven by the design and test phases of the project; the
more effort and thought that we put into each of these, the more robust our solution will be.
Finally, the available resources determine the width or coverage of our SRM deployment, in the
deployment and management phases.
149
Chapter 6
Figure 6.1: Three-dimensional view of the SRM deployment.
Return on Investment
Evaluating the financial returns for productivity software is a difficult task. But formal Return on
investment (ROI) analyses are rarely done for financial expenditures such as insurance, as it is
well known that substantial risks to the organization need to be mitigated. SRM software can be
considered an insurance policy for preventing downtime. The cost of downtime saved by
cleaning up unnecessary files and preventing a disk-full condition can be substantial. SRM
software can provide ROI with the prevention of the very first outage.
Storage Management
There are essentially two approaches to storage management at the broadest level. These two
approaches are either bottom-up or top-down. In bottom-up, storage management begins at the
layer of physical devices and moves up through the storage network, the attached servers and
OSs, finally to the application consumers of storage. For many, bottom-up makes the most sense,
as it essentially follows a chain of dependencies, as the application is dependent on the OS,
which is dependent on the logical volumes, which is dependent on the storage network, which is
dependent on the physical devices. In the other approach to storage management, top-down, the
priority is on application management, as that is the desired end result of all this storage gear,
and provides the actual value to the business.
150
Chapter 6
Organizational View of Storage Management
While thinking about our SRM deployment, I pictured an organizational view of the different
layers or approaches to storage and storage management, as Figure 6.2 illustrates. What you
might gain in insight from studying this organizational view is the multiple layers and shared
responsibility for making sure that the organization benefits from SRM.
I want to be totally clear on the distinction between SRM and storage management. In the
previous chapters, I have focused on deploying an SRM solution, and chose the SRM product
that I personally use. I encouraged you to evaluate any other product you are considering against
this particular SRM product. Whichever you chose, the deployment methodology will stand. At
this point, we must begin considering the broader scope of storage-management products that are
designed for tasks other than the four listed in Figure 6.1. There are many storage-management
activities that an SRM product does not do, as we will see in a moment, such as configure
switches or RAID controllers. In addition, an SRM solution might not display vital status
information for these devices. For this reason, I’ll explore the broader category of storage
management. Your company might also choose a storage-management product for managing a
specific application, so we will look at that focus too.
Centralized
Management
Enterprise Storage
Management
File Serving and Application Layer
File
Servers
Application
Servers
Win2k Netware NT
Policy-Based Object
Management / SRM
Unix
Client
Application Storage
Management
Storage Virtualization
Direct Attached Storage
Device Configuration
and Management
Storage Device Layer
Networked Storage
Fibre-channel SAN
Management
Storage Fabric Layer
Tape Libraries
Figure 6.2: Organizational view of storage management.
151
Clients
Chapter 6
Storage Management Strategies
Our view or vision of storage management has been expanding as the project evolves, and will
continue to expand in the future, especially as the technology world evolves. As an example of
this constant evolution, StorageCentral introduces a new paradigm for storage management,
using a model called policy-based object management. So far we have been focused on solving
the problems of eliminating duplicate files, eliminating unused (aged and orphan) files,
eliminating wasted space, and reducing storage consumption (through file blocking and disk
quotas). There are several other approaches to storage management, often defined by different
product offerings, and we will discuss each of them so that you can be aware of where your SRM
deployment fits in the larger picture. At least, if the CIO asks you why the company is using
Product X instead of Product Z, you can explain that Product X costs much less and provides
more value by letting you enforce quotas in realtime rather than just report on who is using what.
The different approaches we will cover are policy-based object management, device
configuration and management, enterprise storage management, application-centered storage
management, and fibre-channel SAN-centric storage management. Figure 6.3 helps to illustrate
the concept that there are many approaches to storage management, some of them being
complementary and several of them able to exist in the organization at the same time.
Figure 6.3: Alternative, complementary approaches to storage management.
Interestingly, as our view of storage management has been expanding, the product market has
been consolidating, as large companies have been swallowing up smaller players in an effort to
instantly create their own storage-management portfolio. Most recently, VERITAS Software
acquired Precise (which had previously acquired W. Quinn, the original manufacturer of
StorageCentral), but in the past Sun Microsystems acquired HighGround, years ago EMC
acquired Softworks and Terascape, Computer Associates acquired Sterling Software, VERITAS
Software merged with Seagate, Legato acquired Fulltime and Vinca, and Hewlett-Packard
acquired Transoft.
152
Chapter 6
Product Selection Criteria
Your selection of storage management products will most likely be based on the criteria shown
in Table 6.2. Storage management products range in cost from less than $1000 on a per-server
basis to several hundred thousand dollars for an enterprise license. Which option you choose
depends on the type of coverage that you need.
Criteria
Range
Example
Platform
Must support the OSs that you
need to manage (or at least
provide management agent
support)
NT, WS2K3, Novell NetWare, Sun
Solaris, HP-UX, IBM AIX, HP Tru64
UNIX, OpenVMS, Linux
Objects Managed
Types of devices: SAN fabric,
logical disks, physical disks,
folders, files
Does the storage have to be attached
or visible from a server? Or does the
software discover SAN topology by
querying fabric switches? How
detailed or extensive is the objectmanagement level?
Features
Alerting and reporting tool or
realtime storage
management? Directory
integration?
Hard quotas in addition to soft quotas,
alerting, and reporting? Trend analysis
and capacity planning? Realtime file
blocking? AD integrated?
Cost
Is it really worth the price
(relative to other products),
based on all the above
capabilities?
Less than $1000 per server versus
more than $50,000 for an enterprise
license
Support
How comprehensive is the
support? Is a range of support
options available?
Basic product warranty plus additional
gold support on purchase?
Table 6.2: Storage management product selection criteria.
Policy-Based Object Management
In the previous chapters, we have looked at WS2K3’s native storage management abilities
compared with third-party storage management products’ abilities to manage the storage objects
of WS2K3. I have focused my efforts on demonstrating StorageCentral SRM for an example
SRM deployment. As I previously mentioned, in addition to policy-based object management,
there are four other categories of storage management products: device configuration and
management, enterprise storage management, application-centered storage management, and the
fibre-channel SAN approach to storage management. I’ll briefly discuss each of these storage
management product types.
153
Chapter 6
Device Configuration and Management
The first step in deploying any storage is the configuration, often called provisioning because it
involves allocating the resources for end users to consume. Typically this step is performed by
device-specific utilities created by the storage vendor (for the purpose of creating arrays on a
Smart Array or StorageWorks RAID controller, for example) and by the management utilities in
the OS. However, there are some storage-management applications or utilities designed to be
multi-vendor, especially in the area of fibre-channel SANs. For example, HP (Compaq) provides
some specific tools for storage configuration and has licensed and developed a version of the
HighGround product line that Sun Microsystems recently acquired. The SAN management
appliance from HP (which Figure 6.4 shows) will play an increasingly important role in SAN
management as it is the principal device used for configuring the new enterprise virtual storage
array (HSV110).
Figure 6.4 shows both the ability to perform RAID controller configuration as well as the ability to
gather status information, such as the DISK30100 marked as failed (which has a red X through it).
Figure 6.4: Fibre-channel management appliance showing RAID controller configuration.
154
Chapter 6
Enterprise Storage Management
The broadest category of storage management is enterprise management. By enterprise
management, the industry is referring to managing massive amounts of storage both in the data
center and distributed sites, across a variety of OSs and storage product vendors. In some sense,
the phrase enterprise storage management represents the Holy Grail of single-seat administrative
control over a vast empire of storage; however, its definition is somewhat diluted by its use as a
marketing term. Many vendors would like you to think of their product as the enterprise storage
management solution, when in reality, there is no one-size-fits-all solution yet, and it will be a
long time coming, as we will see in the next chapter.
An example of an enterprise storage management product, HighGround SRM is designed to
report on storage across multiple servers and OSs. The HighGround SRM product is designed for
monitoring, reporting, and policy-based management of enterprise storage. Version 5.0 is the
latest release and includes the ability to report on database files and tables and objects as well as
to define thresholds and alerts. When an application is nearing maximum disk space, an
important step is that the administrator be alerted. Most storage management applications can
only alert the administrator and are unable to take action, whether it is to add more storage from
a disk pool or force truncation of database log files. Storage quotas can be defined per user or per
group, and notifications (soft quotas) sent via email. But what is lacking in the product is the
ability to set hard quotas, and soft quotas can lead to users quickly disregarding the soft quota
warnings as all bark and no bite.
We will look at the future of storage management in Chapter 8, including the resolution of this
problem through smarter storage management products.
Application-Centered Storage Management
Another approach to storage management is to focus on the applications that consume the
storage. Typically you purchase an application-monitoring product first, then purchase additional
modules for storage management features. Application storage management is usually driven by
its own necessity; that is, a need to monitor a specific application to ensure that it remains always
available.
Fibre-Channel SAN Approach to Storage Management
The SAN-centric storage management approach is a fairly new market, and I have not read exact
market numbers for this segment. Unlike the solutions from hardware vendors, these types of
solutions must work across all storage architectures and components and manage more than one
vendor’s hardware. The last time that I was on site at VERITAS, the company was hard at work
on its latest SAN management software, SANPoint Control, and was showing how the product
taps into the knowledge of the SAN fabric at the fibre-channel switch to gain insight into the
status of the SAN. Softek SANView is a SAN fabric management product (from Fujitsu) for
monitoring multi-vendor SAN devices. Like the VERITAS product, SANView uses SAN
discovery to map the physical connections between all fabric components. Other products
adopting this approach include BMC’s PATROL Network Storage Manager and IBM’s Tivoli
Storage Network Manager. Many vendors adopting this approach know that much can be
gleaned about the SAN from the fibre-channel switches. However, due to these products’
specialized function, to manage SAN environments, they might do little to solve your SRM
issues, such as controlling disk usage and performing trend analysis.
155
Chapter 6
Project Management
In this section, I will define some basic but important project management functions. At this
point in the project, you should have a clear understanding of how the SRM solution functions
from a technical standpoint. The majority of your time will be spent dealing with organizational
issues and problems that were not discoverable during the pilot test phase.
Critical Path
Defining the critical path in general terms is easier than defining what it is for your SRM
deployment. The critical path is the shortest duration from start to finish with consideration for
all dependencies (tasks that cannot be started before another is finished). Any change to the
duration of tasks on the critical path result in the overall project schedule slipping, and that is
why the critical path is so critical. The best way that I have found to determine the critical path is
to use a planning tool such as Microsoft Project to lay out all of the dependencies.
Milestones
Milestones are defined as markers or events of essentially zero duration that mark the course of a
project. For your SRM deployment, milestones may consist of reaching a decision on a new
storage system (SAN or NAS), choosing an SRM product, finishing the development of the
deployment plan, deploying the first five servers, reducing wasted storage by 20 percent, and so
on. Milestones are important for measuring the progress of a project, but they must also be
measured against resource consumption, as the milestone may be achieved, but you might have
expended too many hours of labor in the process.
Risk Analysis
Identifying the forces that put your project at risk and that can cause project failure is one of the
most difficult tasks for the project manager. However, this task is extremely important, as you
are dealing with resource tradeoffs and trying to use them wisely to prevent project failures.
Table 6.3 shows a sample risk-management analysis. Keep in mind that this table is a general
guideline—you’ll want to hammer out the specifics. For example, if you are concerned about the
SRM software causing server crashes, name the SRM product and what configurations have been
tested versus what is unknown.
156
Chapter 6
You might notice that some of the priorities in the table aren’t in the order you might have put
them in. An example is the identified risk of the SRM solution interfering with server availability
(system crashes). This item is certainly a high-impact item, as identified by the severity level of
5, but is low enough probability (a rough estimate of 15 percent) that it becomes a low priority.
Total risk is composed of two multiplied components: the risk severity times the estimated
probability. One of the major reasons for performing the risk analysis is to assign relative priority
based on the impact of BOTH components and not just one or the other. Although server
availability is definitely important to the company, the risk that is being introduced is so low
(usually due to proper testing) that we can focus on other risks.
The first column in Table 6.3 assigns a relative priority so that you can sort the table with the
items that deserve the most attention at the top; however this first column will be the last column
that you have enough information to fill in. The next column is the Source of Risk, which helps
to identify whether the risk is within your range of control. Even if it is totally out of your
control, there are still steps that you can take to mitigate or manage the risk. For example, if you
are purchasing new hardware and fail to insure it during transit, then you are taking the risk that
something may happen to the gear before it arrives. Even if you are financially covered, there is
a risk that replacement may cause severe project delays.
The Risk Name is just a short, descriptive name, useful for identifying what you are talking
about in project meetings. The full Risk Description helps to identify and understand the source
of the risk. As described in the previous example, the Risk Severity is the level of impact to the
business if this event or series of events happens. In Table 6.3, I use a scale of 1 to 5, with 5
being the most severe level. Next is the Risk Probability, which is expressed as percent
likelihood of occurrence. By multiplying the severity and the probability, you obtain an estimate
of the Total Risk. It is from this relative number that you can rank the priorities. The column
labeled Mitigation allows you to define the procedures or steps you can take to minimize the
probability or impact of the risk. Also, you can assign an owner to the risk.
157
Chapter 6
Priority
Risk Source
Risk Name
Description
Severity
Probability
Total
Risk
Mitigation
R
i
s
k
O
w
n
e
r
1
Organizational
and External
(market
forces)
Lack of
resource
funding.
Project
success
depends on
funding for
SRM
deployment.
Project
funding
priority is
relative to
other
company
projects.
5
20 percent
1.0
Assurance
of funding
based on
fiscal year
budget
2
People
Lack of
training
Inability to
properly
implement
and support
the SRM
solution.
3
30 percent
0.9
Develop
training
program
and
schedule
sessions
3
Technical
SRM issues
SRM
solution not
working as
designed in
all
situations.
2
40 percent
0.8
Lab
testing,
consultant
expertise
and
vendor
support
4
Technical
Deployment
resource
Ability to
keep SRM a
priority and
deploy on
schedule
(may lose
focus due to
putting out
other fires).
3.5
20 percent
0.7
Project
status as
a
discussion
point in
weekly IT
staff
meetings
5
Technical
SRM
interference
SRM
solution
interfering
with server
availability
(system
crashes).
4
15 percent
0.6
Purchase
premiere
support
contract
from SRM
vendor
Table 6.3: Sample risk-management analysis.
158
Chapter 6
Risk Mitigation Roles
Do you need another title? How about Chief Risk Officer (CRO). This title is defined as the
person responsible for making sure that the project team is identifying and mitigating risk. Quite
the responsibility, as risk is usually associated with the high-visibility of project failure and not
the expected result of success. But notice the team approach; the CRO needs to bring up the open
discussion of risk factors at status meetings and make sure that the team members are alert to
project risk and empowered to use the following mitigation techniques. You may need to
experiment a bit with formalizing this process if you are unsure, as too much formality can
become a burden and hinder the positive benefits. At least ensure that all team members are
aware that there are risks associated with the project and some of the risks may not be 100
percent controllable.
Mitigation Techniques
Remember that risk is not something inherently bad or something to avoid at all costs, for
achieving success means taking on some risks. I have given the acronym NIP’EM (as in “nip
them in the bud”) to the process that I have developed to make sure that I incorporate risk
management into the project. Spelled out, the acronym is as follows
•
Notice—risks (be aware and be alert to risks)
•
Identify—the source of risks and the CROs
•
Plan—the mitigation of risk (using the provided template)
•
Enact—the plan and people involved
•
Manage—the ongoing status of the risk
Just by reading this book and raising your awareness of the following issues, you are taking the
first step toward risk management. Your chances of project success are increased with the
knowledge you gain.
Identifying Project Issues
The following section gives some example issues that you may have to resolve. By knowing
these issues in advance, you will be assured a smoother deployment. Your team will likely have
several levels of support for these issues. The first level will be to restore service to the business
users as quickly as possible. The next level will involve deeper analysis of the technical issues.
Wherever possible, the end user should be given the resources necessary for self-sufficiency, for
example, knowing the location of Help files and access to maintenance upgrades for software
installed on their computers.
159
Chapter 6
Technical Issues
The following problems are some of the most common headaches and issues that I have seen
during our internal deployment and also at customer sites:
•
Server crashes—Hopefully, you will have followed the previous chapter and conducted
pilot testing to make sure that your SRM solution (filter drivers and any agents that it
installs, if applicable) is compatible with the drivers, filter drivers, and agents already on
your systems. Nevertheless, expect the worst and have a contingency plan and maybe you
will be pleasantly surprised instead of vice-versa. These types of issues are rarely
predictable. The best advice I have gained is to try to change only one thing at a time;
otherwise, you’ll never establish a dependency between the root cause of the problem. I
know that this advice is not always easy to follow, as you are often in a hurry to resolve
the issue.
•
Upgrade issues—Similar to unexpected server crashes, upgrading software that is
working fine can lead to a new set of problems. And often you don’t have the option of
rolling back to the previous version. Occasionally, there is loss of functionality—at least
until systems administrators learn how to do familiar tasks in the new software
environment.
•
Pre-existing blocked files—I’ve seen the odd circumstance in which a company
implemented file blocking and created a set of policies (vbs, mp3, and so on), but existing
files were not affected. Any existing blocked files remained intact and could even be
moved around within the users’ shares! The solution was to run a report to find the
existing files and delete them.
•
Blocking leaves a zero-byte file—Another side effect of file blocking was that a zerobyte file was left behind. Not a major issue, but the clean-up of these file stubs was
incorporated into the monthly reporting and clean-up procedures.
160
Chapter 6
People Issues
Some of the issues that you deal with will not be the result of an improperly functioning SRM
product; on the contrary, they will be the result of a properly functioning SRM product. Perhaps
you have some political or organizational issues that will surface in the middle of your
deployment:
•
Lack of user education—Suppose that you are the first user to receive the quota message
that Figure 6.5 shows. If no one has told you ahead of time that a quota-management
system is in place, you might end up calling the Help desk for support. If they are
uninformed, who knows how many hours can be wasted trying to track down the source
of the message.
Figure 6.5: An example of what can be an obscure quota message to an uninformed end user.
•
One size fits all?—Remember that not all users in your organization will respond to the
SRM solution in the same way. Some will demand more storage, and they might be
entirely justified, especially if they are dealing with large graphic projects, performance
monitor captures, and so on. For this reason, I recommended in earlier chapters to
establish a baseline of storage consumption.
Resource Constraints
Most project managers have been faced with the situation of not having enough resources to
accomplish every task in a project. Often there must be tradeoffs made between using resources
here or there. You will likely be faced with two opposing members in your project team asking
for time or money that only one of them can have. Do you add more cache memory to the RAID
controllers to get better server performance, do you upgrade the network switches to get faster
user access, or do you hire that external consultant to conduct training? As you make the tradeoff
decisions, keep in mind two things: First, the project success criteria and how success or failure
will be measured. If the need is not on the list, don’t bother with it unless you really see the need
as part of risk mitigation to protect one of the success measurement points. The second thing you
will consider is the critical path. If tasks on the critical path are starved of resources, the entire
project can be delayed.
161
Chapter 6
Change Control
Given the rapid development of new software these days, most likely something in your SRM
deployment project will change, introducing an element of risk to the project. Perhaps the SRM
product itself will enter a new release version, and the features of the upgrade will appeal to your
decision makers. Or the change could merely be a change to the environment that might mean
unknown interaction with the SRM product, such as a new desktop OS or new handheld portable
device, such as the Pocket PC. During the course of writing this book, Microsoft released a new
Windows OS and several vendors released new SRM products.
You will be faced with the decision to continue rolling out your current product and upgrade
later, perhaps as a separate project, or to start over with your pilot testing using the new product.
Obviously, you cannot assume that the new product with behave in the same manner as the old;
there will be new features to take advantage of and potentially new system interferences. If you
are too far along in the deployment, you won’t have the option of merely substituting the new for
the old. Upgrading the SRM product might require a reboot after the installation of the new
version, so there is obviously some change control needed. Figure 6.6 illustrates this point,
showing the prompt to uninstall StorageCentral SRM 4.1 before installing StorageCentral SRM
5. Originally, this upgrade required a reboot between the uninstallation and the installation of the
new version, but the process has been streamlined to incorporate the old version uninstall and the
new version install before the single reboot.
Figure 6.6: Upgrading to StorageCentral SRM 5 requires uninstalling version 4.1.
Extending AD
StorageCentral is also an example of a product that is capable of extending the AD schema. You
will find that more and more products request this type of modification to AD to provide
directory integration. As Figure 6.7 shows, the product will prompt you during the installation to
ask whether you want to make this optional modification to AD. Schema extension is typically
required to create application-specific containers, and StorageCentral uses these to move
information to AD that was previously stored in each server’s registry. The net benefit is that
SRM is managed from a central point, AD, and activities such as searching or applying settings
in a cascaded manner are made more efficient.
162
Chapter 6
The option to modify the AD schema will affect your deployment depending on whether you are
•
Upgrading from NT 4.0 to WS2K3 and AD and introducing SRM
•
Deploying a new, first-time SRM product on WS2K3
•
Upgrading an existing SRM product
In the first case, if you are upgrading to WS2K3 (or Win2K) and AD and introducing SRM, you
may be in the best position to make the schema changes with the least interruption or risk to your
existing environment. If you are upgrading an existing SRM product, you will have server and
file share settings that need to be migrated to AD, either handled by the product or done
manually.
Any product affecting AD creates considerations for you during the installation. First, you must
consider the permissions required to perform the installation. In addition to having Administrator
permissions on the server and at the domain level, if you are modifying AD, you must be a
member of the Schema Admins, Domain Admins, and Enterprise Admins groups.
Second, the system on which you perform the installation will need to be considered. Your first
installation should be at a domain controller instead of at a member server if you plan to extend
the schema. You don’t necessarily need to log on to the system with the schema master Flexible
Single-Master Operation (FSMO) role—the changes will actually take place at that domain
controller because it owns schema operations. Some people prefer to cut out the middleman, so
to speak, and log on to the schema master FSMO just for safety’s sake in case there is any
disruption to the network during the schema extension.
In addition to making sure that you have the right permissions, make sure that you select The schema
may be modified on this domain controller check box on Win2K domain controllers. Only the schema
master FSMO has this check box. To navigate to this property, open the Schema Master MMC snapin by clicking Start, Run, typing
MMC
in the text box, and clicking OK. In the MMC, click Add/Remove Snap-in from the Console menu, and
select Add. Click Active Directory Schema, click Add, then click Close. If the AD Schema MMC snapin is not listed, click Start, Run, and in the text box, type
regsvr32 schmmgmt.dll
then click OK. Look for the dialog box stating DllRegisterServer in schmmgmt.dll succeeded, then try
adding the snap-in as explained earlier. Once the snap-in is added, right-click Active Directory
Schema, and select Operations Master from the menu. Don’t bother looking for The schema may be
modified on this domain controller check box on WS2K3 domain controllers, as it has been removed
and isn't required.
163
Chapter 6
Figure 6.7: Example of a product extending the AD schema.
A good practice is to know where the FSMO roles are located before you deploy, especially because
there is no automatic failover of these roles between domain controllers. Grab the WS2K3 resource
kit Dumpfsmos.cmd utility to get a dump of where the FSMO roles are located. Or you can follow the
procedures at http://support.microsoft.com/support/kb/articles/Q234/7/90.ASP.
After the installation of any product that modifies the AD schema has completed, I highly
recommend that you use tools such as the Active Directory Users and Computers MMC snap-in
and ADSIEdit to check whether the new environment (including containers and objects) appears
in these tools before running the product. Figure 6.8 shows the view of StorageCentral SRM
containers in Active Directory Users and Computers, showing how the information previously
held in the local registry is now in AD.
To see the StorageCentral containers in AD, you must enable the Advanced View in Active Directory
Users and Computers. Alternatively, you can use a tool such as LDP or ADSIEdit from the WS2K3
Support Tools.
164
Chapter 6
Figure 6.8: StorageCentral SRM containers in Active Directory Users and Computers.
Best Practices
Throughout this book, I have focused heavily on the deployment methodology in addition to the
feature set and goals of SRM. In the next chapter, we will look at management and monitoring of
the SRM solution. I will draw heavily from best practices developed primarily for data center
operations, and will discuss both the Microsoft Operations Framework (MOF) and the
Information Technology Infrastructure Library (ITIL). The difference between the two is that
ITIL is a wide-coverage set of guidelines for running the business of IT, and includes several
levels of certification, while the MOF is focused on Microsoft technology and provides more
detailed guidance (referred to as prescriptive as opposed to descriptive). If you are looking for
the next level in your IT career and are moving into data center operations, you will want to
consider ITIL training and possibly certification.
165
Chapter 6
Hardware Standardization
Essentially a risk-management technique, hardware standardizations are used by larger
organizations to ensure consistent deployment results. Particularly if you are deploying new file
servers, you will need to publish standardized configurations such as the ones listed in Table 6.4.
Also frequently standardized is the drive layout (RAID type, partition sizes, file system, and
drive letters) so that file servers have consistent mass storage.
Configuration
Server Model,
Processor, Memory, NIC,
Options
Branch Office
Pentium III processor,
512MB of RAM, dual
NICs, redundant power
supplies
Direct Attached RAID5, 36GB
drives x __ number
Departmental
Dual Pentium III
processor, 1GB of RAM,
dual NICs, remote
management option
board, redundant power
supplies
Direct Attached RAID5, 36GB
or 72GB drives x __ number
Data Center
Dual Pentium III processor
with cluster option, 1GB of
RAM, dual NICs, remote
management option
board, redundant power
supplies
SAN, redundant fabric and
array controllers, RAID5, 36GB
drives x __ number (or __GB of
virtual disk array)
Vendor
Part
Numbers
Storage
Vendor
Part
Numbers
Table 6.4: Standardized hardware configurations for deployment.
Server Naming Standards
Similar to hardware standardization, establishing a set of server naming standards is used to
ensure consistency. Developing server naming standards is really not difficult, but I have sat in
meetings in which very large organizations just could not come to agreement on the standard. If
you need one suggested, you can use a combination of a code based on geographic location (3
digits), server role (4 digits), and a unique number (1 or 2 digits).
166
Chapter 6
Server Consolidation
If you want to catch a CIO’s ear, just whisper the phrase “server consolidation.” Server
consolidation is quite the hot topic in IT. The following reasons might be why your CIO is so
gung ho about server consolidation:
•
Fewer servers means less chance of hardware faults
•
Less management overhead with a centralized environment
•
Fewer servers means fewer software licenses (third-party add-ons and so on)
•
Pooled resources means more efficient use of storage and balancing hot spots and cool
spots in performance (such as network and processor)
•
Unify disparate servers models and normal server configuration standards
•
The ability to create higher SLAs by using more fault-tolerant hardware systems
The bottom line is to save money with all of these advantages. But the following list provides
reasons why server consolidation might not be such a good idea in your environment:
•
Your locations and servers are too distributed, so network traffic over the WAN would
increase
•
Greater impact of an outage if you lose the data center server
•
Political reasons, such as objections to sharing administrative control
•
Single point of failure (“placing all of your eggs in one basket”) vs. distributed systems
Server consolidation has an important impact from the perspective of SRM, and might affect
your project. You might be managing fewer systems (which is a good thing), but you must also
create greater assurance that the system will be stable. In the process of server consolidation, you
will first need to focus on the storage consolidation, migrating isolated DAS to networked
storage (SAN or NAS).
Here are some lessons learned and best practices from my own server-consolidation efforts.
First, if you place all of your eggs in one basket, make sure it’s a really nice basket! Start with
fully fault-tolerant hardware and consider availability enhancements such as clustering if you can
justify the cost. Focus your efforts on becoming an expert on recoverability; Business
Continuance Volumes (BCVs) cloning or snapshotting are designed to speed recovery. We’ll
look more at them in Chapter 8. In addition systems-management software and processes
become more important as you place more users on a single server or storage system—the
sooner that you can find out about hardware faults or leaky applications (consumer memory or
locking processes on the server), the better your chances of avoiding or minimizing downtime.
But overall, storage is critical. It used to be that storage was a peripheral that you attached to a
server, but, as I heard it expressed recently, storage is the core and the servers are peripherals.
167
Chapter 6
SAN Best Practices
It is quite likely that you are very familiar with DAS and familiar enough with NAS to be able to
manage those devices. But SANs are newer technology and more difficult to deal with, so you
might not be as familiar with them. The following list provides some best practices in case your
storage deployment includes SANs. If you are new to SANs, one day you will look back and
realize how vital it is to know this information. I gathered these best practices from working with
some enterprise-class SAN deployments:
•
A SAN is really a fibre-channel fabric. Avoid arbitrated loops except for the smallest
environments with very few hosts. If this is your first foray into the world of SANs, the
recommendation has been to start with a loop, but be advised that you will not be able to
get a very long lifecycle out of the product, especially in production.
•
Work with your switch vendor to define your fibre-channel fabric and zones. The fabric
should have no single point of failure.
•
Standardize on vendors at each layer of the SAN fabric, for example consistent use of
switches, host bus adapters (HBAs), and storage devices. Although multi-vendor can
work, it must be done carefully.
•
Standardize on firmware and drivers for the devices on the SAN. It may be difficult to
verify these, and even more difficult to update them once the SAN is in production.
•
When you bring a new system onto the SAN, be sure that the system does not interrupt
any other systems. WS2K3 does not always play well with others; if another OS (such as
UNIX) has disks without proper Logical Unit Number (LUN) masking (or storage
security), you might lock those disks if you mount them. So devices on your SAN should
be properly secured through LUN masking or hardware-based storage security. Storage
security based in software might not work if the server is booted in safe mode or the OS
is rebuilt.
•
Build your systems with single path to start with before attempting multi-path. Once you
add the multi-path, test the failover and also test safe mode.
Success Measurement Criteria
Finally, but importantly, keep in mind the criteria that will be measured to determine whether
your project is successful. Table 6.5 gives you another template that you can print out and tack to
your project-management bulletin board.
168
Chapter 6
SRM Goal
Objective
Actual Outcome
Eliminate duplicate files (and
improve the sharing of files)
Reduce storage consumption by 10%
over the initial 30 days and again over
the next 90 days
Day 0
Day 30
% reduction
Day 90
% reduction
Eliminate unused files (based
on aging and orphaned files)
Reduce storage consumption by 10%
over the initial 30 days and again over
the next 90 days
Day 0
Day 30
% reduction
Day 90
% reduction
Eliminate wasted space (from
non-essential files, based on
file type)
Reduce storage consumption by 10%
over the initial 30 days and again over
the next 90 days
Day 0
Day 30
% reduction
Day 90
% reduction
Reduce storage consumption
(by setting disk quotas)
Reduce storage consumption by 20%
over the initial 30 days and again over
the next 90 days
Day 0
Day 30
% reduction
Day 90
% reduction
Table 6.5: Sample success measurement criteria.
Summary
This chapter covers the deployment phase of your SRM solution. We started by reviewing the
SRM goals and components and developing an organizational view of the SRM solution. Next,
we looked at SRM products and strategies and listed product selection criteria for the following
types of solutions: device configuration and management, enterprise storage management,
application-centered storage resource management, fibre-channel SAN approach to SRM, and
policy-based object management. We also covered project-management fundamentals such as
the critical path, setting milestones, and performing a risk analysis. I gave you a template for
risk-mitigation techniques and identifying sources of problems such as technical issues and
people issues. We also looked at change control in the context of extending AD. I gave you some
best practices in the areas of hardware standardization and server-naming standards. We looked
at the pros and cons of server consolidation, and some best practices for SANs, as they can be an
essential part of server consolidation. Finally, I gave you some sample success measurement
criteria to help you define your objectives for this phase of the project.
In the next chapter, we will continue the focus on the project-management aspects, as you finish
the deployment by setting up systems to monitor and maintain the SRM solution. In addition, we
will cover the technical aspects of what you need to monitor and what solutions are available.
169
Chapter 7
Chapter 7: Manage and Maintain the SRM Solution
In the previous chapter, we covered the deployment phase of your SRM solution. We started by
reviewing the SRM goals and components, and developed an organizational view of the SRM
solution. Next, we looked at the different storage management strategies and the various
products, listing product selection criteria for the following types of solutions: device
configuration and management, enterprise storage management, application-centered SRM,
fibre-channel SAN approach to SRM, and policy-based object management. I continued to cover
project-management fundamentals, such as defining the critical path, setting milestones, and
performing a risk analysis. I gave you a template for risk-mitigation techniques and how to
identify sources of problems such as technical issues and people issues. We also looked at
change control in the context of extending AD. Finally, I gave you some sample successmeasurement criteria to help you define your objectives for this phase of the project.
In this chapter, we will continue the focus on project management, as you complete the
deployment by setting up systems to monitor and maintain the SRM solution. In addition, we
will cover the technical aspects of what you need to monitor and which solutions are available. I
will give you a complete list of systems management recurring tasks that you can use to make
sure that you have all your operations—including SRM functions—in place.
The goal of this chapter is to aid you in developing a daily approach to SRM that automates the
repetitive tasks—for example, monitoring disk usage by using the SRM software that we have
discussed. This automation will free your time for other crucial tasks—such as maintaining your
security defenses—that might be overlooked, as you only have so much time, and must
continually fight to ensure that your priorities match those of the business. Table 7.1 shows
Phase 6 in the overall SRM deployment methodology.
Phase
Process
SRM
Maintain
Continue to support
the solution and
prepare to improve
as needed
Monitor disk usage; add storage as needed (hopefully,
only for performance upgrades or to replace defective
hardware).
Table 7.1: Phase 6 of the SRM deployment methodology.
Project Management Aspects
At this point, you should be polishing off any rough edges in your SRM deployment, and you’ll
find that this task is the easiest part of the deployment. You will have the opportunity to see the
benefits of SRM, and to work on automating the monitoring and management process. Change of
plan—this urgent message just came in—we have a security violation on our network that must
be dealt with immediately!
170
Chapter 7
Security Issues
When your security model is compromised, all else takes a back seat—SRM becomes less
important than storage resource protection. Security should not be omitted from any deployment.
Throughout this SRM discussion, security may have been given lesser priority, but in this
chapter, we will increase the priority of security in the context of your SRM deployment. Recent
virus outbreaks have either tested whether you have been updating your systems’ security
patches or given you a chance to validate your data recovery procedures. Lately, much effort has
been spent just keeping systems safe from harm, and from this effort, new attention to security
and a new security initiative from Microsoft will result.
There are several things that you can do to improve your security immediately. First, subscribe to
the Microsoft Security Notification Service. To subscribe to this service, send an email to
[email protected] (no need to put anything in the subject line or message body). More
information can be found at http://www.microsoft.com/technet/security/bulletin/notify.asp.
The next thing that you can do is to download and run several Microsoft-provided security tools.
Many security flaws or problems have been found in IIS, which is a component of the default
installation of WS2K3, so many of the tools focus on IIS. A good starting point is the Microsoft
Security Tool Kit, as it packages several tools and recent patches to the OS, IIS, and some
applications, such as Internet Explorer (IE).
Microsoft Baseline Security Analyzer
The newest version of the Microsoft Baseline Security Analyzer (MBSA) can be downloaded
from the MBSA site at http://www.microsoft.com/technet/security/tools/mbsahome.mspx. The
newest version scans not only for missing security updates and incorrect configurations in
Windows but also in IE, IIS, SQL Server, Exchange Server, and many other Microsoft products.
This tool should become a constant companion, and you should use it to run regular security
checks of all your servers.
MBSA will point out any missing security updates as well as poor configurations (such as weak
or missing passwords). Obviously, in order to have the most secure system possible, you should
carefully review MBSA’s report and consider its recommendations for hardening your servers.
In addition, Microsoft recommends uninstalling all Windows services not in active use—
particularly IIS, but also services such as DNS, DHCP, and so forth. Any service not actively
needed on a server should be uninstalled; merely disabling the service still presents the
opportunity for an attacker to re-enable it and exploit any vulnerabilities.
171
Chapter 7
Systems Management and Monitoring
Now that we’ve secured our systems, we can get back to the business of SRM. The challenge in
managing any storage environment is how to be proactive instead of reactive to catastrophic
events. What can we learn from the top professionals in enterprise organizations, the largest
consumers of storage? If we follow their lead, we will already know the questions that must be
asked, and how to find the answers, such as “How do I know if my storage is online and
performing as well as it should?”
Anticipating Changes
Table 7.2 lists the types of events most likely to happen in your environment, and some planned
responses. What you can gain from the table is seeing the importance of SRM. Perhaps you are
reading this guide with the thought of maintaining SRM yourself, without a third-party
application. If so, you will need the following contingency plans.
Anticipated Event
Planned Response
Running low on disk space
Notify users and prepare user report and administrative report
about what can be removed; if space is dangerously low, block
writes until files are removed
New users and home
directories
Add new users to the existing storage policy (how much space
allocated, which types of files are not allowed); ensure that no
users exist outside of policy
New folders or subdirectories
Ensure that existing storage policies are applied to the new
folders
New viral attacks
Prevent viruses from writing files by using NTFS permissions and
identifying the viral files (sometimes creating a read-only preexisting file to stop the viral action)
New storage systems and
SANs
Carve out the storage and allocate to application servers and file
servers; apply storage policy to allocated storage; ensure that
storage systems and SANs are part of the management
framework; understand how to deal with specific events, such as
device failure
Disk drives added to the
servers
Add the disks to existing arrays (if supported) or create new
arrays and logical disks; ensure that the disks are part of the
storage policy
Non-events
User accounts and files will become orphaned through long
periods of inactivity; plan to identify and clean up these objects
periodically
Minor service interruptions
Resolve items such as loss of power, cable damage or failure,
operator error, or other service errors such as GC or domain
controller failures precluding authentication
Catastrophic events
Device failure or data corruption necessitating recovery
procedures; identify proper procedures and ensure that recovery
hardware is on standby
Table 7.2: Anticipated storage events and planned responses.
172
Chapter 7
OS Monitoring
You can monitor and measure WS2K3 stability using similar methods as you use with other
applications, primarily by using an application monitor (such as Microsoft Operations
Manager—MOM, which I’ll discuss later) to watch event logs for the dirty shutdown event (ID
6008) followed by the system startup event (ID 6005).
If your server is experiencing stop errors, see the following articles for information about how to use
the crash dump information recorded in the Memory.dmp file
“Gathering Blue Screen Information After Memory Dump” at
http://support.microsoft.com/directory/article.asp?ID=KB;EN-US;Q192463&
“Blue Screen Preparation Before Contacting Microsoft” at
http://support.microsoft.com/directory/article.asp?ID=KB;EN-US;Q129845&
For information about troubleshooting failed applications, see the article “How to Install Symbols for
Dr Watson Error Debugging” at http://support.microsoft.com/directory/article.asp?ID=KB;ENUS;Q141465&
Another product that performs WS2K3 monitoring offers a visual perspective—Quest Software’s
Spotlight on Windows. This product’s UI looks like it should also play CD-ROMs or mp3s, but
it actually provides an all-in-one view of how a server is performing, including items such as free
disk space and disk I/O (reads/second and writes/second). This tool offers more functionality
than Windows Performance Monitor provides, featuring an analysis of performance data and an
online tuning guide.
Storage Event Monitoring
When we are forced to be in reactive mode, which is inevitable as devices fail and software
crashes, the key is how quickly we can find out that there is a problem and how extensive the
information is that we can gather. Quite often we find that the failure was preceded by several
warnings, such as several Event 9, source: scsi miniport driver, which states The device,
\Device\ScsiPortX, did not respond within the timeout period followed by an Event 11 source:
scsi miniport driver, which states The driver detected a controller error on Device\ScsiPortX.
The WS2K3 Performance Monitor can be set to monitor many servers with a very infrequent
polling interval—just remember to change the service startup to use a domain account with
sufficient credentials. You can monitor free disk space on logical drives (by enabling disk
counters using the diskperf –y command, as discussed in previous chapters), or you can monitor
a counter such as system uptime, just to make sure the system is online.
Another choice is a vendor-provided storage monitoring application, such as the Web-based
view of a direct-attached RAID controller, which Figure 7.1 shows. The information that this
figure shows is from Compaq Insight Manager, which provides information about devices
attached to the server: controllers, disks, storage boxes (cabinets), and so on, and is available at
http://www.compaq.com.
173
Chapter 7
Figure 7.1: Web-based view of a direct-attached RAID controller.
This type of vendor-provided application is usual for storage event monitoring as it shows
degraded and failed devices, which you can see in the Condition Legend. As the figure shows,
the controller is in a degraded state as an array is being rebuilt (the error code states Expand in
Progress). Where this product falls short is that a view or state must be determined for each
server and rolled up to a centralized hierarchy, and perhaps this model does not apply when you
are dealing with multiple applications sharing a pool of storage. So, we must also consider
storage monitoring from an application perspective.
Storage Application Monitoring
In the previous chapters, we have gone through the process of designing, testing, and installing
your deployment of an SRM application. At this point, we must address the questions, Who will
monitor the storage resource monitor? and How will we know that the SRM application is online
and performing its duties? The answers lie in another layer of monitoring—in application
monitoring and management. There are a wide variety of application-monitoring products,
including those focused on SAN device management, which we touched on in the past chapter.
174
Chapter 7
The wide variety of storage and SAN monitoring tools presents several challenges. First, it
presents a variety of interfaces or methods of managing the storage, as there is little commonality
between the vendors. Second, vendors must develop a product that manages a wide variety of
devices that offer varying degrees of interoperability or have limited standards. Thus, the end
result is a multitude of specialized management applications with little centralization. At this
point in technology evolution, our best choice for centralization is a management and monitoring
application that relies on gathering information from the servers (and other similar devices that
include event logging, such as a SAN management appliance based on WS2K3) attached to the
SAN.
What will monitor the monitoring application—how will we know that the applicationmonitoring application is running? Fair questions, let’s take a look at one application-monitoring
package, MOM, that includes management packs to monitor itself.
MOM
So much press and publicity has been focused on MOM lately that I think it is beneficial for
storage architects and storage administrators to pay attention. I have heard critical reviews of
MOM’s difficulty and shortfalls, but as with any Microsoft product, the lessons from the field
will be turned into a better and perhaps more successful product. If you are unfamiliar with
MOM, it is a server- and application-monitoring product for which Microsoft bought the code
from NetIQ. So, if you are familiar with the NetIQ product functionality, MOM will be familiar.
If not, you might find getting started with MOM difficult and overwhelming. I’ll give you a
quick lesson in how to get started with MOM, and we’ll look at how MOM integrates or will be
integrated with storage management.
Perhaps the most difficult part of getting started with MOM is to meet the prerequisites. It is
doubtful that you have all of them in place. The server on which you choose to install MOM is
known as the central computer. This server will act as the database collection point and the
management console. It should be a member of a domain, but not a domain controller, or MOM
will refuse to install.
First, run Office Setup, and install the Office graphing component and Access 2000 (the full
version of Access 2000 is required for creating or customizing reports, whereas, the run-time
version of Access 2000 is required to run and view reports). The run-time version of Access
2000 is available on the MOM CD-ROM in the \Intel\Access2000RT folder.
Next, update %systemroot%\system32\inetsrv\browscap.ini if you’re not using IE 6.0.
Supposedly (according to the MOM product documentation), this file is downloadable from the
Microsoft Web site, but I couldn’t find it, and the MOM setup application takes care of updating
browscap.ini. Optionally, you can install Outlook 98 or later to send email notifications through
Microsoft Exchange.
Next, install SQL Server 2000, and set a password on the sa account. If you’re installing MOM
on an existing SQL Server, run the
sp_helpsort
query to ensure that the sort order is case insensitive, and verify that the audit level of SQL
Server is set to None or Failure (check the audit level on the Security tab of the server’s
properties page). Ensure that the MSSQLServer, MSDTC, and SQLServerAgent services are
running and set to start automatically on computer startup.
175
Chapter 7
MOM requires Microsoft Data Access Components (MDAC) 2.6 or later. As Figure 7.2 shows,
the MOM setup program will verify this prerequisite and give you the option to install MDAC
2.6.
Figure 7.2: MOM installation verifies prerequisites and can update MDAC.
Next, increase the log file size of the Microsoft Distributed Transaction Coordinator (MSDTC).
As Figure 7.3 illustrates, the MOM installation program gives you the option to increase the
MSDTC log file size, and it can launch the Component Services MMC for you. In the MMC,
right-click My Computer, and select Stop MSDTC. Right-click My Computer, and select
Properties to access the MSDTC log file settings. Increase the log file size as much as possible,
with 512MB being a recommended minimum for production environments (possibly on its own
drive array), and 64MB a recommended minimum for small or test environments. Clicking OK
to confirm the changes has the same effect as clicking Reset Log. On the pop-up warning
message, click Yes only if you are sure that it is OK to reset this log on your system. Finally,
right-click My Computer, and select Start MSDTC.
176
Chapter 7
Figure 7.3: The MOM installation program gives you the option to increase the MSDTC log file size.
During installation, you might want to add Management Pack Modules as Figure 7.4 shows.
Figure 7.4: Adding Management Pack Modules during MOM installation.
177
Chapter 7
The next step in setting up MOM is to add the servers that will be monitored. MOM will
discover the servers and push out agents to them if you authorize it. This process isn’t well
documented in MOM, so I have illustrated it. The first step is to right-click the Agent Managers
folder in the MOM Administrator Console, and open the properties, as Figure 7.5 shows.
Figure 7.5: Accessing the properties of the Agent Managers folder is the first step in selecting the computers
to be managed by MOM.
Next, on the Managed Computer Rules tab, click Add, which will take you to the window that
Figure 7.6 shows. This window lets you enter the domain name of the servers and a rule for
matching the server names. If you want to find all computers in the domain, simply enter an
asterisk (*). You can approve or reject the installation of MOM agents individually, so you don’t
need to worry about finding too many computers at this point (unless MOM has been previously
configured to install without confirmation, but that is not the default setting).
178
Chapter 7
Figure 7.6: Selecting servers to monitor in MOM.
After MOM discovers the servers, you will see them listed in the Pending Installation folder
under Configuration, as Figure 7.7 shows. In this figure, I have three new servers on which to
install MOM, pending my approval.
Figure 7.7: List of computers pending installation of MOM agents.
179
Chapter 7
Integration with Other Applications
As Figure 7.8 shows, you can use Microsoft Visio Professional 2002 or later to diagram a SAN;
a process that is made easier by an add-in called BrightStor SAN Designer from Computer
Associates. This product works with Visio and allows you to create even complex SAN designs
more easily, using either equipment from a specific vendor or generic SAN equipment icons.
Figure 7.8: Using Microsoft Visio and BrightStor SAN Designer to diagram a SAN.
Improving the System
Although maintaining the status quo and avoiding problems is a good starting point, there is also
a need to work to improve your systems. From the business standpoint, information systems are
designed to give your business competitive advantage. On the horizon, there are always
competitors to internal information systems departments—service providers—whose mission is
to sell the same IT functions to the business as you, as a network administrator, provide, but from
an external basis. For the service providers to be successful, they must provide competitive
systems offerings, such as more efficient operations at a lower cost. They can also provide
competitive advantages such as higher performance or guaranteed availability. If the business’
internal information systems operations cannot provide these desired advantages, you are in
danger of losing your job to service providers. One way to ensure job security is to maintain
availability.
Maintaining Availability
In working with customers, I’ve determined that the key to maintaining and improving
availability is to understand two components: the mean time between failures (MTBF) and the
mean time to recover (MTTR). Combined, MTBF and MTTR determine the system availability.
180
Chapter 7
Improving MTBF
Most studies of system downtime, especially those focused on storage systems, list the top
causes of service interruptions as hardware failures, software crashes, and operator errors. So the
best way to ensure availability is to implement redundant hardware systems and provide
adequate operator training and change control. To protect against software crashes (in addition to
change control, which can help minimize untested configurations from being deployed) and to
protect against any remaining hardware faults, you can use clustering technologies.
There are many types of clustering in the Windows environment, from the Wolfpack clusters of
Microsoft Cluster Server (MSCS) to devices that run multiple servers in lockstep, such as
Marathon Technologies’ Endurance solution (http://www.marathontechnologies.com). Both of
these clustering options require specialized hardware that can add significantly to the cost. For
MSCS, the storage must be on a bus that can be shared, either external SCSI (which limits both
the distance and number of hosts) or fibre-channel (which is more flexible and costly).
Endurance requires a dual set of servers to separate the compute element from the I/O Processor
(IOP), which maintains the storage and network connections. To connect the computer element
and the IOP, the solution uses proprietary boards (Marathon Interconnects—MICs), essentially
as an extension of the system bus. This setup lets the computer element and IOP pair be
redundant (for a total of four physical servers acting as one logical server) and separated at a
distance, up to the acceptable latency limits of the fibre-channel interconnects. In the near future,
we may see these proprietary interconnect boards replaced by industry-standard InfiniBand
boards.
Another option is to use servers that mirror all internal devices, including processor and memory,
running all internal operations in lockstep. For any of these, the additional cost must be justified
against the desired improvement in availability, or at least the improvement in MTBF; if
something does go wrong on one of these specialized systems, you need to look at MTTR, and it
may be more difficult to recover than on a standard server.
Improving MTTR
The primary method for reducing MTTR is to ensure that suitable system and information
backups are being performed, and to ensure that recovery procedures are valid. The difficulty is
gauging the value of these operations against the cost of performing them. The experienced IT
manager or CIO knows the value of practicing recovery operations and decreasing recovery
times, but you can easily let this necessity fall behind in day-to-day priorities and activities.
File Share Security
Up to this point, I have touched briefly on NTFS security, so let’s consider this a final review of
the subject, and perhaps a final examination to see how well you do. Recently, I worked on a
project in which a shared directory was needed so that vendors and employees could place files
in it, but they could not browse the directory or open the files. I was surprised at how difficult
this process turned out to be for some systems administrators. Once you follow these steps, the
process will make sense as you relate it to your knowledge of inheritable NTFS permissions, but
if you perform the steps in the wrong order, the process will not work, which is what was
preventing the systems administrators from creating the secure drop share.
181
Chapter 7
Creating a Secure Drop Directory
The design goal is to have a drop folder that is available to anyone on the network to drop files
into but is not available for anyone else but a select administrator (who can view the contents or
execute files in that directory). For example, suppose the secure drop folder is called ztest. Two
users, AB User and Secure (or groups of users instead of the individual accounts used in this
scenario) can view the folder over the network. A member of the local administrators must be
logged on to view the folder’s properties. The share ztest should be under the Full Control of the
account Secure. The share ztest should be visible to AB User over the network and allow writeonly access to this user. As Figure 7.9 illustrates, if AB User attempts to open the ztest folder,
access is denied.
Figure 7.9: Attempts by a non-administrator to open the secure drop share ztest are denied.
However, copying a file to the folder by any authenticated user does not result in access denied,
as Figure 7.10 illustrates.
Figure 7.10: Non-administrator users can copy a file to the ztest folder.
The process of creating a secure drop share is not that tricky, but it has a few steps that must be
done in the right order or it just won’t work. Figure 7.11 shows the desired permissions for the
Secure account, which will have Full Control access to the ztest folder.
182
Chapter 7
Figure 7.11: Desired permissions for the Secure account.
Figure 7.12 shows the correct permissions for the AB User account, which allow the user writeonly permissions on the ztest folder.
Figure 7.12: Desired permissions for AB User account.
183
Chapter 7
Figure 7.13 shows what happens if you attempt to set the NTFS permissions during the sharecreation process. This error message can prevent some administrators from attempting to use this
configuration. As the figure shows, if the process is performed in this order, you cannot even
create the share!
Figure 7.13: The error message that results when permissions are applied during the shared folder creation.
Instead of using Windows Explorer to create shares, you can use the Computer Management
MMC snap-in, which Figure 7.14 shows. Doing so has the following advantages: you can create
shares on a remote computer and setting permissions on the folder is easier.
Figure 7.14: Using the Computer Management snap-in to remotely create a shared folder.
184
Chapter 7
The following steps walk you through how to create and configure the ztest folder:
1. Create the folder on an NTFS partition, as Figure 7.15 shows.
Note that the default permissions are Everyone Read for a newly-installed server, but your default
permissions may be different depending on your configuration and whether or not your server was
upgraded. If you have Everyone Full Control, you can leave it for now.
2. Share the folder either from the computer using Windows Explorer or remotely using the
Computer Management snap-in.
3. Modify the Share Permissions to be Everyone Full Control. The default permissions
provide only Read ability for the Everyone group.
Figure 7.15: Creating the shared folder.
4. Open the newly created share properties and select the Security tab, as Figure 7.16
shows. Remove the Everyone group if it exists.
185
Chapter 7
Figure 7.16: The default share permissions.
Usually you would not be able to remove the Everyone group using Windows Explorer to create
shares. As Figure 7.17 shows, an error message occurs if you attempt to remove the Everyone group
from a folder that inherits permissions.
Figure 7.17: Error message that results from attempting to remove the Everyone group.
186
Chapter 7
The big advantage of using the Computer Management snap-in to create and configure the
shared folder is that the snap-in automatically clears the Allow inheritable permissions check
box. However, if you are using Windows Explorer, you can work around the error message in
Figure 7.17 using the following steps:
1. As Figure 7.18 shows, if you’re using Windows Explorer to create and configure the
share, you must clear the Allow inheritable permissions check box before you can
configure the permissions. As the figure shows, the permissions check boxes will be
grayed until you clear this check box.
Figure 7.18: Clear the Allow inheritable permissions check box to configure the folder’s permissions.
2. After you clear this check box, you will be presented with the pop-up message that Figure
7.19 shows. Click Remove in this dialog box.
Figure 7.19: The pop-up message that results from clearing the Allow inheritable permissions check box.
187
Chapter 7
3. In the Customize Permissions window, which Figure 7.20 shows, clear the Read check
box under Allow, but leave the Write check box under Allow selected.
Figure 7.20: Setting write-only permissions for the secure drop share.
As Figure 7.21 shows, AB User cannot even see the size of files and folders.
Figure 7.21: AB User cannot see the size of files and folders.
188
Chapter 7
Ongoing Process of Storage Management
From here, the process of storage management will consist of everything from designing and
deploying the appropriate storage systems to developing the techniques to manage them.
Traditionally, storage management starts with defining the types of information that will be
stored, and developing the appropriate type of storage to house it. Next, information is classified
and prioritized so that appropriate protection and disaster recovery procedures can be
implemented. Once the storage is online and adequately protected (from both a security
perspective as well as fault protection, such as RAID), the storage monitoring begins, ensuring
availability and performance. Integrity of the information will need to be maintained as well
(protecting the data from corruption as well as ensuring that the information is necessary for the
business). Finally, future storage requirements will need to be anticipated and met; this process
involves selecting the best technology and making sure that the technology is distributed and
used wisely. Table 7.3 will help you keep track of the recurring tasks that comprise storage and
information systems management.
Priority
Task
Recurrence
Estimate
of Time
Needed
Details and
Example
1
Monitor server
and storage
availability
Real-time
Hours,
unless
automated
Verify servers are
online and not
reporting any failed
hardware or
predicted hardware
failures (such as
members of RAID
disk sets), verify
that services are
running, and verify
that disk storage is
available to users
and applications
Periodically,
such as ping
every hour
Hours,
unless
automated
Ensure that network
transport is active
and mail queues or
file transfers are not
queuing up
Resources
SRM tools,
WS2K3
Performance
Monitor,
application
monitors,
hardware
reporting
tools, and the
system
information in
the Computer
Management
console
2
Monitor
information
flow
189
Application or
transportspecific
automation tools
Chapter 7
Priority
Task
Recurrence
Estimate
of Time
Needed
Details and
Example
Resources
3
Troubleshoot
and support
Daily, on
demand
2 to 4
hours
Respond to support
requests on the
end-user or server
level
Help desk
support system
4
Document
change
control
Daily, as
needed (as
changes
occur)
30 minutes
Record changes to
the servers,
storage, and
network
environment
Auditing or
surveying tools
5
Perform
backups
Daily, weekly
4 to 8 hour
window,
hands-on
is time
minimal
Application data
and system state
backups
NT Backup and
other backup
software
6
Review
security logs
Daily, as
needed (as
changes
occur)
30 minutes
Review Windows
event logs and
application-specific
logs
Event log
filtering, and
monitoring tools
that read the
event logs (for
example, MOM)
7
Monitor
storage
utilization
At least
weekly,
automation
makes this
process more
real-time
Hours,
unless
automated
Check for available
free disk space,
perform usage
forecast (trend
analysis), and run
reports on
duplicate, aged,
and unwanted file
types
SRM tools and
WS2K3
Performance
Monitor
8
Routine
maintenance
Weekly
1 to 2
hours per
system
Includes offline
defragmentation,
removing temporary
files, and so on
Disk
defragmenter or
databaseapplication
specific utilities
9
Patch or
update
systems
Monthly or
more often if
there is a
security issue
1 hour per
system;
schedule
change
control if
updates
involve
downtime
Hotfixes and
service packs for
OSs, and firmware
updates such as
ROM flashes
Windows
Update, security
bulletins, and
the QChain tool
for applying
WS2K3 hotfixes
10
Document
environment
and current
project status
Daily or
weekly, as
needed (as
changes
occur)
30 minutes
Update the
documentation of
the network and
storage systems
and prepare reports
for management
Visio, SMS, and
other tracing
tools; and
project status
emails and
Microsoft
Project Gantt
charts
190
Chapter 7
Priority
Task
Recurrence
Estimate
of Time
Needed
Details and
Example
Resources
11
Monitor server
and storage
performance
At least
weekly,
automation
makes this
process more
real-time
15 to 30
minutes
Measure the
storage
performance
compared with
baseline or lastknown state—Is
performance
adequate?
WS2K3
Performance
Monitor and
hardwarespecific tools
such as SCSI or
fibre-channel
diagnostic
utilities
12
Review
directory
Weekly
15 to 30
minutes
Check for inactive
user and computer
accounts
Resources
depend on the
directory (for
example, Active
Directory Users
and Computers
for AD)
13
Validate
backups
Monthly
2 to 6
hours
Perform an offline
server recovery and
ensure that
backups and
recovery
procedures are
valid
Backup software
and standby
recovery
systems
14
Perform
security audit
Annually or
quarterly
8 hours
Perform intrusiondetection audits
and attempts to
breach security
Intrusion
detection tools,
security
consultants, and
the Microsoft
security toolkit
15
Research new
technologies
Annually, as
needed
Varies
Keep up-to-date on
improvements in
technologies and
update professional
certifications
Web sites and
email
subscriptions
such as InfoStor
news and
Storage
UPDATE from
the Windows &
.NET Server
Magazine
network
16
Upgrade
systems
Annually, as
needed
Entire
days
Ongoing upgrades
to servers and
storage systems
Keep track in
configuration
log, including
needed changes
17
Pointless
meetings
Too often
Eternity
Keep a sense of
humor here, I’m
only joking!
Dilbert books
and cartoons
Table 7.3: Storage systems management recurring tasks.
191
Chapter 7
Summary
The goal of this chapter is to aid you in developing a daily approach to SRM that automates
repetitive tasks, for example monitoring disk usage by using the SRM software that we have
discussed. This approach will free your time for other crucial tasks, such as maintaining your
security defenses, which we explored. We finished the SRM deployment by setting up systems to
monitor and maintain the SRM solution. We covered the technical aspects of what you need to
monitor and what solutions are available. I gave you a complete list of systems management
recurring tasks that you can use to make sure that you have all your operations in place,
including SRM functions. Without a daily approach, important tasks might get squeezed out, as
you only have so much time, and must continually fight to ensure that your priorities match those
of the business.
In the next chapter, we will look at the future of storage and SRM, including both hardware
technology and software changes. First, we’ll look at the immediate future—at changes that are
happening all around us—that you’d be wise to learn about and consider. Then I’ll take a more
predictive look into the future and attempt to divine what the predominant or surviving
technologies and standards will be.
Much of the next chapter will focus on networked storage, as that is clearly where the most
improvement and increases in adoption will occur. In the area of hardware, we’ll look at changes
in speeds and feeds as we get faster pipes and possibly even greater distances. One of the
upcoming changes is in virtualization of devices and storage, which we touched upon earlier. In
the next chapter, we will also look at what these changes mean from a storage management
perspective. We will look at the server side of storage networks, changes in host bus adapters
(HBAs), booting from the SAN, and multi-path I/O and what it means for performance and fault
tolerance.
No discussion would be complete without covering disaster recovery, so we will look at distance
mirroring, cloning, snapshots, and serverless backup. Some of these technologies exist today,
albeit in their infancy, so we will look at where they will need to go to speed adoption.
192
Chapter 8
Chapter 8: SRM and Storage Futures
In this final chapter, we will look at the future of storage and SRM, including both hardware
technology and software changes. First, we’ll look at the immediate future, at changes that are
happening all around us that you may be wise to learn about and consider. Then I’ll take a more
predictive look into the future and attempt to divine what the predominant or surviving
technologies and standards will be. Much of this chapter will focus on networked storage, as
clearly that is where the most improvement and increases in adoption will occur.
In the area of hardware, we’ll look at changes in “speeds and feeds” as we get faster pipes and
possibly even greater distances. Some changes will be more about being able to achieve greater
performance over greater distance; but what we are really interested in for this chapter is how
these changes will affect storage management. Will they make it better or worse? One of the
pending changes that will definitely improve storage management is Directory Enabled
Networks (DEN) or, more precisely, directory-enabled storage networks.
One of the upcoming changes is in virtualization of devices and storage, which we touched upon
earlier. In this chapter, we will also look at what virtualization means from a storagemanagement perspective. We will look at the server-side of storage networks, changes in host
bus adapters (HBAs), booting from the SAN, and multi-path I/O and what it means for
performance and fault tolerance. No discussion would be complete without covering disaster
recovery, so we will look at distance mirroring, cloning and snapshots, and serverless backup.
Some of these technologies exist today, albeit in their infancy, so we will look at where they will
need to go to speed adoption.
One thing to keep in mind during this chapter is the idea of principles over protocols—that is,
keep the business value in mind whenever you are investigating new technology. To give you a
concrete example, some new technologies promise to give you the ability to bridge storage
islands (which may be defined or isolated as such by specific protocols or cabling). Perhaps this
technology is of interest to you, but what will be the benefit to the business? Do the applications
running in each of these islands really need to be bridged or are they better off in the safety of
isolation?
Immediate Future
This section covers changes to your environment that you may face immediately or over the next
6 months. These are not wild predictions; instead, they offer guidance on where you can define
your storage strategy. Many of these technologies are currently available to you.
The Future of SRM
With storage capacities and storage consumption growing at phenomenal paces, SRM will play
an increasingly important role in the organization. Storage administrators will need to manage
complex environments without regard to whether the storage is SAN or NAS or whether the
storage is connected to server X, Y, or Z. The desire is for single seat administration using a
single set of management tools; however, that dream is a long, long way from reality.
193
Chapter 8
In the near future, the best that we can hope for are changes in the management capabilities at the
lower levels of the storage architecture (from the disks, controllers, switches, host adapters, OS,
and applications) that enable information to roll up to the directory services. The first phase will
be enabling the view of the entire enterprise storage infrastructure, followed by the ability to
apply storage resource policies to every level of configuration detail.
Storage-Management Utilities
The following Win2K and WS2K3 utilities will make disk management easier for you. WS2K3
ships with additional command-line tools and utilities that were previously available only in the
resource kits. Many of these are disk- and storage-related (for example, Freedisk, which lets a
command run only if a specified percentage of disk space is available; and TakeOwn, which lets
administrators take ownership of orphaned files).
DiskPart
DiskPart lets you manage disks, for example, by extending a disk volume while the storage is
online to the OS. DiskPart is fully scriptable, using the syntax
Diskpart /s <script>
Figure 8.1 shows the commands for using DiskPart, which is also useful for rescanning the
server to detect any devices that have been presented from a SAN. For example, after breaking
off a Business Continuance Volume (BCV—such as a clone) and presenting it to the host, we
use DiskPart to detect the new drive and mount it.
Figure 8.1: DiskPart commands for managing a disk volume (Windows XP version).
194
Chapter 8
DiskPart is available for Win2K by download or as part of the Recovery Console (I discuss the
installation instructions later) as well as in the default installation of Windows XP and WS2K3.
Be careful and make sure that you are using the appropriate OS version, as there can be
differences in how they operate (see the following note). In Win2K, the DiskPart command is
only available when you are using the Recovery Console, so most of the benefit in a production
environment for changing disks will be to WS2K3 systems.
Microsoft previously released an earlier version of DiskPart on the resource kit Web site. To verify
that you have the correct version of DiskPart, check the properties. Earlier versions of the
DiskPart.exe file have a file version of either 0.52 or 1.0 and the later version has the following
properties:
Created: September 21, 2001
Size: 146,432 bytes
File version: 5.1.3553
To install the Recovery Console as a startup option in Win2K, insert the Win2K CD-ROM, and hold
down the Shift key to prevent the CD-ROM auto-run feature from running or wait for the auto-run
feature to bring up the installation options. Close the installation wizard, run a command prompt, and
type the following
x:\i386\winnt32.exe /cmdcons
where x is your CD-ROM drive letter. If you have the bits copied to disk, you can run the installation
directly from the hard drive. Answer Yes to the prompt that Figure 8.2 shows, and installation will
begin.
Figure 8.2: Installing the Recovery Console.
The installation won’t prompt you to reboot your system, but the Recovery Console will be available
as a boot option the next time you reboot your system. The installation did not prompt me for the SP2
source location, so I recommend running the installation from a Win2K source that has had SP2
slipstreamed in (by running
\i386\update>update –s <dir>
where <dir> is the location of your Win2K source files).
195
Chapter 8
DiskPart can also add or break mirrors, assign or remove a disk’s drive letter, create or delete
partitions and volumes, convert basic disks to dynamic disks, import disks and bring offline disks
and volumes online, and convert master boot record (MBR) disks to GUID Partition Table
(GPT) disks. The options under CONVERT for DiskPart are as follows:
•
BASIC—Converts a disk from dynamic to basic
•
DYNAMIC—Converts a disk from basic to dynamic
•
GPT—Converts a disk from MBR to GPT
•
MBR—Converts a disk from GPT to MBR
For information about GPT disks, see the “GPT Disks” section later in this chapter.
Just because you can run it from a command line or script does not mean that it will not destroy your
data! Always test your backup before you perform these types of disk operations!
Fsutil
Fsutil is another command-line utility that an administrator can use to perform file–system–
related tasks, such as managing reparse points, managing sparse files, dismounting a volume, or
extending a volume. Figure 8.3 shows sample Fsutil commands and usage for the quota
command.
You must be logged on as an administrator or a member of the Administrators group to use fsutil.
Figure 8.3: Fsutil quota commands.
196
Chapter 8
Enhanced Device Support
Although SAN support has been available for quite some time (from back in the NT days), there
are a great number of improvements yet to be made; how the OS handles new devices on the
SAN, for example. From what I have seen and heard, performance of the storage subsystem will
be capable of using much more of the bandwidth available in new devices such as HBAs.
Vendors will need to provide new miniport drivers, however. You may not be able to upgrade
the underlying OS without first getting new storage device drivers.
GPT Disks
In the 32-bit Intel world, we are used to using MBR disks, but the 64-bit Intel world brings a new
type of disk known as a GPT disk. Starting with the Intel Itanium processor, the new 64-bit
servers use the Extensible Firmware Interface (EFI) between the computer’s firmware, hardware,
and the OS instead (of a BIOS).
Disk partitions on basic GPT disks are created either by using the EFI firmware utility,
diskpart.efi, the diskpart.exe command-line utility, or the Disk Management MMC in the 64-bit
WS2K3. You will be able to manage both MBR disks and GPT disks in the Disk Management
MMC. However, on a 32-bit machine, the GPT disk appears as a basic MBR disk with a single
partition, and the data cannot be accessed, as there is no translation to the MBR disk format that
the 32-bit machine can understand. Understandably, you cannot easily move GPT and MBR
disks between machines. The 64-bit servers can certainly house MBR disks, but they must boot
to a GPT disk. Interestingly, you can combine MBR and GPT disks in dynamic disk groups, but I
find it highly unlikely that you would want to do so. Converting between the two, MBR and
GPT, is data destructive, so it should only be done on a disk that is considered empty.
MBR disks support volumes as large as 2 terabytes (TB) with as many as four primary partitions
per disk or three primary partitions and one extended partition with unlimited logical drives.
GPT disks support volumes as large as 18 exabytes in size and as many as 128 partitions per
disk. Also, GPT disks keep redundant primary and backup partition tables for fault resilience.
Cluster Cluster
This subhead illustrates the redundancy of clustering. However, clustering in Win2K and
WS2K3 has been a mixed success. At first, it looked rosy, especially compared with the
difficulty of clustering in NT. In the application arena, SQL Server 2000 has done well, but
Exchange 2000 Server has not, with severe limits being placed on users per server in activeactive clusters. Other services such as file and print and DHCP have enjoyed fair success, though
they aren’t as easily cost-justified in clustering as more business-critical applications.
WS2K3 clustering has increased the number of nodes in a cluster, up to eight nodes in WS2K3
Enterprise Edition, for example. Another new clustering enhancement in WS2K3 is in the area of
geographically dispersed clusters. With four nodes or more, you can have a pair of failover
partners on either side of the geographically dispersed locations (called a majority node set
cluster). The chances of failover as a result of server hardware fault or application crash are
greatest, and you don’t want to fail over to the remote cluster unless a catastrophic failure causes
loss of the data center.
197
Chapter 8
A potential improvement I’d like to see for a future version of Windows Server is the inclusion
of clustering support in the base OS. With Win2K and WS2K3, you add the clustering
component and reboot to load the driver. If you are having trouble with a node, you must evict
the node and reboot to remove the driver. Microsoft may realize that this requirement is less than
desirable, especially in an environment in which the customer has chosen to spend extra money
on clustering to gain higher availability. In addition, when the clustering driver is assumed to be
installed by default, running compatibility tests, such as for filter drivers in antivirus
applications, is easier. Clustering is, of course, now supported in the WS2K3 64-bit environment,
but all nodes need to be 64 bit, as it is not possible to run mixed 64-bit and 32-bit applications.
Yet another change to look for down the road is in how the quorum resource is handled. One of
the prerequisites for installing a cluster is to have access to a device on the shared storage
system. In Win2K, if the cluster installation wizard does not detect an external device, it will not
allow cluster installation. By changing to the concept of a majority node set in WS2K3, the
quorum can be kept on a local disk volume and other cluster members look to that source to be
kept in sync. This setup allows cluster installation before the node is on the SAN, and allows for
a greater number of nodes in a geographically dispersed cluster. However, not all WS2K3
clusters are majority node set clusters; the restrictive external storage device requirement for
traditional clusters remains, even under WS2K3.
Although I use the phrase shared storage system, this terminology can cause some confusion as the
Microsoft Cluster Services (MSCS) use a shared-nothing model. Perhaps a better phrase would be
networked storage.
Another badly needed change in Microsoft clustering is a change in the cluster arbitration
mechanism. Current Windows clustering technology uses a bus reset, which is extremely
disruptive in a SAN environment and should only be used as an absolute last resort.
Finally, one other area that needs improvement in clustering is the ability to break past the 22
drive-letter limitation when configuring clusters. No, that is not a typo—I know that our alphabet
contains 26 letters, but 4 letters are usually consumed by the floppy drive, boot drive, CD-ROM,
and the quorum, leaving 22 letters. You might rarely encounter this 22-letter limitation, but I
have run into it in a few very large database environments. The solution to this problem is to
allow volume mount points in clusters so that a database volume can be mounted as a folder on
an existing drive letter.
198
Chapter 8
SAN Boot
Although technically possible in Win2K and WS2K3, placing the boot drive (usually C) on a
SAN device has been recognized by Microsoft as a feature that needs to be officially supported
in upcoming Windows Server releases. I remember the first time that I built a Win2K Server
without any internal disk drives. At the time, the ability to SAN boot had just become available
in the latest firmware of the Fibre-Channel HBAs. It was a strange sight to see a rack full of
servers running without internal disks. However, it worked without a hitch, and we were able to
fulfill our primary mission, which was to boot the servers to disks on either side of a remotely
mirrored storage network. This capability is the type of flexibility offered by SAN boot—the
ability to place an OS image on the SAN, enable access to a server, and bring the server online.
There is potential for the large data center in being able to manage an OS image library—
bringing new servers online in a matter of minutes to meet growing or shifting demand—rather
than using the traditional setup build processes. I’ll give you more information later about why
this ability is important from a storage manager’s standpoint.
Multipath I/O
When placing multiple storage adapters in a server, it is necessary to use some form of vendorprovided software to manage the multiple data paths. Typically, software such as Compaq
Secure Path and EMC PowerPath is used to ensure that failover performs without any
interruption to the data flow. But the greater ability of using multiple adapters is beyond fault
tolerance—the greater ability is to combine both paths into one wide data path. This type of
functionality could actually be provided at the OS level, detecting multiple paths and managing
the failover.
Volume Mounting
One method of finding out how much others on your SAN know about proper volume security
(LUN masking or selective storage presentation) is to bring up an NT or Win2K server on the
SAN. This method is not the best, and I do not recommend doing it if you have any doubts about
the safety of other hosts’ disks. Windows servers are known for claiming any disks that they can
see as their own, attempting to mount them, and even requesting that a disk signature be written.
The issue is that NT was developed back when the majority of storage was direct attached, and
the implications of SAN storage were not fully realized.
Another feature that would be nice in the next- generation Windows server would be more
administrative control over whether Windows attempts to mount any disk that it can see.
Granted, if the disk can be seen, proper volume security (LUN masking or selective storage
presentation) may not be in place; however, you should still have control over whether the disk
belongs to that particular host and should be mounted.
199
Chapter 8
DAS vs. SAN vs. NAS
Why consider networked storage (SAN or NAS)? Perhaps you just know that your company or
department would benefit from the technology, but you need to further explain it to higher-level
decision makers in your organization. Networked storage technology must meet the requirements
of your business application or it is just expensive, complex technology, which in the wrong
hands can put your business information at risk. The following list provides enterprise
organizations’ common business requirements for a separate storage network:
•
Do you copy or move large files over the network? Is this usage of the network adversely
affecting other business functions?
•
Do you back up or restore a large amount of data over the network? Does the restore take
longer than acceptable SLAs?
•
Is the network a source of congestion for these operations (especially for recovery
operations, which may mean failure to meet SLAs)?
•
Do you need to share data amongst multiple servers, such as in a clustered environment?
•
Is data (or more appropriately, information) so critical to your business function that it
needs to be replicated to another location (in case of fire, flood, or earthquake damage at
the primary site)?
•
Would the business benefit from the ability to present the logical disk units to multiple
hosts with few configuration changes (for example, in a disaster recovery situation)?
•
Is the recovery window so critical that the cost justification of BCVs (cloning or
snapshotting data volumes) also sound business justification? The cost of BCVs includes
additional disk space used, such as triple-mirroring or a virtual drive pool; administrative
overhead to develop, test, and manage the solution; as well as any additional hardware or
software required to support the solution.
Now let’s turn to NAS. There have been several recent developments that will impact the future
of NAS. Fibre-Channel SANs have had a head start, and NAS has been playing catch up. To
some degree, Fibre Channel is synonymous with SAN. NAS has excelled at offering cheaper
storage, as long as it is file-level access that you need. This functionality is fine for file sharing
but not for relational databases. What NAS has lacked, it is beginning to provide—block-level
access to data. There are NAS devices that provide a block-level driver to ensure write-ordering
on the storage devices, which is usually not possible using the TCP/IP network redirector.
Recent market studies have shown greater adoption of both SAN and NAS as opposed to DAS.
Although I have long been a fan of DAS for its initial low cost and the high performance of
newer DAS devices, the high cost of DAS’ ongoing management will steer many companies to
NAS or SAN instead of DAS. In addition, the differences between SAN and NAS will begin to
lessen as we see greater performance NAS and easier-to-configure-and-manage SAN
environments. Although we will eventually see convergence of SAN and NAS, it won’t take
place for some time, and in the meanwhile, we must consider interoperability as the primary
mission. Table 8.1 compares common storage obstacles for DAS, SAN, and NAS.
200
Chapter 8
Storage Issue
DAS
SAN
NAS
Storage Network
Isolated storage may
mean over-utilization
or under-utilization.
Requires separate
Fibre-Channel
network; capacity of
1Gbps and 2Gbps
networks will soon be
surpassed by 10Gbps
Ethernet (if it can
move the same
amount of data).
Must redesign your IP
networks to ensure
that any problems,
such as security and
saturation, do not
affect storage.
Data Mode
Block Mode, but
unable to share
devices.
Block Mode—Not
designed to share
data across
applications (unless
managed by a cluster
service).
File Mode—Not
appropriate for some
applications such as
high-end database
servers. Vendors are
starting to provide
block-mode filter
drivers.
Difficulty
Easiest to set up,
except when large
amounts of storage
are needed. Becomes
most difficult to
manage in the long
run.
Most difficult to set up,
requires specialized
knowledge (expert
roles from many
vendors, especially to
interoperate).
Main difficulty is in
getting NAS to work
with applications and
environments that may
demand high
performance and
near-zero latency.
Table 8.1: Common storage obstacles to overcome.
The Demise of NAS?
The demise of NAS is a difficult scenario to envision; however, there is some industry
speculation that NAS will fill only a niche market as storage is dominated by SAN. After the
previous explanation of NAS becoming more SAN-like by providing block-level access to data, I
offer this counterpoint: The big changes in SANs that will lessen the ability of NAS to compete
are virtualization and volume shadow copies. Virtualization in the SAN may allow for more
efficient utilization of those expensive SAN resources. And volume shadow copies allow the
data to be replicated over distance and protected from data center disasters. I find it hard to
believe that SAN could dominate NAS, but I bet you will find someone arguing just this point in
a storage planning meeting over the next few years.
Interoperability
Although each storage vendor has its own interests centered on a line of products, your interests
as the customer is for the storage to interoperate, or work well together. Multi-vendor
interoperability is a key future direction, and some vendors have a definite interest in ensuring
compliance and are investing resources in that effort. For example, the Supported Solutions
Forum is a multi-vendor group sponsored by the Storage Networking Industry Association
(SNIA). The forum is designed to bring together competitors such as EMC, IBM, and Compaq to
create solutions involving servers, HBAs, and storage components such as switches and backup
devices.
201
Chapter 8
SAN and NAS Interoperability
Traditionally, deployments of storage networks have focused on SAN devices and excluded
NAS devices; however, the technological reasons for this gap are disappearing and will soon be
leaving only organizational and political reasons. There are existing devices designed to connect
SAN and NAS networks. Future gateway devices will allow protocol bridging or routing
between SCSI, Fibre-Channel, and enterprise system connection devices, in addition to the
newly emerging iSCSI for storage over IP. What will emerge is the greater ability to classify and
organize storage based on qualities such as
•
Cost
•
Performance
•
Availability and fault tolerance (through RAID levels and redundant hardware, which are
intrinsically aligned with cost)
The ability to present multiple classes of storage to a server is highly beneficial. The direct
benefit to you will be that you can prioritize and provision storage based on how critical the data
is to business and the ROI to the organization.
SAN Management API
A recent trend is that storage-product vendors are exchanging information on managing storage
devices (not necessarily by giving the information away, often charging for the information). Did
you ever think that you would see the day when rivals EMC and Compaq would be sharing
information? Such was the case recently when the two companies decided that the need for
storage management (for either company’s hardware) was a higher call that requires their
cooperation. The results of the joint effort will be that Compaq will add an Element Manager to
its SAN appliance to manage the EMC Symmetrix initially and the EMC CLARiiON later. On
the other side, EMC will use both the Compaq APIs and the SANworks Command Scripter
(which provides a command-line interface at the host) in the EMC AutoIS SAN-management
software. The other development to watch is a common model, such as the SAN Management
API version 2.0, and the work groups at the SNIA.
For more information about storage-group developments, go to http://www.snia.org/ and
http://www.T11.org/. There are two excellent white papers (although the information may be getting
old by now, the overall framework still applies) at
http://www.snia.org/English/Collaterals/Whitepapers/Shared_Storage_Model.pdf and
http://www.snia.org/English/Collaterals/Whitepapers/SANWP2.PDF
202
Chapter 8
Hardware Technology’s Future
In this section, we will look at some pretty sure bets in the future of hardware technology, and
how these developments will impact storage management.
Speeds and Feeds
Let’s look at some of the imminent changes in getting data to and from storage. We will see
some changes in the hard wiring of storage networks. But will we see wireless SANs? My first
reaction is to say doubtful, but we have recently seen wireless speeds increase, especially in burst
mode over short distances, which could eventually be used for SANs. Perhaps I’ll look back one
day at my doubt and laugh, just as I laugh at my doubt in 1991, when I read that the typical
personal computer in 1999 would have more than 100MB of memory.
2Gbps Fibre Channel and Beyond
2Gbps Fibre Channel is essentially here, just as Gigabit Ethernet to the desktop may be here, but
that doesn’t mean that you and I have it! The change to 2Gbps Fibre Channel will require
upgrades to our HBAs and switches, and our Gigabit Interface Connectors (GBICs—the
converter between the electrical and optical signals) will need to be changed to Small FormFactor Pluggable (SFP) transceivers, which require a new connector on the fiber cables. Unlike
Ethernet equipment, a Fibre-Channel fabric will fall back to the lowest speed of a device
communicating on the fabric, so if you do not upgrade all devices in the data path, you will not
benefit.
Some people may wait for the next big jump in speed (for example, 10Gbps) before upgrading
from 1Gbps. As we covered in the previous chapter about storage performance, your applications
may not even benefit from the upgrade. It is analogous to upgrading from Ultra2 SCSI to Ultra3
SCSI and increasing bandwidth from 80MBps to 160MBps: This upgrade provides an increase in
throughput, but your application may be constrained by disk I/O operations or even disk
response times. In fact, in testing the new 2Gbps hardware, it is very difficult to push the
performance test hard enough to demonstrate the throughput capabilities of 2Gbps—it takes such
a huge array of disks to fill a 200MBps pipeline, that the only real way to max out the throughput
is to use a large cache so that you are reading the bits from solid state memory instead of
spinning magnetic disks.
2Gbps Fibre Channel may be the standard for future implementations, especially if you’re setting
up your first SAN and are buying new equipment. However, upgrading from 1Gbps to 2Gbps
does not make sense unless you know that your existing Fibre-Channel fabric is the bottleneck. It
will be an estimated 2 to 3 years before you need to look into 10Gbps Fibre Channel.
Fibre-Channel Topology
In the near future, Fibre-Channel switches will drive down prices into the current range of FibreChannel hubs, slowing the adoption rate of looped environments or driving them to specialized
tasks such as isolated clusters. Of course, some of them may end up under your desk! Arbitrated
loop environments may be fine for learning the ropes of Fibre-Channel, but it will not get you
into fabrics, zoning, and multi-host environments. In addition, port densities on Fibre-Channel
switches are increasing to the range of 256 ports (albeit at a price premium).
203
Chapter 8
10GB Ethernet
Storage networks have been leapfrogging each other in maximum speed, with 2Gbps Fibre
Channel surpassing Gigabit Ethernet. Meanwhile, work is being done to push Ethernet networks
to 10Gbps Ethernet. Even though the capacity of 1Gbps and 2Gbps networks will soon be
surpassed by 10Gbps Ethernet, the determination will be whether it can move the same amount
of data. Theoretically, a 10Gbps link can fill even the largest of today’s hard drives in a matter of
minutes! However, Ethernet-based storage typically has much higher processing overhead than
Fibre Channel, including packet re-sends. Much of this processing is moved to a processor
onboard the NIC, yet we still see some server CPU overhead remain, and the impact of this
overhead for 10Gbps or 1.25GBps has yet to be seen. So, the end result in data throughput may
be that 2Gbps Fibre Channel is still a viable competitor for 10Gbps Ethernet for a few more
years.
Volume Management
Changes in how disk volumes are secured and how host access is determined may be
forthcoming. Arguments are being made in the industry that the old method of securing access
by HBA MAC address may not work in a directory-enabled storage network. Some of the access
may be determined by use of a SAN appliance on the Fibre-Channel fabric.
The Role of HBAs
What role will HBAs play in the future of storage management? Any device that is connected to
a storage network can play a pivotal role in the gathering of information about other attached
devices. Already, we have seen changes in HBA abilities have an impact, for example, in
providing SAN boot capability. In the next section, we will also see how HBAs can play a role in
virtualization.
Virtualization
Probably the biggest change in hardware technology is that of storage virtualization. This change
will impact everything from performance, how storage units are created, and how storage is
managed. I opted to include virtualization in this section rather than the earlier section about the
immediate future because virtualization is such a drastic shift (in essence one of the few places
where it would be appropriate to use the phase “paradigm shift” and not be in a Dilbert cartoon).
Virtualization is really in its infancy, and we will see much to come in this area. I can understand
virtualization when it comes to disks and have worked with new storage systems that abstract or
virtualize some disk array creation, but imagine what is possible when entire pools of storage
including disk, optical, and tape are virtualized and categorized to support an HSM system.
The central concept of virtualization is to create a large pool of storage resources and control
host access, presenting only what is needed or requested. Virtualization requires upgraded
capabilities in the SAN, either in the form of new storage systems and a SAN appliance or new
drivers for the OS and HBAs. There are several different methods of implementing storage
virtualization, usually either in-band or out-of-band.
204
Chapter 8
In-Band
The first method of virtualization, in-band, uses a SAN appliance or software directly in the data
path to control access to storage devices at the block level. When a block request is made for
data, the SAN appliance or software is responsible for locating and retrieving the data. The main
advantage of this approach is that there is no access to data on the SAN without the “permission”
of the virtualization manager, which helps to prevent unwanted access to disk resources. If you
have ever had someone plug a Windows server into the wrong Fibre-Channel switch, you know
the reasons why this advantage is important—Windows has had a tendency to think that any
devices it can see are local devices and that it should own those devices, including writing new
disk signatures.
The disadvantage of the in-band approach is that the SAN appliance or software can become a
limiting factor in SAN scalability, as it must control all I/O, both to and from the servers and
storage. In the event that there is a problem with the SAN appliance, it is critical that the
appliance hardware be fully fault tolerant, including redundant components and paths, otherwise
you could lose your entire SAN.
Out-of-Band
Another implementation of storage virtualization is out-of-band, in which the SAN appliance (or
even memory on the HBAs) acts as an intermediary, storing the location of data blocks requested
by the host. A table in memory must be kept updated by a virtualization manager and any new
devices added to the SAN must comply with the virtualization scheme. The main advantage of
out-of-band virtualization is that there is no central route for data access, as each server can
transfer data directly to storage once the location is known.
Distance Mirroring
The replication of storage data to a second location, also known by names such as distance
mirroring and remote copy sets, is not new to SAN technology and has long been available. The
differences that we will see in the immediate future are the ability to use a wider variety of
transports and to increase the distance between storage systems. Currently single-mode Fibre
Channel reaches 10 kilometers, and other transports can be used for greater distances (100km
and beyond with distance-enabling technology), albeit at a certain latency penalty. The primary
criteria for determining distance capability are the application’s ability to tolerate latency and
keep I/O synchronous. As latency increases, the ability to perform synchronous I/O becomes
increasingly difficult, and the replication must be done asynchronously if the application can
tolerate it.
The transport choices for Fibre-Channel distance mirroring include native (dark fiber),
Synchronous Optical Network (SONET), asynchronous transfer mode (ATM), and IP. In
addition, as we see 10Gigabit Ethernet networks hit the market, Fibre Channel over IP (FCIP)
and iSCSI will play a greater role in distance replication, extending the potential distances while
lowering the cost. The complexity and cost of these solutions will decrease to the extent that
more businesses will look to distance mirroring as a disaster-tolerant solution.
205
Chapter 8
SAN and WAN Converters
There are really two categories to pay attention to when it comes to the transport of SAN
protocols over WANs. The first are the boxes that act as protocol bridges or routers and perform
the protocol translation. As you may well know, these are expensive boxes and are often
connected to expensive runs of fiber. Fibre-Channel bridges to FCIP, and in the future perhaps
iSCSI, enable wide-area connectivity over IP or even ATM.
The second category to watch are the companies who purchase this expensive capital equipment
and sell it by the connection or channel. Quite a bit of investment has been made recently in
metropolitan area optical networks, and companies such as CNT act as a communications
provider for data.
Native Fibre Channel to Disk
The ability to use native Fibre Channel all the way from the HBA to the disk devices has been
available for a number of years, but the technology has been prohibitively expensive except for
high-end data centers, which usually do not include Windows servers. We will see that change,
as pure Fibre Channel becomes more affordable. Yet it must still be justified from a costperformance standpoint, and the primary advantage is that there is no SCSI translation at the
disk.
SAN Boot
The ability to place the OS boot drive on the SAN is in ever-increasing demand these days. Part
of the demand is the result of the increasing density of servers (especially blade designs)—
environments in which servers may have no internal storage, instead relying on the internal HBA
to gain access to the boot device on the SAN. In addition, having the OS boot drive on the SAN
aids in recoverability, as a standby server can be brought in to replace a failed server and access
the boot drive (assuming the volume security has been changed to allow the second host access).
The future potential of SAN boot also includes the possibility of extremely rapid server
deployment—imagine a scenario in which another Web server or application server is needed to
satisfy increasing load. Either through administrator intervention or dynamically through
software, a boot image is placed on a SAN disk and a server is brought online, fully configured
and ready to perform.
New Device Classes
Of course the future will bring entirely new types of devices, but we have little idea what those
devices will be (other than the obvious extension of existing capabilities). It seems that the
weakest link in the data center lately is in the area of backup media. Disk capabilities have grown
at a rate that is difficult to keep up with, and tape technologies have lagged behind. Some
companies are building backup systems using inexpensive IDE drives (these can be rotated
offsite and treated very similarly to disposable tape media), and we may see these become
common in the data center if they prove to be a viable substitute.
206
Chapter 8
In the area of extending our existing capabilities, we will see technological innovation create
higher performance designs. For example, with memory cost relative to density dropping so low,
it is surprising that we do not see greater adoption of solid state disk (SSD) technology. SSDs are
still expensive, but there are uses for this technology where RAM (memory) is just not a suitable
replacement (for example, in high-speed transaction logging disks that must be non-volatile,
surviving a power interruption). So perhaps we will see a hybridization of this technology in
which greater amounts of solid state memory are pooled across an array of disks, similar to the
way that caching is used currently but accessible to the host as a disk volume for when it needs
to make temporary disk swaps or transactions.
Bus Architecture
For quite some time, we have been hearing about a new bus architecture, InfiniBand, designed to
eventually replace the PCI bus. If InfiniBand is slow in coming, it is alright by me—changes in
the bus architecture of servers are pretty disruptive. They can be beneficial in relieving one
source of potential data bottlenecks, but you may be just now implementing PCI-X as the next
generation of server bus architecture and aren’t quite ready for another drastic change.
Software Technology Futures
In this section, we will look at the direction of storage futures beyond the immediate future.
These are extremely speculative, as the current next release of Windows Server—codenamed
“Longhorn Server”—is still under definition by Microsoft and scheduled for a 2006 or 2007
rollout. Even that release is intended only to add support for the Longhorn Client release of that
same timeframe; the next major release of Windows Server, codenamed “Blackcomb,” is even
further out. So take this with a grain of salt.
Perhaps you are the type of storage administrator or systems administrator who has a personal
wish list of future OS features. You can easily perform what is called a gap analysis, by taking a
look within your organization at where you spend what you consider to be your most unfulfilling
or least productive time. At times, I bet it is performing frustrating hours of troubleshooting only
to find a simple but overlooked cause, such as a device that is intermittent or a configuration
change that had a larger impact than the operator knew about. At other times, I bet it is the
process of adding or expanding existing storage and recreating the information to manage it. No
doubt the features that you most desire are the ability to provision and expand storage systems
while you also control the configuration of your environment.
WinFS
Windows Future Storage (WinFS) is a new file system built upon the existing NTFS file system.
WinFS combines NTFS and SQL Server to provide a number of enhanced features and
capabilities. For example, objects within the file system—such as files and folders—can have
complex relationship with one another, and can be searched by a powerful full-text search engine
more rapidly than today’s Indexing Service permits. However, there’s no current information on
whether WinFS will make it into the initial release of Longhorn, or be scheduled for release later.
When released, WinFS should provide benefits primarily in the area of data and storage
management, making it easier to find out what data is stored where, and to make that data more
readily accessible to users and applications.
207
Chapter 8
DEN Enhancements
The goal of DENs, or directory-enabled storage networks, is to be able to provision (configure
new devices) and manage your entire storage infrastructure from a centralized point with as little
repetition of duties or tasks as possible. Similar efforts are being made in the server and network
infrastructure toward the goal of plugging a server blade or switch blade into a rack, having it
poll the master directory for its role, and configuration taking place with little human
intervention. During the rest of its production life cycle, the hardware, whether server, network,
or storage, may be re-deployed dynamically to meet shifting demand in Web transactions,
transaction processing, or any other current needs.
One of the milestones on the road to directory-enabled storage networks is a centralized
repository and a common set of protocols. The Distributed Management Task Force (DMTF) is
working on this task—defining a common information model, CIM 2.5, and mapping the schema
from CIM to an X.500 LDAP directory. The design goal is to deliver network services based on
pre-defined business criteria. For example, letting multiple priorities and classes of services
assist in the provisioning of storage hardware. But first, all the devices that make up enterprise
storage infrastructure must fit into the data model, and the storage software must understand the
rules that we define.
Think about the explosion of the Internet when it seemed to hit critical mass (although we know
that it will continue to grow and evolve). The main factors were bandwidth at a reasonable price
(initially dial-up), a standardized set of protocols (HTTP over TCP/IP), and the means to create a
virtually centralized (but distributed) repository in the form of the World Wide Web. Even
search engines have evolved their techniques from Gopher to Webcrawler to Alta Vista to
Google and beyond. A research associate and I once discussed this topic at length, and he added
that it also took a robust user experience, initially provided by the Mosaic browser and later IE
and Netscape. I remember my Lynx browser that worked fine until I started getting more and
more graphics placeholders—but my point is to find parallel patterns in the growth or evolution
of storage management. From an administrative standpoint, you will need a centralized
management starting point, even if the configuration information is stored in distributed devices.
As long as the devices speak the same language or use common interchangeable protocols, you
can essentially hyperlink or browse them. The rich UI is, of course, very helpful when dealing
with everything from disk spindles to enterprise data centers in one console.
So it may be some time before we are fully directory-enabled, but in the meantime, you can keep
a watchful eye in the industry press or at storage conferences for the emerging standards. You
can’t expect to pick the winner in every detail of directory-enabled storage networks (unless of
course you were just sure of the winner in Super Bowl XXXVI), but expect the vendors that you
use for your storage networks to participate in the creation of an open DEN standard.
For more information about DENs, see http://www.dmtf.org/standards/standard_den.php.
208
Chapter 8
Dynamic Volume Management
The rules that determine adding space to an existing drive (dynamic volume management) have
typically been very limiting. If you are most familiar with NT storage, you are probably used to
rebuilding RAID sets manually. With WS2K3 and the addition of third-party utilities such as
VERITAS Volume Manager, it is possible to extend a disk size without having to back up,
rebuild, and restore data. In the not-too-distant future, these limitations will seem like a bad
memory and will be hard to explain to the storage newcomer. We will pull from a pool of storage
resources to dynamically grow any disk that requires additional capacity. Obviously, this
possibility will take some re-work of the Windows OS.
Multipath I/O
As mentioned in the previous section, multipath I/O will eventually become integral to the core
OS. However, there will still be a market for third-party vendors to extend multipath to include
the ability to dynamically load balance across a variety of adapters (even from multiple vendors)
and even to multiple devices. New devices classes will include optical, and tape, in addition to
disk. We will see support for larger numbers of paths beyond the usual two.
Security
As we watch the evolution of storage networks—especially as we get close to bridging our SAN
with our LAN/WAN infrastructure—we will see analogous attention given to security. Recently,
security has been driven to the forefront of IT in a manner that I only hope we can avoid by
being proactive in our storage networks. For example, LUN masking or selective storage
presentation methods may vary across vendors and hosts, and access to physical devices is of
utmost importance.
Shared File Systems
In a shared file system, multiple hosts have access to data on the same disks at the block level.
Contrast this setup to the Microsoft approach to clustering (which is known as a shared-nothing
approach), which uses the Cluster Manager to limit disk access to one host at a time. The
advantage of sharing the file system is that multiple hosts can respond to client requests, such as
allowing a large bank of read-only analysis servers and several data-input or transaction-update
servers. This setup requires the use of a Distributed Lock Manager (DLM) to coordinate the
updates to disk, ensure that there are no write collisions, and ensure that the read-only servers see
the same data. There are currently third-party shared–file–system solutions for Windows
available from companies such as VERITAS and IBM Tivoli, and we may see one of these
vendor’s solutions adopted into Windows’ core similar to how we have seen a subset of the
VERITAS Volume Manager functionality licensed for the core OS Logical Disk Manager
(LDM).
209
Chapter 8
Storage Protocols
Much of the focus on storage protocols is on storage internetworking, as each type of storage
network (SAN or NAS) has its set of strengths and its proponents. What it boils down to is Fibre
Channel versus IP. Most SANs are Fibre-Channel protocol, but IP networks have great appeal
because of the existing cabling and routing infrastructure, wide choices of standardized
hardware, and the highly skilled workforce available to support them. In addition, Ethernet IP
networks have been more open and inter-connected and have struggled through some of the
security issues that Fibre-Channel networks have yet to face.
FCIP
FCIP is a protocol standard developed by the IPS Working Group of the Internet Engineering
Task Force (IETF). FCIP is designed for connecting geographically dispersed Fibre-Channel
SANs over IP networks, with a key distinction being that all Fibre-Channel devices are unaware
of the presence of the IP network.
For more information about FCIP, see
http://www.snia.org/English/Forums/IP_Storage/IP_Storage_FCIP.html and
http://search.ietf.org/internet-drafts/draft-ietf-ips-fcovertcpip-09.txt (note that this document name will
change, so you can find information about FCIP by searching at
http://search.ietf.org/search/brokers/internet-drafts/query.html).
Storage over IP and iSCSI
iSCSI is one of the potential protocols for storage over IP, enabling more efficient transport than
the network protocol. iSCSI is intended to provide block-level access to storage, which would
include application databases in addition to the traditional file-based storage. Although there are
other contenders, such as Storage over IP (SoIP), iSCSI is positioned to be the dominant
protocol, and the one I choose to watch.
In addition to using existing network infrastructure, iSCSI may be an enabler for distance
replication beyond the 10 kilometer limit of single-mode Fibre Channel. However, iSCSI will
need to overcome the early momentum being gained by FCIP enabled by Fibre-Channel bridges.
In fact, I would bet that iSCSI is a greater market threat to DAS than Fibre-Channel SANs
because iSCSI allows much easier interconnects and networking of low-end storage systems that
would be traditionally SCSI-cabled.
The Direct Access File System
Several large players in the NAS arena, such as Network Appliance and IBM, are also counting
on the implementation of the Direct Access File System (DAFS) to give them the performance
that they will need to become more widely adopted in the database and application server
market. SNIA has created the DAFS Implementers’ Forum to carry on the work started by the
DAFS Collaborative.
For more information about DAFS, see http://www.snia.org/English/Forums/DAFS/DAFS_Docs.html
or http://www.snia.org/English/Forums/DAFS/DAFS.html
210
Chapter 8
Storage Management
SAN devices will play a key role in SAN management, with current work progressing on
providing APIs for managing common device classes. As I previously mentioned, major storage
vendors are even sharing information, which is a surprise considering that they often have
competing products. But the call to action in the industry is clear—storage management lags far
behind the capabilities and importance of the storage devices being implemented.
SAN Devices
Whatever standards emerge for the storage components of DENs, the devices on the storage
network will need to provide manageability to be compliant. For example, you may be familiar
with network device elements providing a Management Information Base (MIB) to comply with
SNMP standards. Over the past few years, we have seen storage and fabric elements provide a
MIB, which will change slightly as the Common Information Model (CIM) and Web-Based
Enterprise Management (WBEM) shift the format to using XML. Overall, look to the Desktop
Management Task Force (http://www.dmtf.org) for definition of the standards.
Policy-Based Management
Just recently, we have seen progression and adoption in the area of policy-based SRM. The
primary benefit of policy-based SRM is to define policies at the highest level in the organization
and place storage within those policies; thus, reducing the effort and duplication of effort
required to administer a typical storage environment. Policy-based SRM will have a vital role in
virtualized enterprise storage, as the creation and application of that storage can fit under the
same policy (as opposed to having two storage-management systems and hence two sets of
policies to maintain). The latest generation of software (for example, InterSAN PATHLINE)
helps to automate the entire process of device discovery, provisioning, monitoring, and control,
including storage virtualization, as we discussed earlier.
Operations and Procedural Futures
Although constant change can be somewhat unsettling, there is comfort to be had in knowing
that storage will continue to play a key role in the Windows strategy and storage resources will
need to be managed. The difficult part is knowing that we must drive the rickety old stagecoach
for a few more years before the shiny new locomotives appear to whisk us into the future. Just
think about how the Internet changed what was possible from an application-development
standpoint—everything from Smart Tags (or just hyperlinks) in documents that let you crossreference a world of information, to peer-to-peer applications that let you create a virtual
network of like-minded users. At this point, we don’t really know how changes in storage will
transform the underlying applications, other than to say that we should need less duct tape and
bailing wire to hold it all together.
The concept of partitioning disks and each application/server owning its little piece of disk real
estate will vanish. And the next thing to vanish will be the old concept of each application/server
owning its little piece of data, as multiple hosts will be able to pool that as well.
211
Chapter 8
Storage Certifications
SNIA has been developing certification for storage administrators, implementers, and resellers
focused on knowledge of Fibre-Channel SAN technology. The SNIA FC-SAN Certification
Program certifications are classified by the following levels:
•
•
•
•
•
Professional—Limited technical knowledge required; mainly understanding of
terminology, principles, and purposes associated with Fibre-Channel SANs. Targeted at
sales and marketing or other support personnel with limited exposure to Fibre-Channel
SANs.
Practitioner—More technical background in features, functions, and underlying
technology of Fibre-Channel SANs; required for pre- and post-sales support, consulting,
and field-service engineers.
Specialist—Similar to Practitioner except adds the ability to plan, build, and configure a
complex Fibre-Channel SAN. Substantial technical background required for system
architects, consultants, and other full-time SAN support personnel.
Expert—Most comprehensive of the technical levels, requiring the ability to analyze,
diagnose, and troubleshoot Fibre-Channel SAN system problems.
Master—Reserved for future use, awarded to innovators and architects in SAN
technology as a recognition for lifetime achievement in storage networking.
Enterprise Backup Strategies
Another significant change to consider from an operational standpoint is disaster recovery in a
networked-storage environment. As a result of the critical nature of recovering enterprise
storage, we will use more advanced techniques to back up and recover. From a storagemanagement standpoint, the management tools that you use to ensure that your disaster-recovery
procedures are operational will need to be updated. For example, if you are monitoring the
Win2K event logs for successful completion of tape backup, you may need to instead watch for
successful completion of the SCSI Extended Copy (serverless backup).
BCVs
BCVs are not a new or emerging technology so much as one that will gain wider acceptance and
support. Both clones and snapshots are called BCVs, as they both offer the ability to perform
much more rapid recovery than traditional restore mechanisms. Some people do not make a clear
distinction between snapshots and clones, mainly because both methods reduce the data-recovery
window. However, there is a crucial difference—a clone is a triple-mirrored copy of the data on
an array of disk drives, so it can be broken off and set aside or used for other production work
such as integrity analysis or data warehousing and information analysis.
By contrast, the snapshot is a hybrid of blocks from the original data and the new updated
blocks. Setting aside the original files (such as a very large database) requires a substantial
amount of disk processing for the host. For example, if you wanted to copy VLDB.dbf from disk
E to disk Z on a database server, the action would also impact the ability of the host to respond to
client requests. If you take the database offline and want to restore the original, the difference
between snapshots and clones becomes clear. With a clone, you can merely swap the logical disk
units, but with a snapshot, the original data is a virtual representation and would require
swapping out the disk blocks, which is a very slow process when compared with merely
replacing the files.
212
Chapter 8
Consider also the difference in backing up the snapshot or clone data. With a clone, the broken
off mirror set can be presented to the backup host for backup to a tape device with little or no
impact to the database host. A snapshot can be presented, but the original host must service the
blocks—again, as it is a virtual representation, a hybrid of the original blocks and the new ones.
In either case, snapshot or clone, the only way that the files on disk are useful for recovery is if
the files are consistent to the application writing them. The application, such as a database or file
server, could be in the process of writing file data or metadata (about which files are on the disk)
from its cache. There are hardware solutions that can break off BCVs without any application
knowledge; however, there is little guarantee that the clone or snapshot will be useful in
recovery. Enter the introduction of volume snapshot services in.NET Server. The role of the
.NET OS in snapshots or clones is to interact with applications and backup processes to ensure
that I/O to disk does not interfere and make the data inconsistent.
Serverless Backup
Unless you have tape drives attached to every single application server, it is most likely that you
back up your data over some kind of network. If the backup device is attached to a backup
server, the data passes from one server to another over the network. In a SAN, the data can be
directly transferred from disk device to tape device without involvement of the backup server.
Typically, the backup server passes information to an agent running on the application server
requesting that it mount a tape device and stream the backup to it. The most typical
implementation of this type of backup uses the Network Data Management Protocol (NDMP) to
move data from the application server to the backup device over the SAN instead of the network.
NDMP was created by Legato Systems and Network Appliance and has been handed over to
SNIA for future development.
In the next phase of evolution, true serverless backup, the backup server is not necessary. Some
of you early adopters may even be using this new setup now, as the technology is available and
will be maturing over the next few years. The SCSI-3 Extended Copy command lets a device
execute a series of commands directly to devices. For example, copying data from a source
device to a destination device without a server having to read the data from the source into
memory and back to the other device. Of course, for this approach to work, the data needs to
remain unchanged on disk or the files will be inconsistent between the two sets. Some software
intermediary is still needed on the host server to pause I/O to disk during the backup process. In
addition, this setup currently will not work in a Windows clustered environment because
Windows environments use the shared-nothing storage design and each node owns the devices
through disk reservations. I’m sure that Microsoft has considered this fact, and we will just have
to wait and see whether the company changes its clustering model or develops another solution
to allow SCSI Extended Copy for serverless backup.
213
Chapter 8
Summary
This chapter wraps up The Definitive Guide to Windows 2003 Storage Resource Management.
We looked into the future of storage and SRM including both hardware technology and software
changes. The most immediate changes include technology that is currently available to you, but
unless you are an early adopter, it will be a short time before you are using it.
Much of this chapter focused on networked storage, as clearly that is where the most
improvement and increases in adoption will occur. In the area of hardware, we looked at changes
in speeds and feeds as we get faster pipes and even greater distances. One of the upcoming
changes is in virtualization of devices and storage. We looked at what that means from a storagemanagement perspective. We covered the server-side of storage networks, changes in HBAs,
booting from the SAN, and multi-path I/O and what it will mean for performance and fault
tolerance. In the area of disaster recovery, we looked at distance mirroring, cloning and
snapshots, and serverless backup.
I also covered the next generation of storage technologies, and the features they may provide to
enhance our storage management. Finally, we looked at the changes we will need to make from
an operational and procedural perspective. All in all, it is both a frustrating and exciting time to
be a storage administrator, and I wish you the best in your journey.
Download Additional eBooks from Realtime Nexus!
Realtime Nexus—The Digital Library provides world-class expert resources that IT
professionals depend on to learn about the newest technologies. If you found this eBook to be
informative, we encourage you to download more of our industry-leading technology eBooks
and video guides at Realtime Nexus. Please visit http://nexus.realtimepublishers.com.
214
Appendix
Appendix A: SRM Software and Hardware Vendors
Vendor
Focus
URL
Astrum (now a
part of EMC)
Policy-based
object
management
http://www.astrumsoftware.com
NTP Software
Policy-based
object
management
http://www.ntpsoftware.com
BMC Software
Enterprise
storage and
application
storage
management
http://www.bmc.com
Brocade
Fibre-Channel
switches and
management
www.brocade.com
Compaq
Fibre-Channel
SAN
configuration,
virtualization,
and device
management
http://www.compaq.com/storage
Computer
Associates
Application
storage
management
and backup
http://www.computerassociates.com
Dot Hill
(SANnet
storage
solutions)
SAN
configuration
and
management
http://www.dothill.com/products/software/sanpath.htm
EMC
Enterprise
storage and
device
management
http://www.emc.com
Hitachi Data
Systems
Fibre-channel
SAN devices
and
management
www.hds.com
HP
SAN
configuration
and
management
http://welcome.hp.com/country/us/en/prodserv/storage.html
IBM
Enterprise
storage and
device
management
http://www.storage.ibm.com
http://www.tivoli.com
McData
Fibre-channel
switches and
management
http://www.mcdata.com/
215
Appendix
Vendor
Focus
URL
NetIQ
Application
storage
management
http://www.netiq.com
Sun
Microsystems
Enterprise
storage
management
http://www.sun.com/storage
VERITAS
Software
Fibre-channel
SAN
management
and policybased object
management
http://www.veritas.com
216
Appendix
Appendix B: SRM and Storage Web Sites, Portals, and
Mailing Lists
Resource
URL
Storage Innovators
http://searchstorage.techtarget.com/tipsIndex/0,289482,sid5_tax287587,00.html
InfoStor
http://www.infostor.com
Search Storage
http://www.searchstorage.com/
Storage
Management.org
http://www.stormgt.org/
Enterprise Systems
http://www.esj.com/
Storage Magazine
http://storagemagazine.techtarget.com/
Information Week
www.informationweek.com
217
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement