[ Team LiB ] Secure Programming Cookbook for C

[ Team LiB ] Secure Programming Cookbook for C
[ Team LiB ]
•
Table of Contents
•
Index
•
Reviews
•
Reader Reviews
•
Errata
Secure Programming Cookbook for C and C++
By Matt Messier, John Viega
Publisher: O'Reilly
Pub Date: July 2003
ISBN: 0-596-00394-3
Pages: 784
Secure Programming Cookbook for C and C++ is an important new resource for developers serious
about writing secure code for Unix® (including Linux®) and Windows® environments. This essential
code companion covers a wide range of topics, including safe initialization, access control, input
validation, symmetric and public key cryptography, cryptographic hashes and MACs, authentication
and key exchange, PKI, random numbers, and anti-tampering.
[ Team LiB ]
[ Team LiB ]
•
Table of Contents
•
Index
•
Reviews
•
Reader Reviews
•
Errata
Secure Programming Cookbook for C and C++
By Matt Messier, John Viega
Publisher: O'Reilly
Pub Date: July 2003
ISBN: 0-596-00394-3
Pages: 784
Copyright
Foreword
Preface
More Than Just a Book
We Can't Do It All
Organization of This Book
Recipe Compatibility
Conventions Used in This Book
Comments and Questions
Acknowledgments
Chapter 1. Safe Initialization
Section 1.1. Sanitizing the Environment
Section 1.2. Restricting Privileges on Windows
Section 1.3. Dropping Privileges in setuid Programs
Section 1.4. Limiting Risk with Privilege Separation
Section 1.5. Managing File Descriptors Safely
Section 1.6. Creating a Child Process Securely
Section 1.7. Executing External Programs Securely
Section 1.8. Executing External Programs Securely
Section 1.9. Disabling Memory Dumps in the Event of a Crash
Chapter 2. Access Control
Section 2.1. Understanding the Unix Access Control Model
Section 2.2. Understanding the Windows Access Control Model
Section 2.3. Determining Whether a User Has Access to a File on Unix
Section 2.4. Determining Whether a Directory Is Secure
Section 2.5. Erasing Files Securely
Section 2.6. Accessing File Information Securely
Section 2.7. Restricting Access Permissions for New Files on Unix
Section 2.8. Locking Files
Section 2.9. Synchronizing Resource Access Across Processes on Unix
Section 2.10. Synchronizing Resource Access Across Processes on Windows
Section 2.11. Creating Files for Temporary Use
Section 2.12. Restricting Filesystem Access on Unix
Section 2.13. Restricting Filesystem and Network Access on FreeBSD
Chapter 3. Input Validation
Section 3.1. Understanding Basic Data Validation Techniques
Section 3.2. Preventing Attacks on Formatting Functions
Section 3.3. Preventing Buffer Overflows
Section 3.4. Using the SafeStr Library
Section 3.5. Preventing Integer Coercion and Wrap-Around Problems
Section 3.6. Using Environment Variables Securely
Section 3.7. Validating Filenames and Paths
Section 3.8. Evaluating URL Encodings
Section 3.9. Validating Email Addresses
Section 3.10. Preventing Cross-Site Scripting
Section 3.11. Preventing SQL Injection Attacks
Section 3.12. Detecting Illegal UTF-8 Characters
Section 3.13. Preventing File Descriptor Overflows When Using select( )
Chapter 4. Symmetric Cryptography Fundamentals
Section 4.1. Representing Keys for Use in Cryptographic Algorithms
Section 4.2. Generating Random Symmetric Keys
Section 4.3. Representing Binary Keys (or Other Raw Data) as Hexadecimal
Section 4.4. Turning ASCII Hex Keys (or Other ASCII Hex Data) into Binary
Section 4.5. Performing Base64 Encoding
Section 4.6. Performing Base64 Decoding
Section 4.7. Representing Keys (or Other Binary Data) as English Text
Section 4.8. Converting Text Keys to Binary Keys
Section 4.9. Using Salts, Nonces, and Initialization Vectors
Section 4.10. Deriving Symmetric Keys from a Password
Section 4.11. Algorithmically Generating Symmetric Keys from One Base Secret
Section 4.12. Encrypting in a Single Reduced Character Set
Section 4.13. Managing Key Material Securely
Section 4.14. Timing Cryptographic Primitives
Chapter 5. Symmetric Encryption
Section 5.1. Deciding Whether to Use Multiple Encryption Algorithms
Section 5.2. Figuring Out Which Encryption Algorithm Is Best
Section 5.3. Selecting an Appropriate Key Length
Section 5.4. Selecting a Cipher Mode
Section 5.5. Using a Raw Block Cipher
Section 5.6. Using a Generic CBC Mode Implementation
Section 5.7. Using a Generic CFB Mode Implementation
Section 5.8. Using a Generic OFB Mode Implementation
Section 5.9. Using a Generic CTR Mode Implementation
Section 5.10. Using CWC Mode
Section 5.11. Manually Adding and Checking Cipher Padding
Section 5.12. Precomputing Keystream in OFB, CTR, CCM, or CWC Modes (or with Stream Ciphers)
Section 5.13. Parallelizing Encryption and Decryption in Modes That Allow It (Without Breaking
Compatibility)
Section 5.14. Parallelizing Encryption and Decryption in Arbitrary Modes (Breaking Compatibility)
Section 5.15. Performing File or Disk Encryption
Section 5.16. Using a High-Level, Error-Resistant Encryption and Decryption API
Section 5.17. Performing Block Cipher Setup (for CBC, CFB, OFB, and ECB Modes) in OpenSSL
Section 5.18. Using Variable Key-Length Ciphers in OpenSSL
Section 5.19. Disabling Cipher Padding in OpenSSL in CBC Mode
Section 5.20. Performing Additional Cipher Setup in OpenSSL
Section 5.21. Querying Cipher Configuration Properties in OpenSSL
Section 5.22. Performing Low-Level Encryption and Decryption with OpenSSL
Section 5.23. Setting Up and Using RC4
Section 5.24. Using One-Time Pads
Section 5.25. Using Symmetric Encryption with Microsoft's CryptoAPI
Section 5.26. Creating a CryptoAPI Key Object from Raw Key Data
Section 5.27. Extracting Raw Key Data from a CryptoAPI Key Object
Chapter 6. Hashes and Message Authentication
Section 6.1. Understanding the Basics of Hashes and MACs
Section 6.2. Deciding Whether to Support Multiple Message Digests or MACs
Section 6.3. Choosing a Cryptographic Hash Algorithm
Section 6.4. Choosing a Message Authentication Code
Section 6.5. Incrementally Hashing Data
Section 6.6. Hashing a Single String
Section 6.7. Using a Cryptographic Hash
Section 6.8. Using a Nonce to Protect Against Birthday Attacks
Section 6.9. Checking Message Integrity
Section 6.10. Using HMAC
Section 6.11. Using OMAC (a Simple Block Cipher-Based MAC)
Section 6.12. Using HMAC or OMAC with a Nonce
Section 6.13. Using a MAC That's Reasonably Fast in Software and Hardware
Section 6.14. Using a MAC That's Optimized for Software Speed
Section 6.15. Constructing a Hash Function from a Block Cipher
Section 6.16. Using a Block Cipher to Build a Full-Strength Hash Function
Section 6.17. Using Smaller MAC Tags
Section 6.18. Making Encryption and Message Integrity Work Together
Section 6.19. Making Your Own MAC
Section 6.20. Encrypting with a Hash Function
Section 6.21. Securely Authenticating a MAC (Thwarting Capture Replay Attacks)
Section 6.22. Parallelizing MACs
Chapter 7. Public Key Cryptography
Section 7.1. Determining When to Use Public Key Cryptography
Section 7.2. Selecting a Public Key Algorithm
Section 7.3. Selecting Public Key Sizes
Section 7.4. Manipulating Big Numbers
Section 7.5. Generating a Prime Number (Testing for Primality)
Section 7.6. Generating an RSA Key Pair
Section 7.7. Disentangling the Public and Private Keys in OpenSSL
Section 7.8. Converting Binary Strings to Integers for Use with RSA
Section 7.9. Converting Integers into Binary Strings for Use with RSA
Section 7.10. Performing Raw Encryption with an RSA Public Key
Section 7.11. Performing Raw Decryption Using an RSA Private Key
Section 7.12. Signing Data Using an RSA Private Key
Section 7.13. Verifying Signed Data Using an RSA Public Key
Section 7.14. Securely Signing and Encrypting with RSA
Section 7.15. Using the Digital Signature Algorithm (DSA)
Section 7.16. Representing Public Keys and Certificates in Binary (DER Encoding)
Section 7.17. Representing Keys and Certificates in Plaintext (PEM Encoding)
Chapter 8. Authentication and Key Exchange
Section 8.1. Choosing an Authentication Method
Section 8.2. Getting User and Group Information on Unix
Section 8.3. Getting User and Group Information on Windows
Section 8.4. Restricting Access Based on Hostname or IP Address
Section 8.5. Generating Random Passwords and Passphrases
Section 8.6. Testing the Strength of Passwords
Section 8.7. Prompting for a Password
Section 8.8. Throttling Failed Authentication Attempts
Section 8.9. Performing Password-Based Authentication with crypt( )
Section 8.10. Performing Password-Based Authentication with MD5-MCF
Section 8.11. Performing Password-Based Authentication with PBKDF2
Section 8.12. Authenticating with PAM
Section 8.13. Authenticating with Kerberos
Section 8.14. Authenticating with HTTP Cookies
Section 8.15. Performing Password-Based Authentication and Key Exchange
Section 8.16. Performing Authenticated Key Exchange Using RSA
Section 8.17. Using Basic Diffie-Hellman Key Agreement
Section 8.18. Using Diffie-Hellman and DSA Together
Section 8.19. Minimizing the Window of Vulnerability When Authenticating Without a PKI
Section 8.20. Providing Forward Secrecy in a Symmetric System
Section 8.21. Ensuring Forward Secrecy in a Public Key System
Section 8.22. Confirming Requests via Email
Chapter 9. Networking
Section 9.1. Creating an SSL Client
Section 9.2. Creating an SSL Server
Section 9.3. Using Session Caching to Make SSL Servers More Efficient
Section 9.4. Securing Web Communication on Windows Using the WinInet API
Section 9.5. Enabling SSL without Modifying Source Code
Section 9.6. Using Kerberos Encryption
Section 9.7. Performing Interprocess Communication Using Sockets
Section 9.8. Performing Authentication with Unix Domain Sockets
Section 9.9. Performing Session ID Management
Section 9.10. Securing Database Connections
Section 9.11. Using a Virtual Private Network to Secure Network Connections
Section 9.12. Building an Authenticated Secure Channel Without SSL
Chapter 10. Public Key Infrastructure
Section 10.1. Understanding Public Key Infrastructure (PKI)
Section 10.2. Obtaining a Certificate
Section 10.3. Using Root Certificates
Section 10.4. Understanding X.509 Certificate Verification Methodology
Section 10.5. Performing X.509 Certificate Verification with OpenSSL
Section 10.6. Performing X.509 Certificate Verification with CryptoAPI
Section 10.7. Verifying an SSL Peer's Certificate
Section 10.8. Adding Hostname Checking to Certificate Verification
Section 10.9. Using a Whitelist to Verify Certificates
Section 10.10. Obtaining Certificate Revocation Lists with OpenSSL
Section 10.11. Obtaining CRLs with CryptoAPI
Section 10.12. Checking Revocation Status via OCSP with OpenSSL
Chapter 11. Random Numbers
Section 11.1. Determining What Kind of Random Numbers to Use
Section 11.2. Using a Generic API for Randomness and Entropy
Section 11.3. Using the Standard Unix Randomness Infrastructure
Section 11.4. Using the Standard Windows Randomness Infrastructure
Section 11.5. Using an Application-Level Generator
Section 11.6. Reseeding a Pseudo-Random Number Generator
Section 11.7. Using an Entropy Gathering Daemon-Compatible Solution
Section 11.8. Getting Entropy or Pseudo-Randomness Using EGADS
Section 11.9. Using the OpenSSL Random Number API
Section 11.10. Getting Random Integers
Section 11.11. Getting a Random Integer in a Range
Section 11.12. Getting a Random Floating-Point Value with Uniform Distribution
Section 11.13. Getting Floating-Point Values with Nonuniform Distributions
Section 11.14. Getting a Random Printable ASCII String
Section 11.15. Shuffling Fairly
Section 11.16. Compressing Data with Entropy into a Fixed-Size Seed
Section 11.17. Getting Entropy at Startup
Section 11.18. Statistically Testing Random Numbers
Section 11.19. Performing Entropy Estimation and Management
Section 11.20. Gathering Entropy from the Keyboard
Section 11.21. Gathering Entropy from Mouse Events on Windows
Section 11.22. Gathering Entropy from Thread Timings
Section 11.23. Gathering Entropy from System State
Chapter 12. Anti-Tampering
Section 12.1. Understanding the Problem of Software Protection
Section 12.2. Detecting Modification
Section 12.3. Obfuscating Code
Section 12.4. Performing Bit and Byte Obfuscation
Section 12.5. Performing Constant Transforms on Variables
Section 12.6. Merging Scalar Variables
Section 12.7. Splitting Variables
Section 12.8. Disguising Boolean Values
Section 12.9. Using Function Pointers
Section 12.10. Restructuring Arrays
Section 12.11. Hiding Strings
Section 12.12. Detecting Debuggers
Section 12.13. Detecting Unix Debuggers
Section 12.14. Detecting Windows Debuggers
Section 12.15. Detecting SoftICE
Section 12.16. Countering Disassembly
Section 12.17. Using Self-Modifying Code
Chapter 13. Other Topics
Section 13.1. Performing Error Handling
Section 13.2. Erasing Data from Memory Securely
Section 13.3. Preventing Memory from Being Paged to Disk
Section 13.4. Using Variable Arguments Properly
Section 13.5. Performing Proper Signal Handling
Section 13.6. Protecting against Shatter Attacks on Windows
Section 13.7. Guarding Against Spawning Too Many Threads
Section 13.8. Guarding Against Creating Too Many Network Sockets
Section 13.9. Guarding Against Resource Starvation Attacks on Unix
Section 13.10. Guarding Against Resource Starvation Attacks on Windows
Section 13.11. Following Best Practices for Audit Logging
Colophon
Index
[ Team LiB ]
[ Team LiB ]
Copyright
Copyright © 2003 O'Reilly & Associates, Inc.
Printed in the United States of America.
Published by O'Reilly & Associates, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O'Reilly & Associates books may be purchased for educational, business, or sales promotional use.
Online editions are also available for most titles (http://safari.oreilly.com). For more information,
contact our corporate/institutional sales department: (800) 998-9938 or [email protected]
Nutshell Handbook, the Nutshell Handbook logo, and the O'Reilly logo are registered trademarks of
O'Reilly & Associates, Inc. Many of the designations used by manufacturers and sellers to distinguish
their products are claimed as trademarks. Where those designations appear in this book, and O'Reilly
& Associates, Inc. was aware of a trademark claim, the designations have been printed in caps or
initial caps. The association between the image of a crested porcupine and the topic of secure
programming with C and C++ is a trademark of O'Reilly & Associates, Inc.
While every precaution has been taken in the preparation of this book, the publisher and authors
assume no responsibility for errors or omissions, or for damages resulting from the use of the
information contained herein.
[ Team LiB ]
[ Team LiB ]
Foreword
There is a humorous, computing-related aphorism that goes like this: "There are 10 types of people:
those who understand binary, and those who don't." Besides being amusing to people who
understand number representation, this saying can be used to group people into four (or 100)
categories:
Those who will never quite get the meaning of the statement, even if it is explained to them
Those who need some explanation, but will eventually get the meaning
Those who have the background to grasp the meaning when they read it
Those who have the knowledge and understanding to not only see the statement as obvious,
but be able to come up with it independently on their own
There are parallels for these four categories in many different areas of endeavor. You can apply it to
art, to cooking, to architecture...or to writing software. I have been teaching aspects of software
engineering and security for over 20 years, and I have seen it up close. When it comes to writing
reliable software, there are four kinds of programmers:
Those who are constantly writing buggy code, no matter what
Those who can write reasonable code, given coaching and examples
Those who write good code most of the time, but who don't fully realize their limitations
Those who really understand the language, the machine architecture, software engineering, and
the application area, and who can write textbook code on a regular basis
The gap between the third category and the fourth may not seem like much to some readers, but
there are far fewer people in that last category than you might think. It's also the case that there are
lots of people in the third category who would claim they are in the fourth, but really aren't...similar
to the 70% of all licensed drivers who say they are in the top 50% of safe drivers. Being an objective
judge of one's own abilities is not always possible.
What compounds the problem for us all is that programmers are especially unlikely to realize (or are
unwilling to admit) their limits. There are levels and degrees of complexity when working with
computers and software that few people completely understand. However, programmers generally
hold a world view that they can write correct code all the time, and only occasionally do mistakes
occur, when in reality mistakes are commonplace in nearly everyone's code. As with the four
categories, or the drivers, or any other domain where skill and training are required, the experts with
real ability are fewer in number than those who believe they are expert. The result is software that
may be subtly-or catastrophically-incorrect.
A program with serious flaws may compile properly, and work with obvious inputs. This helps
reinforce the view that the code is correct. If something later exposes a flaw, many programmers will
say that a "bug" somehow "got into the code." Or maybe "it's a computer problem." Neither is
candid. Instead, whoever designed and built the system made mistakes. As a profession, we are
unwilling to take responsibility when we code things incorrectly. Is it any wonder that a recent NIST
study estimated that industry in the United States alone is spending $60 billion a year patching and
customizing badly-written software? Is it a surprise that there are thousands of security patches per
year for common software platforms? We've seen estimates that go as high as $1.5 trillion in
damages per year worldwide for security problems alone, and simple crashes and errors may be
more than 10 times as much. These are not rare flaws causing problems. There is a real crisis in
producing quality software.
The reality is that if we truly face up to the situation, we might reassess some conventional beliefs.
For instance, it is not true that a system is more secure because we can patch the source code when
a flaw is discovered. A system is secure or it is not-there is no "more secure." You can't say a car is
safer because you can replace the fenders yourself after the brakes give out and it goes over a cliff,
either. A system is secure if there are no flaws that lead to a violation of policy. Being able to install
the latest patch to the latest bad code doesn't make a system safer. If anything, after we've done it a
few times, it should perhaps reduce our confidence in the quality of the software.
An honest view of programming might also cause us to pay more attention to design-to capturing
requirements and developing specifications. Too often we end up with code that is put together
without understanding the needs-and the pitfalls-of the environment where it will be used. The
result is software that misbehaves when someone runs it in a different environment, or with
unexpected input. There's a saying that has been attributed to Brian Kernighan, but which appears to
have first been written down by W. D. Young, W.E. Boebert, and R.Y. Kain in 1985: "A program that
has not been specified cannot be incorrect; it can only be surprising." Most of the security patches
issued today are issued to eliminate surprises because there are no specifications for the underlying
code. As a profession, we write too much surprising code.
I could go on, but I hope my points are clear: there are some real problems in the way software is
being produced, and those problems lead to some serious-and expensive-problems. However,
problem-free software and absolute security are almost always beyond our reach in any significant
software project, so the next best thing is to identify and reduce the risks. Proven approaches to
reduce these risks include using established methods of software engineering, exercising care in
design and development, reusing proven software, and thinking about how to handle potential errors.
This is the process of assurance-of building trust in our systems. Assurance needs to be built in
rather than asserted after the software is finished.
That's why this book is so valuable. It can help people write correct, robust software the first time
and avoid many of the surprises. The material in this book can help you provide a network connection
with end-to-end security, as well as help you eliminate the need to patch the code because you didn't
add enough entropy to key generation, or you failed to change the UID/GID values in the correct
order. Using this code you can get the environment set correctly, the signals checked, and the file
descriptors the way you need them. And along the way, you can read a clear, cogent description
about what needs to be set and why in each case. Add in some good design and careful testing, and
a lot of the surprises go away.
Are all the snippets of code in this book correct? Well, correct for what? There are many other things
that go into writing reliable code, and they depend on the context. The code in this book will only get
you partway to your goal of good code. As with any cookbook, you may need to adjust the portions
or add a little extra seasoning to match your overall menu. But before you do that, be sure you
understand the implications! The authors of this book have tried to anticipate most of the
circumstances where you would use their code, and their instructions can help you avoid the most
obvious problems (and many subtle ones). However, you also need to build the rest of the code
properly, and run it on a well-administered system. (For that, you might want to check out some of
the other O'Reilly books, such as Secure Coding by Mark Graff and Kenneth van Wyk, and Practical
Unix and Internet Security by Simson Garfinkel, Gene Spafford, and Alan Schwartz.)
So, let's return to those four categories of programmers. This book isn't likely to help the group of
people who are perpetually unclear on the concepts, but it is unlikely to hurt them. It will do a lot to
help the people who need guidance and examples, because it contains the text as well as the code.
The people who write good software most of the time could learn a lot by reading this book, and
using the examples as starting points. And the experts are the ones who will readily adopt this code
(with, perhaps, some small adaptions); expert coders know that reuse of trusted components is a
key method of avoiding mistakes. Whichever category of programmer you think you are in, you will
probably benefit from reading this book and using the code.
Maybe if enough people catch on to what it means to write reliable code, and they start using
references such as this book, we can all start saying "There are 10 kinds of computer programmers:
those who write code that breaks, and those who read O'Reilly books."
-Gene Spafford, June 2003
[ Team LiB ]
[ Team LiB ]
Preface
We don't think we need to tell you that writing secure software is incredibly difficult, even for the
experts. We're not going to waste any time trying to convince you to start thinking about
security-we assume you're already doing that.
Our goal here is to provide you with a rich set of code samples that you can use to help secure the C
and C++ programs you write, for both Unix[1] and Windows environments.
[1]
We know Linux is not a true Unix, but we will lump it in there throughout this book for the sake of
convenience.
There are already several other books out there on the topic of writing secure software. Many of
them are quite good, but they universally focus on the fundamentals, not code. That is, they cover
basic secure programming principles, and they usually explain how to design for security and perform
risk assessments. Nevertheless, none of them show you by example how to do such things as SSLenable your applications properly, which can be surprisingly difficult.
Fundamental software security skills are important, and everybody should master them. But, in this
book, we assume that you already have the basics under your belt. We do talk about design
considerations, but we do so compactly, focusing instead on getting the implementation details
correct. If you need a more in-depth treatment of basic design principles, there are now several good
books on this topic, including Building Secure Software (Addison Wesley). In addition, on this book's
web site, we provide links to background resources that are available on the Internet.
[ Team LiB ]
[ Team LiB ]
More Than Just a Book
There is no way we could cover all the topics we wanted to cover in a reasonable number of pages. In
this book, we've had to focus on the recipes and technologies we thought would be most universally
applicable. In addition, we've had to focus on the C programming language, with some quick forays
into C++ when important, and a bit of assembly when there's no other way.
We hope this book will do well enough that we'll be able to produce versions for other programming
languages. Until then, we are going to solve both of the aforementioned problems at once with our
web site, http://www.secureprogramming.com, which you can also get to from the book's web page
on the O'Reilly site (http://oreilly.com/catalog/secureprgckbk/). Not only can you find errata there,
but you can also find and submit secure programming recipes that are not in the book. We will put on
the site recipes that we validate to be good. The goal of the site is to be a living, breathing resource
that can evolve as time progresses.
[ Team LiB ]
[ Team LiB ]
We Can't Do It All
There are plenty of things that people may find to criticize about this book. It's too broad a topic to
make a perfect book (that's the motivation for the web site, actually). Although we believe that this
book is likely to help you a great deal, we do want to address some specific issues so at least you'll
know what you're getting if you buy this book:
This book is implementation-focused.
You're not likely to build secure software if you don't know how to design software to be secure
from the get-go. We know that well, and we discuss it at great length in the bookBuilding
Secure Software. On the other hand, it's at least as easy to have a good design that results in
an insecure implementation, particularly when C is the programming language you're using.
Not only do our implementation-level solutions incorporate good design principles, but we also
discuss plenty of issues that will affect your designs as well as your implementations. The world
needs to know both how to design and how to implement with security in mind. We focus on
the implementation so that you'll do a better job of it. Nonetheless, we certainly recommend
that you read a book that thoroughly covers design before you read this book.
This book doesn't cover C++ well enough.
C++ programmers may grumble that we don't use any C++ specific idioms. For the most part,
the advice we give applies to both languages, but giving all the examples in C makes them
more applicable, because practitioners in both languages can still use them. On the rare
occasion that there are things to note that are specific to C++, we certainly try to do so;
examples include our discussions of buffer overflows and the use of exception handling to
prevent leaving programs in an insecure state. Over time, our coverage of C++ will improve on
the book's web site, but, until then, C++ programmers should still find this book relevant.
This book doesn't always force you to do the secure thing.
Some people would rather we take the approach of showing you one right way to do the few
things you should be doing in your applications. For example, we could simply cover ways to
create a secure channel, instead of talking about all the different low-level cryptographic
primitives and the many ways to use them. We do provide a lot of high-level solutions that
we'd strongly prefer you use. On the other hand, we have consulted on so many real-world
systems that we know all too well that some people need to trade off the absolute best security
possible for some other requirement. The whole security game is about risk mitigation, and it's
up to you to decide what your acceptable levels of risk are. We have tried to accommodate
people who may have nonstandard requirements, and to teach those people the risks involved
in what they're doing. If we simply provide high-level solutions, many people won't use them,
and will continue to build their own ad hoc solutions without adequate guidance.
This book could be friendlier to Windows developers.
In general, we cover the native Win32 API, rather than the variety of other API sets that
Microsoft offers, such as ATL and MFC. It would simply be infeasible to cover all of them, so
we've opted to cover the one that everything else builds on. We're sorry if you have to go to a
lower-level API than you might like if you want to use our code, but at least this way the
recipes are more widely applicable.
Much of the code that we present in the book will work on both Unix and Windows with little or
no modification. In these cases, we've favored traditional Unix naming conventions. The
naming conventions may feel foreign, but the bottom line is that no matter what platform
you're writing code for, naming conventions are a matter of personal preference.
If you thumb through the table of contents, you'll quickly find that this book contains a
considerable amount of material relating to cryptography. Where it was reasonable to do so,
we've covered CryptoAPI for Windows, but on the whole, OpenSSL gets far better coverage. It
is our experience that CryptoAPI is not nearly as full-featured as OpenSSL in many respects.
Further, some of the built-in Windows APIs for things such as SSL are far more complex than
we felt was reasonable to cover. Security is something that is difficult to get right even with a
good API to work with; an overly complex and underdocumented API certainly doesn't help the
situation.
We've tried our best to give Unix and Windows equivalent coverage. However, for some topic
areas, one platform may receive more in-depth attention. Generally, this is because of a
specific strength or weakness in the platform. We do believe both Windows and Unix
programmers can benefit from the material contained in this book.
There will still be security problems in code despite this book.
We have done our best to give you the tools you need to make your code a lot better. But even
security gurus occasionally manage to write code with much bigger risks than anticipated. You
should expect that it may happen to you, too, no matter what you know about security. One
caveat: you should not use the code in this book as if it were a code library you can simply link
against. You really need to read the text and understand the problems our code is built to avoid
to make sure that you actually use our code in the way it was intended. This is no different
from any other API, where you really should RTFM thoroughly before coding if you want to
have a chance of getting things right.
Despite the shortcomings some readers may find, we think this book has a great deal to offer. In
addition, we will do the best job we can to supplement this book on the Web in hopes of making the
material even better.
[ Team LiB ]
[ Team LiB ]
Organization of This Book
Because this book is a cookbook, the text is not presented in tutorial style; it is a comprehensive
reference, filled with code that meets common security needs. We do not intend for this book to be
read straight through. Instead, we expect that you will consult this book when you need it, just to
pick out the information and code that you need.
To that end, here is a strategy for getting the most out of this book:
Each recipe is named in some detail. Browse through the table of contents and through the list
of supplemental recipes on the book's web site.
Before reading appropriate recipes, take a look at the chapter introduction and the first few
recipes in the chapter for fundamental background on the topic.
Sometimes, we offer a general recipe providing an overview of possible solutions to a problem,
and then more specific recipes for each solution. For example, we have a generic recipe on
buffer overflows that helps you determine which technology is best for your application; then
there are recipes covering specific technologies that couldn't have been covered concisely in the
overview.
If particular concepts are unclear, look them up in the glossary, which is available on the book's
web site.
Throughout each recipe, we detail potential "gotchas" that you should consider, so be sure to
read recipes in their entirety.
The book is divided into 13 chapters:
Chapter 1, Safe Initialization, provides recipes for making sure your programs are in a secure state
on startup and when calling out to other programs.
Chapter 2, Access Control, shows how to manipulate files and directories in a secure manner. We
demonstrate both the Unix permissions model and the Windows access control lists used to protect
files and other resources.
Chapter 3, Input Validation, teaches you how to protect your programs from malicious user input. In
this chapter, we demonstrate techniques for preventing things like buffer overflow problems, crosssite scripting attacks, format string errors, and SQL-injection attacks.
Chapter 4, Symmetric Cryptography Fundamentals, covers basic encoding and storage issues that
are often helpful in traditional encryption.
Chapter 5, Symmetric Encryption, shows how to choose and use symmetric encryption primitives
such as AES, the Advanced Encryption Standard.
Chapter 6, Hashes and Message Authentication, focuses on ensuring data integrity using message
authentication codes.
Chapter 7, Public Key Cryptography, teaches you how to use basic public key algorithms such as
RSA.
Chapter 8, Authentication, shows you how to manipulate login credentials. We focus on implementing
password-based systems as securely as possible, because this is what most people want to use. Here
we also cover a wide variety of technologies, including PAM and Kerberos.
Chapter 9, Networking, provides code for securing your network connections. We discuss SSL and
TLS, and also describe more lightweight protocols for when you do not want to set up a public key
infrastructure. We strongly encourage you to come here before you go to the cryptography chapters,
because it is exceedingly difficult to build a secure network protocol from parts.
Chapter 10, Public Key Infrastructure, is largely a supplement for Chapter 9 for when you are using a
public key infrastructure (PKI), as well as when you are using the SSL/TLS protocol. In this chapter,
we demonstrate best practices for using a PKI properly. For example, we show how to determine
whether certificates have expired or are otherwise invalid.
Chapter 11, Random Numbers, describes how to get secure random data and turn such data into an
efficient and secure stream of pseudo-random numbers.
Chapter 12, Anti-Tampering, gives you the foundations necessary to start protecting your software
against reverse engineering. There are no absolute solutions in this area, but if you are willing to put
a lot of effort into it, you can make reverse engineering significantly more difficult.
Chapter 13, Other Topics, contains a potpourri of topics that did not fit into other chapters, such as
erasing secrets from memory properly, writing a secure signal handler, and preventing common
attacks against the Windows messaging system.
In addition, our web site contains a glossary providing a comprehensive listing of the many securityrelated terms used throughout this book, complete with concise definitions.
[ Team LiB ]
[ Team LiB ]
Recipe Compatibility
Most of the recipes in this book are written to work on both Unix and Windows platforms. In some
cases, however, we have provided different versions for these platforms. In the individual recipes,
we've noted any such issues. For convenience, Table P-1 lists those recipes that are specific to one
particular platform. Note also that in a few cases, recipes work only on particular variants of Unix.
Table P-1. Platform-specific recipes
Recipe
System
Recipe
System
1.1
Unix
8.2
Unix
1.2
Windows
8.3
Windows
1.3
Unix
8.6
Unix
1.4
Unix
8.9
Unix
1.5
Unix
8.13
Unix
1.6
Unix
9.5
Windows
1.7
Unix
9.9
Unix[2]
1.8
Windows
10.6
Windows
1.9
Unix
10.11
Windows
1.5
Unix
11.3
Unix
2.1
Unix
11.4
Windows
2.2
Windows
11.7
Unix
2.3
Unix
11.21
Windows
2.7
Unix
12.13
Unix
2.9
Unix
12.14
Windows
2.10
Windows
12.15
Windows
2.12
Unix
12.17
Unix[3]
2.13
FreeBSD
13.5
Unix
5.25
Windows
13.6
Windows
5.26
Windows
13.9
Unix
5.26
Windows
13.10
Windows
[2]
This recipe works for FreeBSD, Linux, and NetBSD. It does not work for Darwin, OpenBSD, and Solaris.
[3]
This recipe works for FreeBSD, Linux, NetBSD, OpenBSD, and Solaris. It does not work for Darwin.
[ Team LiB ]
[ Team LiB ]
Conventions Used in This Book
The following typographical conventions are used in this book:
Italic
Is used for filenames, directory names, and URLs. It is also used for emphasis and for the first
use of a technical term.
Constant width
Is used for code examples. It is also used for functions, arguments, structures, environment
variables, data types, and values.
Indicates a tip, suggestion, or general note.
Indicates a warning or caution.
[ Team LiB ]
[ Team LiB ]
Comments and Questions
We have tested and verified the information in this book to the best of our ability, but you may find
that we have made mistakes.
If you find problems with the book or have technical questions, please begin by visiting our web site
to see whether your concerns are addressed:
http://www.secureprogramming.com
As mentioned earlier, we keep an updated list of known errors in the book on that page, along with
new recipes. You can also submit your own recipes or suggestions for new recipes on that page.
If you do not find what you're looking for on our web site, feel free to contact us by sending email to:
[email protected]
You may also contact O'Reilly directly with questions or concerns:
O'Reilly & Associates
1005 Gravenstein Highway North
Sebastopol, CA 95472
(800) 998-9938 (in the United States or Canada)
(707) 829-0515 (international/local)
(707) 829-0104 (fax)
To ask technical questions or comment on the book, send email to:
[email protected]
The O'Reilly web site for the book lists errata and any plans for future editions. You can access this
page at:
http://www.oreilly.com/catalog/secureprgckbk
For information about other books and O'Reilly in general, see the O'Reilly web site:
http://www.oreilly.com
[ Team LiB ]
[ Team LiB ]
Acknowledgments
This book is all the better for its reviewers, including Seth Arnold, Theo de Raadt, Erik Fichtner, Bob
Fleck, Simson Garfinkel, Russ Housley, Mike Howard, Douglas Kilpatrick, Tadayoshi Kohno, John
Regehr, Ray Schneider, Alan Schwartz, Fred Tarabay, Rodney Thayer, David Wagner, Scott Walters,
and Robert Zigweid. In addition, we would like to single out Tom O'Connor for his Herculean efforts in
review and detailed comments.
Zakk Girouard did a lot of background work for us on material in Chapter 1, Chapter 2, Chapter 3,
and Chapter 8, and wrote some text for us. We're very grateful, and, dude, we're sorry we didn't
make it to your winter solstice party; we tried!
We'd also like to thank the wonderful staff at O'Reilly, particularly our editor, Debby Russell. They
were all extraordinarily accommodating, and it was a pleasure working with them. In fact, this
project was originally O'Reilly's idea. Sue Miller, our first editor at O'Reilly, initially suggested a
Cryptography Cookbook that we were happy to do, and it evolved from there. Thanks for tapping us
to write it. Thanks as well to Jon Orwant, who helped in the initial stages of the project.
Many thanks to Gene Spafford for contributing a wonderful foreword to this book and for his many
contributions to the field.
Matt Mackall lent us his expertise, helping us to write Recipe 11.19 and providing good feedback on
the rest of Chapter 11.
Chapter 12 was written "on the clock," by Secure Software staff, thanks to a contract from the Air
Force Research Labs. Martin Stytz and Dawn Ross were responsible for the contract on the Air Force
side, and they were a pleasure to work with. Eric Fedel, Zachary Girouard, and Paolo Soto were part
of the technical work on this effort, and Kaye Kirsch provided (fantastic) administrative support.
Thanks to everyone at Secure Software for supporting this book, including Admiral Guy Curtis, Kaye
Kirsch, and Peter Thimmesch. In addition, we'd like to thank Bill Coleman for being an all-around cool
guy, even though he 12:10'd much of our caffeine supply and our stash of late-night snacks.
Finally, we'd like to thank Strong Bad for teaching us how to type up a book while wearing boxing
gloves.
John Viega: Thanks to Crispin Cowan, Jeremy Epstein, Eric Fedel, Bob Fleck, Larry Hiller, Russ
Housley, Tetsu Iwata, Tadayoshi Kohno, Ben Laurie, David McGrew, Rodney Thayer, David Wagner,
Doug Whiting, and Jason Wright for conversations that ended up improving this book, either directly
or indirectly. Thanks also to my good friend Paul Wouters for hosting the book's web site. And, as
always, thanks to my family for their unflagging support. My daughters Emily and Molly deserve
special thanks, because time I spend writing is often time I don't get to spend with them. Of course,
if they were given a choice in the matter, this book probably wouldn't exist....
Over the years I've been lucky to have a number of excellent mentors. Thanks to Matt Conway, Russ
Housley, Gary McGraw, Paul Reynolds, Greg Stein, and Peter Thimmesch-you were/are all excellent
at the role.
I'd also like to thank Matt Messier for the awesome job he did on the book. I'm sorry it was so much
more work than it was intended to be!
Finally, I would like to thank sugar-free Red Bull and Diet Dr. Pepper for keeping me awake to write.
Narcolepsy is a pain.
Matt Messier: I would like to thank Jim Archer, Mike Bilow, Eric Fedel, Bob Fleck, Brian Gannon, Larry
Hiller, Fred Tarabay, Steve Wells, and the Rabble Babble Crew (Ellen, Brad, Gina, and Michael
especially) for moral support, and for listening to me ramble about whatever I happened to be writing
about at the time, regardless of how much or how little sense I was making. An extra special "thank
you" to my parents, without whom I would never be writing these words.
Thanks also to John Viega for pulling me in to work on this book, and for consistently pushing to
make it as great as I believe it is. John, it's been a pleasure working with you.
Finally, a big thanks goes out to Red Bull and to Peter's wonderful contribution of the espresso
machine in the kitchen that got me going every morning.
[ Team LiB ]
[ Team LiB ]
Chapter 1. Safe Initialization
Robust initialization of a program is important from a security standpoint, because the more parts of
the program's environment that can be validated (e.g., input, privileges, system parameters) before
any critical code runs, the better you can minimize the risks of many types of exploits. In addition,
setting a variety of operating parameters to a known state will help thwart attackers who run a
program in a hostile environment, hoping to exploit some assumption in the program regarding an
external resource that the program accesses (either directly or indirectly). This chapter outlines some
of these potential problems, and suggests solutions that work towards reducing the associated risks.
[ Team LiB ]
[ Team LiB ]
1.1 Sanitizing the Environment
1.1.1 Problem
Attackers can often control the value of important environment variables, sometimes even
remotely-for example, in CGI scripts, where invocation data is passed through environment
variables.
You need to make sure that an attacker does not set environment variables to malicious values.
1.1.2 Solution
Many programs and libraries, including the shared library loader on both Unix and Windows systems,
depend on environment variable settings. Because environment variables are inherited from the
parent process when a program is executed, an attacker can easily sabotage variables, causing your
program to behave in an unexpected and even insecure manner.
Typically, Unix systems are considerably more dependent on environment variables than are
Windows systems. In fact, the only scenario common to both Unix and Windows is that there is an
environment variable defining the path that the system should search to find an executable or shared
library (although differently named variables are used on each platform). On Windows, one
environment variable controls the search path for finding both executables and shared libraries. On
Unix, these are controlled by separate environment variables. Generally, you should not specify a
filename and then rely on these variables for determining the full path. Instead, you should always
use absolute paths to known locations.[1]
[1]
Note that the shared library environment variable can be relatively benign on modern Unix-based operating
systems, because the environment variable will get ignored when a program that can change permissions (i.e.,
a setuid program) is invoked. Nonetheless, it is better to be safe than sorry!
Certain variables expected to be present in the environment can cause insecure program behavior if
they are missing or improperly set. Make sure, therefore, that you never fully purge the environment
and leave it empty. Instead, variables that should exist should be forced to sane values or, at the
very least, treated as highly suspect and examined closely before they're used. Remove any unknown
variables from the environment altogether.
1.1.3 Discussion
The standard C runtime library defines a global variable,[2] environ, as a NULL-terminated array of
strings, where each string in the array is of the form "name=value".
[2]
The use of the term "variable" can quickly become confusing because C defines variables and the
environment defines variables. In this recipe, when we are referring to a C variable, we simply say "variable,"
and when we are referring to an environment variable, we say "environment variable."
Most systems do not declare the variable in any standard header file, Linux being the notable
exception, providing a declaration in unistd.h. You can gain access to the variable by including the
following extern statement in your code:
extern char **environ;
Several functions defined in stdlib.h, such as getenv( ) and putenv( ), provide access to
environment variables, and they all operate on this variable. You can therefore make changes to the
contents of the array or even build a new array and assign it to the variable.
This variable also exists in the standard C runtime library on Windows; however, the C runtime on
Windows is not as tightly bound to the operating system as it is on Unix. Directly manipulating the
environ variable on Windows will not necessarily produce the same effects as it will on Unix; in the
majority of Windows programs, the C runtime is never used at all, instead favoring the Win32 API to
perform the same functions as those provided by the C runtime. Because of this, and because of
Windows' lack of dependence on environment variables, we do not recommend using the code in this
recipe on Windows. It simply does not apply. However, we do recommend that you at least skim the
textual content of this recipe so that you're aware of potential pitfalls that could affect you on
Windows.
On a Unix system, if you invoke the command printenv at a shell prompt, you'll likely see a sizable
list of environment variables as a result. Many of the environment variables you will see are set by
whichever shell you're using (i.e., bash or tcsh). You should never use nor trust any of the
environment variables that are set by the shell. In addition, a malicious user may be able to set other
environment variables.
In most cases, the information contained in the environment variables set by the shell can be
determined by much more reliable means. For example, most shells set theHOME environment
variable, which is intended to be the user's home directory. It's much more reliable to callgetuid( )
to determine who the user is, and then call getpwuid( ) to get the user's password file record, which
will contain the user's home directory. For example:
#include
#include
#include
#include
#include
<sys/types.h>
<stdio.h>
<string.h>
<unistd.h>
<pwd.h>
int main(int argc, char *argv[
uid_t
uid;
struct passwd *pwd;
]) {
uid = getuid( );
printf("User's UID is %d.\n", (int)uid);
if (!(pwd = getpwuid(uid))) {
printf("Unable to get user's password file record!\n");
endpwent( );
return 1;
}
printf("User's home directory is %s\n", pwd->pw_dir);
endpwent( );
return 0;
}
The code above is not thread-safe. Be sure multiple threads do not try to
manipulate the password database at the same time.
In many cases, it is reasonably safe to throw away most of the environment variables that your
program will inherit from its parent process, but you should make it a point to be aware of any
environment variables that will be used by code you're using, including the operating system's
dynamic loader and the standard C runtime library. In particular, dynamic loaders onELF-based Unix
systems (among the Unix variants we're explicitly supporting in this book,Darwin is the major
exception here because it does not use ELF (Executable and Linking Format) for its executable
format) and most standard implementations of malloc( ) all recognize a wide variety of
environment variables that control their behavior.
In most cases, you should never be doing anything in your programs that will make use of thePATH
environment variable. Circumstances do exist in which it may be reasonable to do so, but make sure
to weigh your options carefully beforehand. Indeed, you should consider carefully whether you should
be using any environment variable in your programs. Regardless, if you launch external programs
from within your program, you may not have control over what the external programs do, so you
should take care to provide any external programs you launch with a sane and secure environment.
In particular, the two environment variables IFS and PATH should always be forced to sane values.
The IFS environment variable is somewhat obscure, but it is used by many shells to determine which
character separates command-line arguments. Modern Unix shells use a reasonable default value for
IFS if it is not already set. Nonetheless, you should defensively assume that the shell does nothing of
the sort. Therefore, instead of simply deleting the IFS environment variable, set it to something sane,
such as a space, tab, and newline character.
The PATH environment variable is used by the shell and some of the exec*( ) family of standard C
functions to locate an executable if a path is not explicitly specified. The search path shouldnever
include relative paths, especially the current directory as denoted by a single period. To be safe, you
should always force the setting of the PATH environment variable to _PATH_STDPATH, which is defined
in paths.h. This value is what the shell normally uses to initialize the variable, but an attacker or
naïve user could change it later. The definition of_PATH_STDPATH differs from platform to platform,
so you should generally always use that value so that you get the right standard paths for the system
your program is running on.
Finally, the TZ environment variable denotes the time zone that the program should use, when
relevant. Because users may not be in the same time zone as the machine (which will use a default
whenever the variable is not set), it is a good idea to preserve this variable, if present. Note also that
this variable is generally used by the OS, not the application. If you're using it at the application level,
make sure to do proper input validation to protect against problems such as buffer overflow.
Finally, a special environment variable,, is defined to be the time zone on many systems. All systems
will use it if it is defined, but while most systems will get along fine without it, some systems will not
function properly without its being set. Therefore, you should preserve it if it is present.
Any other environment variables that are defined should be removed unless you know, for some
reason, that you need the variable to be set. For any environment variables you preserve, be sure to
treat them as untrusted user input. You may be expecting them to be set to reasonable values-and
in most cases, they probably will be-but never assume they are. If for some reason you're writing
CGI code in C, the list of environment variables passed from the web server to your program can be
somewhat large, but these are largely trustworthy unless an attacker somehow manages to wedge
another program between the web server and your program.
Of particular interest among environment variables commonly passed from a web server to CGI
scripts are any environment variables whose names begin with HTTP_ and those listed in Table 1-1.
Table 1-1. Environment variables commonly passed from web servers to
CGI scripts
Environment variable
name
Comments
AUTH_TYPE
If authentication was required to make the request, this contains the
authentication type that was used, usually "BASIC".
CONTENT_LENGTH
The number of bytes of content, as specified by the client.
CONTENT_TYPE
The MIME type of the content sent by the client.
GATEWAY_INTERFACE
The version of the CGI specification with which the server complies.
PATH_INFO
Extra path information from the URL.
PATH_TRANSLATED
Extra path information from the URL, translated by the server.
QUERY_STRING
The portion of the URL following the question mark.
REMOTE_ADDR
The IP address of the remote client in dotted decimal form.
REMOTE_HOST
The host name of the remote client.
REMOTE_IDENT
If RFC1413 identification was used, this contains the user name that
was retrieved from the remote identification server.
REMOTE_USER
If authentication was required to make the request, this contains the
user name that was authenticated.
REQUEST_METHOD
The method used to make the current request, usually either "GET" or
"POST".
SCRIPT_NAME
The name of the script that is running, canonicalized to the root of the
web site's document tree (e.g., DocumentRoot in Apache).
SERVER_NAME
The host name or IP address of the server.
SERVER_PORT
The port on which the server is running.
SERVER_PROTOCOL
The protocol used to make the request, typically "HTTP/1.0" or
"HTTP/1.1".
SERVER_SOFTWARE
The name and version of the server.
The code presented in this section defines a function calledspc_sanitize_environment( ) that will
build a new environment with the IFS and PATH environment variables set to sane values, and with
the TZ environment variable preserved from the original environment if it is present. You can also
specify a list of environment variables to preserve from the original in addition to theTZ environment
variable.
The first thing that spc_sanitize_environment( ) does is determine how much memory it will need
to allocate to build the new environment. If the memory it needs cannot be allocated, the function
will call abort( ) to terminate the program immediately. Otherwise, it will then build the new
environment and replace the old environ pointer with a pointer to the newly allocated one. Note that
the memory is allocated in one chunk rather than in smaller pieces for the individual strings. While
this is not strictly necessary (and it does not provide any specific security benefit), it's faster and
places less strain on memory allocation. Note, however, that you should be performing this operation
early in your program, so heap fragmentation shouldn't be much of an issue.
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<paths.h>
extern char **environ;
/* These arrays are both NULL-terminated. */
static char *spc_restricted_environ[ ] = {
"IFS= \t\n",
"PATH=" _PATH_STDPATH,
0
};
static char *spc_preserve_environ[
"TZ",
0
};
] = {
void spc_sanitize_environment(int preservec, char **preservev) {
int
i;
char
**new_environ, *ptr, *value, *var;
size_t arr_size = 1, arr_ptr = 0, len, new_size = 0;
for (i = 0; (var = spc_restricted_environ[i]) != 0; i++) {
new_size += strlen(var) + 1;
arr_size++;
}
for (i = 0; (var = spc_preserve_environ[i]) != 0; i++) {
if (!(value = getenv(var))) continue;
new_size += strlen(var) + strlen(value) + 2; /* include the '=' */
arr_size++;
}
if (preservec && preservev) {
for (i = 0; i < preservec && (var = preservev[i]) != 0; i++) {
if (!(value = getenv(var))) continue;
new_size += strlen(var) + strlen(value) + 2; /* include the '=' */
arr_size++;
}
}
new_size += (arr_size * sizeof(char *));
if (!(new_environ = (char **)malloc(new_size))) abort(
new_environ[arr_size - 1] = 0;
);
ptr = (char *)new_environ + (arr_size * sizeof(char *));
for (i = 0; (var = spc_restricted_environ[i]) != 0; i++) {
new_environ[arr_ptr++] = ptr;
len = strlen(var);
memcpy(ptr, var, len + 1);
ptr += len + 1;
}
for (i = 0; (var = spc_preserve_environ[i]) != 0; i++) {
if (!(value = getenv(var))) continue;
new_environ[arr_ptr++] = ptr;
len = strlen(var);
memcpy(ptr, var, len);
*(ptr + len + 1) = '=';
memcpy(ptr + len + 2, value, strlen(value) + 1);
ptr += len + strlen(value) + 2; /* include the '=' */
}
if (preservec && preservev) {
for (i = 0; i < preservec && (var = preservev[i]) != 0; i++) {
if (!(value = getenv(var))) continue;
new_environ[arr_ptr++] = ptr;
len = strlen(var);
memcpy(ptr, var, len);
*(ptr + len + 1) = '=';
memcpy(ptr + len + 2, value, strlen(value) + 1);
ptr += len + strlen(value) + 2; /* include the '=' */
}
}
environ = new_environ;
}
1.1.4 See Also
Recipe 1.7, Recipe 1.8
[ Team LiB ]
[ Team LiB ]
1.2 Restricting Privileges on Windows
1.2.1 Problem
Your Windows program runs with elevated privileges, such as Administrator or Local System, but it
does not require all the privileges granted to the user account under which it's running. Your program
never needs to perform certain actions that may be dangerous if users with elevated privileges run it
and an attacker manages to compromise the program.
1.2.2 Solution
When a user logs into the system or the service control manager starts a service, a token is created
that contains information about the user logging in or the user under which the service is running.
The token contains a list of all of the groups to which the user belongs (the user and each group in
the list is represented by a Security ID or SID), as well as a set of privileges that any thread running
with the token has. The set of privileges is initialized from the privileges assigned by the system
administrator to the user and the groups to which the user belongs.
Beginning with Windows 2000, it is possible to create a restricted token and force threads to run
using that token. Once a restricted token has been applied to a running thread, any restrictions
imposed by the restricted token cannot be lifted; however, it is possible to revert the thread back to
its original unrestricted token. With restricted tokens, it's possible to remove privileges, restrict the
SIDs that are used in access checking, and deny SIDs access. The use of restricted tokens is more
useful when combined with the CreateProcessAsUser( ) API to create a new process with a
restricted token that cannot be reverted to a more permissive token.
Beginning with Windows .NET Server 2003, it is possible to permanently remove privileges from a
process's token. Once the privileges have been removed, they cannot be added back. Any new
processes created by a process running with a modified token will inherit the modified token;
therefore, the same restrictions imposed upon the parent process are also imposed upon the child
process. Note that modifying a token is quite different from creating a restricted token. In particular,
only privileges can be removed; SIDs can be neither restricted nor denied.
1.2.3 Discussion
Tokens contain a list of SIDs, composed of the user's SID and one SID for each group of which the
user is a member. SIDs are assigned by the system when users and groups are created. In addition
to the SIDs, tokens also contain a list of restricted SIDs. When access checks are performed and the
token contains a list of restricted SIDs, the intersection of the two lists of SIDs contained in the token
is used to perform the access check. Finally, tokens also contain a list of privileges. Privileges define
specific access rights. For example, for a process to use the Win32 debugging API, the process's
token must contain the SeDebugPrivilege privilege.
The primary list of SIDs contained in a token cannot be modified. The token is created for a particular
user, and the token must always contain the user's SID along with the SIDs for each group of which
the user is a member. However, each SID in the primary list can be marked with a "deny" attribute,
which causes access to be denied when an access control list (ACL) contains a SID that is marked as
"deny" in the active token.
1.2.3.1 Creating restricted tokens
Using the CreateRestrictedToken( ) API, a restricted token can be created from an existing token.
The resulting token can then be used to create a new process or to set an impersonation token for a
thread. In the former case, the restricted token becomes the newly created process's primary token;
in the latter case, the thread can revert back to its primary token, effectively making the restrictions
imposed by the restricted token useful for little more than helping to prevent accidents.
CreateRestrictedToken( ) requires a large number of arguments, and it may seem an intimidating
function to use, but with some explanation and examples, it's not actually all that difficult. The
function has the following signature:
BOOL CreateRestrictedToken(HANDLE ExistingTokenHandle, DWORD Flags,
DWORD DisableSidCount, PSID_AND_ATTRIBUTES SidsToDisable,
DWORD DeletePrivilegeCount, PLUID_AND_ATTRIBUTES PrivilegesToDelete,
DWORD RestrictedSidCount, PSID_AND_ATTRIBUTES SidsToRestrict,
PHANDLE NewTokenHandle);
These functions have the following arguments:
ExistingTokenHandle
Handle to an existing token. An existing token handle can be obtained via a call to either
OpenProcessToken( ) or OpenThreadToken( ). The token may be either a primary or a
restricted token. In the latter case, the token may be obtained from an earlier call to
CreateRestrictedToken( ). The existing token handle must have been opened or created
with TOKEN_DUPLICATE access.
Flags
May be specified as 0 or as a combination of DISABLE_MAX_PRIVILEGE or SANDBOX_INERT. If
DISABLE_MAX_PRIVILEGE is used, all privileges in the new token are disabled, and the two
arguments DeletePrivilegeCount and PrivilegesToDelete are ignored. The SANDBOX_INERT
has no special meaning other than it is stored in the token, and can be later queried using
GetTokenInformation( ).
DisableSidCount
Number of elements in the list SidsToDisable. May be specified as 0 if there are no SIDs to be
disabled. Disabling a SID is the same as enabling the SIDs "deny" attribute.
SidsToDisable
List of SIDs for which the "deny" attribute is to be enabled. May be specified asNULL if no SIDs
are to have the "deny" attribute enabled. See below for information on the
SID_AND_ATTRIBUTES structure.
DeletePrivilegeCount
Number of elements in the list PrivilegesToDelete. May be specified as 0 if there are no
privileges to be deleted.
PrivilegesToDelete
List of privileges to be deleted from the token. May be specified asNULL if no privileges are to
be deleted. See below for information on the LUID_AND_ATTRIBUTES structure.
RestrictedSidCount
Number of elements in the list SidsToRestrict. May be specified as 0 if there are no restricted
SIDs to be added.
SidsToRestrict
List of SIDs to restrict. If the existing token is a restricted token that already has restricted
SIDs, the resulting token will have a list of restricted SIDs that is the intersection of the
existing token's list and this list. May be specified asNULL if no restricted SIDs are to be added
to the new token.
NewTokenHandle
Pointer to a HANDLE that will receive the handle to the newly created token.
The function OpenProcessToken( ) will obtain a handle to the process's primary token, while
OpenThreadToken( ) will obtain a handle to the calling thread's impersonation token. Both functions
have a similar signature, though their arguments are treated slightly differently:
BOOL OpenProcessToken(HANDLE hProcess, DWORD dwDesiredAccess, PHANDLE phToken);
BOOL OpenThreadToken(HANDLE hThread, DWORD dwDesiredAccess, BOOL bOpenAsSelf,
PHANDLE phToken);
This function has the following arguments:
hProcess
Handle to the current process, which is normally obtained via a call toGetCurrentProcess( ).
hThread
Handle to the current thread, which is normally obtained via a call toGetCurrentThread( ).
dwDesiredAccess
Bit mask of the types of access desired for the returned token handle. For creating restricted
tokens, this must always include TOKEN_DUPLICATE. If the restricted token being created will
be used as a primary token for a new process, you must include TOKEN_ASSIGN_PRIMARY;
otherwise, if the restricted token that will be created will be used as an impersonation token for
the thread, you must include TOKEN_IMPERSONATE.
bOpenAsSelf
Boolean flag that determines how the access check for retrieving the thread's token is
performed. If specified as FALSE, the access check uses the calling thread's permissions. If
specified as TRUE, the access check uses the calling process's permissions.
phToken
Pointer to a HANDLE that will receive the handle to the process's primary token or the thread's
impersonation token, depending on whether you're calling OpenProcessToken( ) or
OpenThreadToken( ).
Creating a new process with a restricted token is done by calling CreateProcessAsUser( ), which
works just as CreateProcess( ) does (see Recipe 1.8) except that it requires a token to be used as
the new process's primary token. Normally, CreateProcessAsUser( ) requires that the active token
have the SeAssignPrimaryTokenPrivilege privilege, but if a restricted token is used, that privilege
is not required. The following pseudo-code demonstrates the steps required to create a new process
with a restricted primary token:
HANDLE hProcessToken, hRestrictedToken;
/* First get a handle to the current process's primary token */
OpenProcessToken(GetCurrentProcess( ), TOKEN_DUPLICATE | TOKEN_ASSIGN_PRIMARY,
&hProcessToken);
/* Create a restricted token with all privileges removed */
CreateRestrictedToken(hProcessToken, DISABLE_MAX_PRIVILEGE, 0, 0, 0, 0, 0, 0,
&hRestrictedToken);
/* Create a new process using the restricted token */
CreateProcessAsUser(hRestrictedToken, ...);
/* Cleanup */
CloseHandle(hRestrictedToken);
CloseHandle(hProcessToken);
Setting a thread's impersonation token requires a bit more work. Unless the calling thread is
impersonating, calling OpenThreadToken( ) will result in an error because the thread does not have
an impersonation token and thus is using the process's primary token. Likewise, calling
SetThreadToken( ) unless impersonating will also fail because a thread cannot have an
impersonation token if it's not impersonating.
If you want to restrict a thread's access rights temporarily, the easiest solution to the problem is to
force the thread to impersonate itself. When impersonation begins, the thread is assigned an
impersonation token, which can then be obtained via OpenThreadToken( ). A restricted token can be
created from the impersonation token, and the thread's impersonation token can then be replaced
with the new restricted token by calling SetThreadToken( ).
The following pseudo-code demonstrates the steps required to replace a thread's impersonation
token with a restricted one:
HANDLE hRestrictedToken, hThread, hThreadToken;
/* First begin impersonation */
ImpersonateSelf(SecurityImpersonation);
/* Get a handle to the current thread's impersonation token */
hThread = GetCurrentThread( );
OpenThreadToken(hThread, TOKEN_DUPLICATE | TOKEN_IMPERSONATE, TRUE, &hThreadToken);
/* Create a restricted token with all privileges removed */
CreateRestrictedToken(hThreadToken, DISABLE_MAX_PRIVILEGE, 0, 0, 0, 0, 0, 0,
&hRestrictedToken);
/* Set the thread's impersonation token to the new restricted token */
SetThreadToken(&hThread, hRestrictedToken);
/* ... perform work here */
/* Revert the thread's impersonation token back to its original */
SetThreadToken(&hThread, 0);
/* Stop impersonating */
RevertToSelf( );
/* Cleanup */
CloseHandle(hRestrictedToken);
CloseHandle(hThreadToken);
1.2.3.2 Modifying a process's primary token
Beginning with Windows .NET Server 2003, support for a new flag has been added to the function
AdjustTokenPrivileges( ); it allows a privilege to be removed from a token, rather than simply
disabled. Once the privilege has been removed, it cannot be added back to the token. In older
versions of Windows, privileges could only be enabled or disabled usingAdjustTokenPrivileges( ),
and there was no way to remove privileges from a token without duplicating it. There is no way to
substitute another token for a process's primary token-the best you can do in older versions of
Windows is to use restricted impersonation tokens.
BOOL AdjustTokenPrivileges(HANDLE TokenHandle, BOOL DisableAllPrivileges,
PTOKEN_PRIVILEGES NewState, DWORD BufferLength,
PTOKEN_PRIVILEGES PreviousState, PDWORD ReturnLength);
This function has the following arguments:
TokenHandle
Handle to the token that is to have its privileges adjusted. The handle must have been opened
with TOKEN_ADJUST_PRIVILEGES access; in addition, if PreviousState is to be filled in, it must
have TOKEN_QUERY access.
DisableAllPrivileges
Boolean argument that specifies whether all privileges held by the token are to be disabled. If
specified as TRUE, all privileges are disabled, and the NewState argument is ignored. If
specified as FALSE, privileges are adjusted according to the information in theNewState
argument.
NewState
List of privileges that are to be adjusted, along with the adjustment that is to be made for
each. Privileges can be enabled, disabled, and removed. The TOKEN_PRIVILEGES structure
contains two fields: PrivilegeCount and Privileges. PrivilegeCount is simply a DWORD that
indicates how many elements are in the array that is the Privileges field. The Privileges
field is an array of LUID_AND_ATTRIBUTES structures, for which the Attributes field of each
element indicates how the privilege is to be adjusted. A value of 0 disables the privilege,
SE_PRIVILEGE_ENABLED enables it, and SE_PRIVILEGE_REMOVED removes the privilege. See
Section 1.2.3.4 later in this section for more information regarding these structures.
BufferLength
Length in bytes of the PreviousState buffer. May be 0 if PreviousState is NULL.
PreviousState
Buffer into which the state of the token's privileges prior to adjustment is stored. It may be
specified as NULL if the information is not required. If the buffer is not specified asNULL, the
token must have been opened with TOKEN_QUERY access.
ReturnLength
Pointer to an integer into which the number of bytes written into the PreviousState buffer will
be placed. May be specified as NULL if PreviousState is also NULL.
The following example code demonstrates how AdjustTokenPrivileges( ) can be used to remove
backup and restore privileges from a token:
#include <windows.h>
BOOL RemoveBackupAndRestorePrivileges(VOID) {
BOOL
bResult;
HANDLE
hProcess, hProcessToken;
PTOKEN_PRIVILEGES pNewState;
/* Allocate a TOKEN_PRIVILEGES buffer to hold the privilege change information.
* Two privileges will be adjusted, so make sure there is room for two
* LUID_AND_ATTRIBUTES elements in the Privileges field of TOKEN_PRIVILEGES.
*/
pNewState = (PTOKEN_PRIVILEGES)LocalAlloc(LMEM_FIXED, sizeof(TOKEN_PRIVILEGES) +
(sizeof(LUID_AND_ATTRIBUTES) * 2));
if (!pNewState) return FALSE;
/* Add the two privileges that will be removed to the allocated buffer */
pNewState->PrivilegeCount = 2;
if (!LookupPrivilegeValue(0, SE_BACKUP_NAME, &pNewState->Privileges[0].Luid) ||
!LookupPrivilegeValue(0, SE_RESTORE_NAME, &pNewState->Privileges[1].Luid)) {
LocalFree(pNewState);
return FALSE;
}
pNewState->Privileges[0].Attributes = SE_PRIVILEGE_REMOVED;
pNewState->Privileges[1].Attributes = SE_PRIVILEGE_REMOVED;
/* Get a handle to the process's primary token. Request TOKEN_ADJUST_PRIVILEGES
* access so that we can adjust the privileges. No other privileges are req'd
* since we'll be removing the privileges and thus do not care about the previous
* state. TOKEN_QUERY access would be required in order to retrieve the previous
* state information.
*/
hProcess = GetCurrentProcess( );
if (!OpenProcessToken(hProcess, TOKEN_ADJUST_PRIVILEGES, &hProcessToken)) {
LocalFree(pNewState);
return FALSE;
}
/* Adjust the privileges, specifying FALSE for DisableAllPrivileges so that the
* NewState argument will be used instead. Don't request information regarding
* the token's previous state by specifying 0 for the last three arguments.
*/
bResult = AdjustTokenPrivileges(hProcessToken, FALSE, pNewState, 0, 0, 0);
/* Cleanup and return the success or failure of the adjustment */
CloseHandle(hProcessToken);
LocalFree(pNewState);
return bResult;
}
1.2.3.3 Working with SID_AND_ATTRIBUTES structures
A SID_AND_ATTRIBUTES structure contains two fields: Sid and Attributes. The Sid field is of type
PSID, which is a variable-sized object that should never be directly manipulated by application-level
code. The meaning of the Attributes field varies depending on the use of the structure. When a
SID_AND_ATTRIBUTES structure is being used for disabling SIDs (enabling the "deny" attribute), the
Attributes field is ignored. When a SID_AND_ATTRIBUTES structure is being used for restricting
SIDs, the Attributes field should always be set to 0. In both cases, it's best to set theAttributes
field to 0.
Initializing the Sid field of a SID_AND_ATTRIBUTES structure can be done in a number of ways, but
perhaps one of the most useful ways is to use LookupAccountName( ) to obtain the SID for a specific
user or group name. The following code demonstrates how to look up the SID for a name:
#include <windows.h>
PSID SpcLookupSidByName(LPCTSTR lpAccountName, PSID_NAME_USE peUse) {
PSID
pSid;
DWORD
cbSid, cchReferencedDomainName;
LPTSTR
ReferencedDomainName;
SID_NAME_USE eUse;
cbSid = cchReferencedDomainName = 0;
if (!LookupAccountName(0, lpAccountName, 0, &cbSid, 0, &cchReferencedDomainName,
&eUse)) return 0;
if (!(pSid = LocalAlloc(LMEM_FIXED, cbSid))) return 0;
ReferencedDomainName = LocalAlloc(LMEM_FIXED,
(cchReferencedDomainName + 1) * sizeof(TCHAR));
if (!ReferencedDomainName) {
LocalFree(pSid);
return 0;
}
if (!LookupAccountName(0, lpAccountName, pSid, &cbSid, ReferencedDomainName,
&cchReferencedDomainName, &eUse)) {
LocalFree(ReferencedDomainName);
LocalFree(pSid);
return 0;
}
LocalFree(ReferencedDomainName);
if (peUse) *peUse = eUse;
return 0;
}
If the requested account name is found, a PSID object allocated via LocalAlloc( ) is returned;
otherwise, NULL is returned. If the second argument is specified as non-NULL, it will contain the type
of SID that was found. Because Windows uses SIDs for many different things other than simply users
and groups, the type could be one of many possibilities. If you're looking for a user, the type should
be SidTypeUser. If you're looking for a group, the type should be SidTypeGroup. Other possibilities
include SidTypeDomain, SidTypeAlias, SidTypeWellKnownGroup, SidTypeDeletedAccount,
SidTypeInvalid, SidTypeUnknown, and SidTypeComputer.
1.2.3.4 Working with LUID_AND_ATTRIBUTES structures
An LUID_AND_ATTRIBUTES structure contains two fields: Luid and Attributes. The Luid field is of
type LUID, which is an object that should never be directly manipulated by application-level code. The
meaning of the Attributes field varies depending on the use of the structure. When an
LUID_AND_ATTRIBUTES structure is being used for deleting privileges from a restricted token, the
Attributes field is ignored and should be set to 0. When an LUID_AND_ATTRIBUTES structure is
being used for adjusting privileges in a token, the Attributes field should be set to
SE_PRIVILEGE_ENABLED to enable the privilege, SE_PRIVILEGE_REMOVED to remove the privilege, or 0
to disable the privilege. The SE_PRIVILEGE_REMOVED attribute is not valid on Windows NT, Windows
2000, or Windows XP; it is a newly supported flag in Windows .NET Server 2003.
Initializing the Luid field of an LUID_AND_ATTRIBUTES structure is typically done using
LookupPrivilegeValue( ), which has the following signature:
BOOL LookupPrivilegeValue(LPCTSTR lpSystemName, LPCTSTR lpName, PLUID lpLuid);
This function has the following arguments:
lpSystemName
Name of the computer on which the privilege value's name is looked up. This is normally
specified as NULL, which indicates that only the local system should be searched.
lpName
Name of the privilege to look up. The Windows platform SDK header filewinnt.h defines a
sizable number of privilege names as macros that expand to literal strings suitable for use
here. Each of these macros begins with SE_, which is followed by the name of the privilege. For
example, the SeBackupPrivilege privilege has a corresponding macro named
SE_BACKUP_NAME.
lpLuid
Pointer to a caller-allocated LUID object that will receive the LUID information if the lookup is
successful. LUID objects are a fixed size, so they may be allocated either dynamically or on the
stack.
1.2.4 See Also
Recipe 1.8
[ Team LiB ]
[ Team LiB ]
1.3 Dropping Privileges in setuid Programs
1.3.1 Problem
Your program runs setuid or setgid (see Section 1.3.3 for definitions), thus providing your program
with extra privileges when it is executed. After the work requiring the extra privileges is done, those
privileges need to be dropped so that an attacker cannot leverage your program during an attack
that results in privilege elevation.
1.3.2 Solution
If your program must run setuid or setgid, make sure to use the privileges properly so that an
attacker cannot exploit other possible vulnerabilities in your program and gain these additional
privileges. You should perform whatever work requires the additional privileges as early in the
program as possible, and you should drop the extra privileges immediately after that work is done.
While many programmers may be aware of the need to drop privileges, many more are not. Worse,
those who do know to drop privileges rarely know how to do so properly and securely. Dropping
privileges is tricky business because the semantics of the system calls to manipulate IDs for
setuid/setgid vary from one Unix variant to another-sometimes only slightly, but often just enough
to make the code that works on one system fail on another.
On modern Unix systems, the extra privileges resulting from using the setuid or setgid bits on an
executable can be dropped either temporarily or permanently. It is best if your program can do what
it needs to with elevated privileges, then drop those privileges permanently, but that's not always
possible. If you must be able to restore the extra privileges, you will need to be especially careful in
your program to do everything possible to prevent an attacker from being able to take control of
those privileges. We strongly advise against dropping privileges only temporarily. You should do
everything possible to design your program such that it can drop privileges permanently as quickly as
possible. We do recognize that it's not always possible to do-the Unix passwd command is a perfect
example: the last thing it does is use its extra privileges to write the new password to the password
file, and it cannot do it any sooner.
1.3.3 Discussion
Before we can discuss how to drop privileges either temporarily or permanently, it's useful to have at
least a basic understanding of how setuid, setgid, and the privilege model in general work on Unix
systems. Because of space constraints and the complexity of it all, we're not able to delve very
deeply into the inner workings here. If you are interested in a more detailed discussion, we
recommend the paper "Setuid Demystified" by Hao Chen, David Wagner, and Drew Dean, which was
presented at the 11th USENIX Security Symposium in 2002 and is available at
http://www.cs.berkeley.edu/~daw/papers/setuid-usenix02.pdf.
On all Unix systems, each process has an effective user ID, a real user ID, an effective group ID, and
a real group ID. In addition, each process on most modern Unix systems also has a saved user ID
and a saved group ID.[3] All of the Unix variants that we cover in this book have saved user IDs, so
our discussion assumes that the sets of user and group IDs each have an effective ID, a real ID, and
a saved ID.
[3]
Linux further complicates the already complex privilege model by adding a filesystem user ID and a
filesystem group ID, as well as POSIX capabilities. At this time, most systems do not actually make use of
POSIX capabilities, and the filesystem IDs are primarily maintained automatically by the kernel. If the filesystem
IDs are not explicitly modified by a process, they can be safely ignored, and they will behave properly. We won't
discuss them any further here.
Normally when a process is executed, the effective, real, and saved user and group IDs are all set to
the real user and group ID of the process's parent, respectively. However, when the setuid bit is set
on an executable, the effective and saved user IDs are set to the user ID that owns the file. Likewise,
when the setgid bit is set on an executable, the effective and saved group IDs are set to the group ID
that owns the file.
For the most part, all privilege checks performed by the operating system are done using the
effective user or effective group ID. The primary deviations from this rule are some of the system
calls used to manipulate a process's user and group IDs. In general, the effective user or group ID for
a process may be changed as long as the new ID is the same as either the real or the saved ID.
Taking all this into account, permanently dropping privileges involves ensuring that the effective, real,
and saved IDs are all the same value. Temporarily dropping privileges requires that the effective and
real IDs are the same value, and that the saved ID is unchanged so that the effective ID can later be
restored to the higher privilege. These rules apply to both group and user IDs.
One more issue needs to be addressed with regard to dropping privileges. In addition to the effective,
real, and saved group IDs of a process, a process also has ancillary groups. Ancillary groups are
inherited by a process from its parent process, and they can only be altered by a process with
superuser privileges. Therefore, if a process with superuser privileges is dropping these privileges, it
must also be sure to drop any ancillary groups it may have. This is achieved by callingsetgroups( )
with a single group, which is the real group ID for the process. Because thesetgroups( ) system
call is guarded by requiring the effective user ID of the process to be that of the superuser, it must
be done prior to dropping root privileges. Ancillary groups should be dropped regardless of whether
privileges are being dropped permanently or temporarily. In the case of a temporary privilege drop,
the process can restore the ancillary groups if necessary when elevated privileges are restored.
The first of two functions, spc_drop_privileges( ) drops any extra group or user privileges either
permanently or temporarily, depending on the value of its only argument. If a nonzero value is
passed, privileges will be dropped permanently; otherwise, the privilege drop is temporary. The
second function, spc_restore_privileges( ), restores privileges to what they were at the last call
to spc_drop_privileges( ). If either function encounters any problems in attempting to perform its
respective task, abort( ) is called, terminating the process immediately. If any manipulation of
privileges cannot complete successfully, it's safest to assume that the process is in an unknown state,
and you should not allow it to continue.
Recalling our earlier discussion regarding subtle differences in the semantics for changing a process's
group and user IDs, you'll notice that spc_drop_privileges( ) is littered with preprocessor
conditionals that test for the platform on which the code is being compiled. For the BSD-derived
platforms (Darwin, FreeBSD, NetBSD, and OpenBSD), dropping privileges involves a simple call to
setegid( ) or seteuid( ), followed by a call to either setgid( ) or setuid( ) if privileges are
being permanently dropped. The setgid( ) and setuid( ) system calls adjust the process's saved
group and user IDs, respectively, as well as the real group or user ID.
On Linux and Solaris, the setgid( ) and setuid( ) system calls do not alter the process's saved
group and user IDs in all cases. (In particular, if the effective ID is not the superuser, the saved ID is
not altered; otherwise, it is.). That means that these calls can't reliably be used to permanently drop
privileges. Instead, setregid( ) and setreuid( ) are used, which actually simplifies the process
except that these two system calls have different semantics on the BSD-derived platforms.
As discussed above, always drop group privileges before dropping user
privileges; otherwise, group privileges may not be able to be fully dropped.
#include
#include
#include
#include
static
static
static
static
<sys/param.h>
<sys/types.h>
<stdlib.h>
<unistd.h>
int
gid_t
uid_t
gid_t
orig_ngroups = -1;
orig_gid = -1;
orig_uid = -1;
orig_groups[NGROUPS_MAX];
void spc_drop_privileges(int permanent) {
gid_t newgid = getgid( ), oldgid = getegid(
uid_t newuid = getuid( ), olduid = geteuid(
);
);
if (!permanent) {
/* Save information about the privileges that are being dropped so that they
* can be restored later.
*/
orig_gid = oldgid;
orig_uid = olduid;
orig_ngroups = getgroups(NGROUPS_MAX, orig_groups);
}
/* If root privileges are to be dropped, be sure to pare down the ancillary
* groups for the process before doing anything else because the setgroups( )
* system call requires root privileges. Drop ancillary groups regardless of
* whether privileges are being dropped temporarily or permanently.
*/
if (!olduid) setgroups(1, &newgid);
if (newgid != oldgid) {
#if !defined(linux)
setegid(newgid);
if (permanent && setgid(newgid) = = -1) abort( );
#else
if (setregid((permanent ? newgid : -1), newgid) = = -1) abort(
#endif
}
if (newuid != olduid) {
);
#if !defined(linux)
seteuid(newuid);
if (permanent && setuid(newuid) = = -1) abort( );
#else
if (setregid((permanent ? newuid : -1), newuid) = = -1) abort(
#endif
}
/* verify that the changes were successful */
if (permanent) {
if (newgid != oldgid && (setegid(oldgid) != -1
abort( );
if (newuid != olduid && (seteuid(olduid) != -1
abort( );
} else {
if (newgid != oldgid && getegid( ) != newgid)
if (newuid != olduid && geteuid( ) != newuid)
}
);
|| getegid(
) != newgid))
|| geteuid(
) != newuid))
abort(
abort(
);
);
}
void spc_restore_privileges(void) {
if (geteuid( ) != orig_uid)
if (seteuid(orig_uid) = = -1 || geteuid(
if (getegid( ) != orig_gid)
if (setegid(orig_gid) = = -1 || getegid(
if (!orig_uid)
setgroups(orig_ngroups, orig_groups);
}
) != orig_uid) abort(
);
) != orig_gid) abort(
);
1.3.4 See Also
"Setuid Demystified" by Hao Chen, David Wagner, and Drew Dean:
http://www.cs.berkeley.edu/~daw/papers/setuid-usenix02.pdf
Recipe 2.1
[ Team LiB ]
[ Team LiB ]
1.4 Limiting Risk with Privilege Separation
1.4.1 Problem
Your process runs with extra privileges granted by the setuid or setgid bits on the executable.
Because it requires those privileges at various times throughout its lifetime, it can't permanently drop
the extra privileges. You would like to limit the risk of those extra privileges being compromised in
the event of an attack.
1.4.2 Solution
When your program first initializes, create a Unix domain socket pair usingsocketpair( ), which will
create two endpoints of a connected unnamed socket. Fork the process using fork( ), drop the
extra privileges in the child process, and keep them in the parent process. Establish communication
between the parent and child processes. Whenever the child process needs to perform an operation
that requires the extra privileges held by the parent process, defer the operation to the parent.
The result is that the child performs the bulk of the program's work. The parent retains the extra
privileges and does nothing except communicate with the child and perform privileged operations on
its behalf.
If the privileged process opens files on behalf of the unprivileged process, you will need to use a Unix
domain socket, as opposed to an anonymous pipe or some other other interprocess communication
mechanism. The reason is that only Unix domain sockets provide a means by which file descriptors
can be exchanged between the processes after the initial fork( ).
1.4.3 Discussion
In Recipe 1.3, we discussed setuid, setgid, and the importance of permanently dropping the extra
privileges resulting from their use as quickly as possible to minimize the window of vulnerability to a
privilege escalation attack. In many cases, the extra privileges are necessary for performing some
initialization or other small amount of work, such as binding a socket to a privileged port. In other
cases, however, the work requiring extra privileges cannot always be restricted to the beginning of
the program, thus requiring that the extra privileges be dropped only temporarily so that they can
later be restored when they're needed. Unfortunately, this means that an attacker who compromises
the program can also restore those privileges.
1.4.3.1 Privilege separation
One way to solve this problem is to use privilege separation. When privilege separation is employed,
one process is solely responsible for performing all privileged operations, and it does absolutely
nothing else. A second process is responsible for performing the remainder of the program's work,
which does not require any extra privileges. As illustrated inFigure 1-1, a bidirectional
communications channel exists between the two processes to allow the unprivileged process to send
requests to the privileged process and to receive the results.
Figure 1-1. Data flow when using privilege separation
Normally, the two processes are closely related. Usually they're the same program split during
initialization into two separate processes using fork( ). The original process retains its privileges
and enters a loop waiting to service requests from the child process. The child process starts by
permanently dropping the extra privileges inherited from the parent process and continues normally,
sending requests to the parent when it needs privileged operations to be performed.
By separating the process into privileged and unprivileged pieces, the risk of a privilege escalation
attack is significantly reduced. The risk is further reduced by the parent process refusing to perform
any operations that it knows the child does not need. For example, if the program never needs to
delete any files, the privileged process should refuse to service any requests to delete files. Because
the unprivileged child process undertakes most of the program's functionality, it stands the greatest
risk of compromise by an attacker, but because it has no extra privileges of its own, an attacker does
not stand to gain much from the compromise.
1.4.3.2 A privilege separation library: privman
NAI Labs has released a library that implements privilege separation on Unix with an easy-to-use
API. This library, called privman, can be obtained from http://opensource.nailabs.com/privman/. As
of this writing, the library is still in an alpha state and the API is subject to change, but it is quite
usable, and it provides a good generic framework from which to work.
A program using privman should include the privman.h header file and link to the privman library. As
part of the program's initialization, call the privman API function priv_init( ), which requires a
single argument specifying the name of the program. The program's name is used for log entries to
syslog (see Recipe 13.11 for a discussion of logging), as well as for the configuration file to use. The
priv_init( ) function should be called by the program with root privileges enabled, and it will take
care of splitting the program into two processes and adjusting privileges for each half appropriately.
The privman library uses configuration files to determine what operations the privileged half of a
program may perform on behalf of the unprivileged half of the same program. In addition, the
configuration file determines what user the unprivileged half of the program runs as, and what
directory is used in the call to chroot( ) in the unprivileged process (see Recipe 2.12). By default,
privman runs the unprivileged process as the user "nobody" and does a chroot( ) to the root
directory, but we strongly recommend that your program use a user specifically set up for it instead
of "nobody", and that you chroot( ) to a safe directory (see Recipe 2.4).
When the priv_init( ) function returns control to your program, your code will be running in the
unprivileged child process. The parent process retains its privileges, and control is never returned to
you. Instead, the parent process remains in a loop that responds to requests from the unprivileged
process to perform privileged operations.
The privman library provides a number of functions intended to replace standard C runtime functions
for performing privileged operations. When these functions are called, a request is sent to the
privileged process to perform the operation, the privileged process performs the operation, and the
results are returned to the calling process. The privman versions of the standard functions are named
with the prefix of priv_, but otherwise they have the same signature as the functions they replace.
For example, a call to fopen( ):
FILE *f = fopen("/etc/shadow", "r");
becomes a call to priv_fopen( ):
FILE *f = priv_fopen("/etc/shadow", "r");
The following code demonstrates calling priv_init( ) to initialize the privman library, which will split
the program into privileged and unprivileged halves:
#include <privman.h>
#include <string.h>
int main(int argc, char *argv[
char *progname;
]) {
/* Get the program name to pass to the priv_init( ) function, and call
* priv_init( ).
*/
if (!(progname = strrchr(argv[0], '/'))) progname = argv[0];
else progname++;
priv_init(progname);
/* Any code executed from here on out is running without any additional
* privileges afforded by the program running setuid root. This process
* is the child process created by the call in priv_init( ) to fork( ).
*/
return 0;
}
1.4.4 See Also
privman from NAI Labs: http://opensource.nailabs.com/privman/
Recipe 1.3, Recipe 1.7, Recipe 2.4, Recipe 2.12, Recipe 13.11
[ Team LiB ]
[ Team LiB ]
1.5 Managing File Descriptors Safely
1.5.1 Problem
When your program starts up, you want to make sure that only the standard stdin, stdout, and
stderr file descriptors are open, thus avoiding denial of service attacks and avoiding having an
attacker place untrusted files on special hardcoded file descriptors.
1.5.2 Solution
On Unix, use the function getdtablesize( ) to obtain the size of the process's file descriptor table.
For each file descriptor in the process's table, close the descriptors that are notstdin, stdout, or
stderr, which are always 0, 1, and 2, respectively. Test stdin, stdout, and stderr to ensure that
they're open using fstat( ) for each descriptor. If any one is not open, open /dev/null and associate
with the descriptor. If the program is running setuid, stdin, stdout, and stderr should also be
closed if they're not associated with a tty, and reopened using/dev/null.
On Windows, there is no way to determine what file handles are open, but the same issue with open
descriptors does not exist on Windows as it does on Unix.
1.5.3 Discussion
Normally, when a process is started, it inherits all open file descriptors from its parent. This can be a
problem because the size of the file descriptor table on Unix is typically a fixed size. The parent
process could therefore fill the file descriptor table with bogus files to deny your program any file
handles for opening its own files. The result is essentially a denial of service for your program.
When a new file is opened, a descriptor is assigned using the first available entry in the process's file
descriptor table. If stdin is not open, for example, the first file opened is assigned a file descriptor of
0, which is normally reserved for stdin. Similarly, if stdout is not open, file descriptor 1 is assigned
next, followed by stderr's file descriptor of 2 if it is not open.
The only file descriptors that should remain open when your program starts are thestdin, stdout,
and stderr descriptors. If the standard descriptors are not open, your program should open them
using /dev/null and leave them open. Otherwise, calls to functions like printf( ) can have
unexpected and potentially disastrous effects. Worse, the standard C library considers the standard
descriptors to be special, and some functions expect stderr to be properly opened for writing error
messages to. If your program opens a data file for writing and getsstderr's file descriptor, an error
message written to stderr will destroy your data file.
Particularly in a chroot( ) environment (see Recipe 2.12), the /dev/null
device may not be available (it can be made available if the environment is set
up properly). If it is not available, the proper thing for your program to do is to
refuse to run.
The potential for security vulnerabilities arising from file descriptors being managed improperly is
high in non-setuid programs. For setuid (especially setuid root) programs, the potential for problems
increases dramatically. The problem is so serious that some variants of Unix (OpenBSD, in particular)
will explicitly open stdin, stdout, and stderr from the execve( ) system call for a setuid process if
they're not already open.
The following function, spc_sanitize_files( ), first closes all open file descriptors that are not one
of the standard descriptors. Because there is no easy way to tell whether a descriptor is open,
close( ) is called for each one, and any error returned is ignored. Once all of the nonstandard
descriptors are closed, stdin, stdout, and stderr are checked to ensure that they are open. If any
one of them is not open, an attempt is made to open /dev/null. If /dev/null cannot be opened, the
program is terminated immediately.
#include
#include
#include
#include
#include
#include
#include
<sys/types.h>
<limits.h>
<sys/stat.h>
<stdio.h>
<unistd.h>
<errno.h>
<paths.h>
#ifndef OPEN_MAX
#define OPEN_MAX 256
#endif
static int open_devnull(int fd) {
FILE *f = 0;
if (!fd) f = freopen(_PATH_DEVNULL, "rb", stdin);
else if (fd = = 1) f = freopen(_PATH_DEVNULL, "wb", stdout);
else if (fd = = 2) f = freopen(_PATH_DEVNULL, "wb", stderr);
return (f && fileno(f) = = fd);
}
void spc_sanitize_files(void) {
int
fd, fds;
struct stat st;
/* Make sure all open descriptors other than the standard ones are closed */
if ((fds = getdtablesize( )) = = -1) fds = OPEN_MAX;
for (fd = 3; fd < fds; fd++) close(fd);
/* Verify that the standard descriptors are open. If they're not, attempt to
* open them using /dev/null. If any are unsuccessful, abort.
*/
for (fd = 0; fd < 3; fd++)
if (fstat(fd, &st) =
}
[ Team LiB ]
= -1 && (errno != EBADF || !open_devnull(fd))) abort(
);
[ Team LiB ]
1.6 Creating a Child Process Securely
1.6.1 Problem
Your program needs to create a child process either to perform work within the same program or,
more frequently, to execute another program.
1.6.2 Solution
On Unix, creating a child process is done by callingfork( ). When fork( ) completes successfully, a
nearly identical copy of the calling process is created as a new process. Most frequently, a new
program is immediately executed using one of the exec*( ) family of functions (see Recipe 1.7).
However, especially in the days before threading, it was common to usefork( ) to create separate
"threads" of execution within a program.[4]
[4]
Note that we say "program" here rather than "process." When fork( ) completes, the same program is
running, but there are now two processes. The newly created process has a nearly identical copy of the original
process, but it is a copy; any action performed in one process does not affect the other. In a threaded
environment, each thread shares the same process, so all memory, file descriptors, signals, and so on are
shared.
If the newly created process is going to continue running the same program, anypseudo-random
number generators (PRNGs) must be reseeded so that the two processes will each yield different
random data as they continue to execute. In addition, any inherited file descriptors that are not
needed should be closed; they remain open in the other process because the new process only has a
copy of them.
Finally, if the original process had extra privileges from being executed as setuid or setgid, those
privileges will be inherited by the new process, and they should be dropped immediately if they are
not needed. In particular, if the new process is going to be used to execute a new program, privileges
should always be dropped so that the new program does not inherit privileges that it should not have.
1.6.3 Discussion
When fork( ) is used to create a new process, the new process is a nearly identical copy of the
original process. The only differences in the processes are the process ID, the parent process ID, and
the resource utilization counters, which are reset to zero in the new process. Execution in both
processes continues immediately after the return from fork( ). Each process can determine whether
it is the parent or the child by checking the return value from fork( ). In the parent or original
process, fork( ) returns the process ID of the new process, while 0 will be returned in the child
process.
It's important to remember that the new process is a copy of the original. The contents of the original
process's memory (including stack), file descriptor table, and any other process attributes are the
same in both processes, but they're not shared. Any changes to memory contents, file descriptors,
and so on are private to the process that is making them. In other words, if the new process changes
its file position pointer in an open file, the file position pointer for the same file in the original process
remains unchanged.
The fact that the new process is a copy of the original has important security considerations that are
often overlooked. For example, if a PRNG is seeded in the original process, it will be seeded identically
in the child process. This means that if both the original and new processes were to obtain random
data from the PRNG, they would both get the same random data (see Figure 1-2)! The solution to
this problem is to reseed the PRNG in one of the processes, or, preferably, both processes. By
reseeding the PRNG in both processes, neither process will have any knowledge of the other's PRNG
state. Be sure to do this in a thread-safe manner if your program can fork multiple processes.
Figure 1-2. Consequences of not reseeding PRNGs after calling fork( )
At the time of the call to fork( ), any open file descriptors in the original process will also be open in
the new process. If any of these descriptors are unnecessary, they should be closed; they will remain
open in the other process. Closing unnecessary file descriptors is especially important if one of the
processes is going to execute another program (see Recipe 1.5).
Finally, the new process also inherits its access rights from the original process. Normally this is not
an issue, but if the parent process had extra privileges because it was executed setuid or setgid, the
new process will also have the extra privileges. If the new process does not need these privileges,
they should be dropped immediately (see Recipe 1.3). Any extra privileges should be dropped
especially if one of the two processes is going to execute a new program.
The following function, spc_fork( ), is a wrapper around fork( ). As presented here, the code is
incomplete when using an application-level random number generator; it will require the appropriate
code to reseed whatever PRNG you're using. It assumes that the new child process is the process
that will be used to perform any work that does not require any extra privileges that the process may
have. It is rare that when a process is forked, the original process is used to execute another
program or the new process is used to continue primary execution of the program. In other words,
the new process is most often the worker process.
#include <sys/types.h>
#include <unistd.h>
pid_t spc_fork(void) {
pid_t childpid;
if ((childpid = fork(
)) =
= -1) return -1;
/* Reseed PRNGs in both the parent and the child */
/* See Chapter 11 for examples */
/* If this is the parent process, there's nothing more to do */
if (childpid != 0) return childpid;
/* This is the child process */
spc_sanitize_files( );
/* Close all open files. See Recipe 1.1 */
spc_drop_privileges(1); /* Permanently drop privileges. See Recipe 1.3 */
return 0;
}
1.6.4 See Also
Recipe 1.3, Recipe 1.5, Recipe 1.7
[ Team LiB ]
[ Team LiB ]
1.7 Executing External Programs Securely
1.7.1 Problem
Your Unix program needs to execute another program.
1.7.2 Solution
On Unix, one of the exec*( ) family of functions is used to replace the current program within a
process with another program. Typically, when you're executing another program, the original program
continues to run while the new program is executed, thus requiring two processes to achieve the
desired effect. The exec*( ) functions do not create a new process. Instead, you must first use fork(
) to create a new process, and then use one of the exec*( ) functions in the new process to run the
new program. See Recipe 1.6 for a discussion of using fork( ) securely.
1.7.3 Discussion
execve( ) is the system call used to load and begin execution of a new program. The other functions in
the exec*( ) family are wrappers around the execve( ) system call, and they are implemented in
user space in the standard C runtime library. When a new program is loaded and executed with
execve( ) , the new program replaces the old program within the same process. As part of the process
of loading the new program, the old program's address space is replaced with a new address space. File
descriptors that are marked to close on execute are closed; the new program inherits all others. All
other system-level properties are tied to the process, so the new program inherits them from the old
program. Such properties include the process ID, user IDs, group IDs, working and root directories, and
signal mask.
Table 1-2 lists the various exec*( ) wrappers around the execve( ) system call. Note that many of
these wrappers should not be used in secure code. In particular, never use the wrappers that are
named with a "p" suffix because they will search the environment to locate the file to be executed.
When executing external programs, you should always specify the full path to the file that you want to
execute. If the PATH environment variable is used to locate the file, the file that is found to execute may
not be the expected one.
Table 1-2. The exec*( ) family of functions
Function signature
Comments
int execl(const char *path, char *arg, ...);
The
argument
list is
terminated
by a NULL .
The calling
program's
environment
is passed on
to the new
program.
int execle(const char *path, char *arg, ...);
The
argument
list is
terminated
by a NULL ,
and the
environment
pointer to
use follows
immediately.
int execlp(const char *file, char *arg, ...);
The
argument
list is
terminated
by a NULL .
The PATH
environment
variable is
searched to
locate the
program to
execute. The
calling
program's
environment
is passed on
to the new
program.
int exect(const char *path, const char *argv[
The same as
execve( ) ,
except that
process
tracing is
enabled.
], const char *envp[
]);
The PATH
environment
Function signature
int execv(const char *path, const char *argv[
int execve(const char *path, const char *argv[
int execvp(const char *file, const char *argv[
]);
], const char *envp[
]);
Comments
environment
variable is
searched to
locate the
program to
execute.
This is the
main system
call to load
]);
and execute
a new
program.
The PATH
environment
variable is
searched to
locate the
program to
execute. The
calling
program's
environment
is passed on
to the new
program.
The two easiest and safest functions to use are execv( ) and execve( ) ; the only difference between
the two is that execv( ) calls execve( ) , passing environ for the environment pointer. If you have
already sanitized the environment (see Recipe 1.1), it's reasonable to call execv( ) without explicitly
specifying an environment to use. Otherwise, a new environment can be built and passed toexecve( )
.
The argument lists for the functions are built just as they will be received bymain( ) . The first
element of the array is the name of the program that is running, and the last element of the array must
be a NULL . The environment is built in the same manner as described in Recipe 1.1. The first argument
to the two functions is the full path and filename of the executable file to load and execute.
As a courtesy to the new program, before executing it you should close any file descriptors that are
open unless there are descriptors that you intentionally want to pass along to it. Be sure to leavestdin
, stdout , and stderr open. (See Recipe 1.5 for a discussion of file descriptors.)
Finally, if your program was executed setuid or setgid and the extra privileges have not yet been
dropped, or they have been dropped only temporarily, you should drop them permanently before
executing the new program. Otherwise, the new program will inherit the extra privileges when it should
not. If you use the spc_fork( ) function from Recipe 1.6 , the file descriptors and privileges will be
handled for you.
Another function provided by the standard C runtime library for executing programs issystem( ) . This
function hides the details of calling fork( ) and the appropriate exec*( ) function to execute the
program. There are two reasons why you should never use the system( ) function:
It uses the shell to launch the program.
It passes the command to execute to the shell, leaving the task of breaking up the command's
arguments to the shell.
The system( ) function works differently from the exec*( ) functions; instead of replacing the
currently executing program, it creates a new process with fork( ) . The new process executes the
shell with execve( ) while the original process waits for the new process to terminate. The system( )
function therefore does not return control to the caller until the specified program has completed.
Yet another function, popen( ) , works somewhat similarly to system( ) . It also uses the shell to
launch the program, passing the command to execute to the shell and leaving the task of breaking up
the command's arguments to the shell. What it does differently is create an anonymous pipe that is
attached to either the new program's stdin or its stdout file descriptor. The new program's stderr file
descriptor is always inherited from the parent. In addition, it returns control to the caller immediately
with a FILE object connected to the created pipe so that the caller can communicate with the new
program. When communication with the new program is finished, you should callpclose( ) to clean up
the file descriptors and reap the child process created by the call tofork( ) .
You should also avoid using popen( ) and its accompanying pclose( ) function, but popen( ) does
have utility that is worth duplicating in a secure fashion. The following implementation with a similar API
does not make use of the shell.
If you do wish to use either system( ) or popen( ) , be extremely careful. First, make sure that the
environment is properly set, so that there are no Trojan environment variables. Second, remember that
the command you're running will be run in a Unix shell. This means that you must ensure that there is
no way an attacker can pass malicious data to the shell command. If possible, pass in a fixed string
that the attacker cannot manipulate. If the user must be allowed to manipulate the input, only very
careful filtering will accomplish this securely. We recommend that you avoid this scenario at all costs.
The following code implements secure versions ofpopen( ) and pclose( ) using the spc_fork( )
code from Recipe 1.6 . Our versions differ slightly in both interface and function, but not by too much.
The function spc_popen( ) requires the same arguments execve( ) does. In fact, the arguments are
passed directly to execve( ) without any modification. If the operation is successful, an SPC_PIPE
object is returned; otherwise, NULL is returned. When communication with the new program is
complete, call spc_pclose( ) , passing the SPC_PIPE object returned by spc_popen( ) as its only
argument. If the new program has not yet terminated when spc_pclose( ) is called in the original
program, the call will block until the new program does terminate.
If spc_popen( ) is successful, the SPC_PIPE object it returns contains two FILE objects:
read_fd can be used to read data written by the new program to its stdout file descriptor.
write_fd can be used to write data to the new program for reading from its stdin file descriptor.
Unlike popen( ) , which in its most portable form is unidirectional, spc_popen( ) is bidirectional.
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
typedef struct {
FILE *read_fd;
FILE *write_fd;
pid_t child_pid;
} SPC_PIPE;
SPC_PIPE *spc_popen(const char *path, char *const argv[], char *const envp[]) {
int
stdin_pipe[2], stdout_pipe[2];
SPC_PIPE *p;
if (!(p = (SPC_PIPE *)malloc(sizeof(SPC_PIPE)))) return 0;
p->read_fd = p->write_fd = 0;
p->child_pid = -1;
if (pipe(stdin_pipe) = = -1) {
free(p);
return 0;
}
if (pipe(stdout_pipe) = = -1) {
close(stdin_pipe[1]);
close(stdin_pipe[0]);
free(p);
return 0;
}
if (!(p->read_fd = fdopen(stdout_pipe[0], "r"))) {
close(stdout_pipe[1]);
close(stdout_pipe[0]);
close(stdin_pipe[1]);
close(stdin_pipe[0]);
free(p);
return 0;
}
if (!(p->write_fd = fdopen(stdin_pipe[1], "w"))) {
fclose(p->read_fd);
close(stdout_pipe[1]);
close(stdin_pipe[1]);
close(stdin_pipe[0]);
free(p);
return 0;
}
if ((p->child_pid = spc_fork(
fclose(p->write_fd);
fclose(p->read_fd);
close(stdout_pipe[1]);
close(stdin_pipe[0]);
free(p);
)) =
= -1) {
return 0;
}
if (!p->child_pid) {
/* this is the child process */
close(stdout_pipe[0]);
close(stdin_pipe[1]);
if (stdin_pipe[0] != 0) {
dup2(stdin_pipe[0], 0);
close(stdin_pipe[0]);
}
if (stdout_pipe[1] != 1) {
dup2(stdout_pipe[1], 1);
close(stdout_pipe[1]);
}
execve(path, argv, envp);
exit(127);
}
close(stdout_pipe[1]);
close(stdin_pipe[0]);
return p;
}
int spc_pclose(SPC_PIPE *p) {
int
status;
pid_t pid;
if (p->child_pid != -1) {
do {
pid = waitpid(p->child_pid, &status, 0);
} while (pid = = -1 && errno = = EINTR);
}
if (p->read_fd) fclose(p->read_fd);
if (p->write_fd) fclose(p->write_fd);
free(p);
if (pid != -1 && WIFEXITED(status)) return WEXITSTATUS(status);
else return (pid = = -1 ? -1 : 0);
}
1.7.4 See Also
Recipe 1.1 , Recipe 1.5 , Recipe 1.6
[ Team LiB ]
[ Team LiB ]
1.8 Executing External Programs Securely
1.8.1 Problem
Your Windows program needs to execute another program.
1.8.2 Solution
On Windows, use the CreateProcess( ) API function to load and execute a new program.
Alternatively, use the CreateProcessAsUser( ) API function to load and execute a new program
with a primary access token other than the one in use by the current program.
1.8.3 Discussion
The Win32 API provides several functions for executing new programs. In the days of the Win16 API,
the proper way to execute a new program was to call WinExec( ). While this function still exists in
the Win32 API as a wrapper around CreateProcess( ) for compatibility reasons, its use is
deprecated, and new programs should call CreateProcess( ) directly instead.
A powerful but extremely dangerous API function that is popular among developers is ShellExecute(
). This function is implemented as a wrapper around CreateProcess( ), and it does exactly what
we're about to advise against doing with CreateProcess( )-but we're getting a bit ahead of
ourselves.
One of the reasons ShellExecute( ) is so popular is that virtually anything can be executed with the
API. If the file to execute as passed to ShellExecute( ) is not actually executable, the API will
search the registry looking for the right application to launch the file. For example, if you pass it a
filename with a .TXT extension, the filename will probably start Notepad with the specified file loaded.
While this can be an incredibly handy feature, it's also a disaster waiting to happen. Users can
configure their own file associations, and there is no guarantee that you'll get the expected behavior
when you execute a program this way. Another problem is that because users can configure their
own file associations, an attacker can do so as well, causing your program to end up doing something
completely unexpected and potentially disastrous.
The safest way to execute a new program is to use either CreateProcess( ) or
CreateProcessAsUser( ). These two functions share a very similar signature:
BOOL CreateProcess(LPCTSTR lpApplicationName, LPTSTR lpCommandLine,
LPSECURITY_ATTRIBUTES lpProcessAttributes,
LPSECURITY_ATTRIBUTES lpThreadAttributes, BOOL bInheritHandles,
DWORD dwCreationFlags, LPVOID lpEnvironment, LPCTSTR lpCurrentDirectory,
LPSTARTUPINFO lpStartupInfo, LPPROCESS_INFORMATION lpProcessInformation);
BOOL CreateProcessAsUser(HANDLE hToken, LPCTSTR lpApplicationName,
LPTSTR lpCommandLine, LPSECURITY_ATTRIBUTES lpProcessAttributes,
LPSECURITY_ATTRIBUTES lpThreadAttributes, BOOL bInheritHandles,
DWORD dwCreationFlags, LPVOID lpEnvironment, LPCTSTR lpCurrentDirectory,
LPSTARTUPINFO lpStartupInfo, LPPROCESS_INFORMATION lpProcessInformation);
The two most important arguments for the purposes of proper secure use of CreateProcess( ) or
CreateProcessAsUser( ) are lpApplicationName and lpCommandLine. All of the other arguments
are well documented in the Microsoft Platform SDK.
lpApplicationName
Name of the program to execute. The program may be specified as an absolute or relative
path, but you should never specify the program to execute in any way other than as a fully
qualified absolute path and filename. This argument may also be specified asNULL, in which
case the program to execute is determined from the lpCommandLine argument.
lpCommandLine
Any command-line arguments to pass to the new program. If there are no arguments to pass,
this argument may be specified as NULL, but lpApplicationName and lpCommandLine cannot
both be NULL. If lpApplicationName is specified as NULL, the program to execute is taken
from this argument. Everything up to the first space is interpreted as part of the filename of
the program to execute. If the filename to execute has a space in its name, it must be quoted.
If lpApplicationName is not specified as NULL, lpCommandLine should not contain the filename
to execute, but instead contain only the arguments to pass to the program on its command
line.
By far, the biggest mistake that developers make when using CreateProcess( ) or
CreateProcessAsUser( ) is to specify lpApplicationName as NULL and fail to enclose the program
name portion of lpCommandLine in quotes. As a rule, you should never specify lpApplicationName
as NULL. Always specify the filename of the program to execute in lpApplicationName rather than
letting Windows try to figure out what you mean from lpCommandLine.
[ Team LiB ]
[ Team LiB ]
1.9 Disabling Memory Dumps in the Event of a Crash
1.9.1 Problem
Your application stores potentially sensitive data in memory, and you want to prevent this data from
being written to disk if the program crashes, because local attackers might be able to examine a core
dump and use that information nefariously.
1.9.2 Solution
On Unix systems, use setrlimit( ) to set the RLIMIT_CORE resource to zero, which will prevent the
operating system from leaving behind a core file. On Windows, it is not possible to disable such
behavior, but there is equally no guarantee that a memory dump will be performed. A system-wide
setting that cannot be altered on a per-application basis controls what actionWindows takes when an
application crashes.
A Windows feature called Dr. Watson, which is enabled by default, may cause the contents of a
process's address space to be written to disk in the event of a crash. If Microsoft Visual Studio is
installed, the settings that normally cause Dr. Watson to run are changed to run the Microsoft Visual
Studio debugger instead, and no dump will be generated. Other programs do similar things, so from
system to system, there's no telling what might happen if an application crashes.
Unfortunately, there is no way to prevent memory dumps on a per-application basis on Windows. The
settings for how to handle an application crash are system-wide, stored in the registry under
HKEY_LOCAL_MACHINE, and they require Administrator access to change them. Even if you're
reasonably certain Dr. Watson will be the handler on systems on which your program will be running,
there is no way you can disable its functionality on a per-application basis. On the other hand, any
dump that may be created by Dr. Watson is properly protected by ACLs that prevent any other user
from accessing them.
1.9.3 Discussion
On most Unix systems, a program that crashes will "dump core." The action of dumping core causes
an image of the program's committed memory at the time of the crash to be written out to a file on
disk, which can later be used for post-mortem debugging.
The problem with dumping core is that the program may contain potentially sensitive information
within its memory at the time the image is written to disk. Imagine a program that has just read in a
user's password, and then is forced to dump core before it has a chance to erase or otherwise
obfuscate the password in memory.
Because an attacker may be able to manipulate the program's runtime environment in such a way as
to cause it to dump core, and thus write any sensitive information to disk, you should try to prevent a
program from dumping core if there's any chance the attacker may be able to get read access to the
core file.
Generally, core files are written in such a way that the owner is the only person who can read and
modify them, but silly things often happen, such as lingering core files accidentally being made worldreadable by a recursive permissions change.
It's best to prevent against core dumps as early in the program as possible, because if an attacker is
manipulating the program in a way that causes it to crash, you cannot know in advance what state
the program will be in when the attacker manages to force it to crash.
Process core dumping can be restricted on a per-application basis by using the resource limit
capabilities of most Unix systems. One of the standard limits that can be applied to a process is the
maximum core dump file size. This limit serves to protect against large (in terms of memory
consumption) programs that dump core and could potentially fill up all available disk space. Without
this limit in place, it would even be possible for an attacker who has discovered a way to cause a
program to crash from remote and dump core to fill up all available disk space on the server. Setting
the value of RLIMIT_CORE to 0 prevents the process from writing any memory dump to disk, instead
simply terminating the program when a fatal problem is encountered.
#include <sys/types.h>
#include <sys/time.h>
#include <sys/resource.h>
void spc_limit_core(void) {
struct rlimit rlim;
rlim.rlim_cur = rlim.rlim_max = 0;
setrlimit(RLIMIT_CORE, &rlim);
}
In addition to the RLIMIT_CORE limit, the setrlimit( ) function also allows
other per-process limits to be adjusted. We discuss these other limits inRecipe
13.9.
The advantage of disabling core dumps is that if your program has particularly sensitive information
residing in memory unencrypted (even transient data is at risk, because a skilled attacker could
potentially time the core dumps so that your program dumps core at precisely the right time), it will
not ever write this data to disk in a core dump. The primary disadvantage of this approach is that the
lack of a core file makes debugging program crashes very difficult after the fact. How big an issue
this is depends on program deployment and how bugs are tracked and fixed. A number of shells
provide an interface to the setrlimit( ) function via a built-in command. Users who want to
prevent core file generation can set the appropriate limit with the shell command, then run the
program.
However, for situations where data in memory is required to be protected, the application should limit
the core dumps directly via setrlimit( ) so that it becomes impossible to inadvertently run the
program with core dumps enabled. When core dumps are needed for debugging purposes, a safer
alternative is to allow core dumps only when the program has been compiled in "debug mode." This is
easily done by wrapping the setrlimit( ) call with the appropriate preprocessor conditional to
disable the code in debug mode and enable it otherwise.
Some Unix variants (Solaris, for example) allow the system administrator to control how core dumps
are handled on a system-wide basis. Some of the capabilities of these systems allow the
administrator to specify a directory where all core dumps will be placed. When this capability is
employed, the directory configured to hold the core dump files is typically owned by the superuser
and made unreadable to any other users. In addition, most systems force the permissions of a core
file so that it is only readable by the user the process was running as when it dumped core. However,
this is not a very robust solution, as many other exploits could possibly be used to read this file.
1.9.4 See Also
Recipe 13.9
[ Team LiB ]
[ Team LiB ]
Chapter 2. Access Control
Access control is a major issue for application developers. An application must always be sure to
protect its resources from unauthorized access. This requires properly setting permissions on created
files, allowing only authorized hosts to connect to any network ports, and properly handling privilege
elevation and surrendering. Applications must also defend againstrace conditions that may occur
when opening files-for example, the Time of Check, Time of Use (TOCTOU) condition. The proper
approach to access control is a consistent, careful use of all APIs that access external resources. You
must minimize the time a program runs with privileges and perform only the bare minimum of
operations at a privileged level. When sensitive data is involved, it is your application's duty to
protect the user's data from unauthorized access; keep this in mind during all stages of development.
[ Team LiB ]
[ Team LiB ]
2.1 Understanding the Unix Access Control Model
2.1.1 Problem
You want to understand how access control works on Unix systems.
2.1.2 Solution
Unix traditionally uses a user ID-based access control system. Some newer variants implement
additional access control mechanisms, such as Linux's implementation of POSIX capabilities. Because
additional access control mechanisms vary greatly from system to system, we will discuss only the
basic user ID system in this recipe.
2.1.3 Discussion
Every process running on a Unix system has a user ID assigned to it. In reality, every process
actually has three user IDs assigned to it: an effective user ID, a real user ID, and a saved user
ID.[1] The effective user ID is the user ID used for most permission checks. The real user and saved
user IDs are used primarily for determining whether a process can legally change its effective user ID
(see Recipe 1.3).
[1]
Saved user IDs may not be available on some very old Unix platforms, but are available on all modern
Unixes.
In addition to user IDs, each process also has a group ID. As with user IDs, there are actually three
group IDs: an effective group ID, a real group ID, and a saved group ID. Processes may belong to
more than a single group. The operating system maintains a list of groups to which a process belongs
for each process. Group-based permission checks check the effective group ID as well as the
process's group list.
The operating system performs a series of tests to determine whether a process has permission to
access a particular file on the filesystem or some other resource (such as a semaphore or shared
memory segment). By far, the most common permission check performed is for file access.
When a process creates a file or some other resource, the operating system assigns a user ID and a
group ID as the owner of the file or resource. The user ID is assigned the process's effective user ID,
and the group ID is assigned the process's effective group ID.
To define the accessibility of a file or resource, eachfile or resource has three sets of three
permission bits assigned to it. For the owning user, the owning group, and everyone else (often
referred to as "world" or "other"), read, write, and execute permissions are stored.
If the process attempting to access a file or resource shares its effective user ID with the owning user
ID of the file or resource, the first set of permission bits is used. If the process shares its effective
group ID with the owning group ID of the file or resource, the second set of permission bits is used.
In addition, if the file or resource's group owner is in the process's group membership list, the second
set of permission bits is used. If neither the user ID nor the group ID match, the third set of bits is
used. User ownership always trumps group ownership.
Files also have an additional set of bits: the sticky bit, the setuid bit, and the setgid bit. The sticky
and setgid bits are defined for directories; the setuid and setgid bits are defined for executable files;
and all three bits are ignored for any other type of file. In no case are all three bits defined to have
meaning for a single type of file.
2.1.3.1 The sticky bit
Under normal circumstances, a user may delete or rename any file in a directory that the user owns,
regardless of whether the user owns the file. Applying the sticky bit to a directory alters this behavior
such that a user may only delete or rename files in the directory if the user owns the file and
additionally has write permission in the directory. It is common to see the sticky bit applied to
directories such as /tmp so that any user may create temporary files, but other users may not muck
with them.
Historically, application of the sticky bit to executable files also had meaning. Applying the sticky bit
to an executable file would cause the operating system to treat the executable in a special way by
keeping the executable image resident in memory once it was loaded, even after the image was no
longer in use. This optimization is no longer necessary because of faster hardware and widespread
support for and adoption of shared libraries. As a result, most modern Unix variants no longer honor
the sticky bit for executable files.
2.1.3.2 The setuid bit
Normally, when an executable file loads and runs, it runs with the effective user, real user, and saved
user IDs of the process that started it running. Under normal circumstances, all three of these user
IDs are the same value, which means that the process cannot adjust its user IDs unless the process
is running as the superuser.
If the setuid bit is set on an executable, this behavior changes significantly. Instead of inheriting or
maintaining the user IDs of the process that started it, the process's effective user and saved user
IDs will be adjusted to the user ID that owns the executable file. This works for any user ID, but the
most common use of setuid is to use the superuser ID, which grants the executable superuser
privileges regardless of the user that executes it.
Applying the setuid bit to an executable has serious security considerations and consequences. If
possible, avoid using setuid. Unfortunately, that is not always possible;Recipe 1.3 and Recipe 1.4
discuss the setuid bit and the safe handling of it in more detail.
2.1.3.3 The setgid bit
Applied to an executable file, the setgid bit behaves similarly to the setuid bit. Instead of altering the
assignment of user IDs, the setgid bit alters the assignment of group IDs. However, the same
semantics apply for group IDs as they do for user IDs with respect to initialization of a process's
group IDs when a new program starts.
Unlike the setuid bit, the setgid bit also has meaning when applied to a directory. Ordinarily, the
group owner of a newly created file is the same as the effective group ID of the process that creates
the file. However, when the setgid bit is set on the directory in which a new file is created, the group
owner of the newly created file will instead be the group owner of the directory. In addition, Linux will
set the setgid bit on directories created within a directory having the setgid bit set.
On systems that support mandatory locking, the setgid bit also has special meaning on
nonexecutable files. We discuss its meaning in the context of mandatory locking inRecipe 2.8.
2.1.4 See Also
Recipe 1.3, Recipe 1.4, Recipe 2.8
[ Team LiB ]
[ Team LiB ]
2.2 Understanding the Windows Access Control Model
2.2.1 Problem
You want to understand how access control works on Windows systems.
2.2.2 Solution
Versions of Windows before Windows NT have no access control whatsoever. Windows 95, Windows
98, and Windows ME are all intended to be single-user desktop operating systems and thus have no
need for access control. Windows NT, Windows 2000, Windows XP, and Windows Server 2003 all use
a system of access control lists (ACLs).
Most users do not understand the Windows access control model and generally regard it as being
overly complex. However, it is actually rather straightforward and easy to understand. Unfortunately,
from a programmer's perspective, the API for dealing with ACLs is not so easy to deal with.
In Section 2.2.3, we describe the Windows access control model from a high level. We do not provide
examples of using the API here, but other recipes throughout the book do provide such examples.
2.2.3 Discussion
All Windows resources, including files, the registry, synchronization primitives (e.g., mutexes and
events), and IPC mechanisms (e.g., pipes and mailslots), are accessed through objects, which may
be secured using ACLs. Every ACL contains a discretionary access control list (DACL) and a system
access control list (SACL). DACLs determine access rights to an object, and SACLs determine auditing
(e.g., logging) policy. In this recipe, we are concerned only with access rights, so we will discuss only
DACLs.
A DACL contains zero or more access control entries (ACEs). A DACL with no ACEs, said to be a NULL
DACL, is essentially the equivalent of granting full access to everyone, which is never a good idea. A
NULL DACL means anyone can do anything to the object. Not only does full access imply the ability to
read from or write to the object, it also implies the ability to take ownership of the object or modify
its DACL. In the hands of an attacker, the ability to take ownership of the object and modify its DACL
can result in denial of service attacks because the object should be accessible but no longer is.
An ACE (an ACL contains one or more ACEs) consists of three primary pieces of information: a
security ID (SID), an access right, and a boolean indicator of whether the ACE allows or denies the
access right to the entity identified by the ACE's SID. A SID uniquely identifies a user or group on a
system. The special SID, known as "Everyone" or "World", identifies all users and groups on the
system. All objects support a generic set of access rights, and some objects may define others
specific to their type. Table 2-1 lists the generic access rights. Finally, an ACE can either allow or
deny an access right.
Table 2-1. Generic access rights supported by all objects
Access right (C
constant)
Description
DELETE
The ability to delete the object
READ_CONTROL
The ability to read the object's security descriptor, not including its SACL
SYNCHRONIZE
The ability for a thread to wait for the object to be put into the signaled
state; not all objects support this functionality
WRITE_DAC
The ability to modify the object's DACL
WRITE_OWNER
The ability to set the object's owner
GENERIC_READ
The ability to read from or query the object
GENERIC_WRITE
The ability to write to or modify the object
GENERIC_EXECUTE
The ability to execute the object (applies primarily to files)
GENERIC_ALL
Full control
When Windows consults an ACL to verify access to an object, it will always choose the best match.
That is, if a deny ACE for "Everyone" is found, and an allow ACE is then found for a specific user that
happens to be the current user, Windows will use the allow ACE. For example, suppose that the DACL
for a data file contains the following ACEs:
DENY GENERIC_ALL Everyone
This ACE prevents anyone except for the owner of the file from performing any action on the
file.
ALLOW GENERIC_WRITE Marketing
Anyone that is a member of the group "Marketing" will be allowed to write to the file because
this ACE explicitly allows that access right for that group.
ALLOW GENERIC_READ Everyone
This ACE grants read access to the file to everyone.
All objects are created with an owner. The owner of an object is ordinarily the user who created the
object; however, depending on the object's ACL, another user could possibly take ownership of the
object. The owner of an object always has full control of the object, regardless of what the object's
DACL says. Unfortunately, if an object is not sufficiently protected, an attacker can nefariously take
ownership of the object, rendering the rightful owner powerless to counter the attacker.
[ Team LiB ]
[ Team LiB ]
2.3 Determining Whether a User Has Access to a File on
Unix
2.3.1 Problem
Your program is running with extra permissions because its executable has the setuid or setgid bit
set. You need to determine whether the user running the program will be able to access a file without
the extra privileges granted by setuid or setgid.
2.3.2 Solution
Temporarily drop privileges to the user and group for which access is to be checked. With the
process's privileges lowered, perform the access check, then restore privileges to what they were
before the check. See Recipe 1.3 for additional discussion of elevated privileges and how to drop and
restore them.
2.3.3 Discussion
It is always best to allow the operating system to do the bulk of the work of performing access
checks. The only way to do so is to manipulate the privileges under which the process is running.
Recipe 1.3 provides implementations for functions that temporarily drop privileges and then restore
them again.
When performing access checks on files, you need to be careful to avoid the types of race conditions
known as Time of Check, Time of Use (TOCTOU), which are illustrated in Figure 2-1 and Figure 2-2.
These race conditions occur when access is checked before opening a file. The most common way for
this to occur is to use the access( ) system call to verify access to a file, and then to use open( )
or fopen( ) to open the file if the return from access( ) indicates that access will be granted.
The problem is that between the time the access check via access( ) completes and the time open(
) begins (both system calls are atomic within the operating system kernel), there is a window of
vulnerability where an attacker can replace the file that is being operated upon. Let's say that a
program uses access( ) to check to see whether an attacker has write permissions to a particular
file, as shown in Figure 2-1. If that file is a symbolic link, access( ) will follow it, and report that the
attacker does indeed have write permissions for the underlying file. If the attacker can change the
symbolic link after the check occurs, but before the program starts using the file, pointing it to a file
he couldn't otherwise access, the privileged program will end up opening a file that it shouldn't, as
shown in Figure 2-2. The problem is that the program can manipulate either file, and it gets tricked
into opening one on behalf of the user that it shouldn't have.
Figure 2-1. Stage 1 of a TOCTOU race condition: Time of Check
Figure 2-2. Stage 2 of a TOCTOU race condition: Time of Use
While such an attack might sound impossible to perform, attackers have many tricks to slow down a
program to make exploiting race conditions easier. Plus, even if an attacker can only exploit the race
condition every 1,000 times, generally the attack can be automated.
The best approach is to actually have the program take on the identity of the unprivileged user
before opening the file. That way, the correct access permission checks will happen automatically
when the file is opened. You need not even callaccess( ). After the file is opened, the program can
revert to its privileged state. For example, here's some pseudo-code that opens a file properly, using
the spc_drop_privileges( ) and spc_restore_privileges( ) functions from Recipe 1.3:
int fd;
/* Temporarily drop drivileges */
spc_drop_privileges(0);
/* Open the file with the limited privileges */
fd = open("/some/file/that/needs/opening", O_RDWR);
/* Restore privileges */
spc_restore_privileges( );
/* Check the return value from open to see if the file was opened successfully. */
if (fd = = -1) {
perror("open(\"/some/file/that/needs/opening\")");
abort( );
}
There are many other situations where security-critical race conditions occur, particularly in file
access. Basically, every time a condition is explicitly checked, one needs to make sure that the result
cannot have changed by the time that condition is acted upon.
[ Team LiB ]
[ Team LiB ]
2.4 Determining Whether a Directory Is Secure
2.4.1 Problem
Your application needs to store sensitive information on disk, and you want to ensure that the
directory used cannot be modified by any other entity on the system besides the current user and the
administrator. That is, you would like a directory where you can modify the contents at will, without
having to worry about future permission checks.
2.4.2 Solution
Check the entire directory tree above the one you intend to use for unsafe permissions. Specifically,
you are looking for the ability for users other than the owner and the superuser (the Administrator
account on Windows) to modify the directory. On Windows, the required directory traversal cannot be
done without introducing race conditions and a significant amount of complex path processing. The
best advice we can offer, therefore, is to consider home directories (typicallyx:\Documents and
Settings\User, where x is the boot drive and User is the user's account name) the safest directories.
Never consider using temporary directories to store files that may contain sensitive data.
2.4.3 Discussion
Storing sensitive data in files requires extra levels of protection to ensure that the data is not
compromised. An often overlooked aspect of protection is ensuring that the directories that contain
files (which, in turn, contain sensitive data) are safe from modification.
This may appear to be a simple matter of ensuring that the directory is protected against any other
users writing to it, but that is not enough. All the directories in the path must also be protected
against any other users writing to them. This means that the same user who will own the file
containing the sensitive data also owns the directories, and that the directories are all protected
against other users modifying them.
The reason for this is that when a directory is writable by a particular user, that user is able to
rename directories and files that reside within that directory. For example, suppose that you want to
store sensitive data in a file that will be placed into the directory/home/myhome/stuff/securestuff. If
the directory /home/myhome/stuff is writable by another user, that user could rename the directory
securestuff to something else. The result would be that your program would no longer be able to find
the file containing its sensitive data.
Even if the securestuff directory is owned by the user who owns the file containing the sensitive data,
and the permissions on the directory prevent other users from writing to it, the permissions that
matter are on the parent directory, /home/myhome/stuff. This same problem exists for every
directory in the path, right up to the root directory.
In this recipe we present a function, spc_is_safedir( ), for checking all of the directories in a path
specification on Unix. It traverses the directory tree from the bottom back up to the root, ensuring
that only the owner or superuser have write access to each directory.
The spc_is_safedir( ) function requires a single argument specifying the directory to check. The
return value from the function is -1 if some kind of error occurs while attempting to verify the safety
of the path specification, 0 if the path specification is not safe, or 1 if the path specification is safe.
On Unix systems, a process has only one current directory; all threads within a
process share the same working directory. The code presented here changes
the working directory as it works; therefore, the code is not thread-safe!
#include
#include
#include
#include
#include
#include
#include
#include
<sys/types.h>
<sys/stat.h>
<dirent.h>
<fcntl.h>
<limits.h>
<stdlib.h>
<stdio.h>
<unistd.h>
int spc_is_safedir(const char *dir) {
DIR
*fd, *start;
int
rc = -1;
char
new_dir[PATH_MAX + 1];
uid_t
uid;
struct stat f, l;
if (!(start = opendir("."))) return -1;
if (lstat(dir, &l) = = -1) {
closedir(start);
return -1;
}
uid = geteuid( );
do {
if (chdir(dir) = = -1) break;
if (!(fd = opendir("."))) break;
if (fstat(dirfd(fd), &f) = = -1) {
closedir(fd);
break;
}
closedir(fd);
if (l.st_mode != f.st_mode || l.st_ino != f.st_ino || l.st_dev != f.st_dev)
break;
if ((f.st_mode & (S_IWOTH | S_IWGRP)) || (f.st_uid && f.st_uid != uid)) {
rc = 0;
break;
}
dir = "..";
if (lstat(dir, &l) = = -1) break;
if (!getcwd(new_dir, PATH_MAX + 1)) break;
} while (new_dir[1]); /* new_dir[0] will always be a slash */
if (!new_dir[1]) rc = 1;
fchdir(dirfd(start));
closedir(start);
return rc;
}
[ Team LiB ]
[ Team LiB ]
2.5 Erasing Files Securely
2.5.1 Problem
You want to erase a file securely, preventing recovery of any data via "undelete" tools or any inspection of
the disk for data that has been left behind.
2.5.2 Solution
Write over the data in the file multiple times, varying the data written each time. You should write both
random and patterned data for maximum effectiveness.
2.5.3 Discussion
It is extremely difficult, if not outright impossible, to guarantee that the contents of
a file are completely unrecoverable on modern operating systems that offer logging
filesystems, virtual memory, and other such features.
Securely deleting files from disk is not as simple as issuing a system call to delete the file from the
filesystem. The first problem is that most delete operations do not do anything to the data; they merely
delete any underlying metadata that the filesystem uses to associate the file contents with the filename.
The storage space where the actual data is stored is then marked free and will be reclaimed whenever the
filesystem needs that space.
The result is that to truly erase the data, you need to overwrite it with nonsense before the filesystem
delete operation is performed. Many times, this overwriting is implemented by simply zeroing all the bytes
in the file. While this will certainly erase the file from the perspective of most conventional utilities, the fact
that most data is stored on magnetic media makes this more complicated.
More sophisticated tools can analyze the actual media and reveal the data that was previously stored on it.
This type of data recovery has a limit, however. If the data is sufficiently overwritten on the media, it does
become unrecoverable, masked by the new data that has overwritten it. A variety of factors, such as the
type of data written and the characteristics of the media, determine the point at which the interesting data
becomes unrecoverable.
A technique developed by Peter Gutmann provides an algorithm involving multiple passes of data written
to the disk to delete a file securely. The passes involve both specific patterns and random data written to
the disk. The paper detailing this technique is available from
http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html .
Unfortunately, many factors also work to thwart the feasibility of securely wiping the contents of a file.
Many modern operating systems employ complex filesystems that may cause several copies of any given
file to exist in some form at various different locations on the media. Other modern operating system
features such as virtual memory often work to defeat the goal of securely obliterating any traces of
sensitive data.
One of the worst things that can happen is that filesystem caching will turn multiple writes into a single
write operation. On some platforms, calling fsync( ) on the file after one pass will generally cause the
filesystem to flush the contents of the file to disk. But on some platforms that's not necessarily sufficient.
Doing a better job requires knowing about the operating system on which your code is running. For
example, you might be able to wait 10 minutes between passes, and ensure that the cached file has been
written to disk at least once in that time frame. Below, we provide an implementation of Peter Gutmann's
secure file-wiping algorithm, assuming fsync( ) is enough.
On Windows XP and Windows Server 2003, you can use thecipher command with
the /w flag to securely wipe unused portions of NTFS filesystems.
We provide three functions:
spc_fd_wipe( )
Overwrites the contents of a file identified by the specified file descriptor in accordance with
Gutmann's algorithm. If an error occurs while performing the wipe operation, the return value is -1;
otherwise, a successful operation returns zero.
spc_file_wipe( )
A wrapper around the first function, which uses a FILE object instead of a file descriptor. If an error
occurs while performing the wipe operation, the return value is -1; otherwise, a successful operation
returns zero.
SpcWipeFile( )
A Windows-specific function that uses the Win32 API for file access. It requires an open file handle as
its only argument and returns a boolean indicating success or failure.
Note that for all three functions, the file descriptor, FILE object, or file handle passed as an argument
must be open with write access to the file to be wiped; otherwise, the wiping functions will fail. As written,
these functions will probably not work very well on media other than disk because they are constantly
seeking back to the beginning of the file. Another issue that may arise is filesystem caching. All the writes
made to the file may not actually be written to the physical media.
#include
#include
#include
#include
#include
#include
#include
<limits.h>
<sys/types.h>
<sys/stat.h>
<unistd.h>
<errno.h>
<stdio.h>
<string.h>
#define SPC_WIPE_BUFSIZE 4096
static int write_data(int fd, const void *buf, size_t nbytes) {
size_t towrite, written = 0;
ssize_t result;
do {
if (nbytes - written > SSIZE_MAX) towrite = SSIZE_MAX;
else towrite = nbytes - written;
if ((result = write(fd, (const char *)buf + written, towrite)) >= 0)
written += result;
else if (errno != EINTR) return 0;
} while (written < nbytes);
return 1;
}
static int random_pass(int fd, size_t nbytes)
{
size_t
towrite;
unsigned char buf[SPC_WIPE_BUFSIZE];
if (lseek(fd, 0, SEEK_SET) != 0) return -1;
while (nbytes > 0) {
towrite = (nbytes > sizeof(buf) ? sizeof(buf) : nbytes);
spc_rand(buf, towrite);
if (!write_data(fd, buf, towrite)) return -1;
nbytes -= towrite;
}
fsync(fd);
return 0;
}
static int pattern_pass(int fd, unsigned char *buf, size_t bufsz, size_t filesz) {
size_t towrite;
if (!bufsz || lseek(fd, 0, SEEK_SET) != 0) return -1;
while (filesz > 0) {
towrite = (filesz > bufsz ? bufsz : filesz);
if (!write_data(fd, buf, towrite)) return -1;
filesz -= towrite;
}
fsync(fd);
return 0;
}
int spc_fd_wipe(int fd) {
int
count, i, pass, patternsz;
struct stat
st;
unsigned char buf[SPC_WIPE_BUFSIZE], *pattern;
static unsigned char single_pats[16] = {
0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff
};
static unsigned char triple_pats[6][3] = {
{ 0x92, 0x49, 0x24 }, { 0x49, 0x24, 0x92 }, { 0x24, 0x92, 0x49 },
{ 0x6d, 0xb6, 0xdb }, { 0xb6, 0xdb, 0x6d }, { 0xdb, 0x6d, 0xb6 }
};
if (fstat(fd, &st) = = -1) return -1;
if (!st.st_size) return 0;
for (pass = 0; pass < 4; pass++)
if (random_pass(fd, st.st_size) =
= -1) return -1;
memset(buf, single_pats[5], sizeof(buf));
if (pattern_pass(fd, buf, sizeof(buf), st.st_size) =
memset(buf, single_pats[10], sizeof(buf));
if (pattern_pass(fd, buf, sizeof(buf), st.st_size) =
= -1) return -1;
= -1) return -1;
patternsz = sizeof(triple_pats[0]);
for (pass = 0; pass < 3; pass++) {
pattern = triple_pats[pass];
count
= sizeof(buf) / patternsz;
for (i = 0; i < count; i++)
memcpy(buf + (i * patternsz), pattern, patternsz);
if (pattern_pass(fd, buf, patternsz * count, st.st_size) =
}
for (pass = 0; pass < sizeof(single_pats); pass++) {
memset(buf, single_pats[pass], sizeof(buf));
if (pattern_pass(fd, buf, sizeof(buf), st.st_size) =
}
= -1) return -1;
= -1) return -1;
for (pass = 0; pass < sizeof(triple_pats) / patternsz; pass++) {
pattern = triple_pats[pass];
count
= sizeof(buf) / patternsz;
for (i = 0; i < count; i++)
memcpy(buf + (i * patternsz), pattern, patternsz);
if (pattern_pass(fd, buf, patternsz * count, st.st_size) = = -1) return -1;
}
for (pass = 0; pass < 4; pass++)
if (random_pass(fd, st.st_size) =
return 0;
= -1) return -1;
}
int spc_file_wipe(FILE *f) {
return spc_fd_wipe(fileno(f));
}
The Unix implementations should work on Windows systems using the standard C runtime API; however, it
is rare that the standard C runtime API is used on Windows. The following code implementsSpcWipeFile(
) , which is virtually identical to the standard C version except that it uses only Win32 APIs for file access.
#include <windows.h>
#include <wincrypt.h>
#define SPC_WIPE_BUFSIZE 4096
static BOOL RandomPass(HANDLE hFile, HCRYPTPROV hProvider, DWORD dwFileSize)
{
BYTE pbBuffer[SPC_WIPE_BUFSIZE];
DWORD cbBuffer, cbTotalWritten, cbWritten;
if (SetFilePointer(hFile, 0, 0, FILE_BEGIN) = = 0xFFFFFFFF) return FALSE;
while (dwFileSize > 0) {
cbBuffer = (dwFileSize > sizeof(pbBuffer) ? sizeof(pbBuffer) : dwFileSize);
if (!CryptGenRandom(hProvider, cbBuffer, pbBuffer)) return FALSE;
for (cbTotalWritten = 0; cbBuffer > 0; cbTotalWritten += cbWritten)
if (!WriteFile(hFile, pbBuffer + cbTotalWritten, cbBuffer - cbTotalWritten,
&cbWritten, 0)) return FALSE;
dwFileSize -= cbTotalWritten;
}
return TRUE;
}
static BOOL PatternPass(HANDLE hFile, BYTE *pbBuffer, DWORD cbBuffer, DWORD dwFileSize) {
DWORD cbTotalWritten, cbWrite, cbWritten;
if (!cbBuffer || SetFilePointer(hFile, 0, 0, FILE_BEGIN) = = 0xFFFFFFFF) return FALSE;
while (dwFileSize > 0) {
cbWrite = (dwFileSize > cbBuffer ? cbBuffer : dwFileSize);
for (cbTotalWritten = 0; cbWrite > 0; cbTotalWritten += cbWritten)
if (!WriteFile(hFile, pbBuffer + cbTotalWritten, cbWrite - cbTotalWritten,
&cbWritten, 0)) return FALSE;
dwFileSize -= cbTotalWritten;
}
return TRUE;
}
BOOL SpcWipeFile(HANDLE hFile) {
BYTE
pbBuffer[SPC_WIPE_BUFSIZE];
DWORD
dwCount, dwFileSize, dwIndex, dwPass;
HCRYPTPROV hProvider;
static BYTE pbSinglePats[16] = {
0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff
};
static BYTE pbTriplePats[6][3] = {
{ 0x92, 0x49, 0x24 }, { 0x49, 0x24, 0x92 }, { 0x24, 0x92, 0x49 },
{ 0x6d, 0xb6, 0xdb }, { 0xb6, 0xdb, 0x6d }, { 0xdb, 0x6d, 0xb6 }
};
static DWORD cbPattern = sizeof(pbTriplePats[0]);
if ((dwFileSize = GetFileSize(hFile, 0)) =
if (!dwFileSize) return TRUE;
= INVALID_FILE_SIZE) return FALSE;
if (!CryptAcquireContext(&hProvider, 0, 0, 0, CRYPT_VERIFYCONTEXT))
return FALSE;
for (dwPass = 0;
dwPass < 4;
dwPass++)
if (!RandomPass(hFile, hProvider, dwFileSize)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
memset(pbBuffer, pbSinglePats[5], sizeof(pbBuffer));
if (!PatternPass(hFile, pbBuffer, sizeof(pbBuffer), dwFileSize)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
memset(pbBuffer, pbSinglePats[10], sizeof(pbBuffer));
if (!PatternPass(hFile, pbBuffer, sizeof(pbBuffer), dwFileSize)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
cbPattern = sizeof(pbTriplePats[0]);
for (dwPass = 0; dwPass < 3; dwPass++) {
dwCount
= sizeof(pbBuffer) / cbPattern;
for (dwIndex = 0; dwIndex < dwCount; dwIndex++)
CopyMemory(pbBuffer + (dwIndex * cbPattern), pbTriplePats[dwPass],
cbPattern);
if (!PatternPass(hFile, pbBuffer, cbPattern * dwCount, dwFileSize)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
}
for (dwPass = 0; dwPass < sizeof(pbSinglePats); dwPass++) {
memset(pbBuffer, pbSinglePats[dwPass], sizeof(pbBuffer));
if (!PatternPass(hFile, pbBuffer, sizeof(pbBuffer), dwFileSize)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
}
for (dwPass = 0; dwPass < sizeof(pbTriplePats) / cbPattern; dwPass++) {
dwCount
= sizeof(pbBuffer) / cbPattern;
for (dwIndex = 0; dwIndex < dwCount; dwIndex++)
CopyMemory(pbBuffer + (dwIndex * cbPattern), pbTriplePats[dwPass],
cbPattern);
if (!PatternPass(hFile, pbBuffer, cbPattern * dwCount, dwFileSize)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
}
for (dwPass = 0; dwPass < 4; dwPass++)
if (!RandomPass(hFile, hProvider, dwFileSize)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
CryptReleaseContext(hProvider, 0);
return TRUE;
}
2.5.4 See Also
"Secure Deletion of Data from Magnetic and Solid-State Memory" by Peter Gutmann:
http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html
[ Team LiB ]
[ Team LiB ]
2.6 Accessing File Information Securely
2.6.1 Problem
You need to access information about a file, such as its size or last modification date. In doing so, you
want to avoid the possibility of race conditions.
2.6.2 Solution
Use a secure directory, as described in Recipe 2.4. Alternatively, open the file and query the needed
information using the file handle. Do not use functions that operate on the name of the file, especially
if multiple queries are required for the same file or if you intend to open it based on the information
obtained from queries. Operating on filenames introduces the possibility of race conditions because
filenames can change between calls.
On Unix, use the fstat( ) function instead of the stat( ) function. Both functions return the same
information, but fstat( ) uses an open file descriptor while stat( ) uses a filename. Doing so
removes the possibility of a race condition, because the file to which the file descriptor points can
never change unless you reopen the file descriptor. When operating on just the filename, there is no
guarantee that the underlying file pointed to by the filename remains the same after the call tostat(
).
On Windows, use the function GetFileInformationByHandle( ) instead of functions like
FindFirstFile( ) or FindFirstFileEx( ). As with fstat( ) versus stat( ) on Unix (which are
also available on Windows if you're using the C runtime API), the primary difference between these
functions is that one uses a file handle while the others use filenames. If the only information you
need is the size of the file, you can use GetFileSize( ) instead of GetFileInformationByHandle(
).
2.6.3 Discussion
Accessing file information using filenames can lead to race conditions, particularly if multiple queries
are necessary or if you intend to open the file depending on information previously obtained. In
particular, if symbolic links are involved, an attacker could potentially change the file to which the link
points between queries or between the time information is queried and the time the file is actually
opened. This type of race condition, known as a Time of Check, Time of Use (TOCTOU) race
condition, was also discussed in Recipe 2.3.
In most cases, when you need information about a file, such as its size, you also have some intention
of opening the file and using it in some way. For example, if you're checking to see whether a file
exists before trying to create it, you might think to use stat( ) or FindFirstFile( ) first, and if
the function fails with an error indicating the file does not exist, create the file withcreat( ) or
CreateFile( ). A better solution is to use open( ) with the O_CREAT and O_EXCL flags, or to use
CreateFile( ) with CREATE_NEW specified as the creation disposition.
2.6.4 See Also
Recipe 2.3
[ Team LiB ]
[ Team LiB ]
2.7 Restricting Access Permissions for New Files on Unix
2.7.1 Problem
You want to restrict the initial access permissions assigned to a file created by your program.
2.7.2 Solution
On Unix, the operating system stores a value known as the umask for each process it uses when
creating new files on behalf of the process. The umask is used to disable permission bits that may be
specified by the system call used to create files.
2.7.3 Discussion
Remember that umasks apply only on file or directory creation. Calls to chmod(
) and fchmod( ) are not modified by umask settings.
When a process creates a new file, it specifies the access permissions to assign the new file as a
parameter to the system call that creates the file. The operating system modifies the access
permissions by computing the intersection of the inverse of the umask and the permissions requested
by the process. The access permission bits that remain after the intersection is computed are what
the operating system actually uses for the new file. In other words, in the following example code, if
the variable requested_permissions contained the permissions passed to the operating system to
create a new file, the variable actual_permissions would be the actual permissions that the
operating system would use to create the file.
requested_permissions = 0666;
actual_permissions = requested_permissions & ~umask(
);
A process inherits the value of its umask from its parent process when the process is created.
Normally, the shell sets a default umask of either 022 (disable group- and world-writable bits) or 02
(disable world-writable bits) when a user logs in, but users have free reign to change the umask as
they want. Many users are not even aware of the existence of umasks, never mind how to set them
appropriately. Therefore, the umask value as set by the user should never be trusted to be
appropriate.
When using the open( ) system call to create a new file, you can force more restrictive permissions
to be used than what the user's umask might allow, but the only way to create a file with less
restrictive permissions is either to modify the umask before creating the file or to usefchmod( ) to
change the permissions after the file is created.
In most cases, you'll be attempting to loosen restrictions, but consider what happens whenfopen( )
is used to create a new file. The fopen( ) function provides no way to specify the permissions to use
for the new file, and it always uses 0666, which grants read and write access to the owning user, the
owning group, and everyone else. Again, the only way to modify this behavior is either to set the
umask before calling fopen( ) or to use fchmod( ) after the file is created.
Using fchmod( ) to change the permissions of a file after it is created is not a good idea because it
introduces a race condition. Between the time the file is created and the time the permissions are
modified, an attacker could possibly gain unauthorized access to the file. The proper solution is
therefore to modify the umask before creating the file.
Properly using umasks in your program can be a bit complicated, but here are some general
guidelines:
If you are creating files that contain sensitive data, always create them readable and writable
by only the file owner, and deny access to group members and all other users.
Be aware that files that do not contain sensitive data may be readable by other users on the
system. If the user wants to stop this behavior, the umask can be set appropriately before
starting your program.
Avoid setting execute permissions on files, especially group and world execute. If your program
generates files that are meant to be executable, set the execute bit only for the file owner.
Create directories that may contain files used to store sensitive information such that only the
owner of the directory has read, write, and execute permissions for the directory. This allows
only the owner of the directory to enter the directory or view or change its contents, but no
other users can view or otherwise access the directory. (See the discussion of secure directories
in Recipe 2.4 for more information on the importance of this requirement.)
Create directories that are not intended to store sensitive files such that the owner has read,
write, and execute permissions, while group members and everyone else has only read and
execute permissions. If the user wants to stop this behavior, the umask can be set
appropriately before starting your program.
Do not rely on setting the umask to a "secure" value once at the beginning of the program and
then calling all file or directory creation functions with overly permissive file modes. Explicitly set
the mode of the file at the point of creation. There are two reasons to do this. First, it makes
the code clear; your intent concerning permissions is obvious. Second, if an attacker managed
to somehow reset the umask between your adjustment of the umask and any of your file
creation calls, you could potentially create sensitive files with wide-open permissions.
Modifying the umask programmatically is a simple matter of calling the functionumask( ) with the
new mask. The return value will be the old umask value. The standard header filesys/stat.h
prototypes the umask( ) function, and it also contains definitions for a sizable set of macros that
map to the various permission bits. Table 2-2 lists the macros, their values in octal, and the
permission bit or bits to which each one corresponds.
Table 2-2. Macros for permission bits and their octal values
Macro
Octal value
Permission bit(s)
S_IRWXU 0700
Owner read, write, execute
S_IRUSR 0400
Owner read
S_IWUSR 0200
Owner write
S_IXUSR 0100
Owner execute
S_IRWXG 0070
Group read, write, execute
S_IRGRP 0040
Group read
S_IWGRP 0020
Group write
S_IXGRP 0010
Group execute
S_IRWXO 0007
Other/world read, write, execute
S_IROTH 0004
Other/world read
S_IWOTH 0002
Other/world write
S_IXOTH 0001
Other/world execute
umasks are a useful tool for users, allowing them to limit the amount of access others get to their
files. Your program should make every attempt to honor the users' wishes in this regard, but if extra
security is required for files that your application generates, you should always explicitly set this
permission yourself.
2.7.4 See Also
Recipe 2.4
[ Team LiB ]
[ Team LiB ]
2.8 Locking Files
2.8.1 Problem
You want to lock files (or portions of them) to prevent two or more processes from accessing them
simultaneously.
2.8.2 Solution
Two basic types of locks exist: advisory and mandatory. Unix supports both advisory and, to an
extremely limited extent, mandatory locks, while Windows supports only mandatory locks.
2.8.3 Discussion
In the following sections, we will look at the different issues for Unix and Windows.
2.8.3.1 Locking files on Unix
All modern Unix variants support advisory locks. An advisory lock is a lock in which the operating
system does not enforce the lock. Instead, programs sharing the same file must cooperate with each
other to ensure that locks are properly observed. From a security perspective, advisory locks are of
little use because any program is free to perform any action on a file regardless of the state of any
advisory locks that other programs may hold on the file.
Support for mandatory locks varies greatly from one Unix variant to another. Both Linux and Solaris
support mandatory locks, but Darwin, FreeBSD, NetBSD, and OpenBSD do not, even though they
export the interface used by Linux and Solaris to support them. On such systems, this interface
creates advisory locks.
Support for mandatory locking does not extend to NFS. In other words, both Linux and Solaris are
capable only of using mandatory locks on local filesystems. Further, Linux requires that filesystems
be mounted with support for mandatory locking, which is disabled by default. In the end, Solaris is
really the only Unix variant on which you can reasonably expect mandatory locking to work, and even
then, relying on mandatory locks is like playing with fire.
As if the story for mandatory locking on Unix were not bad enough already, it gets worse. To be able
to use mandatory locks on a file, the file must have the setgid bit enabled and the group execute bit
disabled in its permissions. Even if a process holds a mandatory lock on a file, another process may
remove the setgid bit from the file's permissions, which effectively turns the mandatory lock into an
advisory lock!
Essentially, there is no such thing as a mandatory lock on Unix.
Just to add more fuel to the fire, neither Solaris nor Linux fully or properly implement the System V
defined semantics for mandatory locks, and both systems differ in where they stray from the System
V definitions. The details of the differences are not important here. We strongly recommend that you
avoid the Unix mandatory lock debacle altogether. If you want to use advisory locking on Unix, then
we recommend using a standalone lock file, as described inRecipe 2.9.
2.8.3.2 Locking files on Windows
Where Unix falls flat on its face with respect to supporting file locking, Windows gets it right. Windows
supports only mandatory file locks, and it fully enforces them. If a process has a lock on a file or a
portion of a file, another process cannot mistakenly or maliciously steal that lock.
Windows provides four functions for locking and unlocking files. Two functions,LockFile( ) and
LockFileEx( ), are provided for engaging locks, and two functions, UnlockFile( ) and
UnlockFileEx( ), are provided for removing them.
Neither LockFile( ) nor UnlockFile( ) will return until the lock can be successfully obtained or
released, respectively. LockFileEx( ) and UnlockFileEx( ), however, can be called in such a way
that they will always return immediately, either returning failure or signalling an event object when
the requested operation completes.
Locks can be placed on a file in its entirety or on a portion of a file. A single file may have multiple
locks owned by multiple processes so long as none of the locks overlap. When removing a lock, you
must specify the exact portion of the file that was locked. For example, two locks covering contiguous
portions of a file may not be removed with a single unlock operation that spans the two locks.
When a lock is held on a file, closing the file does not necessarily remove the
lock. The behavior is actually undefined and may vary across different
filesystems and versions of Windows. Always make sure to remove any locks
on a file before closing it.
There are two types of locks on Windows:
Shared lock
This type of lock allows other processes to read from the locked portion of the file, while
denying all processes-including the process that obtained the lock-permission to write to the
locked portion of the file.
Exclusive lock
This type of lock denies other processes both read and write access to the locked portion of the
file, while allowing the locking process to read or write to the locked portion of the file.
Using LockFile( ) to obtain a lock always obtains an exclusive lock. However, LockFileEx( )
obtains a shared lock unless the flag LOCKFILE_EXCLUSIVE_LOCK is specified.
Here are the signatures for LockFile and UnlockFile( ):
BOOL LockFile(HANDLE hFile, DWORD dwFileOffsetLow,
DWORD dwFileOffsetHigh, DWORD nNumberOfBytesToLockLow,
DWORD nNumberOfBytesToLockHigh);
BOOL UnlockFile(HANDLE hFile, DWORD dwFileOffsetLow,
DWORD dwFileOffsetHigh, DWORD nNumberOfBytesToUnlockLow,
DWORD nNumberOfBytesToUnlockHigh);
[ Team LiB ]
[ Team LiB ]
2.9 Synchronizing Resource Access Across Processes on
Unix
2.9.1 Problem
You want to ensure that two processes cannot simultaneously access the same resource, such as a
segment of shared memory.
2.9.2 Solution
Use a lock file to signal that you are accessing the resource.
2.9.3 Discussion
Using a lock file to synchronize access to shared resources is not as simple as it sounds. Suppose that
your program creates a lock file and then crashes. If this happens, the lock file will remain, and your
program (as well as any other program that attempted to obtain the lock) will fail until someone
manually removes the lock file. Obviously, this is undesirable. The solution is to store the process ID
of the process holding the lock in the lock file. Other processes attempting to obtain the lock can then
test to see whether the process holding the lock still exists. If it does not, the lock file is stale, it is
safe to remove, and you can make another attempt to obtain the lock.
Unfortunately, this solution is still not a perfect one. What happens if another process is assigned the
same ID as the one stored in the stale lock file? The answer to this question is simply that no process
can obtain the lock until the process with the stale ID terminates or someone manually removes the
lock file. Fortunately, this case should not be encountered frequently.
As a result of solving the stale lock problem, a new problem arises: there is now a race condition
between the time the check for the existence of the process holding the lock is performed and the
time the lock file is removed. The solution to this problem is to attempt to reopen the lock file after
writing the new one to make sure that the process ID in the lock file is the same as the locking
process's ID. If it is, the lock is successfully obtained.
The function presented below, spc_lock_file( ), requires a single argument: the name of the file
to be used as the lock file. You must store the lock file in a "safe" directory (seeRecipe 2.4) on a local
filesystem. Network filesystems-versions of NFS older than Version 3 in particular-may not
necessarily support the O_EXCL flag to open( ). Further, because the ID of the process holding the
lock is stored in the lock file and process IDs are not shared across machines, testing for the
presence of the process holding the lock would be unreliable at best if the lock file were stored on a
network filesystem.
Three attempts are made to obtain the lock, with a pause of one second between attempts. If the
lock cannot be obtained, the return value from the function is 0. If some kind of error occurs in
attempting to obtain the lock, the return value is -1. If the lock is successfully obtained, the return
value is 1.
#include
#include
#include
#include
#include
#include
#include
#include
<sys/types.h>
<unistd.h>
<stdlib.h>
<fcntl.h>
<sys/stat.h>
<errno.h>
<limits.h>
<signal.h>
static int read_data(int fd, void *buf, size_t nbytes) {
size_t toread, nread = 0;
ssize_t result;
do {
if (nbytes - nread > SSIZE_MAX) toread = SSIZE_MAX;
else toread = nbytes - nread;
if ((result = read(fd, (char *)buf + nread, toread)) >= 0)
nread += result;
else if (errno != EINTR) return 0;
} while (nread < nbytes);
return 1;
}
static int write_data(int fd, const void *buf, size_t nbytes) {
size_t towrite, written = 0;
ssize_t result;
do {
if (nbytes - written > SSIZE_MAX) towrite = SSIZE_MAX;
else towrite = nbytes - written;
if ((result = write(fd, (const char *)buf + written, towrite)) >= 0)
written += result;
else if (errno != EINTR) return 0;
} while (written < nbytes);
return 1;
}
The two functions read_data( ) and write_data( ) are helper functions that ensure that all the
requested data is read or written. If the system calls for reading or writing are interrupted by a
signal, they are retried. Because such a small amount of data is being read and written, the data
should all be written atomically, but all the data may not be read or written in a single call. These
helper functions also handle this case.
int spc_lock_file(const char *lfpath) {
int
attempt, fd, result;
pid_t pid;
/* Try three times, if we fail that many times, we lose */
for (attempt = 0; attempt < 3; attempt++) {
if ((fd = open(lfpath, O_RDWR | O_CREAT | O_EXCL, S_IRWXU)) =
if (errno != EEXIST) return -1;
if ((fd = open(lfpath, O_RDONLY)) = = -1) return -1;
result = read_data(fd, &pid, sizeof(pid));
close(fd);
if (result) {
if (pid = = getpid( )) return 1;
if (kill(pid, 0) = = -1) {
if (errno != ESRCH) return -1;
attempt--;
unlink(lfpath);
continue;
}
}
sleep(1);
continue;
}
= -1) {
pid = getpid( );
if (!write_data(fd, &pid, sizeof(pid))) {
close(fd);
return -1;
}
close(fd);
attempt--;
}
/* If we've made it to here, three attempts have been made and the lock could
* not be obtained. Return an error code indicating failure to obtain the
* requested lock.
*/
return 0;
}
The first step in attempting to obtain the lock is to try to create the lock file. If this succeeds, the
caller's process ID is written to the file, the file is closed, and the loop is executed again. The loop
counter is decremented first to ensure that at least one more iteration will always occur. The next
time through the loop, creating the file should fail but won't necessarily do so, because another
process was attempting to get the lock at the same time from a stale process and deleted the lock
file out from under this process. If this happens, the whole process begins again.
If the lock file cannot be created, the lock file is opened for reading, and the ID of the process holding
the lock is read from the file. The read is blocking, so if another process has begun to write out its ID,
the read will block until the other process is done. Another race condition here could be avoided by
performing a non-blocking read in a loop until all the data is read. A timeout could be applied to the
read operation to cause the incomplete lock to be treated as stale. This race condition will only occur
if a process creates the lock file without writing any data to it. This could be caused by an attacker, or
it could occur because the process is terminated at precisely the right time so that it doesn't get the
chance to write its ID to the lock file.
Once the process ID is read from the lock file, an attempt to send the process a signal of 0 is made.
If the signal cannot be sent because the process does not exist, the call tokill( ) will return failure,
and errno will be set to ESRCH. If this happens, the lock file is stale, and it can be removed. This is
where the race condition discussed earlier occurs. The lock file is removed, the attempt counter is
decremented, and the loop is restarted.
Between the time that kill( ) returns failure with an ESRCH error code and the time that unlink( )
is called to remove the lock file, another process could successfully delete the lock file and begin
creating a new one. If this happens, the process will successfully write its process ID to the now
deleted lock file and assume that it has the lock. It will not have the lock, though, because this
process will have deleted the lock file the other process was creating. For this reason, after the lock
file is created, the process must attempt to read the lock file and compare process IDs. If the process
ID in the lock file is the same as the process making the comparison, the lock was successfully
obtained.
2.9.4 See Also
Recipe 2.4
[ Team LiB ]
[ Team LiB ]
2.10 Synchronizing Resource Access Across Processes
on Windows
2.10.1 Problem
You want to ensure that two processes cannot simultaneously access the same resource.
2.10.2 Solution
Use a named mutex (mutually exclusive lock) to synchronize access to the resource.
2.10.3 Discussion
Coordinating access to a shared resource between multiple processes on Windows is much simpler
and much more elegant than it is on Unix. For maximum portability on Unix, you must use a lock file
and make sure to avoid a number of possible race conditions to make lock files work properly. On
Windows, however, the use of named mutexes solves all the problems Unix has without introducing
new ones.
A named mutex is a synchronization object that works by allowing only a single thread to acquire a
lock at any given time. Mutexes can also exist without a name, in which case they are considered
anonymous. Access to an anonymous mutex can only be obtained by somehow acquiring a handle to
the object from the thread that created it. Anonymous mutexes are of no use to us in this recipe, so
we won't discuss them further.
Mutexes have a namespace much like that of a filesystem. The mutex namespace is separate from
namespaces used by all other objects. If two or more applications agree on a name for a mutex,
access to the mutex can always be obtained to use it for synchronizing access to a shared resource.
A mutex is created with a call to the CreateMutex( ) function. You will find it particularly useful in
this recipe that the mutex is created and a handle returned, or, if the mutex already exists, a handle
to the existing mutex is returned.
Once we have a handle to the mutex that will be used for synchronization, using it is a simple matter
of waiting for the mutex to enter the signaled state. When it does, we obtain the lock, and other
processes wait for us to release it. When we are finished using the resource, we simply release the
lock, which places the mutex into the signaled state.
If our program terminates abnormally while it holds the lock on the resource, the lock is released,
and the return from WaitForSingleObject( ) in the next process to obtain the lock is
WAIT_ABANDONED. We do not check for this condition in our code because the code is intended to be
used in such a way that abandoning the lock will not have any adverse effects. This is essentially the
same type of behavior as that in the Unix lock file code fromRecipe 2.9, where it attempts to break
the lock if the process holding it terminates unexpectedly.
To obtain a lock, call SpcLockResource( ) with the name of the lock. If the lock is successfully
obtained, the return will be a handle to the lock; otherwise, the return will beNULL, and
GetLastError( ) can be used to determine what went wrong. When you're done with the lock,
release it by calling SpcUnlockResource( ) with the handle returned by SpcLockResource( ).
#include <windows.h>
HANDLE SpcLockResource(LPCTSTR lpName) {
HANDLE hResourceLock;
if (!lpName) {
SetLastError(ERROR_INVALID_PARAMETER);
return 0;
}
if (!(hResourceLock = CreateMutex(0, FALSE, lpName))) return 0;
if (WaitForSingleObject(hResourceLock, INFINITE) = = WAIT_FAILED) {
CloseHandle(hResourceLock);
return 0;
}
return hResourceLock;
}
BOOL SpcUnlockResource(HANDLE hResourceLock) {
if (!ReleaseMutex(hResourceLock)) return FALSE;
CloseHandle(hResourceLock);
return TRUE;
}
2.10.4 See Also
Recipe 2.9
[ Team LiB ]
[ Team LiB ]
2.11 Creating Files for Temporary Use
2.11.1 Problem
You need to create a file to use as scratch space that may contain sensitive data.
2.11.2 Solution
Generate a random filename and attempt to create the file, failing if the file already exists. If the file
cannot be created because it already exists, repeat the process until it succeeds. If creating the file
fails for any other reason, abort the process.
2.11.3 Discussion
When creating temporary files, you should consider using a known-safe
directory to store them, as described in Recipe 2.4.
The need for temporary files is common. More often than not, other processes have no need to
access the temporary files you create, and especially if the files contain sensitive data, it is best to do
everything possible to ensure that other processes cannot access them. It is also important that
temporary files do not remain on the filesystem any longer than necessary. If the program creating
temporary files terminates unexpectedly before it cleans up the files, temporary directories often
become littered with files of no interest or value to anyone or anything. Worse, if the temporary files
contain sensitive data, they are suddenly both interesting and valuable to an attacker.
2.11.3.1 Temporary files on Unix
The best solution for creating a temporary file on Unix is to use themkstemp( ) function in the
standard C runtime library. This function generates a random filename,[2] attempts to create it, and
repeats the whole process until it is successful, thus guaranteeing that a unique file is created. The
file created by mkstemp( ) will be readable and writable by the owner, but not by anyone else.
[2]
The filename may not be strongly random. An attacker might be able to predict the filename, but that is
generally okay.
To help further ensure that the file cannot be accessed by any other process, and to be sure that the
file will not be left behind by your program if it should terminate unexpectedly before being able to
delete it, the file can be deleted by name while it is open immediately aftermkstemp( ) returns. Even
though the file has been deleted, you will still be able to read from and write to it because there is a
valid descriptor for the file. No other process will be able to open the file because a name will no
longer be associated with it. Once the last open descriptor to the file is closed, the file will no longer
be accessible.
Between the time that a file is created with mkstemp( ) and the time that
unlink( ) is called to delete the file, a window of opportunity exists where an
attacker could open the file before it can be deleted.
The mkstemp( ) function works by specifying a template from which a random filename can be
generated. From the end of the template, "X" characters are replaced with random characters. The
template is modified in place, so the specified buffer must be writable. The return value from
mkstemp( ) is -1 if an error occurs; otherwise, it is the file descriptor to the file that was created.
2.11.3.2 Temporary files on Windows
The Win32 API does not contain a functional equivalent of the standard Cmkstemp( ) function. The
Microsoft C Runtime implementation does not even provide support for the function, although it does
provide an implementation of mktemp( ). However, we strongly advise against using that function on
either Unix or Windows.
The Win32 API does provide a function, GetTempFileName( ), that will generate a temporary
filename, but that is all that it does; it does not open the file for you. Further, if asked to generate a
unique name itself, it will use the system time, which is highly predictable.
Instead, we recommend using GetTempPath( ) to obtain the current user's setting for the location to
place temporary files, and generating your own random filename usingCryptoAPI or some other
cryptographically strong pseudo-random number generator. The code presented here uses the
spc_rand_range( ) function from Recipe 11.11. Refer to Chapter 11 for possible implementations of
random number generators.
The function SpcMakeTempFile( ) repeatedly generates a random temporary filename using a
cryptographically strong pseudo-random number generator and attempts to create the file. The
generated filename contains an absolute path specification to the user's temporary files directory. If
successful, the file is created, inheriting access permissions from that directory, which ordinarily will
prevent users other than the Administrator and the owner from gaining access to it. If
SpcMakeTempFile( ) is unable to create the file, the process begins anew. SpcMakeTempFile( ) will
not return until a file can be successfully created or some kind of fatal error occurs.
As arguments, SpcMakeTempFile( ) requires a preallocated writable buffer and the size of that
buffer in characters. The buffer will contain the filename used to successfully create the temporary
file, and the return value from the function will be a handle to the open file. If an error occurs, the
return value will be INVALID_HANDLE_VALUE, and GetLastError( ) can be used to obtain more
detailed error information.
#include <windows.h>
static LPTSTR lpszFilenameCharacters = TEXT("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ");
static BOOL MakeTempFilename(LPTSTR lpszBuffer, DWORD dwBuffer) {
int
i;
DWORD dwCharacterRange, dwTempPathLength;
TCHAR cCharacter;
dwTempPathLength = GetTempPath(dwBuffer, lpszBuffer);
if (!dwTempPathLength) return FALSE;
if (++dwTempPathLength > dwBuffer || dwBuffer - dwTempPathLength < 12) {
SetLastError(ERROR_INSUFFICIENT_BUFFER);
return FALSE;
}
dwCharacterRange = lstrlen(lpszFilenameCharacters) - 1;
for (i = 0; i < 8; i++) {
cCharacter = lpszFilenameCharacters[spc_rand_range(0, dwCharacterRange)];
lpszBuffer[dwTempPathLength++ - 1] = cCharacter;
}
lpszBuffer[dwTempPathLength++ - 1] = '.';
lpszBuffer[dwTempPathLength++ - 1] = 'T';
lpszBuffer[dwTempPathLength++ - 1] = 'M';
lpszBuffer[dwTempPathLength++ - 1] = 'P';
lpszBuffer[dwTempPathLength++ - 1] = 0;
return TRUE;
}
HANDLE SpcMakeTempFile(LPTSTR lpszBuffer, DWORD dwBuffer) {
HANDLE hFile;
do {
if (!MakeTempFilename(lpszBuffer, dwBuffer)) {
hFile = INVALID_HANDLE_VALUE;
break;
}
hFile = CreateFile(lpszBuffer, GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE,
0, CREATE_NEW,
FILE_ATTRIBUTE_TEMPORARY | FILE_FLAG_DELETE_ON_CLOSE, 0);
if (hFile = = INVALID_HANDLE_VALUE && GetLastError( ) != ERROR_ALREADY_EXISTS)
break;
} while (hFile = = INVALID_HANDLE_VALUE);
return hFile;
}
2.11.4 See Also
Recipe 2.4, Recipe 11.11
[ Team LiB ]
[ Team LiB ]
2.12 Restricting Filesystem Access on Unix
2.12.1 Problem
You want to restrict your program's ability to access important parts of the filesystem.
2.12.2 Solution
Unix systems provide a system call known as chroot( ) that will restrict the process's access to the
filesystem. Specifically, chroot( ) alters a process's perception of the filesystem by changing its root
directory, which effectively prevents the process from accessing any part of the filesystem above the
new root directory.
2.12.3 Discussion
Normally, a process's root directory is the actual system root directory, which allows the process to
access any part of the filesystem. However, by using the chroot( ) system call, a process can alter
its view of the filesystem by changing its root directory to another directory within the filesystem.
Once the process's root directory has been changed once, it can only be made more restrictive. It is
not possible to change the process's root directory to another directory outside of its current view of
the filesystem.
Using chroot( ) is a simple way to increase security for processes that do not require access to the
filesystem outside of a directory or hierarchy of directories containing its data files. If an attacker is
somehow able to compromise the program and gain access to the filesystem, the potential for
damage (whether it is reading sensitive data or destroying data) is localized to the restricted
directory hierarchy imposed by altering the process's root directory.
Unfortunately, one often overlooked caveat applies to usingchroot( ). The first time that chroot(
) is called, it does not necessarily alter the process's current directory, which means that until the
current directory is forcibly changed, it may still be possible to access areas of the filesystem outside
the new root directory structure. It is therefore imperative that the process callingchroot( )
immediately change its current directory to a directory within the new root directory structure. This is
easily accomplished as follows:
#include <unistd.h>
chroot("/new/root/directory");
chdir("/");
One final point regarding the use of chroot( ) is that the system call requires the calling process to
have superuser privileges.
[ Team LiB ]
[ Team LiB ]
2.13 Restricting Filesystem and Network Access on
FreeBSD
2.13.1 Problem
Your program runs primarily (if not exclusively) on FreeBSD, and you want to impose restrictions on
your program's filesystem and network capabilities that are above and beyond whatchroot( ) can
do. (See Recipe 2.12.)
2.13.2 Solution
FreeBSD implements a system call known as jail( ), which will "imprison" a process and its
descendants. It does all that chroot( ) does and more.
2.13.3 Discussion
Ordinarily, a jail is constructed on FreeBSD by the system administrator using thejail program, which
is essentially a wrapper around the jail( ) system call. (Discounting comments and blank lines, the
code is a mere 35 lines.) However, it is possible to use thejail( ) system call in your own
programs.
The FreeBSD jail does everything that chroot( ) does, and then some. It restricts much of the
superuser's normal abilities, and it restricts the IP address that programs running inside the jail may
use.
Creating a jail is as simple as filling in a data structure with the appropriate information and calling
jail( ). The same caveats that apply to chroot( ) also apply to jail( ) because jail( ) calls
chroot( ) internally. In particular, only the superuser may create a jail successfully.
Presently, the jail configuration structure contains only four fields:version, path, hostname, and
ip_number. The version field must be set to 0, and the path field is treated the same as chroot(
)'s argument is. The hostname field sets the hostname of the jail; however, it is possible to change it
from within the jail.
The ip_number field is the IP address to which processes running within the jail are restricted.
Processes within the jail will only be able to bind to this address regardless of what other IP
addresses are assigned to the system. In addition, all IP traffic emanating from processes within the
jail will be forced to use this address as its source.
The IP address assigned to a jail must be configured on the system; typically, it should be set up as
an alias rather than as the primary address for a network interface unless the network interface is
dedicated to the jail. For example, a system with two network interfaces may be configured to route
all traffic from processes outside the jail to one interface, and route all traffic from processes inside
the jail to the other.
2.13.4 See Also
Recipe 2.12
[ Team LiB ]
[ Team LiB ]
Chapter 3. Input Validation
Eavesdropping attacks are often easy to launch, but most people don't worry about them in their
applications. Instead, they tend to worry about what malicious things can be done on the machine on
which the application is running. Most people are far more worried about active attacks than they
about passive attacks.
Pretty much every active attack out there is the result of some kind of input from an attacker.Secure
programming is largely about making sure that inputs from bad people do not do bad things. Indeed,
most of this book addresses how to deal with malicious inputs. For example, cryptography and a
strong authentication protocol can help prevent attackers from capturing someone else's login
credentials and sending those credentials as input to the program.
If this entire book focuses primarily on preventing malicious inputs, why do we have a chapter
specifically devoted to this topic? It's because this chapter is about one important class of defensive
techniques: input validation.
In this chapter, we assume that people are connected to our software, and that some of them may
send malicious data (even if we think there is a trusted client on the other end). One question we
really care about is this: "What does our application do with that data?" In particular, does the
program take data that should be untrusted and do something potentially security-critical with it?
More importantly, can any untrusted data be used to manipulate the application or the underlying
system in a way that has security implications?
[ Team LiB ]
[ Team LiB ]
3.1 Understanding Basic Data Validation Techniques
3.1.1 Problem
You have data coming into your application, and you would like to filter or reject data that might be
malicious.
3.1.2 Solution
Perform data validation at all levels whenever possible. At the very least, make sure data is filtered
on input.
Match constructs that are known to be valid and harmless. Reject anything else.
In addition, be sure to be skeptical about any data coming from a potentially insecure channel. In a
client-server architecture, for example, even if you wrote the client, the server should never assume
it is talking to a trusted client.
3.1.3 Discussion
Applications should not trust any external input. We have often seen situations in which people had a
custom client-server application and the application developer assumed that, because the client was
written in house by trusted, strong coders, there was nothing to worry about in terms of malicious
data being injected.
Those kinds of assumptions lead people to do things that turn out badly, such as embedding in a
client SQL queries or shell commands that get sent to a server and executed. In such a scenario, an
attacker who is good at reverse engineering can replace the SQL code in the client-side binary with
malicious SQL code (perhaps code that reads private records or deletes important data). The attacker
could also replace the actual client with a handcrafted client.
In many situations, an attacker who does not even have control over the client is nevertheless able
to inject malicious data. For example, he might inject bogus data into the network stream.
Cryptography can sometimes help, but even then, we have seen situations in which the attacker did
not need to send data that decrypted properly to cause a problem-for example, as a buffer overflow
in the portion of an application that does the decryption.
You can regard input validation as a kind of access control mechanism. For example, you will
generally want to validate that the person on the other end of the connection has the right
credentials to perform the operations that she is requesting. However, when you're doing data
validation, most often you'll be worried about input that might do things that no user is supposed to
be able to do.
For example, an access control mechanism might determine whether a user has the right to use your
application to send email. If the user has that privilege, and your software calls out to the shell to
send email (which is generally a bad idea), the user should not be able to manipulate the data in such
a way that he can do anything other than send mail as intended.
Let's look at basic rules for proper data validation:
Assume all input is guilty until proven otherwise.
As we said earlier, you should never trust external input that comes from outside the trusted
base. In addition, you should be very skeptical about which components of the system are
trusted, even after you have authenticated the user on the other end!
Prefer rejecting data to filtering data.
If you determine that a piece of data might possibly be malicious, your best bet from a security
perspective is to assume that using the data will screw you up royally no matter what you do,
and act accordingly. In some environments, you might need to be able to handle arbitrary
data, in which case you will need to treat all input in a way that ensures everything is benign.
Avoid the latter situation if possible, because it is a lot harder to get right.
Perform data validation both at input points and at the component level.
One of the most important principles in computer security, defense in depth, states that you
should provide multiple defenses against a problem if a single defense may fail. This is
important in input validation. You can check the validity of data as it comes in from the
network, and you can check it right before you use the data in a manner that might possibly
have security implications. However, each one of these techniques alone is somewhat errorprone.
When you're checking input at the points where data arrives, be aware that components might
get ripped out and matched with code that does not do the proper checking, making the
components less robust than they should be. More importantly, it is often very difficult to
understand enough about the context of the data well enough to make validation easy when
data is fresh from the network. That is, routines that read from a socket usually do not
understand anything about the state the application is in. Without such knowledge, input
routines can do only rudimentary filtering.
On the other hand, when you're checking input at the point before you use it, it's often easy to
forget to perform the check. Most of the time, you will want to make life easier by producing
your own wrapper API to do the filtering, but sometimes you might forget to call it or end up
calling it improperly. For example, many people try to use strncpy( ) to help prevent buffer
overflows, but it is easy to use this function in the wrong way, as we discuss inRecipe 3.3.
Do not accept commands from the user unless you parse them yourself.
Many data input problems involve the program's passing off data that came from an untrusted
source to some other entity that actually parses and acts on the data. If the component doing
the parsing has to trust its caller, bad things can happen if your software does not do the
proper checking. The best known example of this is the Unix command shell. Sometimes,
programs will accomplish tasks by using functions such as system( ) or popen( ) that invoke
a shell (which is often a bad idea by itself; see Recipe 1.7). (We'll look at the shell input
problem later in this chapter.) Another popular example is the database query using the SQL
language. (We'll discuss input validation problems with SQL inRecipe 3.11.)
Beware of special commands, characters, and quoting.
One obvious thing to do when using a command language such as the Unix shell or SQL is to
construct commands in trusted software, instead of allowing users to send commands that get
proxied. However, there is another "gotcha" here. Suppose that you provide users the ability to
search a database for a word. When the user gives you that word, you may be inclined to
concatenate it to your SQL command. If you do not validate the input, the user might be able
to run other commands.
Consider what happens if you have a server application that, among other things, can send
email. Suppose that the email address comes from an untrusted client. If the email address is
placed into a buffer using a format string like "/bin/mail %s < /tmp/email", what happens if the
user submits the following email address: "[email protected]; cat /etc/passwd | mail
[email protected]"?
Make policy decisions based on a "default deny" rule.
There are two different approaches to data filtering. With the first, known aswhitelisting, you
accept input as valid only if it meets specific criteria. Otherwise, you reject it. If you do this, the
major thing you need to worry about is whether the rules that define your whitelist are actually
correct!
With the other approach, known as blacklisting, you reject only those things that are known to
be bad. It is much easier to get your policy wrong when you take this approach.
For example, if you really want to invoke a mail program by calling a shell, you might take a
whitelist approach in which you allow only well-formed email addresses, as discussed inRecipe
3.9. Or you might use a slightly more liberal (less exact) whitelist policy in which you only allow
letters, digits, the @ sign, and periods.
With a blacklist approach, you might try to block out every character that might be leveraged
in an attack. It is hard to be sure that you are not missing something here, particularly if you
try to consider every single operational environment in which your software may be deployed.
For example, if calling out to a shell, you may find all the special characters for the bash shell
and check for those, but leave people using tcsh (or something unusual) open to attack.
You can look for a quoting mechanism, but know how to use it properly.
Sometimes, you really do need to be able to accept arbitrary data from an untrusted source
and use that data in a security-critical way. For example, you might want to be able to put
arbitrary contents from arbitrary documents into a database. In such a case, you might look
for some kind of quoting mechanism. For example, you can usually stick untrusted data in
single quotes in such an environment.
However, you need to be aware of ways in which an attacker can leave the quoted
environment, and you must actively make sure that the attacker does not try to use them. For
example, what happens if the attacker puts a single quote in the data? Will that end the
quoting, allowing the rest of the attacker's data to do malicious things? If there are such
escapes, you should check for them. In this particular example, you might be able to replace
quotes in the attacker's data with a backslash followed by a quote.
When designing your own quoting mechanisms, do not allow escapes.
Following from the previous point, if you need to filter data instead of rejecting potentially
harmful data, it is useful to provide functions that properly quote an arbitrary piece of data for
you. For example, you might have a function that quotes a string for a database, ensuring that
the input will always be interpreted as a single string and nothing more. Such a function would
put quotes around the string and additionally escape anything that could thwart the
surrounding quotes (such as a nested quote).
The better you understand the data, the better you can filter it.
Rough heuristics like "accept the following characters" do not always work well for data
validation. Even if you filter out all bad characters, are the resulting combinations of benign
characters a problem? For example, if you pass untrusted data through a shell, do you want to
take the risk that an attacker might be able to ignore metacharacters but still do some damage
by throwing in a well-placed shell keyword?
The best way to ensure that data is not bad is to do your very best to understand the data and
the context in which that data will be used. Therefore, even if you're passing data on to some
other component, if you need to trust the data before you send it, you should parse it as
accurately as possible. Moreover, in situations where you cannot be accurate, at least be
conservative, and assume that the data is malicious.
3.1.4 See Also
Recipe 1.7, Recipe 3.3, Recipe 3.9, Recipe 3.11
[ Team LiB ]
[ Team LiB ]
3.2 Preventing Attacks on Formatting Functions
3.2.1 Problem
You use functions such as printf( ) or syslog( ) in your program, and you want to ensure that
you use them in such a way that an attacker cannot coerce them into behaving in ways that you do
not intend.
3.2.2 Solution
Functions such as the printf( ) family of functions provide a flexible and powerful way to format
data easily. Unfortunately, they can be extremely dangerous as well. Following the guidelines outlined
in the following Section 3.2.3 will allow you to easily avert many of the problems with these functions.
3.2.3 Discussion
The printf( ) family of functions-and other functions that use them, such as syslog( ) on Unix
systems-all require an argument that specifies a format, as well as a variable number of additional
arguments that are substituted at various locations in the format string to produce formatted output.
The functions come in two major varieties:
Those that output to a file (printf( ) outputs to stdout)
Those that output to a string
Both can be dangerous, but the latter variety is significantly more so.
The format string is copied, character by character, until a percent (%) symbol is encountered. The
characters that immediately follow the percent symbol determine what will be output in their place.
For each substitution in the format string, the next argument in the variable argument list is used.
Because of the way that variable-sized argument lists work in C (see Recipe 13.4), the functions
assume that the number of arguments present in the argument list is equal to the number of
substitutions required by the format string. The GCC compiler in particular will recognize calls to the
functions in the printf( ) family, and it will emit warnings if it detects data type mismatches or an
incorrect number of arguments in the variable argument list.
If you adhere to the following guidelines when using theprintf( ) family of functions, you can be
reasonably certain that you are using the functions safely:
Beware of the "%n" substitution.
All but one of the substitutions recognized by the printf( ) family of functions use arguments
from the variable argument list as data to be substituted into the output. The lone exception is
"%n", which writes the number of bytes written to the output buffer or file into the memory
location pointed to by the next argument in the argument list.
While the "%n" substitution has its place, few programmers are aware of it and its implications.
In particular, if external input is used for the format string, an attacker can embed a "%n"
substitution into the format string to overwrite portions of the stack. The real problem occurs
when all of the arguments in the variable argument list have been exhausted. Because
arguments are passed on the stack in C, the formatting function will write into the stack.
To combat malicious uses of "%n", Immunix has produced a set of patches for glibc 2.2 (the
standard C runtime library for Linux) known as FormatGuard. The patches take advantage of a
GCC compiler extension that allows the preprocessor to distinguish between macros having the
same name, but different numbers of arguments. FormatGuard essentially consists of a large
set of macros for the syslog( ), printf( ), fprintf( ), sprintf( ), and snprintf( )
functions; the macros call safe versions of the respective functions. The safe functions count
the number of substitutions in the format string, and ensure that the proper number of
arguments has been supplied.
Do not use a string from an external source directly as the format specification.
Strings obtained from an external source may contain unexpected percent symbols in them,
causing the formatting function to attempt to substitute arguments that do not exist. If you
need simply to output the string str (to stdout using printf( ), for example), do the
following:
printf("%s", str);
Following this rule to the letter is not always desirable. In particular, your program may need to
obtain format strings from a data file as a consequence of internationalization requirements. The
format strings will vary to some extent depending on the language in use, but they should always
have identical substitutions.
When using vsprintf( ) or sprintf( ) to output to a string, be very careful of using the "%s"
substitution without specifying a precision.
The vsprintf( ) and sprintf( ) functions both assume an infinite amount of space is
available in the buffer into which they write their output. It is especially common to use these
functions with a statically allocated output buffer. If a string substitution is made without
specifying the precision, and that string comes from an external source, there is a good chance
that an attacker may attempt to overflow the static buffer by forcing a string that is too long to
be written into the output buffer. (See Recipe 3.3 for a discussion of buffer overflows.)
One solution is to check the length of the string to be substituted into the output before using it
with vsprintf( ) or sprintf( ). Unfortunately, this solution is error-prone, especially later in
your program's life when another programmer has to make a change to the size of the buffer
or the format string, necessitating a change to the check.
A better solution is to use a precision modifier in the format string. For example, if no more
than 12 characters from a string should ever be substituted into the output, use "%.12s"
instead of simply "%s". The advantage to this solution is that it is part of the formatting
function call; thus, it is less likely to be overlooked in the event of a later change to the format
string.
Avoid using vsprintf( ) and sprintf( ). Use vsnprintf( ) and snprintf( ) or vasprintf( )
and asprintf( ) instead. Alternatively, use a secure string library such as SafeStr (see Recipe
3.4).
The functions vsprintf( ) and sprintf( ) assume that the buffer into which they write their
output is large enough to hold it all. This is never a safe assumption to make and frequently
leads to buffer overflow vulnerabilities. (See Recipe 3.3.)
The functions vasprintf( ) and asprintf( ) dynamically allocate a buffer to hold the
formatted output that is exactly the required size. There are two problems with these
functions, however. The first is that they're not portable. Most modern BSD derivatives
(Darwin, FreeBSD, NetBSD, and OpenBSD) have them, as does Linux. Unfortunately, older
Unix systems and Windows do not. The other problem is that they're slower because they need
to make two passes over the format string, one to calculate the required buffer size, and the
other to actually produce output in the allocated buffer.
The functions vsnprintf( ) and snprintf( ) are just as fast as vsprintf( ) and sprintf(
), but like vasprintf( ) and asprintf( ), they are not yet portable. They are defined in the
C99 standard for C, and they typically enjoy the same availability asvasprintf( ) and
asprintf( ). They both require an additional argument that specifies the length of the output
buffer, and they will never write more data into the buffer than will fit, including theNULL
terminating character.
3.2.4 See Also
FormatGuard from Immunix: http://www.immunix.org/formatguard.html
Recipe 3.3, Recipe 13.4
[ Team LiB ]
[ Team LiB ]
3.3 Preventing Buffer Overflows
3.3.1 Problem
C and C++ do not perform array bounds checking, which turns out to be a security-critical issue,
particularly in handling strings. The risks increase even more dramatically when user-controlled data
is on the program stack (i.e., is a local variable).
3.3.2 Solution
There are many solutions to this problem, but none are satisfying in every situation. You may want to
rely on operational protections such as StackGuard from Immunix, use a library for safe string
handling, or even use a different programming language.
3.3.3 Discussion
Buffer overflows get a lot of attention in the technical world, partially because they constitute one of
the largest classes of security problems in code, but also because they have been around for a long
time and are easy to get rid of, yet still are a huge problem.
Buffer overflows are generally very easy for a C or C++ programmer to understand. An experienced
programmer has invariably written off the end of an array, or indexed into the wrong memory
because she improperly checked the value of the index variable.
Because we assume that you are a C or C++ programmer, we won't insult your intelligence by
explaining buffer overflows to you. If you do not already understand the concept, you can consult
many other software security books, including Building Secure Software by John Viega and Gary
McGraw (Addison Wesley). In this recipe, we won't even focus so much on why buffer overflows are
such a big deal (other resources can help you understand that if you're insatiably curious). Instead,
we'll focus on state-of-the-art strategies for mitigating these problems.
3.3.3.1 String handling
Most languages do not have buffer overflow problems at all, because they ensure that writes to
memory are always in bounds. This can sometimes be done at compile time, but generally it is done
dynamically, right before data gets written. The C and C++ philosophy is different-you are given the
ability to eke out more speed, even if it means that you risk shooting yourself in the foot.
Unfortunately, in C and C++, it is not only possible to overflow buffers but also easy, particularly
when dealing with strings. The problem is that C strings are not high-level data types; they are
arrays of characters. The major consequence of this nonabstraction is that the language does not
manage the length of strings; you have to do it yourself. The only time C ever cares about the length
of a string is in the standard library, and the length is not related to the allocated size at all-instead,
it is delimited by a 0-valued (NULL) byte. Needless to say, this can be extremely error-prone.
One of the simplest examples is the ANSI C standard library function,gets( ):
char *gets(char *str);
This function reads data from the standard input device into the memory pointed to bystr until
there is a newline or until the end of file is reached. It then returns a pointer to the buffer. In
addition, the function NULL-terminates the buffer.
If the buffer in question is a local variable or otherwise lives on the program stack, then the attacker
can often force the program to execute arbitrary code by overwriting important data on the stack.
This is called a stack-smashing attack. Even when the buffer is heap-allocated (that is, it is allocated
with malloc() or new(), a buffer overflow can be security-critical if an attacker can write over critical
data that happens to be in nearby memory.
The problem with this function is that, no matter how big the buffer is, an attacker can always stick
more data into the buffer than it is designed to hold, simply by avoiding the newline.
There are plenty of other places where it is easy to overflow strings. Pretty much any time you
perform an operation that writes to a "string," there is room for a problem. One famous example is
strcpy( ):
char *strcpy(char *dst, const char *src);
This function copies bytes from the address indicated by src into the buffer pointed to by dst, up to
and including the first NULL byte in src. Then it returns dst. No effort is made to ensure that the dst
buffer is big enough to hold the contents of the src buffer. Because the language does not track
allocated sizes, there is no way for the function to do so.
To help alleviate the problems with functions like strcpy( ) that have no way of determining
whether the destination buffer is big enough to hold the result from their respective operations, there
are also functions like strncpy( ):
char *strncpy(char *dst, const char *src, size_t len);
The strncpy( ) function is certainly an improvement over strcpy( ), but there are still problems
with it. Most notably, if the source buffer contains more data than the limit imposed by thelen
argument, the destination buffer will not be NULL-terminated. This means the programmer must
ensure the destination buffer is NULL-terminated. Unfortunately, the programmer often forgets to do
so; there are two reasons for this failure:
It's an additional step for what should be a simple operation.
Many programmers do not realize that the destination buffer may not be NULL-terminated.
The problems with strncpy( ) are further complicated by the fact that a similar function, strncat(
), treats its length-limiting argument in a completely different manner. The difference in behavior
serves only to confuse programmers, and more often than not, mistakes are made. Certainly, we
recommend using strncpy( ) over using strcpy( ); however, there are better solutions.
OpenBSD 2.4 introduced two new functions, strlcpy( ) and strlcat( ), that are consistent in their
behavior, and they provide an indication back to the caller of how much space in the destination
buffer would be required to successfully complete their respective operations without truncating the
results. For both functions, the length limit indicates the maximum size of the destination buffer, and
the destination buffer is always NULL-terminated, even if the destination buffer must be truncated.
Unfortunately, strlcpy( ) and strlcat( ) are not available on all platforms; at present, they seem
to be available only on Darwin, FreeBSD, NetBSD, and OpenBSD. Fortunately, they are easy to
implement yourself-but you don't have to, because we provide implementations here:
#include <sys/types.h>
#include <string.h>
size_t strlcpy(char *dst, const char *src, size_t size) {
char
*dstptr = dst;
size_t
tocopy = size;
const char *srcptr = src;
if (tocopy && --tocopy) {
do {
if (!(*dstptr++ = *srcptr++)) break;
} while (--tocopy);
}
if (!tocopy) {
if (size) *dstptr = 0;
while (*srcptr++);
}
return (srcptr - src - 1);
}
size_t strlcat(char *dst, const char *src, size_t size) {
char
*dstptr = dst;
size_t
dstlen, tocopy = size;
const char *srcptr = src;
while (tocopy-- && *dstptr) dstptr++;
dstlen = dstptr - dst;
if (!(tocopy = size - dstlen)) return (dstlen + strlen(src));
while (*srcptr) {
if (tocopy != 1) {
*dstptr++ = *srcptr;
tocopy--;
}
srcptr++;
}
*dstptr = 0;
return (dstlen + (srcptr - src));
}
As part of its security push, Microsoft has developed a new set of string-handling functions for C and
C++ that are defined in the header file strsafe.h. The new functions handle both ANSI and Unicode
character sets, and each function is available in byte count and character count versions. For more
information regarding using strsafe.h functions in your Windows programs, visit the Microsoft
Developer's Network (MSDN) reference for strsafe.h.
All of the string-handling improvements we've discussed so far operate using traditional C-styleNULLterminated strings. While strlcat( ), strlcpy( ), and Microsoft's new string-handling functions are
vast improvements over the traditional C string-handling functions, they all still require diligence on
the part of the programmer to maintain information regarding the allocated size of destination
buffers.
An alternative to using traditional C style strings is to use theSafeStr library, which is available from
http://www.zork.org/safestr/. The library is a safe string implementation that provides a new, highlevel data type for strings, tracks accounting information for strings, and performs many other
operations. For interoperability purposes, SafeStr strings can be passed to C string functions, as long
as those functions use the string in a read-only manner. (We discussSafeStr in some detail in Recipe
3.4.)
Finally, applications that transfer strings across a network should consider including a string's length
along with the string itself, rather than requiring the recipient to rely on finding theNULL-terminating
character to determine the length of the string. If the length of the string is known up front, the
recipient can allocate a buffer of the proper size up front and read the appropriate amount of data
into it. The alternative is to read byte-by-byte, looking for the NULL-terminator, and possibly
repeatedly resizing the buffer. Dan J. Bernstein has defined a convention called Netstrings
(http://cr.yp.to/proto/netstrings.txt) for encoding the length of a string with the strings. This protocol
simply has you send the length of the string represented in ASCII, then a colon, then the string itself,
then a trailing comma. For example, if you were to send the string "Hello, World!" over a network,
you would send:
14:Hello, World!,
Note that the Netstrings representation does not include the NULL-terminator, as that is really part of
the machine-specific representation of a string, and is not necessary on the network.
3.3.3.2 Using C++
When using C++, you generally have a lot less to worry about when using the standard C++ string
library, std::string. This library is designed in such a way that buffer overflows are less likely.
Standard I/O using the stream operators (>> and <<) is safe when using the standard C++ string
type.
However, buffer overflows when using strings in C++ are not out of the question. First, the
programmer may choose to use old fashioned C API functions, which work fine in C++ but are just as
risky as they are in C. Second, while C++ usually throws an out_of_range exception when an
operation would overflow a buffer, there are two cases where it doesn't.
The first problem area occurs when using the subscript operator, []. This operator doesn't perform
bounds checking for you, so be careful with it.
The second problem area occurs when using C-style strings with the C++ standard library. C-style
strings are always a risk, because even C++ doesn't know how much memory is allocated to a string.
Consider the following C++ program:
#include <iostream.h>
// WARNING: This code has a buffer overflow in it.
int main(int argc, char *argv[]) {
char buf[12];
cin >> buf;
cout << "You said... " << buf << endl;
}
If you compile the above program without optimization, then you run it, typing in more than 11
printable ASCII characters (remember that C++ will add a NULL to the end of the string), the
program will either crash or print out more characters than buf can store. Those extra characters get
written past the end of buf.
Also, when indexing a C-style string through C++, C++ always assumes that the indexing is valid,
even if it isn't.
Another problem occurs when converting C++-style strings to C-style strings. If you use
string::c_str() to do the conversion, you will get a properly NULL-terminated C-style string.
However, if you use string::data(), which writes the string directly into an array (returning a
pointer to the array), you will get a buffer that is not NULL-terminated. That is, the only difference
between c_str() and data() is that c_str() adds a trailing NULL.
One final point with regard to C++ is that there are plenty of applications not using the standard
string library, that are instead using third-party libraries. Such libraries are of varying quality when it
comes to security. We recommend using the standard library if at all possible. Otherwise, be careful
in understanding the semantics of the library you do use, and the possibilities for buffer overflow.
3.3.3.3 Stack protection technologies
In C and C++, memory for local variables is allocated on the stack. In addition, information
pertaining to the control flow of a program is also maintained on the stack. If an array is allocated on
the stack, and that array is overrun, an attacker can overwrite the control flow information that is
also stored on the stack. As we mentioned earlier, this type of attack is often referred to as astacksmashing attack.
Recognizing the gravity of stack-smashing attacks, several technologies have been developed that
attempt to protect programs against them. These technologies take various approaches. Some are
implemented in the compiler (such as Microsoft's /GS compiler flag and IBM's ProPolice), while others
are dynamic runtime solutions (such as Avaya Labs's LibSafe).
All of the compiler-based solutions work in much the same way, although there are some differences
in the implementations. They work by placing a "canary" (which is typically some random value) on
the stack between the control flow information and the local variables. The code that is normally
generated by the compiler to return from the function is modified to check the value of the canary on
the stack, and if it is not what it is supposed to be, the program is terminated immediately.
The idea behind using a canary is that an attacker attempting to mount a stack-smashing attack will
have to overwrite the canary to overwrite the control flow information. By choosing a random value
for the canary, the attacker cannot know what it is and thus be able to include it in the data used to
"smash" the stack.
When a program is distributed in source form, the developer of the program cannot enforce the use
of StackGuard or ProPolice because they are both nonstandard extensions to the GCC compiler. It is
the responsibility of the person compiling the program to make use of one of these technologies. On
the other hand, although it is rare for Windows programs to be distributed in source form, the/GS
compiler flag is a standard part of the Microsoft Visual C++ compiler, and the program's build scripts
(whether they are Makefiles, DevStudio project files, or something else entirely) can enforce the use
of the flag.
For Linux systems, Avaya Labs' LibSafe technology is not implemented as a compiler extension, but
instead takes advantage of a feature of the dynamic loader that causes a dynamic library to be
preloaded with every executable. Using LibSafe does not require the source code for the programs it
protects, and it can be deployed on a system-wide basis.
LibSafe replaces the implementation of several standard functions that are known to be vulnerable to
buffer overflows, such as gets( ), strcpy( ), and scanf( ). The replacement implementations
attempt to compute the maximum possible size of a statically allocated buffer used as a destination
buffer for writing using a GCC built-in function that returns the address of the frame pointer. That
address is normally the first piece of information on the stack after local variables. If an attempt is
made to write more than the estimated size of the buffer, the program is terminated.
Unfortunately, there are several problems with the approach taken by LibSafe. First, it cannot
accurately compute the size of a buffer; the best it can do is limit the size of the buffer to the
difference between the start of the buffer and the frame pointer. Second, LibSafe's protections will
not work with programs that were compiled using the -fomit-frame-pointer flag to GCC, an
optimization that causes the compiler not to put a frame pointer on the stack. Although relatively
useless, this is a popular optimization for programmers to employ. Finally,LibSafe will not work on
setuid binaries without static linking or a similar trick.
In addition to providing protection against conventional stack-smashing attacks, the newest versions
of LibSafe also provide some protection against format-string attacks (see Recipe 3.2). The formatstring protection also requires access to the frame pointer because it attempts to filter out arguments
that are not pointers into the heap or the local variables on the stack.
3.3.4 See Also
MSDN reference for strsafe.h: http://msdn.microsoft.com/library/enus/winui/winui/windowsuserinterface/resources/strings/usingstrsafefunctions.asp
SafeStr from Zork: http://www.zork.org/safestr/
StackGuard from Immunix: http://www.immunix.org/stackguard.html
ProPolice from IBM: http://www.trl.ibm.com/projects/security/ssp/
LibSafe from Avaya Labs: http://www.research.avayalabs.com/project/libsafe/
Netstrings by Dan J. Bernstein: http://cr.yp.to/proto/netstrings.txt
Recipe 3.2, Recipe 3.4
[ Team LiB ]
[ Team LiB ]
3.4 Using the SafeStr Library
3.4.1 Problem
You want an alternative to using the standard C string-manipulation functions to help avoid buffer
overflows (see Recipe 3.3), format-string problems (see Recipe 3.2), and the use of unchecked
external input.
3.4.2 Solution
Use the SafeStr library, which is available from http://www.zork.org/safestr/.
3.4.3 Discussion
The SafeStr library provides an implementation of dynamically sizable strings in C. In addition, the
library also performs reference counting and accounting of the allocated and actual sizes of each
string. Any attempt to increase the actual size of a string beyond its allocated size causes the library
to increase the allocated size of the string to a size at least as large. Because strings managed by
SafeStr ("safe strings") are dynamically sized, safe strings are not a source of potential buffer
overflows. (See Recipe 3.3.)
Safe strings use the type safestr_t, which can actually be cast to the normal C-style string type,
char *, though we strongly recommend against doing so where it can be avoided. In fact, the only
time you should ever cast a safe string to a normal C-style string is for read-only purposes. This is
also the only reason why the safestr_t type was designed in a way that allows casting to normal Cstyle strings.
Casting a safe string to a normal C-style string and modifying it using C-style
string-manipulation functions or other means defeats the protections and
accounting afforded by the SafeStr library.
The SafeStr library provides a rich set of API functions to manipulate the strings it manages. The
large number of functions prohibits us from enumerating them all here, but note that the library
comes with complete documentation in the form of Unix man pages, HTML, and PDF.Table 3-1 lists
the functions that have C equivalents, along with those equivalents.
Table 3-1. SafeStr API functions and equivalents for normal C strings
SafeStr function
C function
safestr_append( )
strcat( )
safestr_nappend( )
strncat( )
safestr_find( )
strstr( )
safestr_copy( )
strcpy( )
safestr_ncopy( )
strncpy( )
safestr_compare( )
strcmp( )
safestr_ncompare( )
strncmp( )
safestr_length( )
strlen( )
safestr_sprintf( )
sprintf( )
safestr_vsprintf( )
vsprintf( )
You can typically create safe strings in any of the following three ways:
SAFESTR_ALLOC( )
Allocates a resizable string with an initial allocation size in bytes as specified by its only
argument. The string returned will be an empty string (actual size zero). Normally the size
allocated for a string will be larger than the actual size of the string. The library rounds
memory allocations up, so if you know that you will need a large string, it is worth allocating it
with a large initial allocation size up front to avoid reallocations as the actual string length
grows.
SAFESTR_CREATE( )
Creates a resizable string from the normal C-style string passed as its only argument. This is
normally the appropriate way to convert a C-style string to a safe string.
SAFESTR_TEMP( )
Creates a temporary resizable string from the normal C-style string passed as its only
argument. SAFESTR_CREATE( ) and SAFESTR_TEMP( ) behave similarly, except that a string
created by SAFESTR_TEMP( ) will be automatically destroyed by the next SafeStr function that
uses it. The only exception is safestr_reference( ), which increments the reference count
on the string, allowing it to survive until safestr_release( ) or safestr_free( ) is called to
decrement the string's reference count.
People are sometimes confused about when actually to use SAFESTR_TEMP( ), as well as how to use
it properly. Use SAFESTR_TEMP( ) when you need to pass a constant string as an argument to a
function that is expecting a safestr_t. A perfect example of such a case would be
safestr_sprintf( ), which has the following signature:
int safestr_sprintf(safestr_t *output, safestr_t *fmt, ...);
The string that specifies the format must be a safe string, but because you should always use
constant strings for the format specification (see Recipe 3.2), you should use SAFESTR_TEMP( ). The
alternative is to use SAFESTR_CREATE( ) to create the string before calling safestr_sprintf( ),
and free it immediately afterward with safestr_free( ).
int
i = 42;
safestr_t fmt, output;
output = SAFESTR_ALLOC(1);
/* Instead of doing this: */
fmt = SAFESTR_CREATE("The value of i is %d.\n");
safestr_sprintf(&output, fmt, i);
safestr_free(fmt);
/* You can do this: */
safestr_sprintf(&output, SAFESTR_TEMP("The value of i is %d.\n"), i);
When using temporary strings, remember that the temporary string will be destroyed automatically
after a call to any SafeStr API function except safestr_reference( ), which will increment the
string's reference count. If a temporary string's reference count is incremented, the string will then
survive any number of API calls until its reference count is decremented to the extent that it will be
destroyed. The API functions safestr_release( ) and safestr_free( ) may be used
interchangeably to decrement a string's reference count.
For example, if you are writing a function that accepts a safestr_t as an argument (which may or
may not be passed as a temporary string) and you will be performing multiple operations on the
string, you should increment the string's reference count before operating on it, and decrement it
again when you are finished. This will ensure that the string is not prematurely destroyed if a
temporary string is passed in to the function.
void some_function(safestr_t *base, safestr_t extra) {
safestr_reference(extra);
if (safestr_length(*base) + safestr_length(extra) < 17)
safestr_append(base, extra);
safestr_release(extra);
}
In this example, if you omitted the calls to safestr_reference( ) and safestr_release( ), and if
extra was a temporary string, the call to safestr_length( ) would cause the string to be
destroyed. As a result, the safestr_append( ) call would then be operating on an invalid safestr_t
if the combined length of base and extra were less than 17.
Finally, the SafeStr library also tracks the trustworthiness of strings. A string can be either trusted or
untrusted. Operations that combine strings result in untrusted strings if any one of the strings
involved in the combination is untrusted; otherwise, the result is trusted. There are few places in
SafeStr's API where the trustworthiness of a string is tested, but the functionsafestr_istrusted( )
allows you to test strings yourself.
The strings that result from using SAFESTR_CREATE( ) or SAFESTR_TEMP( ) are untrusted. You can
use SAFESTR_TEMP_TRUSTED( ) to create temporary strings that are trusted. The trustworthiness of
an existing string can be altered using safestr_trust( ) to make it trusted or safestr_untrust( )
to make it untrusted.
The main reason to track the trustworthiness of a string is to monitor the flow of external inputs.
Safe strings created from external data should initially be untrusted. If you later verify the contents
of a string, ensuring that it contains nothing dangerous, you can then mark the string as trusted.
Whenever you need to use a string to perform some potentially dangerous operation (for example,
using a string in a command-line argument to an external program), check the trustworthiness of the
string before you use it, and fail appropriately if the string is untrusted.
3.4.4 See Also
SafeStr: http://www.zork.org/safestr/
Recipe 3.2, Recipe 3.3
[ Team LiB ]
[ Team LiB ]
3.5 Preventing Integer Coercion and Wrap-Around
Problems
3.5.1 Problem
When using integer values, it is possible to make values go out of range in ways that are not obvious.
In some cases, improperly validated integer values can lead to security problems, particularly when
data gets truncated or when it is converted from a signed value to an unsigned value or vice versa.
Unfortunately, such conversions often happen behind your back.
3.5.2 Solution
Unfortunately, integer coercion and wrap-around problems currently require you to be diligent.
Best practices for such problems require that you validate any coercion that takes place. To do this,
you need to understand the semantics of the library functions you use well enough to know when
they may implicitly cast data.
In addition, you should explicitly check for cases where integer data may wrap around. It is
particularly important to perform wrap-around checks immediately before using data.
3.5.3 Discussion
Integer type problems are often quite subtle. As a result, they are very difficult to avoid and very
difficult to catch unless you are exceedingly careful. There are several different ways that these
problems can manifest themselves, but they always boil down to a type mismatch. In the following
subsections, we'll illustrate the various classes of integer type errors with examples.
3.5.3.1 Signed-to-unsigned coercion
Many API functions take only positive values, and programmers often take advantage of that fact.
For example, consider the following code excerpt:
if (x < MAX_SIZE) {
if (!(ptr = (unsigned char *)malloc(x))) abort(
} else {
/* Handle the error condition ... */
}
);
We might test against MAX_SIZE to protect against denial of service problems where an attacker
causes us to allocate a large amount of memory. At first glance, the previous code seems to protect
against that. Indeed, some people will worry about what happens in the case where someone tries to
malloc( ) a negative number of bytes.
It turns out that malloc( )'s argument is of type size_t, which is an unsigned type. As a result, any
negative numbers are converted to positive numbers. Therefore, we do not have to worry about
allocating a negative number of bytes; it cannot happen.
However, the previous code may still not work correctly. The key to its correct operation is the data
type of x. If x is some signed data type, such as an int, and is a negative value, we will end up
allocating a large amount of data. For example, if an attacker manages to setx to -1, the call to
malloc( ) will try to allocate 4,294,967,295 bytes on most platforms, because the hexadecimal
value of that number (0xFFFFFFF) is the same hexadecimal representation of a signed 32-bit -1.
There are a few ways to alleviate this particular problem:
You can make sure never to use signed data types. Unfortunately, that is not very
practical-particularly when you are using API functions that take both signed and unsigned
values. If you try to ensure that all your data is always unsigned, you might end up with an
unsigned-to-signed conversion problem when you call a library function that takes a regularint
instead of an unsigned int or a size_t.
You can check to make sure x is not negative while it is still signed. There is nothing wrong with
this solution. Basically, you are always assuming the worst (that the data may be cast), and it
might not be.
You can cast x to a size_t before you do your testing. This is a good strategy for those who
prefer testing data as close as possible to the state in which it is going to be used to prevent an
unanticipated change in the meantime. Of course, the cast to a signed value might be
unanticipated for the many programmers out there who do not know that size_t is not a
signed data type. For those people, the second solution makes more sense.
No matter what solution you prefer, you will need to be diligent about conversions that might apply
to your data when you perform your bounds checking.
3.5.3.2 Unsigned-to-signed coercion
Problems may also occur when an unsigned value gets converted to a signed value. For example,
consider the following code:
int main(int argc, char *argv[ ]) {
char
foo[ ] = "abcdefghij";
char
*p = foo + 4;
unsigned int x = 0xffffffff;
if (p + x > p + strlen(p)) {
printf("Buffer overflow!\n");
return -1;
}
printf("%s\n", p + x);
return 0;
}
The poor programmer who wrote this code is properly preventing from reading past the high end of
p, but he probably did not realize that the pointers are signed. Becausex is -1 once it is cast to a
signed value, the result of p + x will be the byte of memory immediately preceding the address to
which p points.
While this code is a contrived example, this is still a very real problem. For example, say you have an
array of fixed-size records. The program might wish to write arbitrary data into a record where the
user supplies the record number, and the program might calculate the memory address of the item
of interest dynamically by multiplying the record number by the size of a record, and then adding
that to the address at which the records begin. Generally, programmers will make sure the item
index is not too high, but they may not realize that the index might be too low!
In addition, it is good to remember that array accesses are rewritten as pointer arithmetic. For
example, arr[x] can index memory before the start of your array if x is less than 0 once converted
to a signed integer.
3.5.3.3 Size mismatches
You may also encounter problems when an integer type of one size gets converted to an integer type
of another size. For example, suppose that you store an unsigned 64-bit quantity inx, then pass x to
an operation that takes an unsigned 32-bit quantity. In C, the upper 32 bits will get truncated.
Therefore, if you need to check for overflow, you had better do it before the cast happens!
Conversely, when there is an implicit coercion from a small value to a large value, remember that the
sign bit will probably extend out, which may not be intended. That is, when C converts a signed value
to a different-sized signed value, it does not simply start treating the same bits as a signed value.
When growing a number, C will make sure that it retains the same value it once had, even if the
binary representation is different. When shrinking the value, C may truncate, but even if it does, the
sign will be the same as it was before truncation, which may result in an unexpected binary
representation.
For example, you might have a string declared as a char *, then want to treat the bytes as integers.
Consider the following code:
int main(int argc, char *argv[
int x = 0;
]) {
if (argc > 1) x += argv[1][0];
printf("%d\n", x);
}
If argv[1][0] happens to be 0xFF, x will end up -1 instead of 255! Even if you declare x to be an
unsigned int, you will still end up with x being 0xFFFFFFFF instead of the desired 0xFF, because C
converts size before sign. That is, a char will get sign-extended into an int before being coerced into
an unsigned int.
3.5.3.4 Wrap-around
A very similar problem (with the same remediation strategy as those described in previous
subsections) occurs when a variable wraps around. For example, when you add 1 to the maximum
unsigned value, you will get zero. When you add 1 to the maximum signed value, you will get the
minimum possible signed value.
This problem often crops up when using a high-precision clock. For example, some people use a 32bit real-time clock, then check to see if one event occurs before another by testing the clock. Of
course, if the clock rolls over (a millisecond clock that uses an unsigned 32-bit value will wrap around
every 49.71 days or so), the result of your test is likely to be wrong!
In any case, you should be keeping track of wrap-arounds and taking appropriate measures when
they occur. Often, when you're using a real-time clock, you can simply use a clock with more
precision. For example, recent x86 chips offer the RDTSC instruction, which provides 64 bits of
precision. (See Recipe 4.14.)
3.5.4 See Also
Recipe 4.14
[ Team LiB ]
[ Team LiB ]
3.6 Using Environment Variables Securely
3.6.1 Problem
You need to obtain the value of, alter the value of, or delete an environment variable.
3.6.2 Solution
A process inherits its environment variables from its parent process. While the parent process most
often will not do anything to tarnish the environment passed on to its children, your program's
environment variables are still external inputs, and you must therefore treat them as such.
The process that parents your own process could be a malicious process that has manipulated the
environment in an attempt to confuse your program and exploit that confusion to nefarious ends. As
much as possible, it is best to avoid depending on the environment, but we recognize that is not
always possible.
3.6.3 Discussion
In the following subsections, we'll look at obtaining the value of an environment variable as well as
changing and deleting environment variables.
3.6.3.1 Obtaining the value of an environment variable
The normal means by which you obtain the value of an environment variable is by callinggetenv( )
with the name of the environment variable whose value is to be retrieved. The problem withgetenv(
) is that it simply returns a pointer into the environment, rather than returning a copy of the
environment variable's value.
If you do not immediately make a copy of the value returned by getenv( ), but instead store the
pointer somewhere for later use, you could end up with a dangling pointer or a different value
altogether, if the environment is modified between the time that you calledgetenv( ) and the time
you use the pointer it returns.
There is a race condition here even after you call getenv() and before you
copy. Be careful to only manipulate the process environment from a single
thread at a time.
Never make any assumptions about the length or the contents of an environment variable's value. It
can be extremely dangerous to simply copy the value into a statically allocated buffer or even a
dynamically allocated buffer that was not allocated based on the actual size of the environment
variable's value. Always compute the size of the environment variable's value yourself, and
dynamically allocate a buffer to hold the copy.
Another problem with environment variables is that a malicious program could manipulate the
environment so that two or more environment variables with the same name exist in your process's
environment. It is easy to detect this situation, but it usually is not worth concerning yourself with it.
Most, if not all, implementations of getenv( ) will always return the first occurrence of an
environment variable.
As a convenience, you can use the function spc_getenv( ), shown in the following code, to obtain
the value of an environment variable. It will return a copy of the environment variable's value
allocated with strdup( ), which means that you will be responsible for freeing the memory with
free( ).
#include <stdlib.h>
#include <string.h>
char *spc_getenv(const char *name) {
char *value;
if (!(value = getenv(name))) return 0;
return strdup(value);
}
3.6.3.2 Changing the value of an environment variable
The standard C runtime function putenv( ) is normally used to modify the value of an environment
variable. In some implementations, putenv( ) can even be used to delete environment variables,
but this behavior is nonstandard and therefore is not portable. If you have sanitized the environment
as described in Recipe 1.1, and particularly if you use the code in that recipe, usingputenv( ) could
cause problems because of the way that code manages the memory allocated to the environment.
We recommend that you avoid using the putenv( ) function altogether.
Another reason to avoid putenv( ) is that an attacker could have manipulated the environment
before spawning your process, in such a way that two or more environment variables share the same
name. You want to make certain that changing the value of an environment variable actually changes
it. If you use the code from Recipe 1.1, you can be reasonably certain that there is only one
environment variable for each name.
Instead of using putenv( ) to modify the value of an environment variable, use spc_putenv( ),
shown in the following code. It will properly handle an environment as the code inRecipe 1.1 builds
it, as well as an unaltered environment. In addition to modifying the value of an environment
variable, spc_putenv( ) is also capable of adding new environment variables.
We have not copied putenv( )'s signature with spc_putenv( ). If you use putenv( ), you must
pass it a string of the form "NAME=VALUE". If you use spc_putenv( ), you must pass it two strings;
the first string is the name of the environment variable to modify or add, and the second is the value
to assign to the environment variable. If an error occurs, spc_putenv( ) will return -1; otherwise, it
will return 0.
Note that the following code is not thread-safe. You need to explicitly avoid the possibility of
manipulating the environment from two separate threads at the same time.
#include <stdlib.h>
#include <string.h>
static int spc_environ;
int spc_putenv(const char *name, const char *value) {
int
del = 0, envc, i, mod = -1;
char
*envptr, **new_environ;
size_t
delsz = 0, envsz = 0, namelen, valuelen;
extern char **environ;
/* First compute the amount of memory required for the new environment */
namelen = strlen(name);
valuelen = strlen(value);
for (envc = 0; environ[envc]; envc++) {
if (!strncmp(environ[envc], name, namelen) && environ[envc][namelen] = = '=') {
if (mod = = -1) mod = envc;
else {
del++;
delsz += strlen(environ[envc]) + 1;
}
}
envsz += strlen(environ[envc]) + 1;
}
if (mod = = -1) {
envc++;
envsz += (namelen + valuelen + 1 + 1);
}
envc -= del;
/* account for duplicate entries of the same name */
envsz -= delsz;
/* allocate memory for the new environment */
envsz += (sizeof(char *) * (envc + 1));
if (!(new_environ = (char **)malloc(envsz))) return 0;
envptr = (char *)new_environ + (sizeof(char *) * (envc + 1));
/* copy the old environment into the new environment, replacing the named
* environment variable if it already exists; otherwise, add it at the end.
*/
for (envc = i = 0; environ[envc]; envc++) {
if (del && !strncmp(environ[envc], name, namelen) &&
environ[envc][namelen] = = '=') continue;
new_environ[i++] = envptr;
if (envc != mod) {
envsz = strlen(environ[envc]);
memcpy(envptr, environ[envc], envsz + 1);
envptr += (envsz + 1);
} else {
memcpy(envptr, name, namelen);
memcpy(envptr + namelen + 1, value, valuelen);
envptr[namelen] = '=';
envptr[namelen + valuelen + 1] = 0;
envptr += (namelen + valuelen + 1 + 1);
}
}
if (mod = = -1) {
new_environ[i++] = envptr;
memcpy(envptr, name, namelen);
memcpy(envptr + namelen + 1, value, valuelen);
envptr[namelen] = '=';
envptr[namelen + valuelen + 1] = 0;
}
new_environ[i] = 0;
/* possibly free the old environment, then replace it with the new one */
if (spc_environ) free(environ);
environ = new_environ;
spc_environ = 1;
return 1;
}
3.6.3.3 Deleting an environment variable
No method for deleting an environment variable is defined in any standard. Some implementations of
putenv( ) will delete environment variables if the assigned value is a zero-length string. Other
systems provide implementations of a function called unsetenv( ), but it is nonstandard and thus
nonportable.
None of these methods of deleting environment variables take into account the possibility that
multiple occurrences of the same environment variable may exist in the environment. Usually, only
the first occurrence will be deleted, rather than all of them. The result is that the environment
variable won't actually be deleted because getenv( ) will return the next occurrence of the
environment variable.
Especially if you use the code from Recipe 1.1 to sanitize the environment, or if you use the code
from the previous subsection, you should use spc_delenv( ) to delete an environment variable. The
following code for spc_delenv( ) depends on the static variable spc_environ declared at global
scope in the spc_putenv( ) code from the previous subsection; the two functions should share the
same instance of that variable.
Note that the following code is not thread-safe. You need to explicitly avoid the possibility of
manipulating the environment from two separate threads at the same time.
#include <stdlib.h>
#include <string.h>
int spc_delenv(const char *name) {
int
del = 0, envc, i, idx = -1;
size_t
delsz = 0, envsz = 0, namelen;
char
*envptr, **new_environ;
extern int spc_environ;
extern char **environ;
/* first compute the size of the new environment */
namelen = strlen(name);
for (envc = 0; environ[envc]; envc++) {
if (!strncmp(environ[envc], name, namelen) && environ[envc][namelen] =
if (idx = = -1) idx = envc;
else {
del++;
delsz += strlen(environ[envc]) + 1;
}
}
envsz += strlen(environ[envc]) + 1;
}
if (idx = = -1) return 1;
envc -= del;
/* account for duplicate entries of the same name */
envsz -= delsz;
/* allocate memory for the new environment */
envsz += (sizeof(char *) * (envc + 1));
if (!(new_environ = (char **)malloc(envsz))) return 0;
envptr = (char *)new_environ + (sizeof(char *) * (envc + 1));
/* copy the old environment into the new environment, ignoring any
* occurrences of the environment variable that we want to delete.
*/
for (envc = i = 0; environ[envc]; envc++) {
if (envc = = idx || (del && !strncmp(environ[envc], name, namelen) &&
environ[envc][namelen] = = '=')) continue;
new_environ[i++] = envptr;
envsz = strlen(environ[envc]);
memcpy(envptr, environ[envc], envsz + 1);
envptr += (envsz + 1);
}
/* possibly free the old environment, then replace it with the new one */
if (spc_environ) free(environ);
environ = new_environ;
spc_environ = 1;
return 1;
}
3.6.4 See Also
Recipe 1.1
[ Team LiB ]
= '=') {
[ Team LiB ]
3.7 Validating Filenames and Paths
3.7.1 Problem
You need to resolve the path of a file provided by a user to determine the actual file that it refers to
on the filesystem.
3.7.2 Solution
On Unix systems, use the function realpath( ) to resolve the canonical name of a file or path. On
Windows, use the function GetFullPathName( ) to resolve the canonical name of a file or path.
3.7.3 Discussion
You must be careful when making access decisions for a file. Taking relative pathnames and links into
account, it is possible for multiple filenames to refer to the same file. Failure to take this into account
when attempting to perform access checks based on filename can have severe consequences.
On the surface, resolving the canonical name of a file or path may appear to be a reasonably simple
task to undertake. However, many programmers fail to consider symbolic and hard links. On
Windows, links are possible, but they are not as serious an issue as they are on Unix because they
are much less frequently used.
Fortunately, most modern Unix systems provide, as part of the standard C runtime, a function called
realpath( ) that will properly resolve the canonical name of a file or path, taking relative paths and
links into account. Be careful when using realpath( ) because the function is not thread-safe, and
the resolved path is stored in a fixed-size buffer that must be at leastMAXPATHLEN bytes in size.
The function realpath( ) is not thread-safe because it changes the current
directory as it resolves the path. On Unix, a process has a single current
directory, regardless of how many threads it has, so changing the current
directory in one thread will affect all other threads within the process.
The signature for realpath( ) is:
char *realpath(const char *pathname, char resolved_path[MAXPATHLEN]);
This function has the following arguments:
pathname
Path to be resolved.
resolved_path
Buffer into which the resolved path will be written. It must be at leastMAXPATHLEN bytes in
size. realpath( ) will never write more than that into the buffer, including the NULLterminating byte.
If the function fails for any reason, the return value will beNULL, and errno will contain an error code
indicating the reason for the failure. If the function is successful, a pointer toresolved_path will be
returned.
On Windows, there is an equivalent function to realpath( ) called GetFullPathName( ). It will
resolve relative paths, link information, and even UNC (Microsoft's Universal Naming Convention)
names. The function is more flexible than its Unix counterpart in that it is thread-safe and provides
an interface to allow you to dynamically allocate enough memory to hold the resolved canonical path.
The signature for GetFullPathName( ) is:
DWORD GetFullPathName(LPCTSTR lpFileName, DWORD nBufferLength, LPTSTR lpBuffer,
LPTSTR *lpFilePath);
This function has the following arguments:
lpFileName
Path to be resolved.
nBufferLength
Size of the buffer, in characters, into which the resolved path will be written.
lpBuffer
Buffer into which the resolved path will be written.
lpFilePart
Pointer into lpBuffer that points to the filename portion of the resolved path.
GetFullPathName( ) will set this pointer on return if it is successful in resolving the path.
When you initially call GetFullPathName( ), you should specifiy NULL for lpBuffer, and 0 for
nBufferLength. When you do this, the return value from GetFullPathName( ) will be the number of
characters required to hold the resolved path. After you allocate the necessary buffer space, call
GetFullPathName( ) again with nBufferLength and lpBuffer filled in appropriately.
GetFullPathName( ) requires the length of the buffer to be specified in
characters, not bytes. Likewise, the return value from the function will be in
units of characters rather than bytes. When allocating memory for the buffer,
be sure to multiply the number of characters by sizeof(TCHAR).
If an error occurs in resolving the path, GetFullPathName( ) will return 0, and you can call
GetLastError( ) to determine the cause of the error; otherwise, it will return the number of
characters written into lpBuffer.
In the following example, SpcResolvePath( ) demonstrates how to use GetFullPathName( )
properly. If it is successful, it will return a dynamically allocated buffer that contains the resolved
path; otherwise, it will return NULL. The allocated buffer must be freed by calling LocalFree( ).
#include <windows.h>
LPTSTR SpcResolvePath(LPCTSTR lpFileName) {
DWORD dwLastError, nBufferLength;
LPTSTR lpBuffer, lpFilePart;
if (!(nBufferLength = GetFullPathName(lpFileName, 0, 0, &lpFilePart))) return 0;
if (!(lpBuffer = (LPTSTR)LocalAlloc(LMEM_FIXED, sizeof(TCHAR) * nBufferLength)))
return 0;
if (!GetFullPathName(lpFileName, nBufferLength, lpBuffer, &lpFilePart)) {
dwLastError = GetLastError( );
LocalFree(lpBuffer);
SetLastError(dwLastError);
return 0;
}
return lpBuffer;
}
[ Team LiB ]
[ Team LiB ]
3.8 Evaluating URL Encodings
3.8.1 Problem
You need to decode a Uniform Resource Locator (URL).
3.8.2 Solution
Iterate over the characters in the URL looking for a percent symbol followed by two hexadecimal
digits. When such a sequence is encountered, combine the hexadecimal digits to obtain the character
with which to replace the entire sequence. For example, in the ASCII character set, the letter "A" has
the value 0x41, which could be encoded as "%41".
3.8.3 Discussion
RFC 1738 defines the syntax for URLs. Section 2.2 of that document also defines the rules for
encoding characters in a URL. While some characters must always be encoded, any character may be
encoded. Essentially, this means that before you do anything with a URL-whether you need to parse
the URL into pieces (i.e., username, password, host, and so on), match portions of the URL against a
whitelist or blacklist, or something else entirely-you need to decode it.
The problem is that you must make certain that you never decode a URL that has already been
decoded; otherwise, you will be vulnerable to double-encoding attacks. Suppose that the URL
contains the sequence "%25%34%31". Decoded once, the result is "%41" because "%25" is the
encoding for the percent symbol, "%34" is the encoding for the number 4, and "%31" is the encoding
for the number 1. Decoded twice, the result is "A".
At first glance, this may seem harmless, but what if you were to decode repeatedly until there were
no more escaped characters? You would end up with certain sequences of characters that are
impossible to represent. The purpose of encoding in the first place is to allow the use of characters
that have special meaning or that cannot be represented visually.
Another potential problem with encoding that is limited primarily to C and C++ is that aNULLterminator can be encoded anywhere in the URL. There are several approaches to dealing with this
problem. One is to treat the decoded string as a binary array rather than a C-style string; another is
to use the SafeStr library described in Recipe 3.4 because it gives no special significance to any one
character.
You can use the following spc_decode_url( ) function to decode a URL. It returns a dynamically
allocated copy of the URL in decoded form. The result will beNULL-terminated, so it may be treated
as a C-style string, but it may contain embedded NULLs as well. You can determine whether it
contains embedded NULLs by comparing the number of bytes spc_decode_url( ) indicates that it
returns with the result of calling strlen( ) on the decoded URL. If the URL contains embedded
NULLs, the result from strlen( ) will be less than the number of bytes indicated by
spc_decode_url( ).
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define SPC_BASE16_TO_10(x) (((x) >= '0' && (x) <= '9') ? ((x) - '0') : \
(toupper((x)) - 'A' + 10))
char *spc_decode_url(const char *url, size_t *nbytes) {
char
*out, *ptr;
const char *c;
if (!(out = ptr = strdup(url))) return 0;
for (c = url; *c; c++) {
if (*c != '%' || !isxdigit(c[1]) || !isxdigit(c[2])) *ptr++ = *c;
else {
*ptr++ = (SPC_BASE16_TO_10(c[1]) * 16) + (SPC_BASE16_TO_10(c[2]));
c += 2;
}
}
*ptr = 0;
if (nbytes) *nbytes = (ptr - out); /* does not include null byte */
return out;
}
3.8.4 See Also
RFC 1738: Uniform Resource Locators (URL)
Recipe 3.4
[ Team LiB ]
[ Team LiB ]
3.9 Validating Email Addresses
3.9.1 Problem
Your program accepts an email address as input, and you need to verify that the supplied address is
valid.
3.9.2 Solution
Scan the email address supplied by the user, and validate it against the lexical rules set forth in RFC
822.
3.9.3 Discussion
RFC 822 defines the syntax for email addresses. Unfortunately, the syntax is complex, and it
supports several address formats that are no longer relevant. The fortunate thing is that if anyone
attempts to use one of these no-longer-relevant address formats, you can be reasonably certain they
are attempting to do something they are not supposed to do.
You can use the following spc_email_isvalid( ) function to check the format of an email address.
It will perform only a syntactical check and will not actually attempt to verify the authenticity of the
address by attempting to deliver mail to it or by performing any DNS lookups on the domain name
portion of the address.
The function only validates the actual email address and will not accept any associated data. For
example, it will fail to validate "Bob Bobson <[email protected]>", but it will successfully validate
"[email protected]". If the supplied email address is syntactically valid,spc_email_isvalid( ) will
return 1; otherwise, it will return 0.
Keep in mind that almost any character is legal in an email address if it is
properly quoted, so if you are passing an email address to something that may
be sensitive to certain characters or character sequences (such as a command
shell), you must be sure to properly escape those characters.
#include <string.h>
int spc_email_isvalid(const char *address) {
int
count = 0;
const char *c, *domain;
static char *rfc822_specials = "()<>@,;:\\\"[]";
/* first we validate the name portion ([email protected]) */
for (c = address; *c; c++) {
if (*c == '\"' && (c == address || *(c - 1) == '.' || *(c - 1) ==
'\"')) {
while (*++c) {
if (*c == '\"') break;
if (*c == '\\' && (*++c == ' ')) continue;
if (*c <= ' ' || *c >= 127) return 0;
}
if (!*c++) return 0;
if (*c == '@') break;
if (*c != '.') return 0;
continue;
}
if (*c == '@') break;
if (*c <= ' ' || *c >= 127) return 0;
if (strchr(rfc822_specials, *c)) return 0;
}
if (c == address || *(c - 1) == '.') return 0;
/* next we validate the domain portion ([email protected]) */
if (!*(domain = ++c)) return 0;
do {
if (*c == '.') {
if (c == domain || *(c - 1) == '.') return 0;
count++;
}
if (*c <= ' ' || *c >= 127) return 0;
if (strchr(rfc822_specials, *c)) return 0;
} while (*++c);
return (count >= 1);
}
3.9.4 See Also
RFC 822: Standard for the Format of ARPA Internet Text Messages
[ Team LiB ]
[ Team LiB ]
3.10 Preventing Cross-Site Scripting
3.10.1 Problem
You are developing a web-based application, and you want to ensure that an attacker cannot exploit
it in an effort to steal information from the browsers of other people visiting the same site.
3.10.2 Solution
When you are generating HTML that must contain external input, be sure to escape that input so that
if it contains embedded HTML tags, the tags are not treated as HTML by the browser.
3.10.3 Discussion
Cross-site scripting attacks (often called CSS, but more frequently XSS in an effort to avoid confusion
with cascading style sheets) are a general class of attacks with a common root cause: insufficient
input validation. The goal of many cross-site scripting attacks is to steal information (usually the
contents of some specific cookie) from unsuspecting users. Other times, the goal is to get an
unsuspecting user to launch an attack on himself. These attacks are especially a problem for sites
that store sensitive information, such as login data or session IDs, in cookies.Cookie theft could allow
an attacker to hijack a session or glean other information that is intended to be private.
Consider, for example, a web-based message board, where many different people visit the site to
read the messages that other people have posted, and to post messages themselves. When someone
posts a new message to the board, if the message board software does not properly validate the
input, the message could contain malicious HTML that, when viewed by other people, performs some
unexpected action. Usually an attacker will attempt to embed some JavaScript code that steals
cookies, or something similar.
Often, an attacker has to go to greater lengths to exploit a cross-site script vulnerability; the
example described above is simplistic. An attacker can exploit any page that will include unescaped
user input, but usually the attacker has to trick the user into displaying that page somehow.
Attackers use many methods to accomplish this goal, such as fake pages that look like part of the
site from which the attacker wishes to steal cookies, or embedded links in innocent-looking email
messages.
It is not generally a good idea to allow users to embed HTML in any input accepted from them, but
many sites allow simple tags in some input, such as those that enable bold or italics on text.
Disallowing HTML altogether is the right solution in most cases, and it is the only solution that will
guarantee that cross-site scripting will be prevented. Other common attempts at a solution, such as
checking the referrer header for all requests (the referrer header is easily forged), do not work.
To disallow HTML in user input, you can do one of the following:
Refuse to accept anything that looks as if it may be HTML
Escape the special characters that enable a browser to interpret data as HTML
Attempting to recognize HTML and refuse it can be error-prone, unless you only look for the use of
the greater-than (>) and less-than (<) symbols. Trying to match tags that will not be allowed (i.e., a
blacklist) is not a good idea because it is difficult to do, and future revisions of HTML are likely to
introduce new tags. Instead, if you are going to allow some tags to pass through, you should take the
whitelist approach and only allow tags that you know are safe.
JavaScript code injection does not require a <script> tag; many other tags
can contain JavaScript code as well. For example, most tags support attributes
such as "onclick" and "onmouseover" that can contain JavaScript code.
The following spc_escape_html( ) function will replace occurrences of special HTML characters with
their escape sequences. For example, input that contains something like "<script>" will be replaced
with "&lt;script&gt;", which no browser should ever interpret as HTML.
Our function will escape most HTML tags, but it will also allow some through. Those that it allows
through are contained in a whitelist, and it will only allow them if the tags are used without any
attributes. In addition, the a (anchor) tag will be allowed with a heavily restricted href attribute. The
attribute must begin with "http://", and it must be the only attribute. The character set allowed in the
attribute's value is also heavily restricted, which means that not all necessarily valid URLs will
successfully make it through. In particular, if the URL contains "#", "?", or "&", which are certainly
valid and all have special meaning, the tag will not be allowed.
If you do not want to allow any HTML through at all, you can simply remove the call to
spc_allow_tag() in spc_escape_html(), and force all possible HTML to be properly escaped. In
many cases, this will actually be the behavior that you'll want.
spc_escape_html() will return a C-style string dynamically allocated with malloc(), which the caller
is responsible for deallocating with free(). If memory cannot be allocated, the return will be NULL. It
also expects a C-style string containing the text to filter as its only argument.
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
/* These are HTML tags that do not take arguments. We special-case the <a> tag
* since it takes an argument. We will allow the tag as-is, or we will allow a
* closing tag (e.g., </p>). Additionally, we process tags in a case* insensitive way. Only letters and numbers are allowed in tags we can allow.
* Note that we do a linear search of the tags. A binary search is more
* efficient (log n time instead of linear), but more complex to implement.
* The efficiency hit shouldn't matter in practice.
*/
static unsigned char *allowed_formatters[] = {
"b", "big", "blink", "i", "s", "small", "strike", "sub", "sup", "tt", "u",
"abbr", "acronym", "cite", "code", "del", "dfn", "em", "ins", "kbd", "samp",
"strong", "var", "dir", "li", "dl", "dd", "dt", "menu", "ol", "ul", "hr",
"br", "p", "h1", "h2", "h3", "h4", "h5", "h6", "center", "bdo", "blockquote",
"nobr", "plaintext", "pre", "q", "spacer",
/* include "a" here so that </a> will work */
"a"
};
#define SKIP_WHITESPACE(p) while (isspace(*p)) p++
static int spc_is_valid_link(const char *input) {
static const char *href="href";
static const char *http = "http://";
int
quoted_string = 0, seen_whitespace = 0;
if (!isspace(*input)) return 0;
SKIP_WHITESPACE(input);
if (strncasecmp(href, input, strlen(href))) return 0;
input += strlen(href);
SKIP_WHITESPACE(input);
if (*input++ != '=') return 0;
SKIP_WHITESPACE(input);
if (*input == '"') {
quoted_string = 1;
input++;
}
if (strncasecmp(http, input, strlen(http))) return 0;
for (input += strlen(http); *input && *input != '>'; input++) {
switch (*input) {
case '.': case '/': case '-': case '_':
break;
case '"':
if (!quoted_string) return 0;
SKIP_WHITESPACE(input);
if (*input != '>') return 0;
return 1;
default:
if (isspace(*input)) {
if (seen_whitespace && !quoted_string) return 0;
SKIP_WHITESPACE(input);
seen_whitespace = 1;
break;
}
if (!isalnum(*input)) return 0;
break;
}
}
return (*input && !quoted_string);
}
static int spc_allow_tag(const char *input) {
int i;
char *tmp;
if (*input == 'a')
return spc_is_valid_link(input + 1);
if (*input == '/') {
input++;
SKIP_WHITESPACE(input);
}
for (i = 0; i < sizeof(allowed_formatters); i++) {
if (strncasecmp(allowed_formatters[i], input, strlen(allowed_formatters[i])))
continue;
else {
tmp = input + strlen(allowed_formatters[i]);
SKIP_WHITESPACE(tmp);
if (*input == '>') return 1;
}
}
return 0;
}
/* Note: This interface expects a C-style NULL-terminated string. */
char *spc_escape_html(const char *input) {
char
*output, *ptr;
size_t
outputlen = 0;
const char *c;
/* This is a worst-case length
for (c = input; *c; c++) {
switch (*c) {
case '<': outputlen += 4;
case '>': outputlen += 4;
case '&': outputlen += 5;
case '\': outputlen += 6;
default:
outputlen += 1;
}
}
calculation */
break;
break;
break;
break;
break;
/*
/*
/*
/*
&lt; */
&gt; */
&amp; */
&quot; */
if (!(output = ptr = (char *)malloc(outputlen + 1))) return 0;
for (c = input; *c; c++) {
switch (*c) {
case '<':
if (!spc_allow_tag(c + 1)) {
*ptr++ = '&'; *ptr++ = 'l'; *ptr++ = 't'; *ptr++ = ';';
break;
} else {
do {
*ptr++ = *c;
} while (*++c != '>');
*ptr++ = '>';
break;
}
case '>':
*ptr++ = '&'; *ptr++ = 'g'; *ptr++ = 't'; *ptr++ = ';';
break;
case '&':
*ptr++ =
*ptr++ =
break;
case ''':
*ptr++ =
*ptr++ =
break;
default:
*ptr++ =
break;
}
}
*ptr = 0;
return output;
}
[ Team LiB ]
'&';
';';
*ptr++ = 'a';
*ptr++ = 'm';
*ptr++ = 'p';
'&';
't';
*ptr++ = 'q';
*ptr++ = 't';
*ptr++ = 'u';
*ptr++ = 'o';
*c;
[ Team LiB ]
3.11 Preventing SQL Injection Attacks
3.11.1 Problem
You are developing an application that interacts with a SQL database, and you need to defend against
SQL injection attacks.
3.11.2 Solution
SQL injection attacks are most common in web applications that use a database to store data, but
they can occur anywhere that a SQL command string is constructed from any type of input from a
user. Specifically, a SQL injection attack is mounted by inserting characters into the command string
that creates a compound command in a single string. For example, suppose a query string is created
with a WHERE clause that is constructed from user input. A proper command might be:
SELECT * FROM people WHERE first_name="frank";
If the value "frank" comes directly from user input and is not properly validated, an attacker could
include a closing double quote and a semicolon that would complete theSELECT command and allow
the attacker to append additional commands. For example:
SELECT * FROM people WHERE first_name="frank";
DROP TABLE people;
Obviously, the best way to avoid SQL injection attacks is to not create SQL command strings that
include any user input. In some small number of applications, this may be feasible, but more
frequently it is not. Avoid including user input in SQL commands as much as you can, but where it
cannot be avoided, you should escape dangerous characters.
3.11.3 Discussion
SQL injection attacks are really just general input validation problems. Unfortunately, there is no
perfect solution to preventing these types of attacks. Your best defense is to apply strict checking of
input-even going so far as to refuse questionable input rather than attempt to escape it-and hope
that that is a strong enough defense.
There are two main approaches that can be taken to avoid SQL injection attacks:
Restrict user input to the smallest character set possible, and refuse any input that contains
character outside of that set.
In many cases, user input needs to be used in queries such as looking up a username or a
message number, or some other relatively simple piece of information. It is rare to need any
character in a user name other than the set of alphanumeric characters. Similarly, message
numbers or other similar identifiers can safely be restricted to digits.
With SQL, problems start to occur when symbol characters that have special meaning are
allowed. Examples of such characters are quotes (both double and single), semicolons, percent
symbols, hyphens, and underscores. Avoid these characters wherever possible; they are often
unnecessary, and allowing them at all just makes things more difficult for everyone except an
attacker.
Escape characters that have special significant to SQL command processors.
In SQL parlance, anything that is not a keyword or an identifier is a literal. Keywords are
portions of a SQL command such as SELECT or WHERE, and an identifier would typically be the
name of a table or the name of a field. In some cases, SQL syntax allows literals to appear
without enclosing quotes, but as a general rule you should always enclose literals with quotes.
Literals should always be enclosed in single quotes ('), but some SQL implementations allow
you to use either single or double quotes ("). Whichever you choose to use, always close the
literal with the same character with which you opened it.
Within literals, most characters are safe to leave unescaped, and in many cases, it is not
possible to escape them. Certainly, with whichever quoting character you choose to use with
your literals, you may need to allow that character inside the literal. Escaping quotes is done by
doubling up on the quote character. Other characters that should always be escaped are
control characters and the escape character itself (a backslash).
Finally, if you are using the LIKE keyword in a WHERE clause, you may wish to prevent input
from containing wildcard characters. In fact, it is a good idea to prevent wildcard characters in
most circumstances. Wildcard characters include the percent symbol, underscore, and square
brackets.
You can use the function spc_escape_sql( ), shown at the end of this section, to escape all of the
characters that we've mentioned. As a convenience (and partly due to necessity), the function will
also surround the escaped string with the quote character of your choice. The return from the
function will be the quoted and escaped version of the input string. If an error occurs (e.g., out of
memory, or an invalid quoting character chosen), the return will be NULL.
spc_escape_sql( ) requires three arguments:
input
The string that is to be escaped.
quote
The quote character to use. It must be either a single or double quote. Any other character will
cause spc_escape_sql( ) to return failure.
wildcards
If this argument is specified as 0, wildcard characters recognized by theLIKE operator in a
WHERE clause will not be escaped; otherwise, they will be. You should only escape wildcards
when you are going to be using the escaped string as the right-hand side for theLIKE
operator.
#include <stdlib.h>
#include <string.h>
char *spc_escape_sql(const char *input, char quote, int wildcards) {
char
*out, *ptr;
const char *c;
/* If every character in the input needs to be escaped, the resulting string
* would at most double in size. Also, include room for the surrounding
* quotes.
*/
if (quote != '\'' && quote != '\"') return 0;
if (!(out = ptr = (char *)malloc(strlen(input) * 2 + 2 + 1))) return 0;
*ptr++ = quote;
for (c = input; *c; c++) {
switch (*c) {
case '\'': case '\"':
if (quote == *c) *ptr++ = *c;
*ptr++ = *c;
break;
case '%': case '_': case '[': case ']':
if (wildcards) *ptr++ = '\\';
*ptr++ = *c;
break;
case '\\': *ptr++ = '\\'; *ptr++ = '\\'; break;
case '\b': *ptr++ = '\\'; *ptr++ = 'b'; break;
case '\n': *ptr++ = '\\'; *ptr++ = 'n'; break;
case '\r': *ptr++ = '\\'; *ptr++ = 'r'; break;
case '\t': *ptr++ = '\\'; *ptr++ = 't'; break;
default:
*ptr++ = *c;
break;
}
}
*ptr++ = quote;
*ptr = 0;
return out;
}
[ Team LiB ]
[ Team LiB ]
3.12 Detecting Illegal UTF-8 Characters
3.12.1 Problem
Your program accepts external input in UTF-8 encoding. You need to make sure that the UTF-8
encoding is valid.
3.12.2 Solution
Scan the input string for illegal UTF-8 sequences. If any illegal sequences are detected, reject the
input.
3.12.3 Discussion
UTF-8 is an encoding that is used to represent multibyte character sets in a way that is backwardcompatible with single-byte character sets. Another advantage of UTF-8 is that it ensures there are
no NULL bytes in the data, with the exception of an actual NULL byte. Encodings such as Unicode's
UCS-2 may (and often do) contain NULL bytes as "padding" if they are treated as byte streams. For
example, the letter "A" is 0x41 in ASCII or UTF-8, but it is 0x0041 in UCS-2.
The first byte in a UTF-8 sequence determines the number of bytes that follow it to make up the
complete sequence. The number of upper bits set in the first byte minus one indicates the number of
bytes that follow. A bit that is never set immediately follows the count, and the remaining bits are
used as part of the character encoding. The bytes that follow the first byte will always have the upper
two bits set and unset, respectively; the remaining bits are combined with the encoding bits from the
other bytes in the sequence to compute the character. Table 3-2 lists the binary encodings for the
range of characters from 0x00000000 to 0x7FFFFFFF.
Table 3-2. UTF-8 encoding byte sequences
Byte range
UTF-8 binary representation
0x00000000 - 0x0000007F
0bbbbbbb
0x00000080 - 0x000007FF
110bbbbb 10bbbbbb
0x00000800 - 0x0000FFFF
1110bbbb 10bbbbbb 10bbbbbb
0x00010000 - 0x001FFFFF
11110bbb 10bbbbbb 10bbbbbb 10bbbbbb
The problem with UTF-8 encoding is that invalid sequences can be embedded in the data. The UTF-8
specification states that the only legal encoding for a character is the shortest sequence of bytes that
yields the correct value. Longer sequences may be able to produce the same value as a shorter
sequence, but they are not legal; such a longer sequence is called anoverlong sequence.
Byte range
UTF-8 binary representation
0x00200000 - 0x03FFFFFF
111110bb 10bbbbbb 10bbbbbb 10bbbbbb 10bbbbbb
0x04000000 - 0x7FFFFFFF
1111110b 10bbbbbb 10bbbbbb 10bbbbbb 10bbbbbb 10bbbbbb
The problem with UTF-8 encoding is that invalid sequences can be embedded in the data. The UTF-8
specification states that the only legal encoding for a character is the shortest sequence of bytes that
yields the correct value. Longer sequences may be able to produce the same value as a shorter
sequence, but they are not legal; such a longer sequence is called anoverlong sequence.
The security issue posed by overlong sequences is that allowing them makes it significantly more
difficult to analyze a UTF-8 encoded string because multiple representations are possible for the same
character. It would be possible to recognize overlong sequences and convert them to the shortest
sequence, but we recommend against doing that because there may be other issues involved that
have not yet been discovered. We recommend that you reject any input that contains an overlong
sequence.
The following spc_utf8_isvalid( ) function will scan a string encoded in UTF-8 to verify that it
contains only valid sequences. It will return 1 if the string contains only legitimate encoding
sequences; otherwise, it will return 0.
int spc_utf8_isvalid(const unsigned char *input) {
int
nb;
const unsigned char *c = input;
for (c = input; *c; c += (nb + 1)) {
if (!(*c & 0x80)) nb = 0;
else if ((*c & 0xc0) = = 0x80) return 0;
else if ((*c & 0xe0) = = 0xc0) nb = 1;
else if ((*c & 0xf0) = = 0xe0) nb = 2;
else if ((*c & 0xf8) = = 0xf0) nb = 3;
else if ((*c & 0xfc) = = 0xf8) nb = 4;
else if ((*c & 0xfe) = = 0xfc) nb = 5;
while (nb-- > 0)
if ((*(c + nb) & 0xc0) != 0x80) return 0;
}
return 1;
}
[ Team LiB ]
[ Team LiB ]
3.13 Preventing File Descriptor Overflows When Using
select( )
3.13.1 Problem
Your program uses the select( ) system call to determine when sockets are ready for writing, have
data waiting to be read, or have an exceptional condition (e.g., out-of-band data has arrived). Using
select( ) requires the use of the fd_set data type, which typically entails the use of the FD_*( )
family of macros. In most implementations, FD_SET( ) and FD_CLR( ), in particular, are susceptible
to an array overrun.
3.13.2 Solution
Do not use the FD_*( ) family of macros. Instead, use the macros that are provided in this recipe.
The FD_SET( ) and FD_CLR( ) macros will modify an fd_set object without performing any bounds
checking. The macros we provide will do proper bounds checking.
3.13.3 Discussion
The select( ) system call is normally used to multiplex sockets. In a single-threaded environment,
select( ) allows you to build sets of socket descriptors for which you wish to wait for data to
become available or that you wish to have available to write data to. Thefd_set data type is used to
hold a list of the socket descriptors, and several standard macros are used to manipulate objects of
this type.
Normally, fd_set is defined as a structure with a single member that is a statically allocated array of
long integers. Because socket descriptors are always numbered starting with 0 and ending with the
highest allowable descriptor, the array of integers in anfd_set is actually treated as a bitmask with a
one-to-one correspondence between bits and socket descriptors.
The size of the array in the fd_set structure is determined by the FD_SETSIZE macro. Most often,
the size of the array is sufficiently large to be able to handle any possible file descriptor, but the
problem is that most implementations of the FD_SET( ) and FD_CLR( ) macros (which are used to
set and clear socket descriptors in an fd_set object) do not perform any bounds checking and will
happily overrun the array if asked to do so.
If FD_SETSIZE is defined to be sufficiently large, why is this a problem? Consider the situation in
which a server program is compiled with FD_SETSIZE defined to be 256, which is normally the
maximum number of file and socket descriptors allowed in a Unix process. Everything works just fine
for a while, but eventually the number of allowed file descriptors is increased to 512 because 256 are
no longer enough for all the connections to the server. The increase in file descriptors could be done
externally by using setrlimit( ) before starting the server process (with the bash shell, the
command would be ulimit -n 512).
The proper way to deal with this problem is to allocate the array dynamically and ensure that
FD_SET( ) and FD_CLR( ) resize the array as necessary before modifying it. Unfortunately, to do
this, we need to create a new data type. We define the data type such that it can be safely cast to an
fd_set for passing it directly to select( ):
#include <stdlib.h>
typedef struct {
long int *fds_bits;
size_t
fds_size;
} SPC_FD_SET;
With a new data type defined, we can replace FD_SET( ), FD_CLR( ), FD_ISSET( ), and FD_ZERO(
), which are normally implemented as preprocessor macros. Instead, we will implement them as
functions because we need to do a little extra work, and it also helps ensure type safety:
void spc_fd_zero(SPC_FD_SET *fdset) {
fdset->fds_bits = 0;
fdset->fds_size = 0;
}
void spc_fd_set(int fd, SPC_FD_SET *fdset) {
long
*tmp_bits;
size_t new_size;
if (fd < 0) return;
if (fd > fdset->fds_size) {
new_size = sizeof(long) * ((fd + sizeof(long) - 1) / sizeof(long));
if (!(tmp_bits = (long *)realloc(fdset->fds_bits, new_size))) return;
fdset->fds_bits = tmp_bits;
fdset->fds_size = new_size;
}
fdset->fds_bits[fd / sizeof(long)] |= (1 << (fd % sizeof(long)));
}
void spc_fd_clr(int fd, SPC_FD_SET *fdset) {
long
*tmp_bits;
size_t new_size;
if (fd < 0) return;
if (fd > fdset->fds_size) {
new_size = sizeof(long) * ((fd + sizeof(long) - 1) / sizeof(long));
if (!(tmp_bits = (long *)realloc(fdset->fds_bits, new_size))) return;
fdset->fds_bits = tmp_bits;
fdset->fds_size = new_size;
}
fdset->fds_bits[fd / sizeof(long)] |= (1 << (fd % sizeof(long)));
}
int spc_fd_isset(int fd, SPC_FD_SET *fdset) {
if (fd < 0 || fd >= fdset->fds_size) return 0;
return (fdset->fds_bits[fd / sizeof(long)] & (1 << (fd % sizeof(long))));
}
void spc_fd_free(SPC_FD_SET *fdset) {
if (fdset->fds_bits) free(fdset->fds_bits);
}
int spc_fd_setsize(SPC_FD_SET *fdset) {
return fdset->fds_size;
}
Notice that we've added two additional functions, spc_fd_free( ) and spc_fd_setsize( ). Because
we are now dynamically allocating the array, there must be some way to free it. The function
spc_fd_free( ) will only free the inner contents of the SPC_FD_SET object passed to it, leaving
management of the SPC_FD_SET object up to you-you may allocate these objects either statically or
dynamically. The other function, spc_fd_setsize( ), is a replacement for the FD_SETSIZE macro
that is normally used as the first argument to select( ), indicating the size of the FD_SET objects
passed as the next three arguments.
Finally, using the new code requires some minor changes to existing code that uses the standard
fd_set. Consider the following code example, where the variableclient_count is a global variable
that represents the number of connected clients, and the variable client_fds is a global variable
that is an array of socket descriptors for each connected client:
void main_server_loop(int server_fd) {
int
i;
fd_set read_mask;
for (;;) {
FD_ZERO(&read_mask);
FD_SET(server_fd, &read_mask);
for (i = 0; i < client_count; i++) FD_SET(client_fds[i], &read_mask);
select(FD_SETSIZE, &read_mask, 0, 0, 0);
if (FD_ISSET(server_fd, &read_mask)) {
/* Do something with the server_fd such as call accept( ) */
}
for (i = 0; i < client_count; i++)
if (FD_ISSET(client_fds[i], &read_mask)) {
/* Read some data from the client's socket descriptor */
}
}
}
}
The equivalent code using the SPC_FD_SET data type and the functions that operate on it would be:
void main_server_loop(int server_fd) {
int
i;
SPC_FD_SET read_mask;
for (;;) {
spc_fd_zero(&read_mask);
spc_fd_set(server_fd, &read_mask);
for (i = 0; i < client_count; i++) spc_fd_set(client_fds[i], &read_mask);
select(spc_fd_size(&read_mask), (fd_set *)&read_mask, 0, 0, 0);
if (spc_fd_isset(server_fd, &read_mask)) {
/* Do something with the server_fd such as call accept( ) */
}
for (i = 0; i < client_count; i++)
if (spc_fd_isset(client_fds[i], &read_mask)) {
/* Read some data from the client's socket descriptor */
}
spc_fd_free(&read_mask);
}
}
As you can see, the code that uses SPC_FD_SET is not all that different from the code that uses
fd_set. Naming issues aside, the only real differences are the need to cast theSPC_FD_SET object to
an fd_set object, and to call spc_fd_free( ).
3.13.4 See Also
Recipe 3.3
[ Team LiB ]
[ Team LiB ]
Chapter 4. Symmetric Cryptography
Fundamentals
Strong cryptography is a critical piece of information security that can be applied at many levels,
from data storage to network communication. One of the most common classes of security problems
people introduce is the misapplication of cryptography. It's an area that can look deceptively easy,
when in reality there are an overwhelming number of pitfalls. Moreover, it is likely that many classes
of cryptographic pitfalls are still unknown.
It doesn't help that cryptography is a huge topic, complete with its own subfields, such as public key
infrastructure (PKI). Many books cover the algorithmic basics; one example isBruce Schneier's
classic, Applied Cryptography (John Wiley & Sons). Even that classic doesn't quite live up to its name,
however, as it focuses on the implementation of cryptographic primitives from the developer's point
of view and spends relatively little time discussing how to integrate cryptography into an application
securely. As a result, we have seen numerous examples of developers armed with a reasonable
understanding of cryptographic algorithms that they've picked up from that book, who then go on to
build their own cryptographic protocols into their applications, which are often insecure.
Over the next three chapters, we focus on the basics of symmetric cryptography. With symmetric
cryptography, any parties who wish to communicate securely must share a piece of secret
information. That shared secret (usually an encryption key) must be communicated over a secure
medium. In particular, sending the secret over the Internet is a bad idea, unless you're using some
sort of channel that is already secure, such as one properly secured using public key encryption
(which can be tough to do correctly in itself). In many cases, it's appropriate to use some type of
out-of-band medium for communication, such as a telephone or a piece of paper.
In these three chapters, we'll cover everything most developers need to use symmetric cryptography
effectively, up to the point when you need to choose an actual network protocol. Applying
cryptography on the network is covered in Chapter 9.
To ensure that you choose the right cryptographic protocols for your
application, you need an understanding of these basics. However, you'll very
rarely need to go all the way back to the primitive algorithms we discuss in
these chapters. Instead, you should focus on out-of-the-box protocols that are
believed to be cryptographically strong. While we therefore recommend that
you thoroughly understand the material in these chapters, we advise you to go
to the recipes in Chapter 9 to find something appropriate before you come here
and build something yourself. Don't fall into the same trap that many ofApplied
Cryptography's readers have fallen into!
There are two classes of symmetric primitives, both of utmost importance. First are symmetric
encryption algorithms, which provide for data secrecy. Second are message authentication codes
(MACs), which can ensure that if someone tampers with data while in transit, the tampering will be
detected. Recently, a third class of primitives has started to appear: encryption modes that provide
for both data secrecy and message authentication. Such primitives can help make the application of
cryptography less prone to disastrous errors.
In this chapter, we will look at how to generate, represent, store, and distribute symmetric-key
material. In Chapter 5, we will look at encryption using block ciphers such as AES, and inChapter 6,
we will examine cryptographic hash functions (such as SHA1) and MACs.
Towards the end of this chapter, we do occasionally forward-reference
algorithms from the next two chapters. It may be a good idea to readRecipe
5.1 through Recipe 5.4 and Recipe 6.1 through Recipe 6.4 before reading
Recipe 4.10 through Recipe 4.14.
[ Team LiB ]
[ Team LiB ]
4.1 Representing Keys for Use in Cryptographic
Algorithms
4.1.1 Problem
You need to keep an internal representation of a symmetric key. You may want to save this key to
disk, pass it over a network, or use it in some other way.
4.1.2 Solution
Simply keep the key as an ordered array of bytes. For example:
/* When statically allocated */
unsigned char *key[KEYLEN_BYTES];
/* When dynamically allocated */
unsigned char *key = (unsigned char *)malloc(KEYLEN_BYTES);
When you're done using a key, you should delete it securely to prevent local attackers from
recovering it from memory. (This is discussed in Recipe 13.2.)
4.1.3 Discussion
While keys in public key cryptography are represented as very large numbers (and often stored in
containers such as X.509 certificates), symmetric keys are always represented as a series of
consecutive bits. Algorithms operate on these binary representations.
Occasionally, people are tempted to use a single 64-bit unit to represent short keys (e.g., along
long when using GCC on most platforms). Similarly, we've commonly seen people use an array of
word-size values. That's a bad idea because of byte-ordering issues.When representing integers, the
bytes of the integer may appear most significant byte first (big-endian) or least significant byte first
(little-endian). Figure 4-1 provides a visual illustration of the difference between big-endian and littleendian storage:
Figure 4-1. Big-endian versus little-endian
Endian-ness doesn't matter when performing integer operations, because the CPU implicitly knows
how integers are supposed to be represented and treats them appropriately. However, a problem
arises when we wish to treat a single integer or an array of integers as an array of bytes. Casting the
address of the first integer to be a pointer to char does not give the right results on a little-endian
machine, because the cast does not cause bytes to be swapped to their "natural" order. If you
absolutely always cast to an appropriate type, this may not be an issue if you don't move data
between architectures, but that would defeat any possible reason to use a bigger storage unit than a
single byte. For this reason, you should always represent key material as an array of one-byte
elements. If you do so, your code and the data will always be portable, even if you send the data
across the network.
You should also avoid using signed data types, simply to avoid potential printing oddities due to sign
extension. For example, let's say that you have a signed 32-bit value,0xFF000000, and you want to
shift it right by one bit. You might expect the result 0x7F800000, but you'd actually get 0xFF800000,
because the sign bit gets shifted, and the result also maintains the same sign.[1]
[1]
To be clear on semantics, note that shifting right eight bits will always give the same result as shifting right
one bit eight times. That is, when shifting right an unsigned value, the leftmost bits always get filled in with
zeros. But with a signed value, they always get filled in with the original value of the most significant bit.
4.1.4 See Also
Recipe 3.2
[ Team LiB ]
[ Team LiB ]
4.2 Generating Random Symmetric Keys
4.2.1 Problem
You want to generate a secure symmetric key. You already have some mechanism for securely
transporting the key to anyone who needs it. You need the key to be as strong as the cipher you're
using, and you want the key to be absolutely independent of any other data in your system.
4.2.2 Solution
Use one of the recipes in Chapter 11 to collect a byte array of the necessary length filled with
entropy.
When you're done using a key, you should delete it securely to prevent local attackers from
recovering it from memory. This is discussed in Recipe 13.2.
4.2.3 Discussion
In Recipe 11.2, we present APIs for getting random data, including key material. We recommend
using the spc_keygen( ) function from that API. See that recipe for considerations on which function
to use.
To actually implement spc_keygen( ), use one of the techniques from Chapter 11. For example, you
may want to use the randomness infrastructure that is built into the operating system (seeRecipe
11.3 and Recipe 11.4), or you may want to collect your own entropy, particularly on an embedded
platform where the operating system provides no such services (seeRecipe 11.19 through Recipe
11.23).
In many cases, you may want to derive short-term keys from a single "master" key. See Recipe 4.11
for a discussion of how to do so.
Be conservative when choosing a symmetric key length. We recommend 128-bit symmetric keys.
(See Recipe 5.3.)
4.2.4 See Also
Recipe 4.11, Recipe 5.3, Recipe 11.2, Recipe 11.3, Recipe 11.4, Recipe 11.19, Recipe 11.20, Recipe
11.21, Recipe 11.22, Recipe 11.23, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
4.3 Representing Binary Keys (or Other Raw Data) as
Hexadecimal
4.3.1 Problem
You want to print out keys in hexadecimal format, either for debugging or for easy communication.
4.3.2 Solution
The easiest way is to use the "%X" specifier in the printf() family of functions. In C++, you can set
the ios::hex flag on a stream object before outputting a value, then clear the flag afterward.
4.3.3 Discussion
Here is a function called spc_print_hex() that prints arbitrary data of a specified length in
formatted hexadecimal:
#include <stdio.h>
#include <string.h>
#define BYTES_PER_GROUP 4
#define GROUPS_PER_LINE 4
/* Don't change these */
#define BYTES_PER_LINE (BYTES_PER_GROUP * GROUPS_PER_LINE)
void spc_print_hex(char *prefix, unsigned char *str, int len) {
unsigned long i, j, preflen = 0;
if (prefix) {
printf("%s", prefix);
preflen = strlen(prefix);
}
for (i = 0; i < len; i++) {
printf("%02X ", str[i]);
if (((i % BYTES_PER_LINE) = = (BYTES_PER_LINE - 1)) && ((i + 1) != len)) {
putchar('\n');
for (j = 0; j < preflen; j++) putchar(' ');
}
else if ((i % BYTES_PER_GROUP) = = (BYTES_PER_GROUP - 1)) putchar(' ');
}
putchar('\n');
}
This function takes the following arguments:
prefix
String to be printed in front of the hexadecimal output. Subsequent lines of output are indented
appropriately.
str
String to be printed, in binary. It is represented as anunsigned char * to make the code
simpler. The caller will probably want to cast, or it can be easily rewritten to be avoid *,
which would require this code to cast this argument to a byte-based type for the array indexing
to work correctly.
len
Number of bytes to print.
This function prints out bytes as two characters, and it pairs bytes in groups of four. It will also print
only 16 bytes per line. Modifying the appropriate preprocessor declarations at the top easily changes
those parameters.
Currently, this function writes to the standard output, but it can be modified to return amalloc( )'d
string quite easily using sprintf( ) and putc( ) instead of printf( ) and putchar( ).
In C++, you can print any data object in hexadecimal by setting the flagios::hex using the setf(
) method on ostream objects (the unsetf( ) method can be used to clear flags). You might also
want the values to print in all uppercase, in which case you should set theios::uppercase flag. If
you want a leading "0x" to print to denote hexadecimal, also set the flagios::showbase. For
example:
cout.setf(ios::hex | ios::uppercase | ios::showbase);
cout << 1234 << endl;
cout.unsetf(ios::hex | ios::uppercase | ios::showbase);
[ Team LiB ]
[ Team LiB ]
4.4 Turning ASCII Hex Keys (or Other ASCII Hex Data) into
Binary
4.4.1 Problem
You have a key represented in ASCII that you'd like to convert into binary form. The string containing
the key is NULL-terminated.
4.4.2 Solution
The code listed in the following Section 4.4.3 parses an ASCII string that represents hexadecimal
data, and it returns a malloc( )'d buffer of the appropriate length. Note that the buffer will be half
the size of the input string, not counting the leading "0x" if it exists. The exception is when there is
whitespace. This function passes back the number of bytes written in its second parameter. If that
parameter is negative, an error occurred.
4.4.3 Discussion
The spc_hex2bin( ) function shown in this section converts an ASCII string into a binary string.
Spaces and tabs are ignored. A leading "0x" or "0X" is ignored. There are two cases in which this
function can fail. First, if it sees a non-hexadecimal digit, it assumes that the string is not in the right
format, and it returns NULL, setting the error parameter to ERR_NOT_HEX. Second, if there is an odd
number of hex digits in the string, it returns NULL, setting the error parameter to ERR_BAD_SIZE.
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define ERR_NOT_HEX -1
#define ERR_BAD_SIZE -2
#define ERR_NO_MEM
-3
unsigned char *spc_hex2bin(const unsigned char *input, size_t *l) {
unsigned char
shift = 4, value = 0;
unsigned char
*r, *ret;
const unsigned char *p;
if (!(r = ret = (unsigned char *)malloc(strlen(input) / 2))) {
*l = ERR_NO_MEM;
return 0;
}
for (p = input; isspace(*p); p++);
if (p[0] = = '0' && (p[1] = = 'x' || p[1] =
= 'X')) p += 2;
while (p[0]) {
switch (p[0]) {
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
value |= (*p++ - '0') << shift;
break;
case 'a': case 'b': case 'c':
case 'd': case 'e': case 'f':
value |= (*p++ - 'a' + 0xa) << shift;
break;
case 'A': case 'B': case 'C':
case 'D': case 'E': case 'F':
value |= (*p++ - 'A' + 0xa) << shift;
break;
case 0:
if (!shift) {
*l = ERR_NOT_HEX;
free(ret);
return 0;
}
break;
default:
if (isspace(p[0])) p++;
else {
*l = ERR_NOT_HEX;
free(ret);
return 0;
}
}
if ((shift = (shift + 4) % 8) != 0) {
*r++ = value;
value = 0;
}
}
if (!shift) {
*l = ERR_BAD_SIZE;
free(ret);
return 0;
}
*l = (r - ret);
return (unsigned char *)realloc(ret, *l);
}
[ Team LiB ]
[ Team LiB ]
4.5 Performing Base64 Encoding
4.5.1 Problem
You want to represent binary data in as compact a textual representation as is reasonable, but the
data must be easy to encode and decode, and it must use printable text characters.
4.5.2 Solution
Base64 encoding encodes six bits of data at a time, meaning that every six bits of input map to one
character of output. The characters in the output will be a numeric digit, a letter (uppercase or
lowercase), a forward slash, a plus, or the equal sign (which is a special padding character).
Note that four output characters map exactly to three input characters. As a result, if the input string
isn't a multiple of three characters, you'll need to do some padding (explained inSection 4.5.3).
4.5.3 Discussion
The base64 alphabet takes 6-bit binary values representing numbers from 0 to 63 and maps them to
a set of printable ASCII characters. The values 0 through 25 map to the uppercase letters in order.
The values 26 through 51 map to the lowercase letters. Then come the decimal digits from 0 to 9,
and finally + and /.
If the length of the input string isn't a multiple of three bytes, the leftover bits are padded to a
multiple of six with zeros; then the last character is encoded. If only one byte would have been
needed in the input to make it a multiple of three, the pad character (=) is added to the end of the
string. Otherwise, two pad characters are added.
#include <stdlib.h>
static char b64table[64] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz"
"0123456789+/";
/* Accepts a binary buffer with an associated size.
* Returns a base64 encoded, NULL-terminated string.
*/
unsigned char *spc_base64_encode(unsigned char *input, size_t len, int wrap) {
unsigned char *output, *p;
size_t
i = 0, mod = len % 3, toalloc;
toalloc = (len / 3) * 4 + (3 - mod) % 3 + 1;
if (wrap) {
toalloc += len / 57;
if (len % 57) toalloc++;
}
p = output = (unsigned char *)malloc(((len / 3) + (mod ? 1 : 0)) * 4 + 1);
if (!p) return 0;
while (i < len - mod) {
*p++ = b64table[input[i++] >> 2];
*p++ = b64table[((input[i - 1] << 4) | (input[i] >> 4)) & 0x3f];
*p++ = b64table[((input[i] << 2) | (input[i + 1] >> 6)) & 0x3f];
*p++ = b64table[input[i + 1] & 0x3f];
i += 2;
if (wrap && !(i % 57)) *p++ = '\n';
}
if (!mod) {
if (wrap && i % 57) *p++ = '\n';
*p = 0;
return output;
} else {
*p++ = b64table[input[i++] >> 2];
*p++ = b64table[((input[i - 1] << 4) | (input[i] >> 4)) & 0x3f];
if (mod = = 1) {
*p++ = '=';
*p++ = '=';
if (wrap) *p++ = '\n';
*p = 0;
return output;
} else {
*p++ = b64table[(input[i] << 2) & 0x3f];
*p++ = '=';
if (wrap) *p++ = '\n';
*p = 0;
return output;
}
}
}
The public interface to the above code is the following:
unsigned char *spc base64_encode(unsigned char *input, size_t len, int wrap);
The result is a NULL-terminated string allocated internally via malloc( ). Some protocols may expect
you to "wrap" base64-encoded data so that, when printed, it takes up less than 80 columns. If such
behavior is necessary, you can pass in a non-zero value for the final parameter, which will cause this
code to insert newlines once every 76 characters. In that case, the string will always end with a
newline (followed by the expected NULL-terminator).
If the call to malloc( ) fails because there is no memory, this function returns 0.
4.5.4 See Also
Recipe 4.6
[ Team LiB ]
[ Team LiB ]
4.6 Performing Base64 Decoding
4.6.1 Problem
You have a base64-encoded string that you'd like to decode.
4.6.2 Solution
Use the inverse of the algorithm for encoding, presented inRecipe 4.5. This is most easily done via
table lookup, mapping each character in the input to six bits of output.
4.6.3 Discussion
Following is our code for decoding a base64-encoded string. We look at each byte separately,
mapping it to its associated 6-bit value. If the byte isNULL, we know that we've reached the end of
the string. If it represents a character not in the base64 set, we ignore it unless thestrict
argument is non-zero, in which case we return an error.
The RFC that specifies this encoding says you should silently ignore any
unnecessary characters in the input stream. If you don't have to do so, we
recommend you don't, as this constitutes a covert channel in any protocol
using this encoding.
Note that we check to ensure strings are properly padded. If the string isn't properly padded or
otherwise terminates prematurely, we return an error.
#include <stdlib.h>
#include <string.h>
static char b64revtb[256]
-3, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
52, 53, 54, 55, 56, 57,
-1, 0, 1, 2, 3, 4,
15, 16, 17, 18, 19, 20,
-1, 26, 27, 28, 29, 30,
41, 42, 43, 44, 45, 46,
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
= {
-1,
-1,
-1,
58,
5,
21,
31,
47,
-1,
-1,
-1,
-1,
-1,
-1,
59,
6,
22,
32,
48,
-1,
-1,
-1,
-1,
-1,
-1,
60,
7,
23,
33,
49,
-1,
-1,
-1,
-1,
-1,
-1,
61,
8,
24,
34,
50,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
9,
25,
35,
51,
-1,
-1,
-1,
-1,
-1,
62,
-1,
10,
-1,
36,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
11,
-1,
37,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-2,
12,
-1,
38,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
13,
-1,
39,
-1,
-1,
-1,
-1,
-1,
-1,
63,
-1,
14,
-1,
40,
-1,
-1,
-1,
-1,
/*0-15*/
/*16-31*/
/*32-47*/
/*48-63*/
/*64-79*/
/*80-95*/
/*96-111*/
/*112-127*/
/*128-143*/
/*144-159*/
/*160-175*/
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1,
-1
/*176-191*/
/*192-207*/
/*208-223*/
/*224-239*/
/*240-255*/
};
static unsigned int raw_base64_decode(unsigned char *in, unsigned char *out,
int strict, int *err) {
unsigned int result = 0, x;
unsigned char buf[3], *p = in, pad = 0;
*err = 0;
while (!pad) {
switch ((x = b64revtb[*p++])) {
case -3: /* NULL TERMINATOR */
if (((p - 1) - in) % 4) *err = 1;
return result;
case -2: /* PADDING CHARACTER. INVALID HERE */
if (((p - 1) - in) % 4 < 2) {
*err = 1;
return result;
} else if (((p - 1) - in) % 4 = = 2) {
/* Make sure there's appropriate padding */
if (*p != '=') {
*err = 1;
return result;
}
buf[2] = 0;
pad = 2;
result++;
break;
} else {
pad = 1;
result += 2;
break;
}
return result;
case -1:
if (strict) {
*err = 2;
return result;
}
break;
default:
switch (((p - 1) - in) % 4) {
case 0:
buf[0] = x << 2;
break;
case 1:
buf[0] |= (x >> 4);
buf[1] = x << 4;
break;
case 2:
buf[1]
buf[2]
break;
case 3:
buf[2]
result
for (x
break;
|= (x >> 2);
= x << 6;
|= x;
+= 3;
= 0; x < 3 - pad;
x++) *out++ = buf[x];
}
break;
}
}
for (x = 0; x < 3 - pad;
return result;
x++) *out++ = buf[x];
}
/* If err is non-zero on exit, then there was an incorrect padding error. We
* allocate enough space for all circumstances, but when there is padding, or
* there are characters outside the character set in the string (which we are
* supposed to ignore), then we end up allocating too much space. You can
* realloc( ) to the correct length if you wish.
*/
unsigned char *spc_base64_decode(unsigned char *buf, size_t *len, int strict,
int *err) {
unsigned char *outbuf;
outbuf = (unsigned char *)malloc(3 * (strlen(buf) / 4 + 1));
if (!outbuf) {
*err = -3;
*len = 0;
return 0;
}
*len = raw_base64_decode(buf, outbuf, strict, err);
if (*err) {
free(outbuf);
*len = 0;
outbuf = 0;
}
return outbuf;
}
The public API to this code is:
unsigned char *spc_base64_decode(unsigned char *buf, size_t *len, int strict, int
*err);
The API assumes that buf is a NULL-terminated string. The len parameter is a pointer that receives
the length of the binary output. If there is an error, the memory pointed to bylen will be 0, and the
value pointed to by err will be non-zero. The error will be -1 if there is a padding error, -2 if strict
checking was requested, but a character outside the strict set is found, and -3 ifmalloc( ) fails.
4.6.4 See Also
Recipe 4.5
[ Team LiB ]
[ Team LiB ]
4.7 Representing Keys (or Other Binary Data) as English
Text
4.7.1 Problem
You want to use an easy-to-read format for displaying keys (or fingerprints or some other interesting
binary data). English would work better than a hexadecimal representation because people's ability to
recognize the key as correct by sight will be better.
4.7.2 Solution
Map a particular number of bits to a dictionary of words. The dictionary should be of such a size that
an exact mapping to a number of bits is possible. That is, the dictionary should have a number of
entries that is a power of two.
4.7.3 Discussion
The spc_bin2words() function shown here converts a binary string of the specified number of bytes
into a string of English words. This function takes two arguments:str is the binary string to convert,
and len is the number of bytes to be converted.
#include <string.h>
#include <stdlib.h>
#include "wordlist.h"
#define BITS_IN_LIST 11
#define MAX_WORDLEN 4
/* len parameter is measured in bytes. Remaining bits padded with 0. */
unsigned char *spc_bin2words(const unsigned char *str, size_t len) {
short
add_space = 0;
size_t
i, leftbits, leftovers, scratch = 0, scratch_bits = 0;
unsigned char *p, *res;
res = (unsigned char *)malloc((len * 8 / BITS_IN_LIST + 1) * (MAX_WORDLEN + 1));
if (!res) abort( );
res[0] = 0;
for (i = 0; i < len;
leftovers = str[i];
leftbits = 8;
i++) {
while (leftbits) {
if (scratch_bits + leftbits <= BITS_IN_LIST) {
scratch |= (leftovers << (BITS_IN_LIST - leftbits - scratch_bits));
scratch_bits += leftbits;
leftbits = 0;
} else {
scratch |= (leftovers >> (leftbits - (BITS_IN_LIST - scratch_bits)));
leftbits -= (BITS_IN_LIST - scratch_bits);
leftovers &= ((1 << leftbits) - 1);
scratch_bits = BITS_IN_LIST;
}
if (scratch_bits = = BITS_IN_LIST) {
p = words[scratch];
/* The strcats are a bit inefficient because they start from the front of
* the string each time. But, they're less confusing, and these strings
* should never get more than a few words long, so efficiency will
* probably never be a real concern.
*/
if (add_space) strcat(res, " ");
strcat(res, p);
scratch = scratch_bits = 0;
add_space = 1;
}
}
}
if (scratch_bits) { /* Emit the final word */
p = words[scratch];
if (add_space) strcat(res, " ");
strcat(res, p);
}
res = (unsigned char *)realloc(res, strlen(res) + 1);
if (!res) abort( ); /* realloc failed; should never happen, as size shrinks */
return res;
}
To save space, the dictionary file (wordlist.h) is not provided here. Instead, you can find it on the
book's web site.
The previous code is subtly incompatible with theS/KEY dictionary because
their dictionary is not in alphabetical order. (S/KEY is an authentication system
using one-time passwords.) Be sure to use the right dictionary!
The code is written in such a way that you can use dictionaries of different sizes if you wish to encode
a different number of bits per word. Currently, the dictionary encodes 11 bits of data (by having
exactly 211 words), where no word is more than 4 characters long. The web site also provides a
dictionary that encodes 13 bits of data, where no word is more than 6 letters long. The previous code
can be modified to use the larger dictionary simply by changing the two appropriate preprocessor
definitions at the top.
The algorithm takes 11 bits of the binary string, then finds the word that maps to the unique 11-bit
value. Note that it is rare for the number of bits represented by a single word to align exactly to a
byte. For example, if you were to encode a 2-byte binary string, those 16 bits would be encoded by 2
words, which could represent up to 22 bits. Therefore, there will usually be leftover bits. In the case
of 2 bytes, there are 6 leftover bits. The algorithm sets all leftover bits to 0.
Because of this padding scheme, the output doesn't always encode how many bytes were in the
input. For example, if the output is 6 words long, the input could have been either 7 or 8 bytes long.
Therefore, you need to manually truncate the output to the desired length.
4.7.4 See Also
Recipe 4.8
[ Team LiB ]
[ Team LiB ]
4.8 Converting Text Keys to Binary Keys
4.8.1 Problem
A user enters a textual representation of a key or other binary data (see Recipe 4.7). You need to
convert it to binary.
4.8.2 Solution
Parse out the words, then look them up in the dictionary to reconstruct the actual bits, as shown in
the code included in the next section.
4.8.3 Discussion
This function spc_words2bin() uses the wordlist.h file provided on the book's web site, and it can be
changed as described in Recipe 4.7.
#include
#include
#include
#include
<stdlib.h>
<string.h>
<ctype.h>
"wordlist.h"
#define BITS_IN_LIST 11
#define MAX_WORDLEN 4
unsigned char *spc_words2bin(unsigned char *str, size_t *outlen) {
int
cmp, i;
size_t
bitsinword, curbits, needed, reslen;
unsigned int ix, min, max;
unsigned char *p = str, *r, *res, word[MAX_WORDLEN + 1];
curbits = reslen = *outlen = 0;
if(!(r = res = (unsigned char *)malloc((strlen(str) + 1) / 2))
return 0;
memset(res, 0, (strlen(str) + 1) / 2);
for (;;) {
while (isspace(*p)) p++;
if (!*p) break;
/* The +1 is because we expect to see a space or a NULL after each and every
* word; otherwise, there's a syntax error.
*/
for (i = 0; i < MAX_WORDLEN + 1; i++) {
if (!*p || isspace(*p)) break;
if (islower(*p)) word[i] = *p++ - ' ';
else if (isupper(*p)) word[i] = *p++;
else {
free(res);
return 0;
}
}
if (i = = MAX_WORDLEN + 1) {
free(res);
return 0;
}
word[i] = 0;
min = 0;
max = (1 << BITS_IN_LIST) - 1;
do {
if (max < min) {
free(res);
return 0; /* Word not in list! */
}
ix = (max + min) / 2;
cmp = strcmp(word, words[ix]);
if (cmp > 0) min = ix + 1;
else if (cmp < 0) max = ix - 1;
} while (cmp);
bitsinword = BITS_IN_LIST;
while (bitsinword) {
needed = 8 - curbits;
if (bitsinword <= needed) {
*r |= (ix << (needed - bitsinword));
curbits += bitsinword;
bitsinword = 0;
} else {
*r |= (ix >> (bitsinword - needed));
bitsinword -= needed;
ix &= ((1 << bitsinword) - 1);
curbits = 8;
}
if (curbits = = 8) {
curbits = 0;
*++r = 0;
reslen++;
}
}
}
if (curbits && *r) {
free(res);
return 0; /* Error, bad format, extra bits! */
}
*outlen = reslen;
return (unsigned char *)realloc(res, reslen);
}
The inputs to the spc_words2bin( ) function are str, which is the English representation of the
binary string, and outlen, which is a pointer to how many bytes are in the output. The return value
is a binary string of length len. Note that any bits encoded by the English words that don't compose
a full byte must be zero, but are otherwise ignored.
You must know a priori how many bytes you expect to get out of this function. For example, 6 words
might map to a 56-bit binary string or to a 64-bit binary string (5 words can encode at most 55 bits,
and 6 words encodes up to 66 bits).
4.8.4 See Also
Recipe 4.7
[ Team LiB ]
[ Team LiB ]
4.9 Using Salts, Nonces, and Initialization Vectors
4.9.1 Problem
You want to use an algorithm that requires a salt, a nonce or an initialization vector (IV). You need to
understand the differences among these three things and figure out how to select good specimens of
each.
4.9.2 Solution
There's a lot of terminology confusion, and the followingSection 4.9.3 contains our take on it.
Basically, salts and IVs should be random, and nonces are usually sequential, potentially with a
random salt as a component, if there is room. With sequential nonces, you need to ensure that you
never repeat a single {key, nonce} pairing.
To get good random values, use a well-seeded, cryptographically strong pseudo-random number
generator (see the appropriate recipes in Chapter 11). Using that, get the necessary number of bits.
For salt, 64 bits is sufficient. For an IV, get one of the requisite size.
4.9.3 Discussion
Salts, nonces, and IVs are all one-time values used in cryptography that don't need to be secret, but
still lead to additional security. It is generally assumed that these values are visible to attackers, even
if it is sometimes possible to hide them. At the very least, the security of cryptographic algorithms
and protocols should not depend on the secrecy of such values.
We try to be consistent with respect to this terminology in the book. However,
in the real world, even among cryptographers there's a lot of inconsistency.
Therefore, be sure to follow the directions in the documentation for whatever
primitive you're using.
4.9.3.1 Salts
Salt is random data that helps protect against dictionary and other precomputation attacks.
Generally, salt is used in password-based systems and is concatenated to the front of a password
before processing. Password systems often use a one-way hash function to turn a password into an
"authenticator." In the simplest such system, if there were no salt, an attacker could build a
dictionary of common passwords and just look up the original password by authenticator.
The use of salt means that the attacker would have to produce a totally separate dictionary for every
possible salt value. If the salt is big enough, it essentially makes dictionary attacks infeasible.
However, the attacker can generally still try to guess every password without using a stronger
protocol. For a discussion of various password-based authentication technologies, seeRecipe 8.1.
If the salt isn't chosen at random, certain dictionaries will be more likely than others. For this reason,
salt is generally expected to be random.
Salt can be generated using the techniques discussed inChapter 11.
4.9.3.2 Nonces
Nonces[2] are bits of data often input to cryptographic protocols and algorithms, including many
message authentication codes and some encryption modes. Such values should only be used a single
time with any particular cryptographic key. In fact, reuse generally isn't prohibited, but the odds of
reuse need to be exceptionally low. That is, if you have a nonce that is very large compared to the
number of times you expect to use it (e.g., the nonce is 128 bits, and you don't expect to use it more
than 232 times), it is sufficient to choose nonces using a cryptographically strong pseudo-random
number generator.
[2]
In the UK, "nonce" is slang for a child sex offender. However, this term is widespread in the cryptographic
world, so we use it.
Sequential nonces have a few advantages over random nonces:
You can easily guarantee that nonces are not repeated. Note, though, that if the possible nonce
space is large, this is not a big concern.
Many protocols already send a unique sequence number for each packet, so one can save space
in transmitted messages.
The sequential ordering of nonces can be used to preventreplay attacks, but only if you actually
check to ensure that the nonce is always incrementing. That is, if each message has a nonce
attached to it, you can tell whether the message came in the right order, by looking at the
nonce and making sure its value is always incrementing.
However, randomness in a nonce helps prevent against classes of attacks that amortize work across
multiple keys in the same system.
We recommend that nonces have both a random portion and a sequential portion. Generally, the
most significant bytes should be random, and the final 6 to 8 bytes should be sequential. An 8-byte
counter can accommodate 264 messages without the counter's repeating, which should be more than
big enough for any system.
If you use both a nonce and a salt, you can select a single random part for each key you use. The
nonce on the whole has to be unique, but the salt can remain fixed for the lifetime of the key; the
counter ensures that the nonce is always unique. In such a nonce, the random part is said to be a
"salt." Generally, it's good to have four or more bytes of salt in a nonce.
If you decide to use only a random nonce, remember that the nonce needs to be changed after each
message, and you lose the ability to prevent against capture-replay attacks.
The random portion of a nonce can be generated using the techniques discussed inChapter 11.
Generally, you will have a fixed-size buffer into which you place the nonce, and you will then set the
remaining bytes to zero, incrementing them after each message is sent. For example, if you have a
16-byte nonce with an 8-byte counter in the least significant bytes, you might use the following code:
/* This assumes a 16-byte nonce where the last 8 bytes represent the counter! */
void increment_nonce(unsigned char *nonce) {
if (!++nonce[15]) if (!++nonce[14]) if (!++nonce[13]) if (!++nonce[12])
if (!++nonce[11]) if (!++nonce[10]) if (!++nonce[9]) if (!++nonce[8]) {
/* If you get here, you're out of nonces. This really shouldn't happen
* with an 8-byte nonce, so often you'll see: if (!++nonce[9]) ++nonce[8];
*/
}
}
Note that the this code can be more efficient if we do a 32-bit increment, but then there are endianness issues that make portability more difficult.
If sequential nonces are implemented correctly, they can help thwart capture
relay attacks (see Recipe 6.1).
4.9.3.3 Initialization vectors (IVs)
The term initialization vector (IV) is the most widely used and abused of the three terms we've been
discussing. IV and nonce are often used interchangeably. However, a careful definition does
differentiate between these two concepts. For our purposes, an IV is a nonce with an additional
requirement: it must be selected in a nonpredictable way. That is, the IV can't be sequential; it must
be random. One popular example in which a real IV is required for maximizing security is when using
the CBC encryption mode (see Recipe 5.6).
The big downside to an IV, as compared to a nonce, is that an IV does not afford protection against
capture-replay attacks-unless you're willing to remember every IV that has ever been used, which is
not a good solution. To ensure protection against such attacks when using an IV, the higher-level
protocol must have its own notion of sequence numbers that get checked in order.
Another downside is that there is generally more data to send. Systems that use sequential nonces
can often avoid sending the nonce, as it can be calculated from the sequence number already sent
with the message.
Initialization vectors can be generated using the techniques discussed inChapter 11.
4.9.4 See Also
Chapter 11
Recipe 5.6, Recipe 5.6, Recipe 8.1
[ Team LiB ]
[ Team LiB ]
4.10 Deriving Symmetric Keys from a Password
4.10.1 Problem
You do not want passwords to be stored on disk. Instead, you would like to convert a password into a
cryptographic key.
4.10.2 Solution
Use PBKDF2, the password-based key derivation function 2, specified inPKCS #5.[3]
[3]
This standard is available from RSA Security at http://www.rsasecurity.com/rsalabs/pkcs/pkcs-5/.
You can also use this recipe to derive keys from other keys. See Recipe 4.1 for
considerations; that recipe also discusses considerations for choosing good salt
values.
4.10.3 Discussion
Passwords can generally vary in length, whereas symmetric keys are almost always a fixed size.
Passwords may be vulnerable to guessing attacks, but ultimately we'd prefer symmetric keys not to
be as easily guessable.
The function spc_pbkdf2( ) in the following code is an implementation of PKCS #5, Version 2.0.
PKCS #5 stands for "Public Key Cryptography Standard #5," although there is nothing public-keyspecific about this standard. The standard defines a way to turn a password into a symmetric key.
The name of the function stands for "password-based key derivation function 2," where the 2
indicates that the function implements Version 2.0 of PKCS #5.
#include
#include
#include
#include
#include
#include
#include
<stdio.h>
<string.h>
<openssl/evp.h>
<openssl/hmac.h>
<sys/types.h>
<netinet/in.h>
<arpa/inet.h> /* for htonl */
#ifdef WIN32
typedef unsigned _ _int64 spc_uint64_t;
#else
typedef unsigned long long spc_uint64_t;
#endif
/* This value needs to be the output size of your pseudo-random function (PRF)! */
#define PRF_OUT_LEN 20
/* This is an implementation of the PKCS#5 PBKDF2 PRF using HMAC-SHA1.
* always gives 20-byte outputs.
*/
It
/* The first three functions are internal helper functions. */
static void pkcs5_initial_prf(unsigned char *p, size_t plen, unsigned char *salt,
size_t saltlen, size_t i, unsigned char *out,
size_t *outlen) {
size_t
swapped_i;
HMAC_CTX
ctx;
HMAC_CTX_init(&ctx);
HMAC_Init(&ctx, p, plen, EVP_sha1( ));
HMAC_Update(&ctx, salt, saltlen);
swapped_i = htonl(i);
HMAC_Update(&ctx, (unsigned char *)&swapped_i, 4);
HMAC_Final(&ctx, out, (unsigned int *)outlen);
}
/* The PRF doesn't *really* change in subsequent calls, but above we handled the
* concatenation of the salt and i within the function, instead of external to it,
* because the implementation is easier that way.
*/
static void pkcs5_subsequent_prf(unsigned char *p, size_t plen, unsigned char *v,
size_t vlen, unsigned char *o, size_t *olen) {
HMAC_CTX ctx;
HMAC_CTX_init(&ctx);
HMAC_Init(&ctx, p, plen, EVP_sha1( ));
HMAC_Update(&ctx, v, vlen);
HMAC_Final(&ctx, o, (unsigned int *)olen);
}
static void pkcs5_F(unsigned char *p, size_t plen, unsigned char *salt,
size_t saltlen, size_t ic, size_t bix, unsigned char *out) {
size_t
i = 1, j, outlen;
unsigned char ulast[PRF_OUT_LEN];
memset(out,0, PRF_OUT_LEN);
pkcs5_initial_prf(p, plen, salt, saltlen, bix, ulast, &outlen);
while (i++ <= ic) {
for (j = 0; j < PRF_OUT_LEN; j++) out[j] ^= ulast[j];
pkcs5_subsequent_prf(p, plen, ulast, PRF_OUT_LEN, ulast, &outlen);
}
for (j = 0; j < PRF_OUT_LEN; j++) out[j] ^= ulast[j];
}
void spc_pbkdf2(unsigned char *pw, unsigned int pwlen, char *salt,
spc_uint64_t saltlen, unsigned int ic, unsigned char *dk,
spc_uint64_t dklen) {
unsigned long i, l, r;
unsigned char final[PRF_OUT_LEN] = {0,};
if (dklen > ((((spc_uint64_t)1) << 32) - 1) * PRF_OUT_LEN) {
/* Call an error handler. */
abort( );
}
l = dklen / PRF_OUT_LEN;
r = dklen % PRF_OUT_LEN;
for (i = 1; i <= l; i++)
pkcs5_F(pw, pwlen, salt, saltlen, ic, i, dk + (i - 1) * PRF_OUT_LEN);
if (r) {
pkcs5_F(pw, pwlen, salt, saltlen, ic, i, final);
for (l = 0; l < r; l++) *(dk + (i - 1) * PRF_OUT_LEN + l) = final[l];
}
}
The spc_pbkdf2( ) function takes seven arguments:
pw
Password, represented as an arbitrary string of bytes.
pwlen
Number of bytes in the password.
salt
String that need not be private but should be unique to the user. The notion of salt is discussed
in Recipe 4.9.
saltlen
Number of bytes in the salt.
ic
"Iteration count," described in more detail later in this section. A good value is 10,000.
dk
Buffer into which the derived key will be placed.
dklen
Length of the desired derived key in bytes.
The Windows version of spc_pbkdf2( ) is called SpcPBKDF2( ). It has essentially the same
signature, though the names are slightly different because of Windows naming conventions. The
implementation uses CryptoAPI for HMAC-SHA1 and requires SpcGetExportableContext( ) and
SpcImportKeyData( ) from Recipe 5.26.
#include <windows.h>
#include <wincrypt.h>
/* This value needs to be the output size of your pseudo-random function (PRF)! */
#define PRF_OUT_LEN 20
/* This is an implementation of the PKCS#5 PBKDF2 PRF using HMAC-SHA1.
* always gives 20-byte outputs.
*/
It
static HCRYPTHASH InitHMAC(HCRYPTPROV hProvider, HCRYPTKEY hKey, ALG_ID Algid) {
HMAC_INFO HMACInfo;
HCRYPTHASH hHash;
HMACInfo.HashAlgid
= Algid;
HMACInfo.pbInnerString = HMACInfo.pbOuterString = 0;
HMACInfo.cbInnerString = HMACInfo.cbOuterString = 0;
if (!CryptCreateHash(hProvider, CALG_HMAC, hKey, 0, &hHash)) return 0;
CryptSetHashParam(hHash, HP_HMAC_INFO, (BYTE *)&HMACInfo, 0);
return hHash;
}
static void FinalHMAC(HCRYPTHASH hHash, BYTE *pbOut, DWORD *cbOut) {
*cbOut = PRF_OUT_LEN;
CryptGetHashParam(hHash, HP_HASHVAL, pbOut, cbOut, 0);
CryptDestroyHash(hHash);
}
static DWORD SwapInt32(DWORD dwInt32) {
_ _asm mov
eax, dwInt32
_ _asm bswap eax
}
static BOOL PKCS5InitialPRF(HCRYPTPROV hProvider, HCRYPTKEY hKey,
BYTE *pbSalt, DWORD cbSalt, DWORD dwCounter,
BYTE *pbOut, DWORD *cbOut) {
HCRYPTHASH hHash;
if (!(hHash = InitHMAC(hProvider, hKey, CALG_SHA1))) return FALSE;
CryptHashData(hHash, pbSalt, cbSalt, 0);
dwCounter = SwapInt32(dwCounter);
CryptHashData(hHash, (BYTE *)&dwCounter, sizeof(dwCounter), 0);
FinalHMAC(hHash, pbOut, cbOut);
return TRUE;
}
static BOOL PKCS5UpdatePRF(HCRYPTPROV hProvider, HCRYPTKEY hKey,
BYTE *pbSalt, DWORD cbSalt,
BYTE *pbOut, DWORD *cbOut) {
HCRYPTHASH hHash;
if (!(hHash = InitHMAC(hProvider, hKey, CALG_SHA1))) return FALSE;
CryptHashData(hHash, pbSalt, cbSalt, 0);
FinalHMAC(hHash, pbOut, cbOut);
return TRUE;
}
static BOOL PKCS5FinalPRF(HCRYPTPROV hProvider, HCRYPTKEY hKey,
BYTE *pbSalt, DWORD cbSalt, DWORD dwIterations,
DWORD dwBlock, BYTE *pbOut) {
BYTE pbBuffer[PRF_OUT_LEN];
DWORD cbBuffer, dwIndex, dwIteration = 1;
SecureZeroMemory(pbOut, PRF_OUT_LEN);
if (!(PKCS5InitialPRF(hProvider, hKey, pbSalt, cbSalt, dwBlock, pbBuffer,
&cbBuffer))) return FALSE;
while (dwIteration < dwIterations) {
for (dwIndex = 0; dwIndex < PRF_OUT_LEN; dwIndex++)
pbOut[dwIndex] ^= pbBuffer[dwIndex];
if (!(PKCS5UpdatePRF(hProvider, hKey, pbBuffer, PRF_OUT_LEN, pbBuffer,
&cbBuffer))) return FALSE;
}
for (dwIndex = 0; dwIndex < PRF_OUT_LEN; dwIndex++)
pbOut[dwIndex] ^= pbBuffer[dwIndex];
return TRUE;
}
BOOL SpcPBKDF2(BYTE *pbPassword, DWORD cbPassword, BYTE *pbSalt, DWORD cbSalt,
DWORD dwIterations, BYTE *pbOut, DWORD cbOut) {
BOOL
bResult = FALSE;
BYTE
pbFinal[PRF_OUT_LEN];
DWORD
dwBlock, dwBlockCount, dwLeftOver;
HCRYPTKEY hKey;
HCRYPTPROV hProvider;
if (cbOut > ((((_ _int64)1) << 32) - 1) * PRF_OUT_LEN) return FALSE;
if (!(hProvider = SpcGetExportableContext( ))) return FALSE;
if (!(hKey = SpcImportKeyData(hProvider, CALG_RC4, pbPassword, cbPassword))) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
dwBlockCount = cbOut / PRF_OUT_LEN;
dwLeftOver
= cbOut % PRF_OUT_LEN;
for (dwBlock = 1; dwBlock <= dwBlockCount; dwBlock++) {
if (!PKCS5FinalPRF(hProvider, hKey, pbSalt, cbSalt, dwIterations, dwBlock,
pbOut + (dwBlock - 1) * PRF_OUT_LEN)) goto done;
}
if (dwLeftOver) {
SecureZeroMemory(pbFinal, PRF_OUT_LEN);
if (!PKCS5FinalPRF(hProvider, hKey, pbSalt, cbSalt, dwIterations, dwBlock,
pbFinal)) goto done;
CopyMemory(pbOut + (dwBlock - 1) * PRF_OUT_LEN, pbFinal, dwLeftOver);
}
bResult = TRUE;
done:
CryptDestroyKey(hKey);
CryptReleaseContext(hProvider, hKey);
return bResult;
}
The salt is used to prevent against a dictionary attack. Without salt, a malicious system administrator
could easily figure out when a user has the same password as someone else, and he would be able to
precompute a huge dictionary of common passwords and look to see if the user's password is in that
list.
While salt is not expected to be private, it still must be chosen carefully. SeeRecipe 4.9 for more on
salt.
How Many Iterations?
To what value should you set the iteration count? The answer depends on the
environment in which you expect your software to be deployed. The basic idea is to
increase computational costs so that a brute-force attack with lots of high-end hardware is
as expensive as possible, but not to cause too noticeable a delay on the lowest-end box on
which you would wish to run legitimately.
Often, password computations such as these occur on a server. However, there are still
people out there who run servers on their 33 MHz machines. We personally believe that
people running on that kind of hardware should be able to tolerate a one-second delay, at
the very least when computing a password for a modern application. Usually, a human
waiting on the other end will be willing to tolerate an even longer wait as long as they
know why they are waiting. Two to three seconds isn't so bad.
With that guideline, we have timed our PKCS #5 implementation with some standard
input. Based on those timings, we think that 10,000 is good for most applications, and
5,000 is the lowest iteration count you should consider in this day and age. On a 33 MHz
machine, 10,000 iterations should take about 2.5 seconds to process. On a 1.67 GHz
machine, they take a mere 0.045 seconds. Even if your computation occurs on an
embedded processor, people will still be able to tolerate the delay.
The good thing is that it would take a single 1.67 GHz machine more than 6 years to
guess 232 passwords, when using PKCS #5 and 10,000 iterations. Therefore, if there
really is at least 32 bits of entropy in your password (which is very rare), you probably
won't have to worry about any attacker who has fewer than a hundred high-end machines
at his disposal, at least for a few years.
Expect governments that want your password to put together a few thousand boxes
complete with crypto acceleration, though!
Even with salt, password-guessing attacks are still possible. To prevent against this kind of attack,
PKCS #5 allows the specification of an iteration count, which basically causes an expensive portion of
the key derivation function to loop the specified number of times. The idea is to slow down the time it
takes to compute a single key from a password. If you make key derivation take a tenth of a second,
the user won't notice. However, if an attacker tries to carry out an exhaustive search of all possible
passwords, she will have to spend a tenth of a second for each password she wants to try, which will
make cracking even a weak password quite difficult. As we describe in the sidebar "How Many
Iterations?", we recommend an iteration count of 10,000.
The actual specification of the key derivation function can be found in Version 2.0 of the PKCS #5
standards document. In brief, we use a pseudo-random function using the password and salt to get
out as many bytes as we need, and we then take those outputs and feed them back into themselves
for each iteration.
There's no need to use HMAC-SHA1 in PKCS #5. Instead, you could use the Advanced Encryption
Standard (AES) as the underlying cryptographic primitive, substituting SHA1 for a hash function
based on AES (see Recipe 6.15 and Recipe 6.16).
4.10.4 See Also
RSA's PKCS #5 page: http://www.rsasecurity.com/rsalabs/pkcs/pkcs-5/
Recipe 4.9, Recipe 4.11, Recipe 5.26, Recipe 6.15, Recipe 6.16
[ Team LiB ]
[ Team LiB ]
4.11 Algorithmically Generating Symmetric Keys from
One Base Secret
4.11.1 Problem
You want to generate a key to use for a short time from a long-term secret (generally a key, but
perhaps a password). If a short-term key is compromised, it should be impossible to recover the base
secret. Multiple entities in the system should be able to compute the same derived key if they have
the right base secret.
For example, you might want to have a single long-term key and use it to create daily encryption
keys or session-specific keys.
4.11.2 Solution
Mix a base secret and any unique information you have available, passing them through apseudorandom function (PRF), as discussed in the following section.
4.11.3 Discussion
The basic idea behind secure key derivation is to take a base secret and a unique identifier that
distinguishes the key to be derived (called a distinguisher) and pass those two items through a
pseudo-random function. The PRF acts very much like a cryptographic one-way hash from a
theoretical security point of view, and indeed, such a one-way hash is often good as a PRF.
There are many different ad hoc solutions for doing key derivation, ranging from the simple to the
complex. On the simple side of the spectrum, you can concatenate a base key with unique data and
pass the string through SHA1. On the complex side is the PBKDF2 function from PKCS #5 (described
in Recipe 4.10).
The simple SHA1 approach is perhaps too simple for general-purpose requirements. In particular,
there are cases where you one might need a key that is larger than the SHA1 output length (i.e., if
you're using AES with 192-bit keys but are willing to have only 160 bits of strength). A generalpurpose hash function maps n bits to a fixed number of bits, whereas we would like a function
capable of mapping n bits to m bits.
PBKDF2 can be overkill. Its interface includes functionality to thwart password-guessing attacks,
which is unnecessary when deriving keys from secrets that were themselves randomly generated.
Fortunately, it is easy to build an n-bit to m-bit PRF that is secure for key derivation. The big difficulty
is often in selecting good distinguishers (i.e., information that differentiates parties). Generally, it is
okay to send differentiating information that one side does not already have and cannot compute in
the clear, because if an attacker tampers with the information in traffic, the two sides will not be able
to agree on a working key. (Of course, you do need to be prepared to handle such attacks.) Similarly,
it is okay to send a salt. See the sidebar, Distinguisher Selection for a discussion.
Distinguisher Selection
The basic idea behind a distinguisher is that it must be unique.
If you want to create a particular derived key, we recommend that you string together in
a predetermined order any interesting information about that key, separating data items
with a unique separation character (i.e., not a character that would be valid in one of the
data items). You can use alternate formats, as long as your data representation is
unambiguous, in that each possible distinguisher is generated by a single, unique set of
information.
As an example, let's say you want to have a different session key that you change once a
day. You could then use the date as a unique distinguisher. If you want to change keys
every time there's a connection, the date is no longer unique. However, you could use the
date concatenated with the number of times a connection has been established on that
date. The two together constitute a unique value.
There are many potential data items you might want to include in a distinguisher, and
they do not have to be unique to be useful, as long as there is a guarantee that the
distinguisher itself is unique. Here is a list of some common data items you could use:
The encryption algorithm and any parameters for which the derived key will be used
The number of times the base key has been used, either overall or in the context of
other interesting data items
A unique identifier corresponding to an entity in the system, such as a username or
email address
The IP addresses of communicating parties
A timestamp, or at least the current date
The MAC address associated with the network interface being used
Any other session-specific information
In addition, to prevent against any possible offline precomputation attacks, we
recommend you add to your differentiator a random salt of at least 64 bits, which you
then communicate to any other party that needs to derive the same key.
The easiest way to get a solid solution that will resist potentially practical attacks is to useHMAC in
counter mode. (Other MACs are not as well suited for this task, because they tend not to handle
variable-length keys.) You can also use this solution if you want an all-block cipher solution, because
you can use a construction to convert a block cipher into a hash function (seeRecipe 6.15 and Recipe
6.16).
More specifically, key HMAC with the base secret. Then, for every block of output you need (where
the block size is the size of the HMAC output), MAC the distinguishers concatenated with a fixed-size
counter at the end. The counter should indicate the number of blocks of output previously processed.
The basic idea is to make sure that each MAC input is unique.
If the desired output length is not a multiple of the MAC output length, simply generate blocks until
you have sufficient bytes, then truncate.
The security level of this solution is limited by the minimum of the number of
bits of entropy in the base secret and the output size of the MAC. For example,
if you use a key with 256 bits of entropy, and you use HMAC-SHA1 to produce
a 256-bit derived key, never assume that you have more than 160 bits of
effective security (that is the output size of HMAC-SHA1).
Here is an example implementation of a PRF based on HMAC-SHA1, using the OpenSSL API for HMAC
(discussed in Recipe 6.10):
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <openssl/evp.h>
#include <openssl/hmac.h>
#define HMAC_OUT_LEN 20 /* SHA1 specific */
void spc_make_derived_key(unsigned char *base, size_t bl, unsigned char *dist,
size_t dl, unsigned char *out, size_t ol) {
HMAC_CTX
c;
unsigned long ctr = 0, nbo_ctr;
size_t
tl, i;
unsigned char last[HMAC_OUT_LEN];
while (ol >= HMAC_OUT_LEN) {
HMAC_Init(&c, base, bl, EVP_sha1( ));
HMAC_Update(&c, dist, dl);
nbo_ctr = htonl(ctr++);
HMAC_Update(&c, (unsigned char *)&nbo_ctr, sizeof(nbo_ctr));
HMAC_Final(&c, out, &tl);
out += HMAC_OUT_LEN;
ol -= HMAC_OUT_LEN;
}
if (!ol) return;
HMAC_Init(&c, base, bl, EVP_sha1( ));
HMAC_Update(&c, dist, dl);
nbo_ctr = htonl(ctr);
HMAC_Update(&c, (unsigned char *)&nbo_ctr, sizeof(nbo_ctr));
HMAC_Final(&c, last, &tl);
for (i = 0; i < ol; i++)
out[i] = last[i];
}
Here is an example implementation of a PRF based on HMAC-SHA1, using the Windows CryptoAPI for
HMAC (discussed in Recipe 6.10). The code presented here also requires
SpcGetExportableContext( ) and SpcImportKeyData( ) from Recipe 5.26.
#include <windows.h>
#include <wincrypt.h>
#define HMAC_OUT_LEN 20 /* SHA1 specific */
static DWORD SwapInt32(DWORD dwInt32) {
_ _asm mov
eax, dwInt32
_ _asm bswap eax
}
BOOL SpcMakeDerivedKey(BYTE *pbBase, DWORD cbBase, BYTE *pbDist, DWORD cbDist,
BYTE *pbOut, DWORD cbOut) {
BYTE
pbLast[HMAC_OUT_LEN];
DWORD
cbData, dwCounter = 0, dwBigCounter;
HCRYPTKEY hKey;
HMAC_INFO HMACInfo;
HCRYPTHASH hHash;
HCRYPTPROV hProvider;
if (!(hProvider = SpcGetExportableContext( ))) return FALSE;
if (!(hKey = SpcImportKeyData(hProvider, CALG_RC4, pbBase, cbBase))) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
HMACInfo.HashAlgid
= CALG_SHA1;
HMACInfo.pbInnerString = HMACInfo.pbOuterString = 0;
HMACInfo.cbInnerString = HMACInfo.cbOuterString = 0;
while (cbOut >= HMAC_OUT_LEN) {
if (!CryptCreateHash(hProvider, CALG_HMAC, hKey, 0, &hHash)) {
CryptDestroyKey(hKey);
CryptReleaseContext(hProvider, 0);
return FALSE;
}
CryptSetHashParam(hHash, HP_HMAC_INFO, (BYTE *)&HMACInfo, 0);
CryptHashData(hHash, pbDist, cbDist, 0);
dwBigCounter = SwapInt32(dwCounter++);
CryptHashData(hHash, (BYTE *)&dwBigCounter, sizeof(dwBigCounter), 0);
cbData = HMAC_OUT_LEN;
CryptGetHashParam(hHash, HP_HASHVAL, pbOut, &cbData, 0);
CryptDestroyHash(hHash);
pbOut += HMAC_OUT_LEN;
cbOut -= HMAC_OUT_LEN;
}
if (cbOut) {
if (!CryptCreateHash(hProvider, CALG_HMAC, hKey, 0, &hHash)) {
CryptDestroyKey(hKey);
CryptReleaseContext(hProvider, 0);
return FALSE;
}
CryptSetHashParam(hHash, HP_HMAC_INFO, (BYTE *)&HMACInfo, 0);
CryptHashData(hHash, pbDist, cbDist, 0);
dwBigCounter = SwapInt32(dwCounter);
CryptHashData(hHash, (BYTE *)&dwBigCounter, sizeof(dwBigCounter), 0);
cbData = HMAC_OUT_LEN;
CryptGetHashParam(hHash, HP_HASHVAL, pbLast, &cbData, 0);
CryptDestroyHash(hHash);
CopyMemory(pbOut, pbLast, cbOut);
}
CryptDestroyKey(hKey);
CryptReleaseContext(hProvider, 0);
return TRUE;
}
Ultimately, if you have a well-specified constant set of distinguishers and a constant base secret
length, it is sufficient to replace HMAC by SHA1-hashing the concatenation of the key, distinguisher,
and counter.
4.11.4 See Also
Recipe 4.10, Recipe 5.26, Recipe 6.10, Recipe 6.15, Recipe 6.16
[ Team LiB ]
[ Team LiB ]
4.12 Encrypting in a Single Reduced Character Set
4.12.1 Problem
You're storing data in a format in which particular characters are invalid. For example, you might be
using a database, and you'd like to encrypt all the fields, but the database does not support binary
strings. You want to avoid growing the message itself (sometimes database fields have length limits)
and thus want to avoid encoding binary data into a representation like base64.
4.12.2 Solution
Encrypt the data using a stream cipher (or a block cipher in a streaming mode). Do so in such a way
that you map each byte of output to a byte in the valid character set.
For example, let's say that your character set is the 64 characters consisting of all uppercase and
lowercase letters, the 10 numerical digits, the space, and the period. For each character, do the
following:
1. Map the input character to a number from 0 to 63.
2. Take a byte of output from the stream cipher and reduce it modulo 64.
3. Add the random byte and the character, reducing the result modulo 64.
4. The result will be a value from 0 to 63. Map it back into the desired character set.
Decryption is done with exactly the same process.
See Recipe 5.2 for a discussion of picking a streaming cipher solution. Generally, we recommend using
AES in CTR mode or the SNOW 2.0 stream cipher.
4.12.3 Discussion
If your character set is an 8-bit quantity per character (e.g., some subset of ASCII instead of Unicode
or something like that), the following code will work:
typedef struct {
unsigned char *cset;
int
csetlen;
unsigned char reverse[256];
unsigned char maxvalid;
} ENCMAP;
#define decrypt_within_charset encrypt_within_charset
void setup_charset_map(ENCMAP *s, unsigned char *charset, int csetlen) {
int i;
s->cset
= charset;
s->csetlen = csetlen;
for (i = 0; i < 256; i++) s->reverse[i] = -1;
for (i = 0; i < csetlen; i++) s->reverse[charset[i]] = i;
s->maxvalid = 255 - (256 % csetlen);
}
void encrypt_within_charset(ENCMAP *s, unsigned char *in, long inlen,
unsigned char *out, unsigned char (*keystream_byte)(
long
i;
unsigned char c;
)) {
for (i = 0; i < inlen; i++) {
do {
c = (*keystream_byte)( );
} while(c > s->maxvalid);
*out++ = s->cset[(s->reverse[*in++] + c) % s->csetlen];
}
}
The function setup_charset_map( ) must be called once to set up a table that maps ASCII values into
an index of the valid subset of characters. The data type that stores the mapping data isENCMAP . The
other two arguments are charset , a list of all characters in the valid subset, and csetlen , which
specifies the number of characters in that set.
Once the character map is set up, you can call encrypt_within_charset( ) to encrypt or decrypt
data, while staying within the specified character set. This function has the following arguments:
s
Pointer to the ENCMAP object.
in
Buffer containing the data to be encrypted or decrypted.
inlen
Length in bytes of the input buffer.
out
Buffer into which the encrypted or decrypted data is placed.
keystream_byte
Pointer to a callback function that should return a single byte of cryptographically strong
keystream.
This code needs to know how to get more bytes of keystream on demand, because some bytes of
keystream will be thrown away if they could potentially be leveraged in a statistical attack. Therefore,
the amount of keystream necessary is theoretically unbounded (though in practice it should never be
significantly more than twice the length of the input). As a result, we need to know how to invoke a
function that gives us new keystream instead of just passing in a buffer of static keystream.
It would be easy (and preferable) to extend this code example to use a cipher context object (keyed
and in a streaming mode) as a parameter instead of the function pointer. Then you could get the next
byte of keystream directly from the passed context object. If your crypto library does not allow you
direct access to keystream, encrypting all zeros returns the original keystream.
Remember to use a MAC anytime you encrypt, even though this expands your
message length. The MAC is almost always necessary for security! For
databases, you can always base64-encode the MAC output and stick it in another
field. (See Recipe 6.9 for how to MAC data securely.)
Note that encrypt_within_charset( ) can be used for both encryption and decryption. For clarity's
sake, we alias decrypt_within_charset( ) using a macro.
The previous code works for fixed-size wide characters if you operate on the appropriate sized values,
even though we only operate on single characters. As written, however, our code isn't useful for
variable-byte character sets. With such data, we recommend that you accept a solution that involves
message expansion, such as encrypting, then base64-encoding the result.
4.12.4 See Also
Recipe 5.2 , Recipe 6.9
[ Team LiB ]
[ Team LiB ]
4.13 Managing Key Material Securely
4.13.1 Problem
You want to minimize the odds of someone getting your raw key material, particularly if they end up
with local access to the machine.
4.13.2 Solution
There are a number of things you can do to reduce these risks:
Securely erase keys as soon as you have finished using them. Use thespc_memzero( ) function
from Recipe 13.2.
When you need to store key material, password-protect it, preferably using a scheme to provide
encryption and message integrity so that you can detect it if the encrypted key file is ever
modified. For example, you can use PBKD2 (see Recipe 4.10) to generate a key from a
password and then use that key to encrypt using a mode that also provides integrity, such as
CWC (see Recipe 5.10). For secret keys in public key cryptosystems, use PEM-encoding, which
affords password protection (see Recipe 7.17).
Store differentiating information with your medium- or long-term symmetric keys to make sure
you don't reuse keys. (See Recipe 4.11.)
4.13.3 See Also
Recipe 4.10, Recipe 4.11, Recipe 5.10, Recipe 7.17, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
4.14 Timing Cryptographic Primitives
4.14.1 Problem
You want to compare the efficiency of two similar cryptographic primitives and would like to ensure
that you do so in a fair manner.
4.14.2 Solution
Time operations by calculating how many cycles it takes to process each byte, so that you can
compare numbers across processors of different speeds more fairly.
Focus on the expected average case, best case, and worst case.
4.14.3 Discussion
When you're looking at timing information, you usually have one of two motivations: either you're
interested in comparing the performance of two algorithms, or you'd like to get a sense of how much
data you'll actually be able to pump through a particular machine.
Measuring bytes per second is a useful thing when you're comparing the performance of multiple
algorithms on a single box, but it gives no real indication of performance on other machines.
Therefore, cryptographers prefer to measure how many processor clock cycles it takes to process
each byte, because doing so allows for comparisons that are more widely applicable. For example,
such comparisons will generally hold fast on the same line of processors running at different speeds.
If you're directly comparing the speed of an algorithm on a 2GHz Pentium 4 against the published
speed of the same algorithm run on a 800 MHz Pentium 3, the first one will always be faster when
measured in bytes per second. However, if you convert the numbers from bytes per second to cycles
per byte, you'll see that, if you run the same implementation of an algorithm on a P3 and a P4, the
P3 will generally be faster by 25% or so, just because instructions on a P4 take longer to execute on
average than they do on a P3.
If you know the speed of an algorithm in bytes per second, you can calculate the number of cycles
per byte simply by dividing by the clock speed in hertz (giving you bytes per cycle) and taking the
reciprocal (getting cycles per byte). If you know the speed measured in gigabytes per second, you
can divide by the clock speed in gigahertz, then take the reciprocal. For example, you can process
data at 0.2 gigabytes per second on a 3 GHz CPU as follows:
.2/3 = 0.066666666666666666 (bytes processed per cycle)
1/0.066666666666666666 = 15.0 cycles per byte
For many different reasons, it can be fairly difficult to get timing numbers that are completely
accurate. Often, internal clocks that the programmer can read are somewhat asynchronous from the
core processor clock. More significantly, there's often significant overhead that can be included in
timing results, such as the cost of context switches and sometimes timing overhead.
Some CPUs, such as AMD's Athlon, are advertised such that the actual clock
speed is not obvious. For example, the Athlon 2000 runs at roughly 1666 MHz,
significantly less than the 2000 MHz one might suspect.
Generally, you'll want to find out how quickly a primitive or algorithm can process a fixed amount of
data, and you'd like to know how well it does that in a real-world environment. For that reason, you
generally shouldn't worry much about subtracting out things that aren't relevant to the underlying
algorithm, such as context switches and procedure call overhead. Instead, we recommend running
the algorithm many times and averaging the total time to give a good indication of overall
performance.
In the following sections we'll discuss timing basics, then look at the particulars of timing
cryptographic code.
4.14.3.1 Timing basics
You need to be able to record the current time with as much precision as possible. On a modernx86
machine, it's somewhat common to see people using inline assembly to call the RDTSC instruction
directly, which returns the number of clock cycles since boot as a 64-bit value. For example, here's
some inline assembly for GCC on 32-bit x86 platforms (only!) that reads the counter, placing it into a
64-bit unsigned long long that you pass in by address:
#define current_stamp(a) asm volatile("rdtsc" : "=a"(((unsigned int *)(a))[0]),\
"=d"(((unsigned int *)a)[1]))
The following program uses the above macro to return the number of ticks since boot:
#include <stdio.h>
int main(int argc, char *argv[
spc_uint64_t x;
]) {
current_stamp(&x);
printf("%lld ticks since boot (when I read the clock).\n", x);
return 0;
}
RDTSC is fairly accurate, although processor pipelining issues can lead to this technique's being a few
cycles off, but this is rarely a big deal.
On Windows machines, you can read the same thing using QueryPerformanceCounter( ), which
takes a pointer to a 64-bit integer (the LARGE_INTEGER or _ _int64 type).
You can get fairly accurate timing just by subtracting two subsequent calls tocurrent_stamp( ). For
example, you can time how long an empty for loop with 10,000 iterations takes:
#include <stdio.h>
int main(int argc, char *argv[ ]) {
spc_uint64_t start, finish, diff;
volatile int i;
current_stamp(&start);
for (i = 0; i < 10000; i++);
current_stamp(&finish);
diff = finish - start;
printf("That loop took %lld cycles.\n", diff);
return 0;
}
On an Athlon XP, compiling with GCC 2.95.4, the previous code will consistently give 43-44 cycles
without optimization turned on and 37-38 cycles with optimization turned on. Generally, ifi is
declared volatile, the compiler won't eliminate the loop, even when it can figure out that there are no
side effects.
Note that you can expect some minimal overhead in gathering the timestamp to begin with. You can
calculate the fixed timing overhead by timing nothing:
int main(int argc, char *argv[ ]) {
spc_uint64_t start, finish, diff;
current_stamp(&start);
current_stamp(&finish);
diff = finish - start;
printf("Timing overhead takes %lld cycles.\n", diff);
return 0;
}
On an Athlon XP, the overhead is usually reported as 0 cycles and occasionally as 1 cycle. This isn't
really accurate, because the two store operations in the first timestamp call take about 2 to 4 cycles.
The problem is largely due to pipelining and other complex architectural issues, and it is hard to work
around. You can explicitly introduce pipeline stalls, but we've found that doesn't always work as well
as expected. One thing to do is to time the processing of a large amount of data. Even then, you will
get variances in timing because of things not under your control, such as context switches. In short,
you can get within a few cycles of the truth, and beyond that you'll probably have to take some sort
of average.
A more portable but less accurate way of getting timing information onUnix-based platforms is to
ask the operating system for the clock using the gettimeofday( ) function. The resolution varies
depending on your underlying hardware, but it's usually very good. It can be implemented using
RDTSC but might have additional overhead. Nonetheless, on most operating systems,
gettimeofday( ) is very accurate.
Other Ways to Get the Time
On many machines, there are other ways to get the time. One way is to use thePOSIX
times( ) function, which has the advantage that you can separate the time your process
spends in the kernel from the time spent in user space running your code. Whiletimes( )
is obsoleted on many systems, getrusage( ) does the same thing.
Another alternative is the ISO C89 standard function, clock( ). However, other timers
we discuss generally provide resolution that is as good as or better than this function.
Here's a macro that will use gettimeofday( ) to put the number of microseconds since January 1,
1970 into an unsigned 64-bit integer (if your compiler does not support a 64-bit integer type, you'll
have to store the two 32-bit values separately and diff them properly; see below).
#include <sys/time.h>
#define current_time_as_int64(a) {
struct timeval t;
gettimeofday(&t, 0);
*a = (spc_uint64_t)((t.tv_sec * 1000000) + t.tv_usec);
}
\
\
\
\
Attackers can often force the worst-case performance for functionality with well-chosen inputs.
Therefore, you should always be sure to determine the worst-case performance characteristics of
whatever it is you're doing, and plan accordingly.
The gettimeofday( )-based macro does not compute the same thing the
RDTSC version does! The former returns the number of microseconds elapsed,
while the latter returns the number of cycles elapsed.
You'll usually be interested in the number of seconds elapsed. Therefore, you'll need to convert the
result of the gettimeofday( ) call to a number of cycles. To perform this conversion, divide by the
clock speed, represented as a floating-point number in gigahertz.
Because you care about elapsed time, you'll want to subtract the starting time from the ending time
to come up with elapsed time. You can transform a per-second representation to a per-cycle
representation once you've calculated the total running time by subtracting the start from the end.
Here's a function to do both, which requires you to define a constant with your clock speed in
gigahertz:
#define MY_GHZ 1.6666666666666667
/* We're using an Athlon XP 2000 */
spc_uint64_t get_cycle_count(spc_uint64_t start, spc_uint64_t end) {
return (spc_uint64_t)((end - start) / (doublt)MY_GHZ);
}
4.14.3.2 Timing cryptographic code
When timing cryptographic primitives, you'll generally want to know how many cycles it takes to
process a byte, on average. That's easy: just divide the number of bytes you process by the number
of cycles it takes to process. If you wish, you can remove overhead from the cycle count, such as
timing overhead (e.g., a loop).
One important thing to note about timing cryptographic code is that some types of algorithms have
different performance characteristics as they process more data. That is, they can be dominated by
per-message overhead costs for small message sizes. For example, most hash functions such as
SHA1 are significantly slower (per byte) for small messages than they are for large messages.
You need to figure out whether you care about optimal performance or average-case performance.
Most often, it will be the latter. For example, if you are comparing the speeds of SHA1 and some
other cryptographic hash function such as RIPEMD-160, you should ask yourself what range of
message sizes you expect to see and test for values sampled throughout that range.
[ Team LiB ]
[ Team LiB ]
Chapter 5. Symmetric Encryption
This chapter discusses the basics of symmetric encryption algorithms. Message integrity checking and
hash functions are covered in Chapter 6. The use of cryptography on a network is discussed in
Chapter 9.
Many of the recipes in this chapter are too low-level for general-purpose use.
We recommend that you first try to find what you need in Chapter 9 before
resorting to building solutions yourself using the recipes in this chapter. If you
do use these recipes, please be careful, read all of our warnings, and do
consider using the higher-level constructs we suggest.
[ Team LiB ]
[ Team LiB ]
5.1 Deciding Whether to Use Multiple Encryption
Algorithms
5.1.1 Problem
You need to figure out whether to support multiple encryption algorithms in your system.
5.1.2 Solution
There is no right answer. It depends on your needs, as we discuss in the following section.
5.1.3 Discussion
Clearly, if you need to support multiple encryption algorithms for standards compliance or legacy
support, you should do so. Beyond that, there are two schools of thought. The first school of thought
recommends that you support multiple algorithms to allow users to pick their favorite. The other
benefit of this approach is that if an algorithm turns out to be seriously broken, supporting multiple
algorithms can make it easier for users to switch.
However, the other school of thought points out that in reality, many users will never switch
algorithms, even if one is broken. Moreover, by supporting multiple algorithms, you risk adding more
complexity to your application, which can be detrimental. In addition, if there are multiple
interoperating implementations of a protocol you're creating, other developers often will implement
only their own preferred algorithms, potentially leading to major interoperability problems.
We personally prefer picking a single algorithm that will do a good enough job of meeting the needs
of all users. That way, the application is simpler to comprehend, and there are no interoperability
issues. If you choose well-regarded algorithms, the hope is that there won't be a break that actually
impacts end users. However, if there is such a break, you should make the algorithm easy to replace.
Many cryptographic APIs, such as the OpenSSL EVP interface (discussed inRecipe 5.17), provide an
interface to help out here.
5.1.4 See Also
Recipe 5.17
[ Team LiB ]
[ Team LiB ]
5.2 Figuring Out Which Encryption Algorithm Is Best
5.2.1 Problem
You need to figure out which encryption algorithm you should use.
5.2.2 Solution
Use something well regarded that fits your needs. We recommend AES for general-purpose use. If
you're willing to go against the grain and are paranoid, you can use Serpent, which isn't quite as fast
as AES but is believed to have a much higher security margin.
If you really feel that you need the fastest possible secure solution, consider theSNOW 2.0 stream
cipher, which currently looks very good. It appears to have a much better security margin than the
popular favorite, RC4, and is even faster. However, it is fairly new. If you're highly risk-adverse, we
recommend AES or Serpent. Although popular, RC4 would never be the best available choice.
5.2.3 Discussion
Be sure to read this discussion carefully, as well as other related discussions.
While a strong encryption algorithm is a great foundation, there are many ways
to use strong encryption primitives in an insecure way.
There are two general types of ciphers:
Block ciphers
These work by encrypting a fixed-size chunk of data (a block). Data that isn't aligned to the
size of the block needs to be padded somehow. The same input always produces the same
output.
Stream ciphers
These work by generating a stream of pseudo-random data, then using XOR[1] to combine the
stream with the plaintext.
[1]
Or some other in-group operation, such as modular addition.
There are many different ways of using block ciphers; these are called block cipher modes. Selecting
a mode and using it properly is important to security. Many block cipher modes are designed to
produce a result that acts just like a stream cipher. Each block cipher mode has its advantages and
drawbacks. See Recipe 5.4 for information on selecting a mode.
Stream ciphers generally are used as designed. You don't hear people talking about stream cipher
modes. This class of ciphers can be made to act as block ciphers, but that generally destroys their
best property (their speed), so they are typically not used that way.
We recommend the use of only those ciphers that have been studied by the cryptographic community
and are held in wide regard.
There are a large number of symmetric encryption algorithms. However, unless you need a particular
algorithm for the sake of interoperability or standards, we recommend using one of a very small
number of well-regarded algorithms. AES, the Advanced Encryption Standard, is a great generalpurpose block cipher. It is among the fastest block ciphers, is extremely well studied, and is believed
to provide a high level of security. It can also use key lengths up to 256 bits.
AES has recently replaced Triple-DES (3DES), a variant of the original Data Encryption Standard
(DES), as the block cipher of choice, partially because of its status as a U.S. government standard,
and partially because of its widespread endorsement by leading cryptographers. However, Triple-DES
is still considered a very secure alternative to AES. In fact, in some ways it is a more conservative
solution, because it has been studied for many more years than has AES, and because AES is based
on a relatively new breed of block cipher that is far less understood than the traditional underpinnings
upon which Triple-DES is based.[2]
[2]
Most block ciphers are known as Feistel ciphers, a construction style dating back to the early 1970s. AES is a
Square cipher, which is a new style of block cipher construction, dating only to 1997.
Nonetheless, AES is widely believed to be able to resist any practical attack currently known that
could be launched against any block cipher. Today, many cryptographers would feel just as safe
using AES as they would using Triple-DES. In addition,AES always uses longer effective keys and is
capable of key sizes up to 256 bits, which should offer vastly more security thanTriple-DES, with its
effective 112-bit keys.[3] (The actual key length can be either 128 or 192 bits, but not all of the bits
have an impact on security.) DES itself is, for all intents and purposes, insecure because of its short
key length. Finally, AES is faster than DES, and much faster than Triple-DES.
[3]
This assumes that a meet-in-the-middle attack is practical. Otherwise, the effective strength is 168 bits. In
practice, even 112 bits is enough.
Serpent is a block cipher that has received significant scrutiny and is believed to have a higher
security margin than AES. Some cryptographers worry that AES may be easy to break in 5 to 10
years because of its nontraditional nature and its simple algebraic structure. Serpent is significantly
more conservative in every way, but it is slower. Nonetheless, it's at least three times faster than
Triple-DES and is more than fast enough for all practical purposes.
Of course, because AES is a standard, you won't lose your job if AES turns out to be broken, whereas
you'll probably get in trouble if Serpent someday falls!
RC4 is the only widely used stream cipher. It is quite fast but difficult to use properly, because of a
major weakness in initialization (when using a key to initialize the cipher). In addition, while there is
no known practical attack against RC4, there are some theoretical problems that show this algorithm
to be far from optimal. In particular, RC4's output is fairly easy to distinguish from a true random
generator, which is a bad sign. (See Recipe 5.23 for information on how to use RC4 securely.)
SNOW is a new stream cipher that makes significant improvements on old principles. Besides the fact
that it's likely to be more secure than RC4, it is also faster-an optimized C version runs nearly twice
as fast for us than does a good, optimized assembly implementation of RC4. It has also received a
fair amount of scrutiny, though not nearly as much as AES. Nothing significant has been found in it,
and even the minor theoretical issues in the first version were fixed, resulting in SNOW 2.0.
Table 5-1 shows some of the fastest noncommercial implementations for popular patent-free
algorithms we could find and run on our own x86-based hardware. (There may, of course, be faster
implementations out there.) Generally, the implementations were optimized assembly. Speeds are
measured in cycles per byte for the Pentium III, which should give a good indication of how the
algorithms perform in general.
On a 1 GHz machine, you would need an algorithm running at 1 cycle per byte to be able to encrypt
1 gigabyte per second. On a 3 GHz machine, you would only need the algorithm to run at 3 cycles
per byte. Some of the implementations listed in the table are therefore capable of handling gigabit
speeds fairly effortlessly on reasonable PC hardware.
Note that you won't generally quite get such speeds in practice as a result of overhead from cache
misses and other OS-level issues, but you may come within a cycle or two per byte.
Table 5-1. Noncommercial implementations for popular patent-free
encryption algorithms
Cipher
AES
Key size
Speed[4]
Implementation
128 bits[5]
14.1 cpb in
asm, 22.6
cpb in C
Brian Gladman's[6]
The assembly version currently works
only on Windows.
This could be a heck of a lot better and
should probably improve in the near
future. Currently, we recommend Brian
Gladman's C code instead. Perhaps
OpenSSL will incorporate Brian's code
soon!
AES
128 bits
41.3 cpb
OpenSSL
Triple
DES
192 bits[7]
108.2 cpb
OpenSSL
SNOW
2.0
128 or 256
6.4 cpb
bits
Fast reference
implementation[8]
RC4
Up to 256
bits
(usually
128 bits)
OpenSSL
Serpent
10.7 cpb
128, 192,
35.6 cpb
or 256 bits
Up to 256
bits
Blowfish
(usually
128 bits)
23.2 cpb
Fast reference
implementation
OpenSSL
Notes
This implementation is written in C.
It gets a lot faster on 64-bit platforms
and is at least as fast as AES in
hardware.
[4]
All timing values are best cases based on empirical testing and assumes that the data being processed is
already in cache. Do not expect that you'll quite be able to match these speeds in practice.
[5]
AES supports 192-bit and 256-bit keys, but the algorithm then runs slower.
[6]
http://fp.gladman.plus.com/AES/
[7]
The effective strength of Triple DES is theoretically no greater than112 bits.
[8]
Available from http://www.it.lth.se/cryptology/snow/
As we mentioned, we generally prefer AES (when used properly), which is not only a standard but
also is incredibly fast for a block cipher. It's not quite as fast as RC4, but it seems to have a far better
security margin. If speed does make a difference to you, you can choose SNOW 2.0, which is actually
faster than RC4. Or, in some environments, you can use an AES mode of operation that allows for
parallelization, which really isn't possible in an interoperable way using RC4. Particularly in hardware,
AES in counter mode can achieve much higher speeds than even SNOW can.
Clearly, Triple-DES isn't fast in the slightest; we have included it inTable 5-1 only to give you a point
of reference. In our opinion, you really shouldn't need to consider anything other than AES unless
you need interoperability, in which case performance is practically irrelevant anyway!
5.2.4 See Also
Brian Gladman's Cryptographic Technology page: http://fp.gladman.plus.com/AES/
OpenSSL home page: http://www.openssl.org/
SNOW home page: http://www.it.lth.se/cryptology/snow/
Serpent home page: http://www.cl.cam.ac.uk/~rja14/serpent.html
Recipe 5.4, Recipe 5.23
[ Team LiB ]
[ Team LiB ]
5.3 Selecting an Appropriate Key Length
5.3.1 Problem
You are using a cipher with a variable key length and need to decide which key length to use.
5.3.2 Solution
Strike a balance between long-term security needs and speed requirements. The weakest commonly
used key length we would recommend in practice would be Triple-DES keys (112 effective bits). For
almost all other algorithms worth considering, it is easy to use 128-bit keys, and you should do so.
Some would even recommend using a key size that's twice as big as the effective strength you'd like
(but this is unnecessary if you properly use a nonce when you encrypt; seeSection 5.3.3).
5.3.3 Discussion
Some ciphers offer configurable key lengths. For example, AES allows 128-bit, 192-bit, or 256-bit
keys, whereas RC4 allows for many different sizes, but 40 bits and 128 bits are the common
configurations. The ease with which an attacker can perform a brute-force attack (trying out every
possible key) is based not only on key length, but also on the financial resources of the attacker. 56bit keys are trivial for a well-funded government to break, and even a person with access to a
reasonable array of modern desktop hardware can break 56-bit keys fairly quickly. Therefore, the
lifetime of 56-bit keys is unreasonable for any security needs. Unfortunately, there are still many
locations where 40-bit keys or 56-bit keys are used, because weak encryption used to be the
maximum level of encryption that could be exported from the United States.
Symmetric key length recommendations do not apply to public key lengths. See
Recipe 7.3 for public key length recommendations.
Supporting cryptographically weak configurations is a risky proposition. Not only are the people who
are legitimately using those configurations at risk, but unless you are extremely careful in your
protocol design, it is also possible that an attacker can force the negotiation of an insecure
configuration by acting as a "man in the middle" during the initial phases of a connection, before fullfledged encryption begins. Such an attack is often known as a rollback attack, because the attacker
forces the communicating parties to use a known insecure version of the protocol. (We discuss how
to thwart such attacks in Recipe 10.7.)
In the real world, people try very hard to get to 80 bits of effective security, which we feel is the
minimum effective strength you should accept. Generally, 128 bits of effective security is considered
probably enough for all time, if the best attack that can be launched against a system is brute force.
However, even if using the right encryption mode, that still assumes no cryptographic weaknesses in
the cipher whatsoever.
In addition, depending on the way you use encryption, there are precomputation and collision attacks
that allow the attacker to do better than brute force. The general rule of thumb is that the effective
strength of a block cipher is actually half the key size, assuming the cipher has no known attacks that
are better than brute force.
However, if you use random data properly, you generally get a bit of security back for each bit of the
data (assuming it's truly random; see Recipe 11.1 for more discussion about this). The trick is using
such data properly. In CBC mode, generally the initialization vector for each message sent should be
random, and it will thwart these attacks. In most other modes, the initialization vector acts more like
a nonce, where it must be different for each message but doesn't have to be completely random. In
such cases, you can select a random value at key setup time, then construct per-message initializers
by combining the random value and a message counter.
In any event, with a 128-bit key, we strongly recommend that you build a system without a 64-bit
random value being used in some fashion to prevent against attack.
Should you use key lengths greater than 128 bits, especially considering that so many algorithms
provide for them? For example, AES allows for 128-bit, 192-bit, and 256-bit keys. Longer key lengths
provide more security, yet for AES they are less efficient (in most other variable key length ciphers,
setup gets more expensive, but encryption does not). In several of our own benchmarks, 128-bit AES
is generally only about 33% faster than 256-bit AES. Also, 256-bit AES runs at least 50% faster than
Triple-DES does. When it was the de facto standard, Triple-DES was considered adequate for almost
all applications.
In the real world, 128 bits of security may be enough for all time, even considering that the ciphers
we use today are probably nowhere near as good as they could be. And if it ever becomes something
to worry about, it will be news on geek web sites like Slashdot. Basically, when the U.S. government
went through the AES standardization process, they were thinking ahead in asking for algorithms
capable of supporting 192-bit and 256-bit keys, just in case future advances like quantum computing
somehow reduce the effective key strength of symmetric algorithms.
Until there's a need for bigger keys, we recommend sticking with 128-bit keys when using AES as
there is no reason to take the efficiency hit when using AES. We say this particularly because we
don't see anything on the horizon that is even a remote threat.
However, this advice assumes you're really getting 128 bits of effective strength. If you refuse to use
random data to prevent against collision and precomputation attacks, it definitely makes sense to
move to larger key sizes to obtain your desired security margin.
5.3.4 See Also
Recipe 5.3, Recipe 7.3, Recipe 10.7, Recipe 11.1
[ Team LiB ]
[ Team LiB ]
5.4 Selecting a Cipher Mode
5.4.1 Problem
You need to use a low-level interface to encryption. You have chosen a block cipher and need to
select the mode in which to use that cipher.
5.4.2 Solution
There are various tradeoffs. For general-purpose use, we recommend CWC mode in conjunction with
AES, as we discuss in the following section. If you wish to do your own message authentication, we
recommend CTR mode, as long as you're careful with it.
5.4.3 Discussion
First, we should emphasize that you should use a low-level mode only if it is absolutely necessary,
because of the ease with which accidental security vulnerabilities can arise. For general-purpose use,
we recommend a high-level abstraction, such as that discussed inRecipe 5.16.
With that out of the way, we'll note that each cipher mode has its advantages and drawbacks.
Certain drawbacks are common to all of the popular cipher modes and should usually be solved at
another layer. In particular:
If a network attack destroys or modifies data in transit, any cipher mode that does not perform
integrity checking will, if the attacker does his job properly, fail to detect an error. The modes
we discuss that provide built-in integrity checking are CWC, CCM, and OCB.
When an attacker does tamper with a data stream by adding or truncating, most modes will be
completely unable to recover. In some limited circumstances, CFB mode can recover, but this
problem is nonetheless better solved at the protocol layer.
Especially when padding is not necessary, the ciphertext length gives away information about
the length of the original message, which can occasionally be useful to an attacker. This is a
covert channel, but one that most people choose to ignore. If you wish to eliminate risks with
regard to this problem, pad to a large length, even if padding is not needed. To get rid of the
risk completely, send fixed-size messages at regular intervals, whether or not there is "real"
data to send. Bogus messages to eliminate covert channels are called cover traffic.
Block ciphers leak information about the key as they get used. Some block cipher modes leak a
lot more information than others. In particular, CBC mode leaks a lot more information than
something like CTR mode.
If you do not use a cipher mode that provides built-in integrity checking, be
sure to use a MAC (message authentication code) whenever encrypting.
In the following sections, we'll go over the important properties of each of the most popular modes,
pointing out the tradeoffs involved with each (we'll avoid discussing the details of the modes here;
we'll do that in later recipes). Note that if a problem is listed for only a single cipher mode and goes
unmentioned elsewhere, it is not a problem for those other modes. For each of the modes we
discuss, speed is not a significant concern; the only thing that has a significant impact on
performance is the underlying block cipher.[9]
[9]
Integrity-aware modes will necessarily be slower than raw encryption modes, but CWC and OCB are faster
than combining an integrity primitive with a standard mode, and CCM is just as fast as doing so.
5.4.3.1 Electronic Code Book (ECB) mode
This mode simply breaks up a message into blocks and directly encrypts each block with the raw
encryption operation. It does not have any desirable security properties and should not be used
under any circumstances. We cover raw encryption as a building block for building other modes, but
we don't cover ECB itself because of its poor security properties.
ECB has been standardized by NIST (the U.S. National Institute for Standards and Technology).
The primary disadvantages of ECB mode are:
Encrypting a block of a fixed value always yields the same result, making ECB mode particularly
susceptible to dictionary attacks.
When encrypting more than one block and sending the results over an untrusted medium, it is
particularly easy to add or remove blocks without detection (that is, ECB is susceptible to
tampering, capture replay, and other problems). All other cipher modes that lack integrity
checking have similar problems, but ECB is particularly bad.
The inputs to the block cipher are never randomized because they are always exactly equal to
the corresponding block of plaintext.
Offline precomputation is feasible.
The mode does have certain advantages, but do note that other modes share these advantages:
Multiblock messages can be broken up, and the pieces encrypted in parallel.
Random access of messages is possible; the 1,024th block can be decrypted without decrypting
other data blocks.
However, the advantages of ECB do not warrant its use.
We do discuss how to use ECB to encrypt a block at a time in Recipe 5.5, when it is necessary in
implementing other cryptographic primitives.
5.4.3.2 Cipher Block Chaining (CBC) mode
CBC mode is a simple extension to ECB mode that adds a significant amount of security. CBC works
by breaking the message up into blocks, then using XOR to combine the ciphertext of the previous
block with the plaintext of the current block. The result is then encrypted in ECB mode. The very first
block of plaintext is XOR'd with an initialization vector (IV). The IV can be publicly known, and it must
be randomly selected for maximum security. Many people use sequential IVs or even fixed IVs, but
that is not at all recommended. For example, SSL has had security problems in the past when using
CBC without random IVs. Also note that if there are common initial strings, CBC mode can remain
susceptible to dictionary attacks if no IV or similar mechanism is used. As with ECB, padding is
required, unless messages are always block-aligned.
CBC has been standardized by NIST.
The primary disadvantages of CBC mode are:
Encryption cannot be parallelized (though decryption can be, and there are encryption
workarounds that break interoperability; see Recipe 5.14).
There is no possibility of offline precomputation.
Capture replay of entire or partial messages can be possible without additional consideration.
The mode requires an initial input that must be random. It is not sufficient to use a unique but
predictable value.
The mode leaks more information than is optimal. We wouldn't use it to output more than 240
blocks.
The primary advantage of CBC mode is that it captures the desirable properties of ECB mode,
while removing most of the drawbacks.
We discuss CBC mode in Recipe 5.6.
5.4.3.3 Counter (CTR) mode
Whereas ECB and CBC are block-based modes, counter (CTR) mode and the rest of the modes
described in this section simulate a stream cipher. That is, they use block-based encryption as an
underlying primitive to produce a pseudo-random stream of data, known as akeystream. The
plaintext is turned into ciphertext by XOR'ing it with the keystream.
CTR mode generates a block's worth of keystream by encrypting a counter using ECB mode. The
result of the encryption is a block of keystream. The counter is then incremented. Generally, the
counter being publicly known is acceptable, though it's always better to keep it a secret if possible.
The counter can start at a particular value, such as zero, or something chosen at random, and
increment by one every time. (The initial counter value is a nonce, which is subtly different from an
initialization vector; see Recipe 4.9.) Alternatively, the counter can be modified every time using a
deterministic pseudo-random number generator that doesn't repeat until all possible values are
generated. The only significant requirements are that the counter value never be repeated and that
both sides of an encryption channel know the order in which to use counters. In practice, part of the
counter is usually chosen randomly at keying time, and part is sequential. Both parts help thwart
particular kinds of risks.
Despite being over 20 years old, CTR mode has only recently been standardized by NIST as part of
the AES standardization process.
The primary disadvantages of CTR mode are:
Flipping bits in the plaintext is very easy because flipping a ciphertext bit flips the corresponding
plaintext bit (this problem is shared with all stream cipher modes). As with other encryption
algorithms, message integrity checks are absolutely necessary for adequate security.
Reusing {key, counter} pairs is disastrous. Generally, if there is any significant risk of reusing a
{key, nonce} pair (e.g., across reboot), it is best to avoid ever reusing a single key across
multiple messages (or data streams). (See Recipe 4.11 for advice if you wish to use one base
secret and derive multiple secrets from it.)
CTR mode has inadequate security when using ciphers with 64-bit blocks, unless you use a large
random nonce and a small counter, which drastically limits the number of messages that can be
sent. For this reason, OCB is probably still preferable for such ciphers, but CTR is clearly better
for 128-bit block ciphers.
The primary advantages of CTR mode are:
The keystream can be precomputed.
The keystream computation can be done in parallel.
Random access into the keystream is possible. (The 1,024th byte can be decrypted with only a
single raw encryption operation.)
For ciphers where raw encryption and decryption require separate algorithms (particularly AES),
only a single algorithm is necessary. In such a case, the faster of the two algorithms can be
used (though you will get incompatible results if you use decryption where someone else uses
encryption).
CTR mode leaks incredibly little information about the key. After 264 encryptions, an attacker
would learn about a bit's worth of information on a 128-bit key.
CTR mode is old and simple, and its security properties are well understood. It has recently gained a
lot of favor in the cryptographic community over other solutions for using block ciphers in streaming
modes, particularly as the world moves to AES with its 128-bit blocks.
Many of the "better" modes that provide built-in integrity checking, such as CWC and CCM mode, use
CTR mode as a component because of its desirable properties.
We discuss CTR mode in Recipe 5.9.
5.4.3.4 Output Feedback (OFB) mode
OFB mode is another streaming mode, much like CTR mode. The keystream is generated by
continually encrypting the last block of keystream to produce the next block. The first block of
keystream is generated by encrypting a nonce. OFB mode shares many properties with CTR mode,
although CTR mode has additional benefits. Therefore, OFB mode is seeing less and less use these
days.
OFB mode has been standardized by NIST.
The primary disadvantages of OFB mode are:
Bit-flipping attacks are easy, as with any streaming mode. Again, integrity checks are a must.
Reusing a {key, none} pair is disastrous (but is easy to avoid). Generally, if there is any
significant risk of reusing a {key, nonce} pair (e..g., across reboot), it is best to avoid reusing a
single key across multiple messages or data streams. (See Recipe 4.11 for advice if you wish to
use one base secret, and derive multiple secrets from it.)
Keystream computation cannot be done in parallel.
The primary advantages of OFB mode are:
Keystreams can be precomputed.
For ciphers where raw encryption and decryption operations require separate algorithms
(particularly AES), only a single algorithm is necessary. In such a case, the faster of the two
algorithms can be used (though you will get incompatible results if you use decryption where
someone else uses encryption).
It does not have nonce-size problems when used with 64-bit block ciphers.
When used properly, it leaks information at the same (slow) rate that CTR mode does.
We discuss OFB mode in Recipe 5.8.
5.4.3.5 Cipher Feedback (CFB) mode
CFB mode generally works similarly to OFB mode, except that in its most common configuration, it
produces keystream by always encrypting the last block of ciphertext, instead of the last block of
keystream.
CFB mode has been standardized by NIST.
The primary disadvantages of CFB mode are:
Bit-flipping attacks are easy, as with any streaming mode. Again, integrity checks are a must.
Reusing a {key, nonce} pair is disastrous (but is easy to avoid). Generally, if there is any
significant risk of reusing a {key, nonce} pair (e.g., across reboot), it is best to avoid reusing a
single key across multiple messages or data streams.
Encryption cannot be parallelized (though decryption can be).
The primary advantages of CFB mode are:
For ciphers where raw encryption and decryption operations require separate algorithms
(particularly AES), only a single algorithm is necessary. In such a case, the faster of the two
algorithms can be used.
A minor bit of precomputational work can be done in advance of receiving a block-sized element
of data, but this is not very significant compared to CTR mode or OFB mode.
It does not have nonce-size problems when used with 64-bit block ciphers.
These days, CFB mode is rarely used because CTR mode and OFB mode provide more advantages
with no additional drawbacks.
We discuss CFB mode in Recipe 5.7.
5.4.3.6 Carter-Wegman + CTR (CWC) mode
CWC mode is a high-level encryption mode that provides both encryption and built-in message
integrity, similar to CCM and OCB modes (discussed later).
CWC is a new mode, introduced by Tadayoshi Kohno, John Viega, and Doug Whiting. NIST is
currently considering CWC mode for standardization.
The primary disadvantages of CWC are:
The required nonce must never be reused (this is easy to avoid).
It isn't well suited for use with 64-bit block ciphers. It does work well with AES, of course.
The primary advantages of CWC mode are:
CWC ensures message integrity in addition to performing encryption.
The additional functionality requires minimal message expansion. (You would need to send the
same amount of data to perform integrity checking with any of the cipher modes described
earlier.)
CWC is parallelizable (hardware implementations can achieve speeds above 10 gigabits per
second).
CWC has provable security properties while using only a single block cipher key. This means
that under reasonable assumptions on the underlying block cipher, the mode provides excellent
secrecy and message integrity if the nonce is always unique.
CWC leverages all the good properties of CTR mode, such as being able to handle messages
without padding and being slow to leak information.
For ciphers where raw encryption and decryption operations require separate algorithms
(particularly AES), only a single algorithm is necessary. In such a case, the faster of the two
algorithms can be used (though you will get incompatible results if you use decryption where
someone else uses encryption).
We believe that the advantages of CWC mode make it more appealing for general-purpose use than
all other modes. However, the problem of repeating nonces is a serious one that developers often get
wrong. See Recipe 5.10, where we provide a high-level wrapper to CWC mode that is designed to
circumvent such problems.
5.4.3.7 Offset Codebook (OCB) mode
OCB mode is a patented encryption mode that you must license to use.[10] CWC offers similar
properties and is not restricted by patents.
[10]
At least one other patent also needs to be licensed to use this mode legally.
OCB is reasonably new. It was introduced by Phil Rogaway and is based on earlier work at IBM. Both
parties have patents covering this work, and a patent held by the University of Maryland also may
apply. OCB is not under consideration by any standards movements.
The primary disadvantages of OCB mode are:
It is restricted by patents.
The required nonce must never be reused (this is easy to avoid).
It isn't well suited for use with 64-bit block ciphers. It does work well with AES, of course.
The primary advantages of OCB mode are:
OCB ensures message integrity in addition to performing encryption.
The additional functionality requires minimal message expansion (you would need to send the
same amount of data to perform integrity checking with any of the previously mentioned cipher
modes).
OCB is fully parallelizable (hardware implementations can achieve speeds above 10 gigabits per
second).
OCB has provable security properties while using only a single block cipher key. This means that
under reasonable assumptions on the underlying block cipher, the mode provides excellent
secrecy and message integrity if the nonce is always unique.
Messages can be of arbitrary length (there is no need for block alignment).
For ciphers where raw encryption and decryption operations require separate algorithms
(particularly AES), only a single algorithm is necessary. In such a case, the faster of the two
algorithms can be used (though you will get incompatible results if you use decryption where
someone else uses encryption).
Because of its patent status and the availability of free alternatives with essentially identical
properties (particularly CWC mode), we recommend against using OCB mode. If you're interested in
using it anyway, see Phil Rogaway's OCB page at http://www.cs.ucdavis.edu/~rogaway/ocb/.
5.4.3.8 CTR plus CBC-MAC (CCM) mode
While OCB mode has appealing properties, its patent status makes it all but useless for most
applications. CCM is another alternative that provides many of the same properties, without any
patent encumbrance. There are some disadvantages of CCM mode, however:
While encryption and decryption can be parallelized, the message integrity check cannot be.
OCB and CWC both avoid this limitation.
In some applications, CCM can be nonoptimal because the length of the message must be
known before processing can begin.
The required nonce must never be reused (this is easy to avoid).
It isn't well suited to 64-bit block ciphers. It does work well with AES, of course.
CCM is also fairly new (more recent than OCB, but a bit older than CWC). It was introduced byDoug
Whiting, Russ Housley, and Niels Fergusen. NIST is currently considering it for standardization.
The primary advantages of CCM mode are:
CCM ensures message integrity in addition to performing encryption.
The message integrity functionality requires minimal message expansion (you would need to
send the same amount of data to perform integrity checking with any of the previously
mentioned cipher modes).
CCM has provable security properties while using only a single key. This means that under
reasonable assumptions on the underlying block cipher, the mode provides near-optimal secrecy
and message integrity if the required nonce is always unique.
CCM leverages most of the good properties of CTR mode, such as being able to handle
messages without padding and being slow to leak information.
For ciphers where raw encryption and decryption operations require separate algorithms
(particularly AES), only a single algorithm is necessary. In such a case, the faster of the two
algorithms can be used (though you will get incompatible results if you use decryption where
someone else uses encryption).
In this book, we focus on CWC mode instead of CCM mode because CWC mode offers additional
advantages, even though in many environments those advantages are minor. However, if you wish
to use CCM mode, we recommend that you grab an off-the-shelf implementation of it because the
mode is somewhat complex in comparison to standard modes. As of this writing, there are three free,
publicly available implementations of CCM mode:
The reference implementation: http://hifn.com/support/ccm.htm
The implementation from Secure Software: http://www.securesoftware.com/ccm.php
The implementation from Brian Gladman: http://fp.gladman.plus.com/AES/ccm.zip
5.4.4 See Also
CCM reference implementation: http://hifn.com/support/ccm.htm
CCM implementation from Secure Software: http://www.securesoftware.com/ccm.php
CCM implementation from Brian Gladman: http://fp.gladman.plus.com/AES/ccm.zip
CWC home page: http://www.zork.org/cwc/
OCB home page: http://www.cs.ucdavis.edu/~rogaway/ocb/
Recipe 4.9, Recipe 4.11, Recipe 5.5-Recipe 5.10, Recipe 5.14, Recipe 5.16
[ Team LiB ]
[ Team LiB ]
5.5 Using a Raw Block Cipher
5.5.1 Problem
You're trying to make one of our implementations for other block cipher modes work. They all use
raw encryption operations as a foundation, and you would like to understand how to plug in thirdparty implementations.
5.5.2 Solution
Raw operations on block ciphers consist of three operations: key setup, encryption of a block, and
decryption of a block. In other recipes, we provide three macros that you need to implement to use
our code. In the discussion for this recipe, we'll look at several desirable bindings for these macros.
5.5.3 Discussion
Do not use raw encryption operations in your own designs! Such operations
should only be used as a fundamental building block by skilled cryptographers.
Raw block ciphers operate on fixed-size chunks of data. That size is called theblock size. The input
and output are of this same fixed length. A block cipher also requires a key, which may be of a
different length than the block size. Sometimes an algorithm will allow variable-length keys, but the
block size is generally fixed.
Setting up a block cipher generally involves turning the raw key into akey schedule. Basically, the
key schedule is just a set of keys derived from the original key in a cipher-dependent manner. You
need to create the key schedule only once; it's good for every use of the underlying key because raw
encryption always gives the same result for any {key, input} pair (the same is true for decryption).
Once you have a key schedule, you can generally pass it, along with an input block, into the cipher
encryption function (or the decryption function) to get an output block.
To keep the example code as simple as possible, we've written it assuming you are going to want to
use one and only one cipher with it (though it's not so difficult to make the code work with multiple
ciphers).
To get the code in this book working, you need to define several macros:
SPC_BLOCK_SZ
Denotes the block size of the cipher in bytes.
SPC_KEY_SCHED
This macro must be an alias for the key schedule type that goes along with your cipher. This
value will be library-specific and can be implemented bytypedef instead of through a macro.
Note that the key schedule type should be an array of bytes of some fixed size, so that we can
ask for the size of the key schedule using sizeof(SPC_KEY_SCHED).
SPC_ENCRYPT_INIT(sched, key, keybytes) and SPC_DECRYPT_INIT(sched, key, keybytes)
Both of these macros take a pointer to a key schedule to write into, the key used to derive that
schedule, and the number of bytes in that key. If you are using an algorithm with fixed-size
keys, you can ignore the third parameter. Note that once you've built a key schedule, you
shouldn't be able to tell the difference between different key lengths. In many
implementations, initializing for encryption and initializing for decryption are the same
operation.
SPC_DO_ENCRYPT(sched, in, out) and SPC_DO_DECRYPT(sched, in, out)
Both of these macros are expected to take a pointer to a key schedule and two pointers to
memory corresponding to the input block and the output block. Both blocks are expected to be
of size SPC_BLOCK_SZ.
In the following sections, we'll provide some bindings for these macros forBrian Gladman's AES
implementation and for the OpenSSL API. Unfortunately, we cannot use Microsoft's CryptoAPI
because it does not allow for exchanging symmetric encryption keys without encrypting them (see
Recipe 5.26 and Recipe 5.27 to see how to work around this limitation)-and that would add
significant complexity to what we're trying to achieve with this recipe. In addition, AES is only
available in the .NET framework, which severely limits portability across various Windows versions.
(The .NET framework is available only for Windows XP and Windows .NET Server 2003.)
5.5.3.1 Brian Gladman's AES implementation
Brian Gladman has written the fastest freely available AES implementation to date. He has a version
in x86 assembly that works with Windows and a portable C version that is faster than the assembly
versions other people offer. It's available from his web page athttp://fp.gladman.plus.com/AES/.
To bind his implementation to our macros, do the following:
#include "aes.h"
#define
typedef
#define
#define
#define
#define
SPC_BLOCK_SZ 16
aes_ctx SPC_KEY_SCHED;
SPC_ENCRYPT_INIT(sched, key, keybytes)
SPC_DECRYPT_INIT(sched, key, keybytes)
SPC_DO_ENCRYPT(sched, in, out)
SPC_DO_DECRYPT(sched, in, out)
aes_enc_key(key, keybytes, sched)
aes_dec_key(key, keybytes, sched)
aes_enc_block(in, out, sched)
aes_dec_block(in, out, sched)
5.5.3.2 OpenSSL block cipher implementations
Next, we'll provide implementations for these macros for all of the ciphers in OpenSSL 0.9.7. Note
that the block size for all of the algorithms listed in this section is 8 bytes, except for AES, which is
16.
Table 5-2 lists the block ciphers that OpenSSL exports, along with the header file you need to include
for each cipher and the associated type for the key schedule.
Table 5-2. Block ciphers supported by OpenSSL
Cipher
Header file
Key schedule type
AES
openssl/aes.h
AES_KEY
Blowfish
openssl/blowfish.h
BF_KEY
CAST5
openssl/cast.h
CAST_KEY
DES
openssl/des.h
DES_key_schedule
3-key Triple-DES
openssl/des.h
DES_EDE_KEY
2-key Triple-DES
openssl/des.h
DES_EDE_KEY
IDEA
openssl/idea.h
IDEA_KEY_SCHEDULE
RC2
openssl/rc2.h
RC2_KEY
RC5
openssl/rc5.h
RC5_32_KEY
Table 5-3 provides implementations of the SPC_ENCRYPT_INIT macro for each of the block ciphers
listed in Table 5-2.
Table 5-3. Implementations for the SPC_ENCRYPT_INIT macro for each
OpenSSL-supported block cipher
Cipher
OpenSSL-based SPC_ENCRYPT_INIT implementation
AES
AES_set_encrypt_key(key, keybytes * 8, sched)
Blowfish
BF_set_key(sched, keybytes, key)
CAST5
CAST_set_key(sched, keybytes, key)
DES
DES_set_key_unchecked((DES_cblock *)key, sched)
3-key
TripleDES
DES_set_key_unchecked((DES_cblock *)key, &sched->ks1);
\DES_set_key_unchecked((DES_cblock *)(key + 8), &sched->ks2);
\DES_set_key_unchecked((DES_cblock *)(key + 16), &sched->ks3);
2-key
TripleDES
DES_set_key_unchecked((DES_cblock *)key, &sched->ks1);
\DES_set_key_unchecked((DES_cblock *)(key + 8), &sched->ks2);
IDEA
idea_set_encrypt_key(key, sched);
RC2
RC2_set_key(sched, keybytes, key, keybytes * 8);
In most of the implementations in Table 5-3, SPC_DECRYPT_INIT will be the same as
SPC_ENCRYPT_INIT (you can define one to the other). The two exceptions are AES and IDEA. For
AES:
Cipher
RC5
OpenSSL-based SPC_ENCRYPT_INIT implementation
RC5_32_set_key(sched, keybytes, key, 12);
In most of the implementations in Table 5-3, SPC_DECRYPT_INIT will be the same as
SPC_ENCRYPT_INIT (you can define one to the other). The two exceptions are AES and IDEA. For
AES:
#define SPC_DECRYPT_INIT(sched, key, keybytes) \
AES_set_decrypt_key(key, keybytes * 8, sched)
For IDEA:
#define SPC_DECRYPT_INIT(sched, key, keybytes) { \
IDEA_KEY_SCHEDULE tmp;\
idea_set_encrypt_key(key, &tmp);\
idea_set_decrypt_key(&tmp, sched);\
}
Table 5-4 and Table 5-5 provide implementations of the SPC_DO_ENCRYPT and SPC_DO_DECRYPT
macros.
Table 5-4. Implementations for the SPC_DO_ENCRYPT macro for each
OpenSSL-supported block cipher
Cipher
OpenSSL-based SPC_DO_ENCRYPT implementation
AES
AES_encrypt(in, out, sched)
Blowfish
BF_ecb_encrypt(in, out, sched, 1)
CAST5
CAST_ecb_encrypt(in, out, sched, 1)
DES
DES_ecb_encrypt(in, out, sched, 1)
3-key TripleDES
DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1,
&sched->ks2, &sched->ks3, 1);
2-key TripleDES
DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1,
&sched->ks2, &sched->ks1, 1);
IDEA
idea_ecb_encrypt(in, out, sched);
RC2
RC2_ecb_encrypt(in, out, sched, 1);
RC5
RC5_32_ecb_encrypt(in, out, sched, 1);
Table 5-5. Implementations for the SPC_DO_DECRYPT macro for each
OpenSSL-supported block cipher
Cipher
OpenSSL-based SPC_DO_DECRYPT implementation
AES
AES_decrypt(in, out, sched)
Blowfish
BF_ecb_encrypt(in, out, sched, 0)
CAST5
CAST_ecb_encrypt(in, out, sched, 0)
DES
DES_ecb_encrypt(in, out, sched, 0)
3-key TripleDES
DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1,
&sched->ks2, &sched->ks3, 0);
2-key TripleDES
DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1,
&sched->ks2, &sched->ks1, 0);
IDEA
idea_ecb_encrypt(in, out, sched);
RC2
RC2_ecb_encrypt(in, out, sched, 0);
RC5
RC5_32_ecb_encrypt(in, out, sched, 0);
5.5.4 See Also
Brian Gladman's AES page: http://fp.gladman.plus.com/AES/
OpenSSL home page: http://www.openssl.org/
Recipe 5.4, Recipe 5.26, Recipe 5.27.
[ Team LiB ]
[ Team LiB ]
5.6 Using a Generic CBC Mode Implementation
5.6.1 Problem
You want a more high-level interface for CBC mode than your library provides. Alternatively, you
want a portable CBC interface, or you have only a block cipher implementation and you would like to
use CBC mode.
5.6.2 Solution
CBC mode XORs each plaintext block with the previous output block before encrypting. The first block
is XOR'd with the IV. Many libraries provide a CBC implementation. If you need code that implements
CBC mode, you will find it in the following discussion.
5.6.3 Discussion
You should probably use a higher-level abstraction, such as the one discussed
in Recipe 5.16. Use a raw mode only when absolutely necessary, because there
is a huge potential for introducing a security vulnerability by accident. If you
still want to use CBC, be sure to use a message authentication code with it (see
Chapter 6).
CBC mode is a way to use a raw block cipher and, if used properly, it avoids all the security risks
associated with using the block cipher directly. CBC mode works on a message in blocks, where
blocks are a unit of data on which the underlying cipher operates. For example, AES uses 128-bit
blocks, whereas older ciphers such as DES almost universally use 64-bit blocks.
See Recipe 5.4 for a discussion of the advantages and disadvantages of this mode, as well as a
comparison to other cipher modes.
CBC mode works (as illustrated in Figure 5-1) by taking the ciphertext output for the previous block,
XOR'ing that with the plaintext for the current block, and encrypting the result with the raw block
cipher. The very first block of plaintext gets XOR'd with an initialization vector, which needs to be
randomly selected to ensure meeting security goals but which may be publicly known.
Many people use sequential IVs or even fixed IVs, but that is not at all
recommended. For example, SSL has had security problems in the past when
using CBC without random IVs. Also note that if there are common initial
strings, CBC mode can remain susceptible to dictionary attacks if no IV or
similar mechanism is used. As with ECB, padding is required unless messages
are always block-aligned.
Figure 5-1. CBC mode
Many libraries already come with an implementation of CBC mode for any ciphers they support.
Some don't, however. For example, you may only get an implementation of the raw block cipher
when you obtain reference code for a new cipher.
Generally, CBC mode requires padding. Because the cipher operates on block-sized quantities, it
needs to have a way of handling messages that do not break up evenly into block-sized parts. This is
done by adding padding to each message, as described in Recipe 5.11. Padding always adds to the
length of a message. If you wish to avoid message expansion, you have a couple of options. You can
ensure that your messages always have a length that is a multiple of the block size; in that case, you
can simply turn off padding. Otherwise, you have to use a different mode. SeeRecipe 5.4 for our
mode recommendations. If you're really a fan of CBC mode, you can support arbitrary-length
messages without message expansion using a modified version of CBC mode known asciphertext
stealing or CTS mode. We do not discuss CTS mode in the book, but there is a recipe about it on this
book's web site.
Here, we present a reasonably optimized implementation of CBC mode that builds upon the raw block
cipher interface presented in Recipe 5.5. It also requires the spc_memset( ) function from Recipe
13.2.
5.6.3.1 The high-level API
This implementation has two APIs. The first API is the high-level API, which takes a message as input
and returns a dynamically allocated result. This API only deals with padded messages. If you want to
turn off cipher padding, you will need to use the incremental interface.
unsigned char *spc_cbc_encrypt(unsigned
unsigned
unsigned char *spc_cbc_decrypt(unsigned
unsigned
char
char
char
char
*key, size_t kl, unsigned char *iv,
*in, size_t il, size_t *ol);
*key, size_t kl, unsigned char *iv,
*in, size_t il, size_t *ol);
Both functions pass out the number of bytes in the result by writing to the memory pointed to by the
final argument. If decryption fails for some reason, spc_cbc_decrypt( ) will return 0. Such an error
means that the input was not a multiple of the block size, or that the padding was wrong.
These two functions erase the key from memory before exiting. You may want
to have them erase the plaintext as well.
Here's the implementation of the above interface:
#include <stdlib.h>
#include <string.h>
unsigned char *spc_cbc_encrypt(unsigned char *key, size_t kl, unsigned char *iv,
unsigned char *in, size_t il, size_t *ol) {
SPC_CBC_CTX
ctx;
size_t
tmp;
unsigned char *result;
if (!(result = (unsigned char *)malloc(((il / SPC_BLOCK_SZ) * SPC_BLOCK_SZ) +
SPC_BLOCK_SZ))) return 0;
spc_cbc_encrypt_init(&ctx, key, kl, iv);
spc_cbc_encrypt_update(&ctx, in, il, result, &tmp);
spc_cbc_encrypt_final(&ctx, result+tmp, ol);
*ol += tmp;
return result;
}
unsigned char *spc_cbc_decrypt(unsigned char *key, size_t kl, unsigned char *iv,
unsigned char *in, size_t il, size_t *ol) {
int
success;
size_t
tmp;
SPC_CBC_CTX
ctx;
unsigned char *result;
if (!(result = (unsigned char *)malloc(il))) return 0;
spc_cbc_decrypt_init(&ctx, key, kl, iv);
spc_cbc_decrypt_update(&ctx, in, il, result, &tmp);
if (!(success = spc_cbc_decrypt_final(&ctx, result+tmp, ol))) {
*ol = 0;
spc_memset(result, 0, il);
free(result);
return 0;
}
*ol += tmp;
result = (unsigned char *)realloc(result, *ol);
return result;
}
Note that this code depends on the SPC_CBC_CTX data type, as well as the incremental CBC interface,
neither of which we have yet discussed.
5.6.3.2 SPC_CBC_CTX data type
Let's look at the SPC_CBC_CTX data type. It's defined as:
typedef struct {
SPC_KEY_SCHED ks;
int
ix;
int
pad;
unsigned char iv[SPC_BLOCK_SZ];
unsigned char ctbuf[SPC_BLOCK_SZ];
} SPC_CBC_CTX;
The ks field is an expanded version of the cipher key. The ix field is basically used to determine how
much data is needed before we have processed data that is a multiple of the block length. Thepad
field specifies whether the API needs to add padding or should expect messages to be exactly blockaligned. The iv field is used to store the initialization vector for the next block of encryption. The
ctbuf field is only used in decryption to cache ciphertext until we have enough to fill a block.
5.6.3.3 Incremental initialization
To begin encrypting or decrypting, we need to initialize the mode. Initialization is different for each
mode. Here are the functions for initializing an SPC_CBC_CTX object:
void spc_cbc_encrypt_init(SPC_CBC_CTX *ctx, unsigned char *key, size_t kl,
unsigned char *iv) {
SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);
spc_memset(key, 0, kl);
memcpy(ctx->iv, iv, SPC_BLOCK_SZ);
ctx->ix = 0;
ctx->pad = 1;
}
void spc_cbc_decrypt_init(SPC_CBC_CTX *ctx, unsigned char *key, size_t kl,
unsigned char *iv) {
SPC_DECRYPT_INIT(&(ctx->ks), key, kl);
spc_memset(key, 0, kl);
memcpy(ctx->iv, iv, SPC_BLOCK_SZ);
ctx->ix = 0;
ctx->pad = 1;
}
These functions are identical, except that they call the appropriate method for keying, which may be
different depending on whether we're encrypting or decrypting. Both of these functions erase the key
that you pass in!
Note that the initialization vector (IV) must be selected randomly. You should also avoid encrypting
more than about 240 blocks of data using a single key. See Recipe 4.9 for more on initialization
vectors.
Now we can add data as we get it using the spc_cbc_encrypt_update( ) and
spc_cbc_decrypt_update( ) functions. These functions are particularly useful when a message
comes in pieces. You'll get the same results as if the message had come in all at once. When you
wish to finish encrypting or decrypting, you call spc_cbc_encrypt_final( ) or
spc_cbc_decrypt_final( ), as appropriate.
You're responsible for making sure the proper init, update, and final calls are
made, and that they do not happen out of order.
5.6.3.4 Incremental encrypting
The function spc_cbc_encrypt_update( ) has the following signature:
int spc_cbc_encrypt_update(CBC_CTX *ctx, unsigned char *in, size_t il,
unsigned char *out, size_t *ol);
This function has the following arguments:
ctx
Pointer to the SPC_CBC_CTX object associated with the current message.
in
Pointer to the plaintext data to be encrypted.
il
Number indicating how many bytes of plaintext are to be encrypted.
out
Pointer to a buffer where any incremental ciphertext output should be written.
ol
Pointer into which the number of ciphertext bytes written to the output buffer is placed. This
argument may be NULL, in which case the caller is already expected to know the length of the
output.
Our implementation of this function always returns 1, but a hardware-based
implementation might have an unexpected failure, so it's important to check
the return value!
This API is in the spirit of PKCS #11,[11] which provides a standard cryptographic interface to
hardware. We do this so that the above functions can have the bulk of their implementations
replaced with calls to PKCS #11-compliant hardware. Generally, PKCS #11 reverses the order of
input and output argument sets. Also, it does not securely wipe key material.
[11]
PKCS #11 is available from http://www.rsasecurity.com/rsalabs/pkcs/pkcs-11/.
Because this API is PKCS #11-compliant, it's somewhat more low-level than it
needs to be and therefore is a bit difficult to use properly. First, you need to be
sure that the output buffer is big enough to hold the input; otherwise, you will
have a buffer overflow. Second, you need to make sure the out argument
always points to the first unused byte in the output buffer; otherwise, you will
keep overwriting the same data every time spc_cbc_encrypt_update( )
outputs data.
If you are using padding and you know the length of the input message in advance, you can calculate
the output length easily. If the message is of a length that is an exact multiple of the block size, the
output message will be a block larger. Otherwise, the message will get as many bytes added to it as
necessary to make the input length a multiple of the block size. Using integer math, we can calculate
the output length as follows, where il is the input length:
((il / SPC_BLOCK_SZ) * SPC_BLOCK_SZ) + SPC_BLOCK_SZ
If we do not have the entire message at once, when using padding the easiest thing to do is to
assume there may be an additional block of output. That is, if you pass in 7 bytes, allocating7 +
SPC_BLOCK_SZ is safe. If you wish to be a bit more precise, you can always add SPC_BLOCK_SZ bytes
to the input length, then reduce the number to the next block-aligned size. For example, if we have
an 8-byte block, and we call spc_cbc_encrypt_update( ) with 7 bytes, there is no way to get more
than 8 bytes of output, no matter how much data was buffered internally. Note that if no data was
buffered internally, we won't get any output!
Of course, you can exactly determine the amount of data to pass in if you are keeping track of how
many bytes are buffered at any given time (which you can do by looking atctx->ix). If you do that,
add the buffered length to your input length. The amount of output is always the largest blockaligned value less than or equal to this total length.
If you're not using padding, you will get a block of output for every block of input. To switch off
padding, you can call the following function, passing in a for the second argument:
void spc_cbc_set_padding(SPC_CBC_CTX *ctx, int pad) {
ctx->pad = pad;
}
Here's our implementation of spc_cbc_encrypt_update( ):
int spc_cbc_encrypt_update(SPC_CBC_CTX *ctx, unsigned char *in, size_t il,
unsigned char *out, size_t *ol) {
/* Keep a ptr to in, which we advance; we calculate ol by subtraction later. */
int
i;
unsigned char *start = out;
/* If we have leftovers, but not enough to fill a block, XOR them into the right
* places in the IV slot and return. It's not much stuff, so one byte at a time
* is fine.
*/
if (il < SPC_BLOCK_SZ-ctx->ix) {
while (il--) ctx->iv[ctx->ix++] ^= *in++;
if (ol) *ol = 0;
return 1;
}
/* If we did have leftovers, and we're here, fill up a block then output the
* ciphertext.
*/
if (ctx->ix) {
while (ctx->ix < SPC_BLOCK_SZ) --il, ctx->iv[ctx->ix++] ^= *in++;
SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, ctx->iv);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((unsigned int *)out)[i] = ((unsigned int *)(ctx->iv))[i];
out += SPC_BLOCK_SZ;
}
/* Operate on word-sized chunks, because it's easy to do so. You might gain a
* couple of cycles per loop by unrolling and getting rid of i if you know your
* word size a priori.
*/
while (il >= SPC_BLOCK_SZ) {
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((unsigned int *)(ctx->iv))[i] ^= ((unsigned int *)in)[i];
SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, ctx->iv);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((unsigned int *)out)[i] = ((unsigned int *)(ctx->iv))[i];
out += SPC_BLOCK_SZ;
in += SPC_BLOCK_SZ;
il -= SPC_BLOCK_SZ;
}
/* Deal with leftovers... one byte at a time is fine. */
for (i = 0; i < il; i++) ctx->iv[i] ^= in[i];
ctx->ix = il;
if (ol) *ol = out-start;
return 1;
}
The following spc_cbc_encrypt_final( ) function outputs any remaining data and securely wipes
the key material in the context, along with all the intermediate state. If padding is on, it will output
one block. If padding is off, it won't output anything. If padding is off and the total length of the input
wasn't a multiple of the block size, spc_cbc_encrypt_final( ) will return 0. Otherwise, it will
always succeed.
int spc_cbc_encrypt_final(SPC_CBC_CTX *ctx, unsigned char *out, size_t *ol) {
int
ret;
unsigned char pad;
if (ctx->pad) {
pad = SPC_BLOCK_SZ - ctx->ix;
while (ctx->ix < SPC_BLOCK_SZ) ctx->iv[ctx->ix++] ^= pad;
SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, out);
spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));
if(ol) *ol = SPC_BLOCK_SZ;
return 1;
}
if(ol) *ol = 0;
ret = !(ctx->ix);
spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));
return ret;
}
This function has the following arguments:
ctx
Pointer to the SPC_CBC_CTX object being used for the current message.
out
Pointer to the output buffer, if any. It may be NULL when padding is disabled.
ol
The number of output bytes written to the output buffer is placed into this pointer. This
argument may be NULL, in which case the output length is not written.
5.6.3.5 Incremental decryption
The CBC decryption API is largely similar to the encryption API, with one major exception. When
encrypting, we can output a block of data every time we take in a block of data. When decrypting,
that's not possible. We can decrypt data, but until we know that a block isn't the final block, we can't
output it because part of the block may be padding. Of course, with padding turned off, that
restriction could go away, but our API acts the same with padding off, just to ensure consistent
behavior.
The spc_cbc_decrypt_update( ) function, shown later in this section, has the following signature:
int spc_decrypt_update(SPC_CBC_CTX *ctx, unsigned char *in, size_t il,
unsigned char *out, size_t *ol);
This function has the following arguments:
ctx
Pointer to the SPC_CBC_CTX object being used for the current message.
in
Pointer to the ciphertext input buffer.
inlen
Number of bytes contained in the ciphertext input buffer.
out
Pointer to a buffer where any incremental plaintext output should be written.
ol
Pointer into which the number of output bytes written to the output buffer is placed. This
argument may be NULL, in which case the output length is not written.
This function can output up to SPC_BLOCK_SZ - 1 bytes more than is input, depending on how much
data has previously been buffered.
int spc_cbc_decrypt_update(SPC_CBC_CTX *ctx, unsigned char *in, size_t il,
unsigned char *out, size_t *ol) {
int
i;
unsigned char *next_iv, *start = out;
/* If there's not enough stuff to fit in ctbuf, dump it in there and return */
if (il < SPC_BLOCK_SZ - ctx->ix) {
while (il--) ctx->ctbuf[ctx->ix++] = *in++;
if (ol) *ol = 0;
return 1;
}
/* If there's stuff in ctbuf, fill it. */
if (ctx->ix % SPC_BLOCK_SZ) {
while (ctx->ix < SPC_BLOCK_SZ) {
ctx->ctbuf[ctx->ix++] = *in++;
--il;
}
}
if (!il) {
if (ol) *ol = 0;
return 1;
}
/* If we get here, and the ctbuf is full, it can't be padding.
if (ctx->ix) {
SPC_DO_DECRYPT(&(ctx->ks), ctx->ctbuf, out);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++) {
((int *)out)[i]
^= ((int *)ctx->iv)[i];
((int *)ctx->iv)[i] = ((int *)ctx->ctbuf)[i];
}
out += SPC_BLOCK_SZ;
}
if (il > SPC_BLOCK_SZ) {
SPC_DO_DECRYPT(&(ctx->ks), in, out);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((int *)out)[i] ^= ((int *)ctx->iv)[i];
next_iv = in;
out += SPC_BLOCK_SZ;
in
+= SPC_BLOCK_SZ;
il
-= SPC_BLOCK_SZ;
Spill it. */
} else next_iv = ctx->iv;
while (il > SPC_BLOCK_SZ) {
SPC_DO_DECRYPT(&(ctx->ks), in, out);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int);
((int *)out)[i] ^= ((int *)next_iv)[i];
next_iv = in;
out += SPC_BLOCK_SZ;
in += SPC_BLOCK_SZ;
il -= SPC_BLOCK_SZ;
}
/* Store the IV. */
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int);
((int *)ctx->iv)[i] = ((int *)next_iv)[i];
ctx->ix = 0;
while (il--) ctx->ctbuf[ctx->ix++] = *in++;
if (ol) *ol = out - start;
return 1;
i++)
i++)
}
Finalizing CBC-mode decryption is done with spc_cbc_decrypt_final( ), whose listing follows. This
function will return 1 if there are no problems or 0 if the total input length is not a multiple of the
block size or if padding is on and the padding is incorrect.
If the call is successful and padding is on, the function will write into the output buffer anywhere from
0 to SPC_BLOCK_SZ bytes. If padding is off, a successful function will always writeSPC_BLOCK_SZ
bytes into the output buffer.
As with spc_cbc_encrypt_final( ), this function will securely erase the contents of the context
object before returning.
int spc_cbc_decrypt_final(SPC_CBC_CTX *ctx, unsigned char *out, size_t *ol) {
unsigned int i;
unsigned char pad;
if (ctx->ix != SPC_BLOCK_SZ) {
if (ol) *ol = 0;
/* If there was no input, and there's no padding, then everything is OK. */
spc_memset(&(ctx->ks), 0, sizeof(SPC_KEY_SCHED));
spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));
return (!ctx->ix && !ctx->pad);
}
if (!ctx->pad) {
SPC_DO_DECRYPT(&(ctx->ks), ctx->ctbuf, out);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((int *)out)[i] ^= ((int *)ctx->iv)[i];
if (ol) *ol = SPC_BLOCK_SZ;
spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));
return 1;
}
SPC_DO_DECRYPT(&(ctx->ks), ctx->ctbuf, ctx->ctbuf);
spc_memset(&(ctx->ks), 0, sizeof(SPC_KEY_SCHED));
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((int *)ctx->ctbuf)[i] ^= ((int *)ctx->iv)[i];
pad = ctx->ctbuf[SPC_BLOCK_SZ - 1];
if (pad > SPC_BLOCK_SZ) {
if (ol) *ol = 0;
spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));
return 0;
}
for (i = 1; i < pad; i++) {
if (ctx->ctbuf[SPC_BLOCK_SZ - 1 - i] != pad) {
if (ol) *ol = 0;
spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));
return 0;
}
}
for (i = 0; i < SPC_BLOCK_SZ - pad; i++)
*out++ = ctx->ctbuf[i];
if (ol) *ol = SPC_BLOCK_SZ - pad;
spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));
return 1;
}
5.6.4 See Also
PKCS #11 web page: http://www.rsasecurity.com/rsalabs/pkcs/pkcs-11/
Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.11, Recipe 5.16, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
5.7 Using a Generic CFB Mode Implementation
5.7.1 Problem
You want a more high-level interface for CFB mode than your library provides. Alternatively, you
want a portable CFB interface, or you have only a block cipher implementation and would like to use
CFB mode.
5.7.2 Solution
CFB mode generates keystream by encrypting a "state" buffer, which starts out being the nonce and
changes after each output, based on the actual outputted value.
Many libraries provide a CFB implementation. If you need code that implements this mode, you will
find it in the following Section 5.7.3.
5.7.3 Discussion
You should probably use a higher-level abstraction, such as the one discussed
in Recipe 5.16. Use a raw mode only when absolutely necessary, because there
is a huge potential for introducing a security vulnerability by accident. If you
still want to use CFB, be sure to use a message authentication code with it (see
Chapter 6).
CFB is a stream-based mode. Encryption occurs by XOR'ing the keystream bytes with the plaintext
bytes, as shown in Figure 5-2. The keystream is generated one block at a time, and it is always
dependent on the previous keystream block as well as the plaintext data XOR'd with the previous
keystream block.
CFB does this by keeping a "state" buffer, which is initially the nonce. As a block's worth of data gets
encrypted, the state buffer has some or all of its bits shifted out and ciphertext bits shifted in. The
amount of data shifted in before each encryption operation is the "feedback size," which is often the
block size of the cipher, meaning that the state function is always replaced by the ciphertext of the
previous block. See Figure 5-2 for a graphical view of CFB mode.
Figure 5-2. CFB mode
The block size of the cipher is important to CFB mode because keystream is produced in block-sized
chunks and therefore requires keeping track of block-sized portions of the ciphertext. CFB is
fundamentally a streaming mode, however, because the plaintext is encrypted simply by XOR'ing
with the CFB keystream.
In Recipe 5.4, we discuss the advantages and drawbacks of CFB and compare it to other popular
modes.
These days, CFB mode is rarely used because CTR and OFB modes (CTR mode in particular) provide
more advantages, with no additional drawbacks. Of course, we recommend a higher-level mode over
all of these, one that provides stronger security guarantees-for example, CWC or CCM mode.
Many libraries already come with an implementation of CFB mode for any ciphers they support.
However, some don't. For example, you may only get an implementation of the raw block cipher
when you obtain reference code for a new cipher.
In the following sections we present a reasonably optimized implementation of CFB mode that builds
upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( )
function from Recipe 13.2.
This implementation is only for the case where the feedback size is equal to the
cipher block size. This is the most efficient mechanism and is no less secure
than other feedback sizes, so we strongly recommend this approach.
5.7.3.1 The high-level API
This implementation has two APIs. The first is a high-level API, which takes a message as input and
returns a dynamically allocated result.
unsigned char *spc_cfb_encrypt(unsigned
unsigned
unsigned char *spc_cfb_decrypt(unsigned
unsigned
char
char
char
char
*key, size_t kl, unsigned char *nonce,
*in, size_t il);
*key, size_t kl, unsigned char *nonce,
*in, size_t il)
Both of the previous functions output the same number of bytes as were input, unless a memory
allocation error occurs, in which case 0 is returned.
These two functions erase the key from memory before exiting. You may want
to have them erase the plaintext as well.
Here's the implementation of the interface:
#include <stdlib.h>
#include <string.h>
unsigned char *spc_cfb_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,
unsigned char *in, size_t il) {
SPC_CFB_CTX
ctx;
unsigned char *out;
if (!(out = (unsigned char *)malloc(il))) return 0;
spc_cfb_init(&ctx, key, kl, nonce);
spc_cfb_encrypt_update(&ctx, in, il, out);
spc_cfb_final(&ctx);
return out;
}
unsigned char *spc_cfb_decrypt(unsigned char *key, size_t kl, unsigned char *nonce,
unsigned char *in, size_t il) {
SPC_CFB_CTX
ctx;
unsigned char *out;
if (!(out = (unsigned char *)malloc(il))) return 0;
spc_cfb_init(&ctx, key, kl, nonce);
spc_cfb_decrypt_update(&ctx, in, il, out);
spc_cfb_final(&ctx);
return out;
}
Note that this code depends on the SPC_CFB_CTX data type and the incremental CFB interface, both
discussed in the following sections.
5.7.3.2 The incremental API
Let's look at the SPC_CFB_CTX data type. It's defined as:
typedef struct {
SPC_KEY_SCHED ks;
int
ix;
unsigned char nonce[SPC_BLOCK_SZ];
} SPC_CFB_CTX;
The ks field is an expanded version of the cipher key (block ciphers generally use a single key to
derive multiple keys for internal use). The ix field is used to determine how much keystream we
have buffered. The nonce field is really the buffer in which we store the input to the next encryption,
and it is the place where intermediate keystream bytes are stored.
To begin encrypting or decrypting, we need to initialize the mode. Initialization is the same operation
for both encryption and decryption:
void spc_cfb_init(SPC_CFB_CTX *ctx, unsigned char *key, size_t kl, unsigned char
*nonce) {
SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);
spc_memset(key,0, kl);
memcpy(ctx->nonce, nonce, SPC_BLOCK_SZ);
ctx->ix = 0;
}
Note again that we remove the key from memory during this operation.
Never use the same nonce (often called an IV in this context; seeRecipe 4.9) twice with a single key.
To implement that recommendation effectively, never reuse a key. Alternatively, pick a random
starting IV each time you key, and never output more than about 240 blocks using a single key.
Now we can add data as we get it using the spc_cfb_encrypt_update( ) or
spc_cfb_decrypt_update( ) function, as appropriate. These functions are particularly useful when a
message may arrive in pieces. You'll get the same results as if it all arrived at once. When you want
to finish encrypting or decrypting, call spc_cfb_final( ).
You're responsible for making sure the proper init, update, and final calls are
made, and that they do not happen out of order.
The function spc_cfb_encrypt_update( ), which is shown later in this section, has the following
signature:
int spc_cfb_encrypt_update(CFB_CTX *ctx, unsigned char *in, size_t il,
unsigned char *out);
This function has the following arguments:
ctx
Pointer to the SPC_CFB_CTX object associated with the current message.
in
Pointer to the plaintext data to be encrypted.
il
Number of bytes of plaintext to be encrypted.
out
Pointer to the output buffer, which needs to be exactly as long as the input plaintext data.
Our implementation of this function always returns 1, but a hardware-based
implementation might have an unexpected failure, so it's important to check
the return value!
This API is in the spirit of PKCS #11, which provides a standard cryptographic interface to hardware.
We do this so that the above functions can have the bulk of their implementations replaced with calls
to PKCS #11-compliant hardware. PKCS #11 APIs generally pass out data explicitly indicating the
length of data outputted, while we ignore that because it will always be zero on failure or the size of
the input buffer on success. Also note that PKCS #11-based calls tend to order their arguments
differently from the way we do, and they will not generally wipe key material, as we do in our
initialization and finalization routines.
Because this API is developed with PKCS #11 in mind, it's somewhat more lowlevel than it needs to be and therefore is a bit difficult to use properly. First,
you need to be sure the output buffer is big enough to hold the input;
otherwise, you will have a buffer overflow. Second, you need to make sure the
out argument always points to the first unused byte in the output buffer.
Otherwise, you will keep overwriting the same data every time
spc_cfb_encrypt_update( ) outputs.
Here's our implementation of spc_cfb_encrypt_update( ):
int spc_cfb_encrypt_update(SPC_CFB_CTX *ctx, unsigned char *in, size_t il,
unsigned char *out) {
int i;
if (ctx->ix) {
while (ctx->ix) {
if (!il--) return 1;
ctx->nonce[ctx->ix] = *out++ = *in++ ^ ctx->nonce[ctx->ix++];
ctx->ix %= SPC_BLOCK_SZ;
}
}
if (!il) return 1;
while (il >= SPC_BLOCK_SZ) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++) {
((int *)ctx->nonce)[i] = ((int *)out)[i] = ((int *)in)[i] ^
((int *)ctx->nonce)[i];
}
il -= SPC_BLOCK_SZ;
in += SPC_BLOCK_SZ;
out += SPC_BLOCK_SZ;
}
SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);
for (i = 0; i <il; i++)
ctx->nonce[ctx->ix] = *out++ = *in++ ^ ctx->nonce[ctx->ix++];
return 1;
}
Decryption has a similar API, but a different implementation:
int spc_cfb_decrypt_update(SPC_CFB_CTX *ctx, unsigned char *in, size_t il,
unsigned char *out) {
int i, x;
char c;
if (ctx->ix) {
while (ctx->ix) {
if (!il--) return 1;
c = *in;
*out++ = *in++ ^ ctx->nonce[ctx->ix];
ctx->nonce[ctx->ix++] = c;
ctx->ix %= SPC_BLOCK_SZ;
}
}
if (!il) return 1;
while (il >= SPC_BLOCK_SZ) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++) {
x = ((int *)in)[i];
((int *)out)[i] = x ^ ((int *)ctx->nonce)[i];
((int *)ctx->nonce)[i] = x;
}
il -= SPC_BLOCK_SZ;
in += SPC_BLOCK_SZ;
out += SPC_BLOCK_SZ;
}
SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);
for (i = 0; i < il; i++) {
c = *in;
*out++ = *in++ ^ ctx->nonce[ctx->ix];
ctx->nonce[ctx->ix++] = c;
}
return 1;
}
To finalize either encryption or decryption, use spc_cfb_final( ), which never needs to output
anything, because CFB is a streaming mode:
int spc_cfb_final(SPC_CFB_CTX *ctx) {
spc_memset(&ctx, 0, sizeof(SPC_CFB_CTX));
return 1;
}
5.7.4 See Also
Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.16, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
5.8 Using a Generic OFB Mode Implementation
5.8.1 Problem
You want a more high-level interface for OFB mode than your library provides. Alternatively, you
want a portable OFB interface, or you have only a block cipher implementation and you would like to
use OFB mode.
5.8.2 Solution
OFB mode encrypts by generating keystream, then combining the keystream with the plaintext via
XOR. OFB generates keystream one block at a time. Each block of keystream is produced by
encrypting the previous block of keystream, except for the first block, which is generated by
encrypting the nonce.
Many libraries provide an OFB implementation. If you need code implementing this mode, you will
find it in the following Section 5.8.3.
5.8.3 Discussion
You should probably use a higher-level abstraction, such as the one discussed
in Recipe 5.16. Use a raw mode only when absolutely necessary, because there
is a huge potential for introducing a security vulnerability by accident. If you
still want to use OFB, be sure to use a message authentication code with it.
OFB mode is a stream-based mode. Encryption occurs by XOR'ing the keystream bytes with the
plaintext bytes, as shown in Figure 5-3. The keystream is generated one block at a time, by
encrypting the previous keystream block.[12] The first block is generated by encrypting the nonce.
[12]
As with CFB mode, the "feedback size" could conceivably be smaller than the block size, but such schemes
aren't secure.
Figure 5-3. OFB mode
This mode shares many properties with counter mode (CTR), but CTR mode has additional benefits.
OFB mode is therefore seeing less and less use these days. Of course, we recommend a higher-level
mode than both of these modes, one that provides stronger security guarantees-for example, CWC
or CCM mode.
In Recipe 5.4, we discuss the advantages and drawbacks of OFB and compare it to other popular
modes.
Many libraries already come with an implementation of OFB mode for any ciphers they support.
However, some don't. For example, you may only get an implementation of the raw block cipher
when you obtain reference code for a new cipher.
In the following sections we present a reasonably optimized implementation of OFB mode that builds
upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( )
function from Recipe 13.2.
5.8.3.1 The high-level API
This implementation has two APIs. The first is a high-level API, which takes a message as input and
returns a dynamically allocated result.
unsigned char *spc_ofb_encrypt(unsigned
unsigned
unsigned char *spc_ofb_decrypt(unsigned
unsigned
char
char
char
char
*key, size_t kl, unsigned char *nonce,
*in, size_t il);
*key, size_t kl, unsigned char *nonce,
*in, size_t il)
Both of these functions output the same number of bytes as were input, unless a memory allocation
error occurs, in which case 0 is returned. The decryption routine is exactly the same as the
encryption routine and is implemented by macro.
These two functions also erase the key from memory before exiting. You may
want to have them erase the plaintext as well.
Here's the implementation of the interface:
#include <stdlib.h>
#include <string.h>
unsigned char *spc_ofb_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,
unsigned char *in, size_t il) {
SPC_OFB_CTX
ctx;
unsigned char *out;
if (!(out = (unsigned char *)malloc(il))) return 0;
spc_ofb_init(&ctx, key, kl, nonce);
spc_ofb_update(&ctx, in, il, out);
spc_ofb_final(&ctx);
return out;
}
#define spc_ofb_decrypt spc_ofb_encrypt
Note that the previous code depends on the SPC_OFB_CTX data type and the incremental OFB
interface, both discussed in the following sections.
5.8.3.2 The incremental API
Let's look at the SPC_OFB_CTX data type. It's defined as:
typedef struct {
SPC_KEY_SCHED ks;
int
ix;
unsigned char nonce[SPC_BLOCK_SZ];
} SPC_OFB_CTX;
The ks field is an expanded version of the cipher key (block ciphers generally use a single key to
derive multiple keys for internal use). The ix field is used to determine how much of the last block of
keystream we have buffered (i.e., that hasn't been used yet). The nonce field is really the buffer in
which we store the current block of the keystream.
To begin encrypting or decrypting, we need to initialize the mode.Initialization is the same operation
for both encryption and decryption:
void spc_ofb_init(SPC_OFB_CTX *ctx, unsigned char *key, size_t kl, unsigned char
*nonce) {
SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);
spc_memset(key,0, kl);
memcpy(ctx->nonce, nonce, SPC_BLOCK_SZ);
ctx->ix = 0;
}
Note again that we remove the key from memory during this operation.
Never use the same nonce (often called an IV in this context) twice with a single key. Use a secure
random value or a counter. See Recipe 4.9 for more information on nonces.
Now we can add data as we get it using the spc_ofb_update( ) function. This function is particularly
useful when a message arrives in pieces. You'll get the same results as if it all arrived at once. When
you want to finish encrypting or decrypting, call spc_ofb_final( ).
You're responsible for making sure the init, update, and final calls do not
happen out of order.
The function spc_ofb_update( ) has the following signature:
int spc_ofb_update(OFB_CTX *ctx, unsigned char *in, size_t il, unsigned char *out);
This function has the following arguments:
ctx
Pointer to the SPC_OFB_CTX object associated with the current message.
in
Pointer to a buffer containing the data to be encrypted or decrypted.
il
Number of bytes contained in the input buffer.
out
Pointer to the output buffer, which needs to be exactly as long as the input buffer.
Our implementation of this function always returns 1, but a hardware-based
implementation might have an unexpected failure, so it's important to check
the return value!
This API is in the spirit of PKCS #11, which provides a standard cryptographic interface to hardware.
We do this so that the above functions can have the bulk of their implementations replaced with calls
to PKCS #11-compliant hardware. PKCS #11 APIs generally pass out data explicitly indicating the
length of data outputted, while we ignore that because it will always be zero on failure or the size of
the input buffer on success. Also note that PKCS #11-based calls tend to order their arguments
differently from the way we do, and they will not generally wipe key material, as we do in our
initialization and finalization routines.
Because this API is developed with PKCS #11 in mind, it's somewhat more lowlevel than it needs to be, and therefore is a bit difficult to use properly. First,
you need to be sure the output buffer is big enough to hold the input;
otherwise, you will have a buffer overflow. Second, you need to make sure the
out argument always points to the first unused byte in the output buffer.
Otherwise, you will keep overwriting the same data every time
spc_ofb_update( ) outputs.
Here's our implementation of spc_ofb_update( ):
int spc_ofb_update(SPC_OFB_CTX *ctx, unsigned char *in, size_t il, unsigned char
*out) {
int i;
if (ctx->ix) {
while (ctx->ix) {
if (!il--) return 1;
*out++ = *in++ ^ ctx->nonce[ctx->ix++];
ctx->ix %= SPC_BLOCK_SZ;
}
}
if (!il) return 1;
while (il >= SPC_BLOCK_SZ) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((int *)out)[i] = ((int *)in)[i] ^ ((int *)ctx->nonce)[i];
il -= SPC_BLOCK_SZ;
in += SPC_BLOCK_SZ;
out += SPC_BLOCK_SZ;
}
SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);
for (i = 0; i < il; i++) *out++ = *in++ ^ ctx->nonce[ctx->ix++];
return 1;
}
To finalize either encryption or decryption, use the spc_ofb_final( ) call, which never needs to
output anything, because OFB is a streaming mode:
int spc_ofb_final(SPC_OFB_CTX *ctx) {
spc_memset(&ctx, 0, sizeof(SPC_OFB_CTX));
return 1;
}
5.8.4 See Also
Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.16, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
5.9 Using a Generic CTR Mode Implementation
5.9.1 Problem
You want to use counter (CTR) mode and your library doesn't provide an interface, or you want to
use a more high-level interface than your library provides. Alternatively, you would like a portable
CTR interface, or you have only a block cipher implementation and you would like to use CTR mode.
5.9.2 Solution
CTR mode encrypts by generating keystream, then combining the keystream with the plaintext via
XOR. This mode generates keystream one block at a time by encrypting plaintexts that are the same,
except for an ever-changing counter, as shown in Figure 5-4. Generally, the counter value starts at
zero and is incremented sequentially.
Figure 5-4. Counter (CTR) mode
Few libraries provide a CTR implementation, because it has only recently come into favor, despite the
fact that it is a very old mode with great properties. We provide code implementing this mode in the
following Section 5.9.3.
5.9.3 Discussion
You should probably use a higher-level abstraction, such as the one discussed
in Recipe 5.16. Use a raw mode only when absolutely necessary, because there
is a huge potential for introducing asecurity vulnerability by accident. If you still
want to use CTR mode, be sure to use a message authentication code with it.
CTR mode is a stream-based mode. Encryption occurs by XOR'ing the keystream bytes with the
plaintext bytes. The keystream is generated one block at a time by encrypting a plaintext block that
includes a counter value. Given a single key, the counter value must be unique for every encryption.
This mode has many benefits over the "standard" modes (e.g., ECB, CBC, CFB, and OFB). However,
we recommend a higher-level mode, one that provides stronger security guarantees (i.e., message
integrity detection), such as CWC or CCM modes. Most high-level modes use CTR mode as a
component.
In Recipe 5.4, we discuss the advantages and drawbacks of CTR mode and compare it to other
popular modes.
Like most other modes, CTR mode requires a nonce (often called an IV in this context). Most modes
use the nonce as an input to encryption, and thus require something the same size as the algorithm's
block length. With CTR mode, the input to encryption is generally the concatenation of the nonce and
a counter. The counter is usually at least 32 bits, depending on the maximum amount of data you
might want to encrypt with a single {key, nonce} pair. We recommend using a good random value
for the nonce.
In the following sections we present a reasonably optimized implementation of CTR mode that builds
upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( )
function from Recipe 13.2. By default, we use a 6-byte counter, which leaves room for a nonce of
SPC_BLOCK_SZ - 6 bytes. With AES and other ciphers with 128-bit blocks, this is sufficient space.
CTR mode with 64-bit blocks is highly susceptible to birthday attacks unless
you use a large random portion to the nonce, which limits the message you can
send with a given key. In short, don't use CTR mode with 64-bit block ciphers.
5.9.3.1 The high-level API
This implementation has two APIs. The first is a high-level API, which takes a message as input and
returns a dynamically allocated result.
unsigned char *spc_ctr_encrypt(unsigned
unsigned
unsigned char *spc_ctr_decrypt(unsigned
unsigned
char
char
char
char
*key, size_t kl, unsigned char *nonce,
*in, size_t il);
*key, size_t kl, unsigned char *nonce,
*in, size_t il)
Both of the previous functions output the same number of bytes as were input, unless a memory
allocation error occurs, in which case 0 is returned. The decryption routine is exactly the same as the
encryption routine, and it is implemented by macro.
These two functions also erase the key from memory before exiting. You may
want to have them erase the plaintext as well.
Here's the implementation of the interface:
#include <stdlib.h>
#include <string.h>
unsigned char *spc_ctr_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,
unsigned char *in, size_t il) {
SPC_CTR_CTX
ctx;
unsigned char *out;
if (!(out = (unsigned char *)malloc(il))) return 0;
spc_ctr_init(&ctx, key, kl, nonce);
spc_ctr_update(&ctx, in, il, out);
spc_ctr_final(&ctx);
return out;
}
#define spc_ctr_decrypt spc_ctr_encrypt
Note that this code depends on the SPC_CTR_CTX data type and the incremental CTR interface, both
discussed in the following sections. In particular, the nonce size varies depending on the value of the
SPC_CTR_BYTES macro (introduced in the next subsection).
5.9.3.2 The incremental API
Let's look at the SPC_CTR_CTX data type. It's defined as:
typedef struct {
SPC_KEY_SCHED ks;
int
ix;
unsigned char ctr[SPC_BLOCK_SZ];
unsigned char ksm[SPC_BLOCK_SZ];
} SPC_CTR_CTX;
The ks field is an expanded version of the cipher key (block ciphers generally use a single key to
derive multiple keys for internal use). The ix field is used to determine how much of the last block of
keystream we have buffered (i.e., that hasn't been used yet). The ctr block holds the plaintext used
to generate keystream blocks. Buffered keystream is held in ksm.
To begin encrypting or decrypting, you need to initialize the mode. Initialization is the same operation
for both encryption and decryption, and it depends on a statically defined valueSPC_CTR_BYTES,
which is used to compute the nonce size.
#define SPC_CTR_BYTES 6
void spc_ctr_init(SPC_CTR_CTX *ctx, unsigned char *key, size_t kl, unsigned char
*nonce) {
SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);
spc_memset(key, 0, kl);
memcpy(ctx->ctr, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);
spc_memset(ctx->ctr + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);
ctx->ix = 0;
}
Note again that we remove the key from memory during this operation.
Now you can add data as you get it using the spc_ctr_update( ) function. This function is
particularly useful when a message arrives in pieces. You'll get the same results as if it all arrived at
once. When you want to finish encrypting or decrypting, callspc_ctr_final( ).
You're responsible for making sure the initialization, updating, and finalization
calls do not happen out of order.
The function spc_ctr_update( ) has the following signature:
int spc_ctr_update(CTR_CTX *ctx, unsigned char *in, size_t il, unsigned char *out);
This function has the following arguments:
ctx
Pointer to the SPC_CTR_CTX object associated with the current message.
in
Pointer to a buffer containing the data to be encrypted or decrypted.
il
Number of bytes contained by the input buffer.
out
Pointer to the output buffer, which needs to be exactly as long as the input buffer.
Our implementation of this function always returns 1, but a hardware-based
implementation might have an unexpected failure, so it's important to check
the return value!
This API is in the spirit of PKCS #11, which provides a standard cryptographic interface to hardware.
We do this so that the above functions can have the bulk of their implementations replaced with calls
to PKCS #11-compliant hardware. PKCS #11 APIs generally pass out data explicitly indicating the
length of data outputted, while we ignore that because it will always be zero on failure or the size of
the input buffer on success. Also note that PKCS #11-based calls tend to order their arguments
differently from the way we do, and they will not generally wipe key material, as we do in our
initialization and finalization routines.
Because this API is developed with PKCS #11 in mind, it's somewhat more lowlevel than it needs to be, and therefore is a bit difficult to use properly. First,
you need to be sure the output buffer is big enough to hold the input;
otherwise, you will have a buffer overflow. Second, you need to make sure the
out argument always points to the first unused byte in the output buffer.
Otherwise, you will keep overwriting the same data every time
spc_ctr_update( ) outputs data.
Here's our implementation of spc_ctr_update( ), along with a helper function:
static inline void ctr_increment(unsigned char *ctr) {
unsigned char *x = ctr + SPC_CTR_BYTES;
while (x-- != ctr) if (++(*x)) return;
}
int spc_ctr_update(SPC_CTR_CTX *ctx, unsigned char *in, size_t il, unsigned char
*out) {
int i;
if (ctx->ix) {
while (ctx->ix) {
if (!il--) return 1;
*out++ = *in++ ^ ctx->ksm[ctx->ix++];
ctx->ix %= SPC_BLOCK_SZ;
}
}
if (!il) return 1;
while (il >= SPC_BLOCK_SZ) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, out);
ctr_increment(ctx->ctr);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((int *)out)[i] ^= ((int *)in)[i];
il -= SPC_BLOCK_SZ;
in += SPC_BLOCK_SZ;
out += SPC_BLOCK_SZ;
}
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, ctx->ksm);
ctr_increment(ctx->ctr);
for (i = 0; i < il; i++)
*out++ = *in++ ^ ctx->ksm[ctx->ix++];
return 1;
}
To finalize either encryption or decryption, use the spc_ctr_final( ) call, which never needs to
output anything, because CTR is a streaming mode:
int spc_ctr_final(SPC_CTR_CTX *ctx) {
spc_memset(&ctx, 0, sizeof(SPC_CTR_CTX));
return 1;
}
5.9.4 See Also
Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.16, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
5.10 Using CWC Mode
5.10.1 Problem
You want to use CWC mode to get encryption and message integrity in a single mode.
5.10.2 Solution
Use the reference implementation available from http://www.zork.org/cwc/, or use Brian Gladman's
implementation, available from http://fp.gladman.plus.com/AES/cwc.zip.
5.10.3 Discussion
CWC mode is a mode of operation for providing both encryption and message integrity. This mode is
parallelizable, fast in both software and hardware (where it can achieve speeds of 10 gigabits per
second), unencumbered by patents, and provably secure to good bounds with standard assumptions.
(We compare CWC to other modes in Recipe 5.4.)
CWC mode is not simple to implement because it uses a universal hash function as a component that
is conceptually straightforward but somewhat complex to implement well. We therefore recommend
using an off-the-shelf implementation, such as the implementation on the official CWC web page
(http://www.zork.org/cwc/).
Here, we'll discuss how to use the distribution available from the CWC web page. This implementation
has a set of macros similar to the macros we develop in Recipe 5.5 allowing you to bind the library to
any AES implementation. In particular, if you edit local_options.h, you need to do the following:
1. Set AES_KS_T to whatever value you would set SPC_KEY_SCHED (see Recipe 5.5).
2. Set CWC_AES_SETUP to whatever value you would set SPC_ENCRYPT_INIT (see Recipe 5.5).
3. Set CWC_AES_ENCRYPT to whatever value you would set SPC_DO_ENCRYPT (see Recipe 5.5).
Once those bindings are made, the Zork CWC implementation has a simple API that accepts an entire
message at once:
int cwc_init(cwc_t ctx[1], u_char key[ ], int keybits);
void cwc_encrypt_message(cwc_t ctx[1], u_char a[ ], u_int32 alen, u_char pt[ ],
u_int32 ptlen, u_char nonce[11], u_char output[ ]);
int cwc_decrypt_message(cwc_t ctx[1], u_char a[ ], u_int32 alen, u_char ct[ ],
u_int32 ctlen, u_char nonce[11], u_char output[ ]);
void cwc_cleanup(cwc_t ctx[1]);
If you have very large messages, this API insists that you buffer them before encrypting or
decrypting. That's not a fundamental limitation of CWC mode, but only of this implementation. A
future version of the implementation might change that, but do note that it would require partially
decrypting a message before the library could determine whether the message is authentic. The API
above does not decrypt if the message isn't authentic.
If you need to operate on very large messages, check out Brian Gladman's
CWC implementation, which works incrementally.
This API looks slightly different from the all-in-one APIs we've presented for other modes in this
chapter. It's actually closer to the incremental mode. The CWC mode has a notion of individual
messages. It is intended that each message be sent individually. You're expected to use a single key
for a large number of messages, but each message gets its own nonce. Generally, each message is
expected to be short but can be multiple gigabytes.
Note that encrypting a message grows the message by 16 bytes. The extra 16 bytes at the end are
used for ensuring the integrity of the message (it is effectively the result of a message authentication
code; see Chapter 6).
The previous API assumes that you have the entire message to encrypt or decrypt at once. In the
following discussion, we'll talk about the API that allows you to incrementally process a single
message.
The cwc_init( ) function allows us to initialize a CWC context object of type cwc_t that can be
reused across multiple messages. Generally, a single key will be used for an entire session. The first
argument is a pointer to the cwc_t object (the declaration as an array of one is a specification saying
that the pointer is only to a single object rather than to an array of objects). The second argument is
the AES key, which must be a buffer of 16, 24, or 32 bytes. The third argument specifies the number
of bits in the key (128, 192 or 256). The function fails ifkeybits is not a correct value.
The cwc_encrypt_message( ) function has the following arguments:
ctx
Pointer to the cwc_t context object.
a
Buffer containing optional data that you would like to authenticate, but that does not need to
be encrypted, such as plaintext headers in the HTTP protocol.
alen
Length of extra authentication data buffer, specified in bytes. It may be zero if there is no such
data.
pt
Buffer containing the plaintext you would like to encrypt and authenticate.
ptlen
Length of the plaintext buffer. It may be zero if there is no data to be encrypted.
nonce
Pointer to an 11-byte buffer, which must be unique for each message. (SeeRecipe 4.9 for hints
on nonce selection.)
output
Buffer into which the ciphertext is written. This buffer must always be at leastptlen + 16
bytes in size because the message grows by 16 bytes when the authentication value is added.
This function always succeeds. The cwc_decrypt_message( ) function, on the other hand, returns 1
on success, and 0 on failure. Failure occurs only if the message integrity check fails, meaning the
data has somehow changed since it was originally encrypted. This function has the following
arguments:
ctx
Pointer to the cwc_t context object.
a
Buffer containing optional data that you would like to authenticate, but that was not encrypted,
such as plaintext headers in the HTTP protocol.
alen
Length of extra authentication data buffer, specified in bytes. It may be zero if there is no such
data.
ct
Buffer containing the ciphertext you would like to authenticate and decrypt if it is valid.
ctlen
Length of the ciphertext buffer. It may be zero if there is no data to be decrypted.
nonce
Pointer to an 11-byte buffer, which must be unique for each message. (SeeRecipe 4.9 for hints
on nonce selection.)
output
Buffer into which the plaintext is written. This buffer must always be at leastctlen - 16 bytes
in size because the message shrinks by 16 bytes when the authentication value is removed.
The cwc_cleanup( ) function simply wipes the contents of the cwc context object passed into it.
5.10.4 See Also
CWC implementation from Brian Gladman: http://fp.gladman.plus.com/AES/cwc.zip
CWC home page: http://www.zork.org/cwc
Recipe 5.4, Recipe 5.5
[ Team LiB ]
[ Team LiB ]
5.11 Manually Adding and Checking Cipher Padding
5.11.1 Problem
You want to add padding to data manually, then check it manually when decrypting.
5.11.2 Solution
There are many subtle ways in which padding can go wrong, so use an off-the-shelf scheme, such as
PKCS block cipher padding.
5.11.3 Discussion
Padding is applied to plaintext; when decrypting, you must check for proper
padding of the resulting data to determine where the plaintext message
actually ends.
Generally, it is not a good idea to add padding yourself. If you're using a reasonably high-level
abstraction, padding will be handled for you. In addition, padding often isn't required, for example,
when using a stream cipher or one of many common block cipher modes (including CWC, CTR, CCM,
OFB, and CFB).
Because ECB mode really shouldn't be used for stream-based encryption, the only common case
where padding is actually interesting is when you're using CBC mode.
If you are in a situation where you do need padding, we recommend that you use a standard
scheme. There are many subtle things that can go wrong (although the most important requirement
is that padding always be unambiguous[13]), and there's no good reason to wing it.
[13]
Because of this, it's impossible to avoid adding data to the end of the message, even when the message is
block-aligned, at least if you want your padding scheme to work with arbitrary binary data.
The most widespread standard padding for block ciphers is calledPKCS block padding. The goal of
PKCS block padding is that the last byte of the padded plaintext should unambiguously describe how
much padding was added to the message. PKCS padding sets every byte of padding to the number of
bytes of padding added. If the input is block-aligned, an entire block of padding is added. For
example, if four bytes of padding were needed, the proper padding would be:
0x04040404
If you're using a block cipher with 64-bit (8-byte) blocks, and the input is block-aligned, the padding
would be:
0x0808080808080808
Here's an example API for adding and removing padding:
void spc_add_padding(unsigned char *pad_goes_here, int ptlen, int bl) {
int i, n = (ptlen - 1) % bl + 1;
for (i = 0;
i < n;
i++) *(pad_goes_here + i) = (unsigned char)n;
}
int spc_remove_padding(unsigned char *lastblock, int bl) {
unsigned char i, n = lastblock[bl - 1];
unsigned char *p = lastblock + bl;
/* In your programs you should probably throw an exception or abort instead. */
if (n > bl || n <= 0) return -1;
for (i = n; i; i--) if (*--p != n) return -1;
return bl - n;
}
The spc_add_padding( ) function adds padding directly to a preallocated buffer called
pad_goes_here. The function takes as input the length of the plaintext and the block length of the
cipher. From that information, we figure out how many bytes to add, and we write the result into the
appropriate buffer.
The spc_remove_padding( ) function deals with unencrypted plaintext. As input, we pass it the final
block of plaintext, along with the block length of the cipher. The function looks at the last byte to see
how many padding bytes should be present. If the final byte is bigger than the block length or is less
than one, the padding is not in the right format, indicating a decryption error. Finally, we check to
see whether the padded bytes are all in the correct format. If everything is in order, the function will
return the number of valid bytes in the final block of data, which could be anything from zero to one
less than the block length.
[ Team LiB ]
[ Team LiB ]
5.12 Precomputing Keystream in OFB, CTR, CCM, or CWC
Modes (or with Stream Ciphers)
5.12.1 Problem
You want to save computational resources when data is actually flowing over a network by
precomputing keystream so that encryption or decryption will consist merely of XOR'ing data with the
precomputed keystream.
5.12.2 Solution
If your API has a function that performs keystream generation, use that. Otherwise, call the
encryption routine, passing in N bytes set to 0, where N is the number of bytes of keystream you
wish to precompute.
5.12.3 Discussion
Most cryptographic APIs do not have an explicit way to precompute keystream for cipher modes
where such precomputation makes sense. Fortunately, any byte XOR'd with zero returns the original
byte. Therefore, to recover the keystream, we can "encrypt" a string of zeros. Then, when we have
data that we really do wish to encrypt, we need only XOR that data with the stored keystream.
If you have the source for the encryption algorithm, you can remove the final XOR operation to
create a keystream-generating function. For example, the spc_ctr_update( ) function from Recipe
5.9 can be adapted easily into the following keystream generator:
int spc_ctr_keystream(SPC_CTR_CTX *ctx, size_t il, unsigned char *out) {
int i;
if (ctx->ix) {
while (ctx->ix) {
if (!il--) return 1;
*out++ = ctx->ksm[ctx->ix++];
ctx->ix %= SPC_BLOCK_SZ;
}
}
if (!il) return 1;
while (il >= SPC_BLOCK_SZ) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, out);
ctr_increment(ctx->ctr);
il -= SPC_BLOCK_SZ;
out += SPC_BLOCK_SZ;
}
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, ctx->ksm);
ctr_increment(ctx->ctr);
for (i = 0; i <il; i++) *out++ = ctx->ksm[ctx->ix++];
return 1;
}
Note that we simply remove the in argument along with the XOR operation whenever we write to the
output buffer.
[ Team LiB ]
[ Team LiB ]
5.13 Parallelizing Encryption and Decryption in Modes
That Allow It (Without Breaking Compatibility)
5.13.1 Problem
You want to parallelize encryption, decryption, or keystream generation.
5.13.2 Solution
Only some cipher modes are naturally parallelizable in a way that doesn't break compatibility. In
particular, CTR mode is naturally parallizable, as are decryption with CBC and CFB. There are two
basic strategies: one is to treat the message in an interleaved fashion, and the other is to break it up
into a single chunk for each parallel process.
The first strategy is generally more practical. However, it is often difficult to make either technique
result in a speed gain when processing messages in software.
5.13.3 Discussion
Parallelizing encryption and decryption does not necessarily result in a speed
improvement. To provide any chance of a speedup, you'll certainly need to
ensure that multiple processors are working in parallel. Even in such an
environment, data sets may be too small to run faster when they are processed
in parallel.
Some cipher modes can have independent parts of the message operated upon independently. In
such cases, there is the potential for parallelization. For example, with CTR mode, the keystream is
computed in blocks, where each block of keystream is generated by encrypting a unique plaintext
block. Those blocks can be computed in any order.
In CBC, CFB, and OFB modes, encryption can't really be parallelized because the ciphertext for a
block is necessary to create the ciphertext for the next block; thus, we can't compute ciphertext out
of order. However, for CBC and CFB, when we decrypt, things are different. Because we only need
the ciphertext of a block to decrypt the next block, we can decrypt the next block before we decrypt
the first one.
There are two reasonable strategies for parallelizing the work. When a message shows up all at once,
you might divide it roughly into equal parts and handle each part separately. Alternatively, you can
take an interleaved approach, where alternating blocks are handled by different threads. That is, the
actual message is separated into two different plaintexts, as shown inFigure 5-5.
Figure 5-5. Encryption through interleaving
If done correctly, both approaches will result in the correct output. We generally prefer the
interleaving approach, because all threads can do work with just a little bit of data available. This is
particularly true in hardware, where buffers are small.
With a noninterleaving approach, you must wait at least until the length of the message is known,
which is often when all of the data is finally available. Then, if the message length is known in
advance, you must wait for a large percentage of the data to show up before the second thread can
be launched.
Even the interleaved approach is a lot easier when the size of the message is known in advance
because it makes it easier to get the message all in one place. If you need the whole message to
come in before you know the length, parallelization may not be worthwhile, because in many cases,
waiting for an entire message to come in before beginning work can introduce enough latency to
thwart the benefits of parallelization.
If you aren't generally going to get an entire message all at once, but you are able to determine the
biggest message you might get, another reasonably easy approach is to allocate a result buffer big
enough to hold the largest possible message.
For the sake of simplicity, let's assume that the message arrives all at once and you might want to
process a message with two parallel threads. The following code provides an exampleAPI that can
handle CTR mode encryption and decryption in parallel (remember that encryption and decryption
are the same operation in CTR mode).
Because we assume the message is available up front, all of the information we need to operate on a
message is passed into the function spc_pctr_setup( ), which requires a context object (here, the
type is SPC_CTR2_CTX), the key, the key length in bytes, a nonce SPC_BLOCK_SZ - SPC_CTR_BYTES
in length, the input buffer, the length of the message, and the output buffer. This function does not
do any of the encryption and decryption, nor does it copy the input buffer anywhere.
To process the first block, as well as every second block after that, callspc_pctr_do_odd( ), passing
in a pointer to the context object. Nothing else is required because the input and output buffers used
are the ones passed to the spc_pctr_setup( ) function. If you test, you'll notice that the results are
exactly the same as with the CTR mode implementation from Recipe 5.9.
This code requires the preliminaries from Recipe 5.5, as well as the spc_memset( ) function from
Recipe 13.2.
#include <stdlib.h>
#include <string.h>
typedef struct {
SPC_KEY_SCHED ks;
size_t
unsigned char
unsigned char
unsigned char
unsigned char
unsigned char
unsigned char
} SPC_CTR2_CTX;
len;
ctr_odd[SPC_BLOCK_SZ];
ctr_even[SPC_BLOCK_SZ];
*inptr_odd;
*inptr_even;
*outptr_odd;
*outptr_even;
static void pctr_increment(unsigned char *ctr) {
unsigned char *x = ctr + SPC_CTR_BYTES;
while (x-- != ctr) if (++(*x)) return;
}
void spc_pctr_setup(SPC_CTR2_CTX *ctx, unsigned char *key, size_t kl,
unsigned char *nonce, unsigned char *in, size_t len,
unsigned char *out) {
SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);
spc_memset(key,0, kl);
memcpy(ctx->ctr_odd, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);
spc_memset(ctx->ctr_odd + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);
memcpy(ctx->ctr_even, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);
spc_memset(ctx->ctr_even + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);
pctr_increment(ctx->ctr_even);
ctx->inptr_odd
= in;
ctx->inptr_even = in + SPC_BLOCK_SZ;
ctx->outptr_odd = out;
ctx->outptr_even = out + SPC_BLOCK_SZ;
ctx->len
= len;
}
void spc_pctr_do_odd(SPC_CTR2_CTX *ctx) {
size_t
i, j;
unsigned char final[SPC_BLOCK_SZ];
for (i = 0; i + SPC_BLOCK_SZ < ctx->len; i += 2 * SPC_BLOCK_SZ) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_odd, ctx->outptr_odd);
pctr_increment(ctx->ctr_odd);
pctr_increment(ctx->ctr_odd);
for (j = 0; j < SPC_BLOCK_SZ / sizeof(int); j++)
((int *)ctx->outptr_odd)[j] ^= ((int *)ctx->inptr_odd)[j];
ctx->outptr_odd += SPC_BLOCK_SZ * 2;
ctx->inptr_odd += SPC_BLOCK_SZ * 2;
}
if (i < ctx->len) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_odd, final);
for (j = 0; j < ctx->len - i; j++)
ctx->outptr_odd[j] = final[j] ^ ctx->inptr_odd[j];
}
}
void spc_pctr_do_even(SPC_CTR2_CTX *ctx) {
size_t
i, j;
unsigned char final[SPC_BLOCK_SZ];
for (i = SPC_BLOCK_SZ; i + SPC_BLOCK_SZ < ctx->len; i += 2 * SPC_BLOCK_SZ) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_even, ctx->outptr_even);
pctr_increment(ctx->ctr_even);
pctr_increment(ctx->ctr_even);
for (j = 0; j < SPC_BLOCK_SZ / sizeof(int); j++)
((int *)ctx->outptr_even)[j] ^= ((int *)ctx->inptr_even)[j];
ctx->outptr_even += SPC_BLOCK_SZ * 2;
ctx->inptr_even += SPC_BLOCK_SZ * 2;
}
if (i < ctx->len) {
SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_even, final);
for (j = 0; j < ctx->len - i; j++)
ctx->outptr_even[j] = final[j] ^ ctx->inptr_even[j];
}
}
int spc_pctr_final(SPC_CTR2_CTX *ctx) {
spc_memset(&ctx, 0, sizeof(SPC_CTR2_CTX));
return 1;
}
5.13.4 See Also
Recipe 5.5, Recipe 5.9, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
5.14 Parallelizing Encryption and Decryption in Arbitrary
Modes (Breaking Compatibility)
5.14.1 Problem
You are using a cipher mode that is not intrinsically parallelizable, but you have a large data set and
want to take advantage of multiple processors at your disposal.
5.14.2 Solution
Treat the data as multiple streams of interleaved data.
5.14.3 Discussion
Parallelizing encryption and decryption does not necessarily result in a speed
improvement. To provide any chance of a speedup, you will certainly need to
ensure that multiple processors are working in parallel. Even in such an
environment, data sets may be too small to run faster when they are processed
in parallel.
Recipe 5.13 demonstrates how to parallelize CTR mode encryption on a per-block level using a single
encryption context. Instead of having spc_pctr_do_even( ) and spc_pctr_do_odd( ) share a key
and nonce, you could use two separate encryption contexts. In such a case, there is no need to limit
your choice of mode to one that is intrinsically parallelizable. However, note that you won't get the
same results when using two separate contexts as you do when you use a single context, even if you
use the same key and IV or nonce (remembering that IV/nonce reuse is a bad idea-and that
certainly applies here).
One consideration is how much to interleave. There's no need to interleave on a block level. For
example, if you are using two parallel encryption contexts, you could encrypt the first 1,024 bytes of
data with the first context, then alternate every 1,024 bytes.
Generally, it is best to use a different key for each context. You can derive multiple keys from a single
base key, as shown in Recipe 4.11.
It's easiest to consider interleaving only at the plaintext level, particularly if you're using a blockbased mode, where padding will generally be added for each cipher context. In such a case, you
would send the encrypted data in multiple independent streams and reassemble it after decryption.
5.14.4 See Also
Recipe 4.11, Recipe 5.13
[ Team LiB ]
[ Team LiB ]
5.15 Performing File or Disk Encryption
5.15.1 Problem
You want to encrypt a file or a disk.
5.15.2 Solution
If you're willing to use a nonce or an initialization vector, standard modes such as CBC and CTR are
acceptable. For file-at-a-time encryption, you can avoid the use of a nonce or IV altogether by using
the LION construction, described in Section 5.15.3.
Generally, keys will be generated from a password. For that, use PKCS #5, as discussed inRecipe
4.10.
5.15.3 Discussion
Disk encryption is usually done in fixed-size chunks at the operating system level. File encryption can
be performed in chunks so that random access to an encrypted file doesn't require decrypting the
entire file. This also has the benefit that part of a file can be changed without reencrypting the entire
file.
CBC mode is commonly used for this purpose, and it is used on chunks that are a multiple of the
block size of the underlying block cipher, so that padding is never necessary. This eliminates any
message expansion that one would generally expect with CBC mode.
However, when people are doing disk or file encryption with CBC mode, they often use a fixed
initialization vector. That's a bad idea because an initialization vector is expected to be random for
CBC mode to obtain its security goals. Using a fixed IV leads to dictionary-like attacks that can often
lead to recovering, at the very least, the beginning of a file.
Other modes that require only a nonce (not an initialization vector) tend to be streaming modes.
These fail miserably when used for disk encryption if the nonce does not change every single time the
contents associated with that nonce change.
Keys for disk encryption are generally created from a password. Such keys will
be only as strong as the password. See Recipe 4.10 for a discussion of turning a
password into a cryptographic key.
For example, if you're encrypting file-by-file in 8,192-byte chunks, you need a separate nonce for
each 8,192-byte chunk, and you need to select a new nonce every single time you want to protect a
modified version of that chunk. You cannot just make incremental changes, then reencrypt with the
same nonce.
In fact, even for modes where sequential nonces are possible, they really don't make much sense in
the context of file encryption. For example, some people think they can use just one CTR mode nonce
for the entire disk. But if you ever reuse the same piece of keystream, there are attacks. Therefore,
any time you change even a small piece of data, you will have to reencrypt the entire disk using a
different nonce to maintain security. Clearly, that isn't practical.
Therefore, no matter what mode you choose to use, you should choose random initial values.
Many people don't like IVs or nonces for file encryption because of storage space issues. They believe
they shouldn't "waste" space on storing an IV or nonce. When you're encrypting fixed-size chunks,
there are not any viable alternatives; if you want to ensure security, you must use an IV.
If you're willing to accept message expansion, you might want to consider a high-level mode such as
CWC, so that you can also incorporate integrity checks. In practice, integrity checks are usually
ignored on filesystems, though, and the filesystems trust that the operating system's access control
system will ensure integrity.
Actually, if you're willing to encrypt and decrypt on a per-file basis, where you cannot decrypt the file
in parts, you can actually get rid of the need for an initialization vector by using LION, which is a
construction that takes a stream cipher and hash function and turns them into a block cipher that has
an arbitrary block size. Essentially, LION turns those constructs into a single block cipher that has a
variable block length, and you use the cipher in ECB mode.
Throughout this book, we repeatedly advise against using raw block cipher operations for things like
file encryption. However, when the block size is always the same length as the message you want to
encrypt, ECB mode isn't so bad. The only problem is that, given a {key, plaintext} pair, an
unchanged file will always encrypt to the same value. Therefore, an attacker who has seen a
particular file encrypted once can find any unchanged versions of that file encrypted with the same
key. A single change in the file thwarts this problem, however. In practice, most people probably
won't be too concerned with this kind of problem.
Using raw block cipher operations with LION is useful only if the block size really is the size of the file.
You can't break the file up into 8,192-byte chunks or anything like that, which can have a negative
impact on performance, particularly as the file size gets larger.
Considering what we've discussed, something like CBC mode with a randomly chosen IV per block is
probably the best solution for pretty much any use, even if it does take up some additional disk
space. Nonetheless, we recognize that people may want to take an approach where they only need to
have a key, and no IV or nonce.
Therefore, we'll show you LION, built out of the RC4 implementation from Recipe 5.23 and SHA1 (see
Recipe 6.7). The structure of LION is shown in Figure 5-6.
While we cover RC4 because it is popular, we strongly recommend you use
SNOW 2.0 instead, because it seems to have a much more comfortable security
margin.
The one oddity of this technique is that files must be longer than the output size of the message
digest function (20 bytes in the case of SHA1). Therefore, if you have files that small, you will either
need to come up with a nonambiguous padding scheme, which is quite complicated to do securely, or
you'll need to abandon LION (either just for small messages or in general).
LION requires a key that is twice as long as the output size of the message digest function. As with
regular CBC-style encryption for files, if you're using a cipher that takes fixed-size keys, we expect
you'll generate a key of the appropriate length from a password.
Figure 5-6. The structure of LION
We also assume a SHA1 implementation with a very standard API. Here, we use an API that works
with OpenSSL, which should be easily adaptable to other libraries. To switch hash functions, replace
the SHA1 calls as appropriate, and change the value of HASH_SZ to be the digest size of the hash
function that you wish to use.
The function spc_lion_encrypt( ) encrypts its first argument, putting the result into the memory
pointed to by the second argument. The third argument specifies the size of the message, and the
last argument is the key. Again, note that the input size must be larger than the hash function's
output size.
The spc_lion_decrypt( ) function takes a similar argument set as spc_lion_encrypt( ), merely
performing the inverse operation.
#include <stdio.h>
#include <openssl/rc4.h>
#include <openssl/sha.h>
#define HASH_SZ
20
#define NUM_WORDS (HASH_SZ / sizeof(int))
void spc_lion_encrypt(char *in, char *out, size_t blklen, char *key) {
int
i, tmp[NUM_WORDS];
RC4_KEY k;
/* Round 1: R = R ^ RC4(L ^ K1) */
for (i = 0; i < NUM_WORDS; i++)
tmp[i] = ((int *)in)[i] ^ ((int *)key)[i];
RC4_set_key(&k, HASH_SZ, (char *)tmp);
RC4(&k, blklen - HASH_SZ, in + HASH_SZ, out + HASH_SZ);
/* Round 2: L = L ^ SHA1(R) */
SHA1(out + HASH_SZ, blklen - HASH_SZ, out);
for (i = 0; i < NUM_WORDS; i++)
((int *)out)[i] ^= ((int *)in)[i];
/* Round 3: R = R ^ RC4(L ^ K2) */
for (i = 0; i < NUM_WORDS; i++)
tmp[i] = ((int *)out)[i] ^ ((int *)key)[i + NUM_WORDS];
RC4_set_key(&k, HASH_SZ, (char *)tmp);
RC4(&k, blklen - HASH_SZ, out + HASH_SZ, out + HASH_SZ);
}
void spc_lion_decrypt(char *in, char *out, size_t blklen, char *key) {
int
i, tmp[NUM_WORDS];
RC4_KEY k;
for (i = 0; i < NUM_WORDS; i++)
tmp[i] = ((int *)in)[i] ^ ((int *)key)[i + NUM_WORDS];
RC4_set_key(&k, HASH_SZ, (char *)tmp);
RC4(&k, blklen - HASH_SZ, in + HASH_SZ, out + HASH_SZ);
SHA1(out + HASH_SZ, blklen - HASH_SZ, out);
for (i = 0; i < NUM_WORDS; i++) {
((int *)out)[i] ^= ((int *)in)[i];
tmp[i] = ((int *)out)[i] ^ ((int *)key)[i];
}
RC4_set_key(&k, HASH_SZ, (char *)tmp);
RC4(&k, blklen - HASH_SZ, out + HASH_SZ, out + HASH_SZ);
}
5.15.4 See Also
Recipe 4.10, Recipe 5.23, Recipe 6.7
[ Team LiB ]
[ Team LiB ]
5.16 Using a High-Level, Error-Resistant Encryption and
Decryption API
5.16.1 Problem
You want to do encryption or decryption without the hassle of worrying about choosing an encryption
algorithm, performing an integrity check, managing a nonce, and so on.
5.16.2 Solution
Use the following "Encryption Queue" implementation, which relies on the reference CWC mode
implementation (discussed in Recipe 5.10) and the key derivation function from Recipe 4.11.
5.16.3 Discussion
Be sure to take into account the fact that functions in this API can fail,
particularly the decryption functions. If a decryption function fails, you need to
fail gracefully. In Recipe 9.12, we discuss many issues that help ensure robust
network communication that we don't cover here.
This recipe provides an easy-to-use interface to symmetric encryption. The two ends of
communication must set up cipher queues in exactly the same configuration. Thereafter, they can
exchange messages easily until the queues are destroyed.
This code relies on the reference CWC implementation discussed inRecipe 5.10. We use CWC mode
because it gives us both encryption and integrity checking using a single key with a minimum of fuss.
We add a new data type, SPC_CIPHERQ, which is responsible for keeping track of queue state. Here's
the declaration of the SPC_CIPHERQ data type:
typedef struct {
cwc_t
ctx;
unsigned char nonce[SPC_BLOCK_SZ];
} SPC_CIPHERQ;
SPC_CIPHERQ objects are initialized by calling spc_cipherq_setup( ), which requires the code from
Recipe 5.5, as well as an implementation of the randomness API discussed inRecipe 11.2:
#include <stdlib.h>
#include <string.h>
#include <cwc.h>
#define MAX_KEY_LEN (32)
/* 256 bits */
size_t spc_cipherq_setup(SPC_CIPHERQ *q, unsigned char *basekey, size_t keylen,
size_t keyuses) {
unsigned char dk[MAX_KEY_LEN];
unsigned char salt[5];
spc_rand(salt, 5);
spc_make_derived_key(basekey, keylen, salt, 5, 1, dk, keylen);
if (!cwc_init(&(q->ctx), dk, keylen * 8)) return 0;
memcpy(q->nonce, salt, 5);
spc_memset(basekey, 0, keylen);
return keyuses + 1;
}
The function has the following arguments:
q
SPC_CIPHERQ context object.
basekey
Shared key used by both ends of communication (the "base key" that will be used to derive
session keys).
keylen
Length of the shared key in bytes, which must be 16, 24, or 32.
keyuses
Indicates how many times the current key has been used to initialize aSPC_CIPHERQ object. If
you are going to reuse keys, it is important that this argument be used properly.
On error, spc_cipherq_setup() returns 0. Otherwise, it returns the next value
it would expect to receive for the keyuses argument. Be sure to save this value
if you ever plan to reuse keys.
Note also that basekey is erased upon successful initialization.
Every time you initialize an SPC_CIPHERQ object, a key specifically for use with that queue instance is
generated, using the basekey and the keyuses arguments. To derive the key, we use the key
derivation function discussed in Recipe 4.11. Note that this is useful when two parties share a longterm key that they wish to keep reusing. However, if you exchange a session key at connection
establishment (i.e., using one of the techniques from Chapter 8), the key derivation step is
unnecessary, because reusing {key, nonce} pairs is already incredibly unlikely in such a situation.
Both communicating parties must initialize their queue with identical parameters.
When you're done with a queue, you should deallocate internally allocated memory by calling
spc_cipherq_cleanup( ):
void spc_cipherq_cleanup(SPC_CIPHERQ *q) {
spc_memset(q, 0, sizeof(SPC_CIPHERQ));
}
Here are implementations of the encryption and decryption operations (including a helper function),
both of which return a newly allocated buffer containing the results of the appropriate operation:
static void increment_counter(SPC_CIPHERQ *q) {
if (!++q->nonce[10]) if (!++q->nonce[9]) if (!++q->nonce[8]) if (!++q->nonce[7])
if (!++q->nonce[6]) ++q->nonce[5];
}
unsigned char *spc_cipherq_encrypt(SPC_CIPHERQ *q, unsigned char *m, size_t mlen,
size_t *ol) {
unsigned char *ret;
if (!(ret = (unsigned char *)malloc(mlen + 16))) {
if (ol) *ol = 0;
return 0;
}
cwc_encrypt(&(q->ctx), 0, 0, m, mlen, q->nonce, ret);
increment_counter(q);
if (ol) *ol = mlen + 16;
return ret;
}
unsigned char *spc_cipherq_decrypt(SPC_CIPHERQ *q, unsigned char *m, size_t mlen,
size_t *ol) {
unsigned char *ret;
if (!(ret = (unsigned char *)malloc(mlen - 16))) {
if (ol) *ol = 0;
return 0;
}
if (!cwc_decrypt(&(q->ctx), 0, 0, m, mlen, q->nonce, ret)) {
free(ret);
if (ol) *ol = 0;
return 0;
}
increment_counter(q);
if (ol) *ol = mlen - 16;
return ret;
}
The functions spc_cipherq_encrypt( ) and spc_cipherq_decrypt( ) each take four arguments:
q
SPC_CIPHERQ object to use for encryption or decryption.
m
Message to be encrypted or decrypted.
mlen
Length of the message to be encrypted or decrypted, in bytes.
ol
The number of bytes returned from the encryption or decryption operation is stored in this
integer pointer. This may be NULL if you don't need the information. The number of bytes
returned will always be the message length plus 16 bytes for encryption, or the message length
minus 16 bytes for decryption.
These functions don't check for counter rollover because you can use this API to send over 250 trillion
messages with a single key, which should be adequate for any use.
Instead of using such a large counter, it is a good idea to use only five bytes for
the counter and initialize the rest with a random salt value. The random salt
helps prevent against a class of problems in which the attacker amortizes the
cost of an attack by targeting a large number of possible keys at once. In
Recipe 9.12, we show a similar construction that uses both a salt and a counter
in the nonce.
If you do think you might send more messages under a single key, be sure to rekey in time. (This
scheme is set up to handle at least four trillion keyings with a single base key.)
In the previous code, the nonces are separately managed by both parties in the communication. They
each increment by one when appropriate, and will fail to decrypt a message with the wrong nonce.
Thus, this solution prevents capture replay attacks and detects message drops or message
reordering, all as a result of implicit message numbering. Some people like explicit message
numbering and would send at least a message number, if not the entire nonce, with each message
(though you should always compare against the previous nonce to make sure it's increasing). In
addition, if there's a random portion to the nonce as we suggested above, the random portion needs
to be communicated to both parties. In Recipe 9.12, we send the nonce explicitly with each message,
which helps communicate the portion randomly selected at connection setup time.
It's possible to mix and match calls to spc_cipherq_encrypt( ) and spc_cipherq_decrypt( )
using a single context. However, if you want to use this API in this manner, do so only if the
communicating parties send messages in lockstep. If parties can communicate asynchronously (that
is, without taking turns), there is the possibility for a race condition in which theSPC_CIPHERQ states
on each side of the communication get out of sync, which will needlessly cause decryption operations
to fail.
If you need to perform asynchronous communication with an infrastructure like this, you could use
two SPC_CIPHERQ instances, one where the client encrypts messages for the server to decrypt, and
another where the server encrypts messages for the client to decrypt.
The choice you need to make is whether each SPC_CIPHERQ object should be keyed separately or
should share the same key. Sharing the same key is possible, as long as you ensure that the same
{key, nonce} pair is never reused. The way to do this is to manage two sets of nonces that can never
collide. Generally, you do this by setting the high bit of the nonce buffer to 1 in one context and 0 in
another context.
Here's a function that takes an existing context that has been set up, but not otherwise used, and
turns it into two contexts with the same key:
void spc_cipherq_async_setup(SPC_CIPHERQ *q1, SPC_CIPHERQ *q2) {
memcpy(q2, q1, sizeof(SPC_CIPHERQ));
q1->nonce[0] &= 0x7f; /* The upper bit of q1's nonce is always 0. */
q2->nonce[0] |= 0x80; /* The upper bit of q2's nonce is always 1. */
}
We show a similar trick in which we use only one abstraction inRecipe 9.12.
5.16.4 See Also
Recipe 4.11, Recipe 5.5, Recipe 5.10, Recipe 9.12, Recipe 11.2
[ Team LiB ]
[ Team LiB ]
5.17 Performing Block Cipher Setup (for CBC, CFB, OFB,
and ECB Modes) in OpenSSL
5.17.1 Problem
You need to set up a cipher so that you can perform encryption and/or decryption operations in CBC,
CFB, OFB, or ECB mode.
5.17.2 Solution
Here are the steps you need to perform for cipher setup in OpenSSL, using their high-level API:
1. Make sure your code includes openssl/evp.h and links to libcrypto (-lcrypto ).
2. Decide which algorithm and mode you want to use, looking up the mode in Table 5-6to determine
which function instantiates an OpenSSL object representing that mode. Note that OpenSSL
provides only a CTR mode implementation for AES. See Recipe 5.9for more on CTR mode.
3. Instantiate a cipher context (type EVP_CIPHER_CTX ).
4. Pass a pointer to the cipher context to EVP_CIPHER_CTX_init( ) to initialize memory properly.
5. Choose an IV or nonce, if appropriate to the mode (all except ECB).
6. Initialize the mode by calling EVP_EncryptInit_ex( ) or EVP_DecryptInit_ex( ) , as
appropriate:
int EVP_EncryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER
*engine, unsigned char *key, unsigned
int EVP_DecryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER
*engine, unsigned char *key, unsigned
*type, ENGINE
char *ivornonce);
*type, ENGINE
char *ivornonce);
7. If desired, perform any additional configuration the cipher may allow (see Recipe 5.20).
5.17.3 Discussion
Use the raw OpenSSL API only when absolutely necessary because there is a huge
potential for introducing a security vulnerability by accident. For general-purpose
use, we recommend a high-level abstraction, such as that discussed in Recipe
5.16 .
The OpenSSL EVP API is a reasonably high-level interface to a multitude of cryptographic primitives. It
attempts to abstract out most algorithm dependencies, so that algorithms are easy to swap.[14]
[14]
EVP stands for "envelope."
The EVP_EncryptInit_ex( ) and EVP_DecryptInit_ex( ) functions set up a cipher context object to
be used for further operations. It takes four arguments that provide all the information necessary before
encryption or decryption can begin. Both take the same arguments:
ctx
Pointer to an EVP_CIPHER_CTX object, which stores cipher state across calls.
type
Pointer to an EVP_CIPHER object, which represents the cipher configuration to use (see the later
discussion).
engine
Pointer to an ENGINE object representing the actual implementation to use. For example, if you
want to use hardware acceleration, you can pass in an ENGINE object that represents your
cryptographic accelerator.
key
Pointer to the encryption key to be used.
ivornonce
Pointer to an initialization vector or none, if appropriate (useNULL otherwise). For CBC, CFB, and
OFB modes, the initialization vector or nonce is always the same size as the block size of the
cipher, which is often different from the key size of the cipher.
There are also deprecated versions of these calls, EVP_EncryptInit( ) and EVP_DecryptInit( ) , that
are the same except that they do not take the engine argument, and they use only the built-in software
implementation.
Calling a function that returns an EVP_CIPHER object will cause the cipher's implementation to load
dynamically and place information about the algorithm into an internal table if it has not yet done so.
Alternatively, you can load all possible symmetric ciphers at once with a call to the function
OpenSSL_add_all_ciphers( ) , or all ciphers and message digest algorithms with a call to the function
OpenSSL_add_all_algorithms( ) (neither function takes any arguments). For algorithms that have
been loaded, you can retrieve pointers to their objects by name using theEVP_get_cipherbyname( )
function, which takes a single parameter of type char * , representing the desired cipher configuration.
Table 5-6 summarizes the possible functions that can load ciphers (if necessary) and returnEVP_CIPHER
objects. The table also shows the strings that can be used to look up loaded ciphers.
As noted in Recipe 5.2 , we personally recommend AES-based solutions, or (of the
ciphers OpenSSL offers) Triple-DES if AES is not appropriate. If you use other
algorithms, be sure to research them thoroughly.
Table 5-6. Cipher instantiation reference
Cipher
Key strength / actual size
(if different)
Cipher
mode
Call for EVP_CIPHER
object
Cipher lookup
string
AES
128 bits
ECB
EVP_aes_128_ecb( )
aes-128-ecb
AES
128 bits
CBC
EVP_aes_128_cbc( )
aes-128-cbc
AES
128 bits
CFB
EVP_aes_128_cfb( )
aes-128-cfb
AES
128 bits
OFB
EVP_aes_128_ofb( )
aes-128-ofb
AES
192 bits
ECB
EVP_aes_192_ecb( )
aes-192-ecb
AES
192 bits
CBC
EVP_aes_192_cbc( )
aes-192-cbc
AES
192 bits
CFB
EVP_aes_192_cfb( )
aes-192-cfb
AES
192 bits
OFB
EVP_aes_192_ofb( )
aes-192-ofb
AES
256 bits
ECB
EVP_aes_256_ecb( )
aes-256-ecb
AES
256 bits
CBC
EVP_aes_256_cbc( )
aes-256-cbc
AES
256 bits
CFB
EVP_aes_256_cfb( )
aes-256-cfb
AES
256 bits
OFB
EVP_aes_256_ofb( )
aes-256-ofb
Blowfish
128 bits
ECB
EVP_bf_ecb( )
bf-ecb
Blowfish
128 bits
CBC
EVP_bf_cbc( )
bf-cbc
Blowfish
128 bits
CFB
EVP_bf_cfb( )
bf-cfb
Blowfish
128 bits
OFB
EVP_bf_ofb( )
bf-ofb
CAST5
128 bits
ECB
EVP_cast_ecb( )
cast-ecb
CAST5
128 bits
CBC
EVP_cast_cbc( )
cast-cbc
CAST5
128 bits
CFB
EVP_cast_cfb( )
cast-cfb
CAST5
128 bits
OFB
EVP_cast_ofb( )
cast-ofb
DES
Effective: 56 bitsActual: 64
bits
ECB
EVP_des_ecb( )
des-ecb
DES
Effective: 56 bitsActual: 64
bits
CBC
EVP_des_cbc( )
des-cbc
DES
Effective: 56 bitsActual: 64
bits
CFB
EVP_des_cfb( )
des-cfb
DES
Effective: 56 bitsActual: 64
bits
OFB
EVP_des_ofb( )
des-ofb
DESX
Effective[15] : 120
bitsActual: 128 bits
CBC
EVP_desx_cbc( )
desx
3-key
Triple-DES
Effective: 112 bitsActual:
192 bits
ECB
EVP_des_ede3( )
des-ede3
Cipher
Key strength / actual size
(if different)
Cipher
mode
Call for EVP_CIPHER
object
Cipher lookup
string
3-key
Triple-DES
Effective: 112 bitsActual:
192 bits
CBC
EVP_des_ede3_cbc( )
des-ede3-cbc
3-key
Triple-DES
Effective: 112 bitsActual:
192 bits
CFB
EVP_des_ede3_cfb( )
des-ede3-cfb
3-key
Triple-DES
Effective: 112 bitsActual:
192 bits
OFB
EVP_des_ede3_ofb( )
des-ede3-ofb
2-key
Triple-DES
Effective: 112 bitsActual:
128 bits
ECB
EVP_des_ede( )
des-ede
2-key
Triple-DES
Effective: 112 bitsActual:
128 bits
CBC
EVP_des_ede_cbc( )
des-ede-cbc
2-key
Triple-DES
Effective: 112 bitsActual:
128 bits
CFB
EVP_des_ede_cfb( )
des-ede-cfb
2-key
Triple-DES
Effective: 112 bitsActual:
128 bits
OFB
EVP_des_ede_ofb( )
des-ede-ofb
IDEA
128 bits
ECB
EVP_idea_ecb( )
idea-ecb
IDEA
128 bits
CBC
EVP_idea_cbc( )
idea-cbc
IDEA
128 bits
CFB
EVP_idea_cfb( )
idea-cfb
IDEA
128 bits
OFB
EVP_idea_ofb( )
idea-ofb
RC2™
128 bits
ECB
EVP_rc2_ecb( )
rc2-ecb
RC2™
128 bits
CBC
EVP_rc2_cbc( )
rc2-cbc
RC2™
128 bits
CFB
EVP_rc2_cfb( )
rc2-cfb
RC2™
128 bits
OFB
EVP_rc2_ofb( )
rc2-ofb
RC4™
40 bits
n/a
EVP_rc4_40( )
rc4-40
RC4™
128 bits
n/a
EVP_rc4( )
rc4
RC5™
128 bits
ECB
EVP_rc5_32_16_12_ecb( )
rc5-ecb
RC5™
128 bits
CBC
EVP_rc5_32_16_12_cbc( )
rc5-cbc
RC5™
128 bits
CFB
EVP_rc5_32_16_12_cfb( )
rc5-cfb
RC5™
128 bits
OFB
EVP_rc5_32_16_12_ofb( )
rc5-ofb
[15]
There are known plaintext attacks against DESX that reduce the effective strength to 60 bits, but these are
generally considered infeasible.
For stream-based modes (CFB and OFB), encryption and decryption are identical operations. Therefore,
EVP_EncryptInit_ex( ) and EVP_DecryptInit_ex( ) are interchangeable in these cases.
While RC4 can be set up using these instructions, you must be very careful to set
it up securely. We discuss how to do so in Recipe 5.23.
Here is an example of setting up an encryption context using 128-bitAES in CBC mode:
#include <openssl/evp.h>
#include <openssl/rand.h>
/* key must be of size EVP_MAX_KEY_LENGTH.
* iv must be of size EVP_MAX_IV_LENGTH.
*/
EVP_CIPHER_CTX *sample_setup(unsigned char *key, unsigned char *iv) {
EVP_CIPHER_CTX *ctx;
/* This uses the OpenSSL PRNG . See Recipe 11.9 */
RAND_bytes(key, EVP_MAX_KEY_LENGTH);
RAND_bytes(iv, EVP_MAX_IV_LENGTH);
if (!(ctx = (EVP_CIPHER_CTX *)malloc(sizeof(EVP_CIPHER_CTX)))) return 0;
EVP_CIPHER_CTX_init(ctx);
EVP_EncryptInit_ex(ctx, EVP_aes_128_cbc( ), 0, key, iv);
return ctx;
}
This example selects a key and initialization vector at random. Both of these items need to be
communicated to any party that needs to decrypt the data. The caller therefore needs to be able to
recover this information. In this example, we handle this by having the caller pass in allocated memory,
which we fill with the new key and IV. The caller can then communicate them to the other party in
whatever manner is appropriate.
Note that to make replacing algorithms easier, we always create keys and initialization vectors of the
maximum possible length, using macros defined in theopenssl/evp.h header file.
5.17.4 See Also
Recipe 5.2 , Recipe 5.9 , Recipe 5.16 , Recipe 5.18 , Recipe 5.20 , Recipe 5.23
[ Team LiB ]
[ Team LiB ]
5.18 Using Variable Key-Length Ciphers in OpenSSL
5.18.1 Problem
You're using a cipher with an adjustable key length, yet OpenSSL provides no default cipher
configuration for your desired key length.
5.18.2 Solution
Initialize the cipher without a key, call EVP_CIPHER_CTX_set_key_length( ) to set the appropriate
key length, then set the key.
5.18.3 Discussion
Many of the ciphers supported by OpenSSL support variable key lengths. Whereas some, such as
AES, have an available call for each possible key length, others (in particular, RC4) allow for nearly
arbitrary byte-aligned keys. Table 5-7 lists ciphers supported by OpenSSL, and the varying key
lengths those ciphers can support.
Table 5-7. Variable key sizes
Cipher
AES
OpenSSL-supported key sizes
128, 192, and 256 bits
Algorithm's possible key sizes
128, 192, and 256 bits
Blowfish Up to 256 bits
Up to 448 bits
CAST5
40-128 bits
40-128 bits
RC2
Up to 256 bits
Up to 1,024 bits
RC4
Up to 256 bits
Up to 2,048 bits
RC5
Up to 256 bits
Up to 2,040 bits
While RC2, RC4, and RC5 support absurdly high key lengths, it really is overkill to use more than a
256-bit symmetric key. There is not likely to be any greater security, only less efficiency. Therefore,
OpenSSL puts a hard limit of 256 bits on key sizes.
When calling the OpenSSL cipher initialization functions, you can set toNULL any value you do not
want to provide immediately. If the cipher requires data you have not yet provided, clearly
encryption will not work properly.
Therefore, we can choose a cipher using EVP_EncryptInit_ex( ) without specifying a key, then set
the key size using EVP_CIPHER_CTX_set_key_length( ), which takes two arguments: the first is the
context initialized by the call to EVP_EncryptInit_ex( ), and the second is the new key length in
bytes.
Finally, we can set the key by calling EVP_EncryptInit_ex( ) again, passing in the context and any
new data, along with NULL for any parameters we've already set. For example, the following code
would set up a 256-bit version of Blowfish in CBC mode:
#include <openssl/evp.h>
EVP_CIPHER_CTX *blowfish_256_cbc_setup(char *key, char *iv) {
EVP_CIPHER_CTX *ctx;
if (!(ctx = (EVP_CIPHER_CTX *)malloc(sizeof(EVP_CIPHER_CTX)))) return 0;
EVP_CIPHER_CTX_init(ctx);
/* Uses 128-bit keys by default. We pass in NULLs for the parameters that we'll
* fill in after properly setting the key length.
*/
EVP_EncryptInit_ex(ctx, EVP_bf_cbc( ), 0, 0, 0);
EVP_CIPHER_CTX_set_key_length(ctx, 32);
EVP_EncryptInit_ex(ctx, 0, 0, key, iv);
return ctx;
}
[ Team LiB ]
[ Team LiB ]
5.19 Disabling Cipher Padding in OpenSSL in CBC Mode
5.19.1 Problem
You're encrypting in CBC or ECB mode, and the length of your data to encrypt is always a multiple of
the block size. You would like to avoid padding because it adds an extra, unnecessary block of
output.
5.19.2 Solution
OpenSSL has a function that can turn padding on and off for a context object:
int EVP_CIPHER_CTX_set_padding(EVP_CIPHER_CTX *ctx, int pad);
5.19.3 Discussion
Particularly when you are implementing another encryption mode, you may always be operating on
block-sized chunks, and it can be inconvenient to deal with padding. Alternatively, some odd protocol
may require a nonstandard padding scheme that causes you to pad the data manually before
encryption (and to remove the pad manually after encryption).
The second argument of this function should be zero to turn padding off, and non-zero to turn it on.
[ Team LiB ]
[ Team LiB ]
5.20 Performing Additional Cipher Setup in OpenSSL
5.20.1 Problem
Using OpenSSL, you want to adjust a configurable parameter of a cipher other than the key length.
5.20.2 Solution
OpenSSL provides an obtuse, ioctl()-style API for setting uncommon cipher parameters on a
context object:
int EVP_CIPHER_CTX_ctrl(EVP_CIPHER_CTX *ctx, int type, int arg, void *ptr);
5.20.3 Discussion
OpenSSL doesn't provide much flexibility in adjusting cipher characteristics. For example, the three
AES configurations are three specific instantiations of a cipher calledRijndael, which has nine
different configurations. However, OpenSSL supports only the three standard ones.
Nevertheless, there are two cases in which OpenSSL does allow for configurability. In the first case, it
allows for setting the "effective key bits" in RC2. As a result, the RC2 key is crippled so that it is only
as strong as the effective size set. We feel that this functionality is completely useless.
In the second case, OpenSSL allows you to set the number of rounds used internally by theRC5
algorithm. By default, RC5 uses 12 rounds. And while the algorithm should take absolutely variablelength rounds, OpenSSL allows you to set the number only to 8, 12, or 16.
The function EVP_CIPHER_CTX_ctrl( ) can be used to set or query either of these values, given a
cipher of the appropriate type. This function has the following arguments:
ctx
Pointer to the cipher context to be modified.
type
Value indicating which operation to perform (more on this a little later).
arg
Numerical value to set, if appropriate (it is otherwise ignored).
ptr
Pointer to an integer for querying the numerical value of a property, if appropriate (the result is
placed in the integer being pointed to).
The type argument can be one of the four macros defined in openssl/evp.h:
EVP_CTRL_GET_RC2_KEY_BITS
EVP_CTRL_SET_RC2_KEY_BITS
EVP_CTRL_GET_RC5_ROUNDS
EVP_CTRL_SET_RC5_ROUNDS
For example, to set an RC5 context to use 16 rounds:
EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_SET_RC5_ROUNDS, 16, NULL);
To query the number of rounds, putting the result into an integer namedr:
EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_GET_RC5_ROUNDS, 0, &r);
[ Team LiB ]
[ Team LiB ]
5.21 Querying Cipher Configuration Properties in
OpenSSL
5.21.1 Problem
You want to get information about a particular cipher context in OpenSSL.
5.21.2 Solution
For most properties, OpenSSL provides macros for accessing them. For other things, we can access
the members of the cipher context structure directly.
To get the actual object representing the cipher:
EVP_CIPHER *EVP_CIPHER_CTX_cipher(EVP_CIPHER_CTX *ctx);
To get the block size of the cipher:
int EVP_CIPHER_CTX_block_size(EVP_CIPHER_CTX *ctx);
To get the key length of the cipher:
int EVP_CIPHER_CTX_key_length(EVP_CIPHER_CTX *ctx);
To get the length of the initialization vector:
int EVP_CIPHER_CTX_iv_length(EVP_CIPHER_CTX *ctx);
To get the cipher mode being used:
int EVP_CIPHER_CTX_mode(EVP_CIPHER_CTX *ctx);
To see if automatic padding is disabled:
int pad = (ctx->flags & EVP_CIPH_NO_PADDING);
To see if we are encrypting or decrypting:
int encr = (ctx->encrypt);
To retrieve the original initialization vector:
char *iv = (ctx->oiv);
5.21.3 Discussion
The EVP_CIPHER_CTX_cipher( ) function is actually implemented as a macro that returns an object
of type EVP_CIPHER. The cipher itself can be queried, but interesting queries can also be made on the
context object through appropriate macros.
All functions returning lengths return them in bytes.
The EVP_CIPHER_CTX_mode( ) function returns one of the following predefined values:
EVP_CIPH_ECB_MODE
EVP_CIPH_CBC_MODE
EVP_CIPH_CFB_MODE
EVP_CIPH_OFB_MODE
[ Team LiB ]
[ Team LiB ]
5.22 Performing Low-Level Encryption and Decryption
with OpenSSL
5.22.1 Problem
You have set up your cipher and want to perform encryption and decryption.
5.22.2 Solution
Use the following suite of functions:
int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl,
unsigned char *in, int inl);
int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl);
int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl,
unsigned char *in, int inl);
int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl);
5.22.3 Discussion
As a reminder, use a raw mode only if you really know what you're doing. For
general-purpose use, we recommend a high-level abstraction, such as that
discussed in Recipe 5.16. Additionally, be sure to include some sort of integrity
validation whenever encrypting, as we discuss throughout Chapter 6.
The signatures for the encryption and decryption routines are identical, and the actual routines are
completely symmetric. Therefore, we'll only discuss the behavior of the encryption functions, and you
can infer the behavior of the decryption functions from that.
EVP_EncryptUpdate( ) has the following arguments:
ctx
Pointer to the cipher context previously initialized withEVP_EncryptInit_ex( ).
out
Buffer into which any output is placed.
outl
Pointer to an integer, into which the number of bytes written to the output buffer is placed.
in
Buffer containing the data to be encrypted.
inl
Number of bytes contained in the input buffer.
EVP_EncryptFinal_ex( ) takes the following arguments:
ctx
Pointer to the cipher context previously initialized withEVP_EncryptInit_ex( ).
out
Buffer into which any output is placed.
outl
Pointer to an integer, into which the number of bytes written to the output buffer is placed.
There are two phases to encryption in OpenSSL: update, and finalization. The basic idea behind
update mode is that you're feeding in data to encrypt, and if there's incremental output, you get it.
Calling the finalization routine lets OpenSSL know that all the data to be encrypted with this current
context has already been given to the library. OpenSSL then does any cleanup work necessary, and it
will sometimes produce additional output. After a cipher is finalized, you need to reinitialize it if you
plan to reuse it, as described in Recipe 5.17.
In CBC and ECB modes, the cipher cannot always encrypt all the plaintext you give it as that
plaintext arrives, because it requires block-aligned data to operate. In the finalization phase, those
algorithms add padding if appropriate, then yield the remaining output. Note that, because of the
internal buffering that can happen in these modes, the output to any single call of
EVP_EncryptUpdate( ) or EVP_EncryptFinal_ex( ) can be about a full block larger or smaller than
the actual input. If you're encrypting data into a single buffer, you can always avoid overflow if you
make the output buffer an entire block bigger than the input buffer. Remember, however, that if
padding is turned off (as described in Recipe 5.19), the library will be expecting block-aligned data,
and the output will always be the same size as the input.
In OFB and CFB modes, the call to EVP_EncryptUpdate( ) will always return the amount of data you
passed in, and EVP_EncryptFinal_ex( ) will never return any data. This is because these modes
are stream-based modes that don't require aligned data to operate. Therefore, it is sufficient to call
only EVP_EncryptUpdate( ), skipping finalization entirely. Nonetheless, you should always call the
finalization function so that the library has the chance to do any internal cleanup that may be
necessary. For example, if you're using a cryptographic accelerator, the finalization call essentially
gives the hardware license to free up resources for other operations.
These functions all return 1 on success, and 0 on failure. EVP_EncryptFinal_ex( ) will fail if padding
is turned off and the data is not block-aligned. EVP_DecryptFinal_ex( ) will fail if the decrypted
padding is not in the proper format. Additionally, any of these functions may fail if they are using
hardware acceleration and the underlying hardware throws an error. Beyond those problems, they
should not fail. Note again that when decrypting, this API has no way of determining whether the
data decrypted properly. That is, the data may have been modified in transit; other means are
necessary to ensure integrity (i.e., use a MAC, as we discuss throughout Chapter 6).
Here's an example function that, when given an already instantiated cipher context, encrypts an
entire plaintext message 100 bytes at a time into a single heap-allocated buffer, which is returned at
the end of the function. This example demonstrates how you can perform multiple encryption
operations over time and keep encrypting into a single buffer. This code will work properly with any
of the OpenSSL-supported cipher modes.
#include <stdlib.h>
#include <openssl/evp.h>
/* The integer pointed to by rb receives the number of bytes in the output.
* Note that the malloced buffer can be realloced right before the return.
*/
char *encrypt_example(EVP_CIPHER_CTX *ctx, char *data, int inl, int *rb) {
int i, ol, tmp;
char *ret;
ol = 0;
if (!(ret = (char *)malloc(inl + EVP_CIPHER_CTX_block_size(ctx)))) abort( );
for (i = 0; i < inl / 100; i++) {
if (!EVP_EncryptUpdate(ctx, &ret[ol], &tmp, &data[ol], 100)) abort( );
ol += tmp;
}
if (inl % 100) {
if (!EVP_EncryptUpdate(ctx, &ret[ol], &tmp, &data[ol], inl % 100)) abort( );
ol += tmp;
}
if (!EVP_EncryptFinal_ex(ctx, &ret[ol], &tmp)) abort( );
ol += tmp;
if (rb) *rb = ol;
return ret;
}
Here's a simple function for decryption that decrypts an entire message at once:
#include <stdlib.h>
#include <openssl/evp.h>
char *decrypt_example(EVP_CIPHER_CTX *ctx, char *ct, int inl) {
/* We're going to null-terminate the plaintext under the assumption that it's
* non-null terminated ASCII text. The null can otherwise be ignored if it
* wasn't necessary, though the length of the result should be passed back in
* such a case.
*/
int ol;
char *pt;
if (!(pt = (char *)malloc(inl + EVP_CIPHER_CTX_block_size(ctx) + 1))) abort(
EVP_DecryptUpdate(ctx, pt, &ol, ct, inl);
if (!ol) { /* There is no data to decrypt */
free(pt);
return 0;
}
pt[ol] = 0;
return pt;
}
);
5.22.4 See Also
Recipe 5.16, Recipe 5.17
[ Team LiB ]
[ Team LiB ]
5.23 Setting Up and Using RC4
5.23.1 Problem
You want to use RC4 securely.
5.23.2 Solution
You can't be very confident about the security of RC4 for general-purpose use, owing to theoretical
weaknesses. However, if you're willing to use only a very few RC4 outputs (a limit of about 100,000
bytes of output), you can take a risk, as long as you properly set it up.
Before using the standard initialization functions provided by your cryptographic library, take one of
the following two steps:
Cryptographically hash the key material before using it.
Discard the first 256 bytes of the generated keystream.
After initialization, RC4 is used just as any block cipher in a streaming mode is used.
Most libraries implement RC4, but it is so simple that we provide an implementation in the following
section.
5.23.3 Discussion
RC4 is a simple cipher that is really easy to use once you have it set up securely, which is actually
difficult to do! Due to this key-setup problem, RC4's theoretical weaknesses, and the availability of
faster solutions that look more secure, we recommend you just not use RC4. If you're looking for a
very fast solution, we recommend SNOW 2.0.
In this recipe, we'll start off ignoring the RC4 key-setup problem. We'll show you how to use RC4
properly, giving a complete implementation. Then, after all that, we'll discuss how to set it up
securely.
As with any other symmetric encryption algorithm, it is particularly important
to use a MAC along with RC4 to ensure data integrity. We discuss MACs
extensively in Chapter 6.
RC4 requires a little bit of state, including a 256-byte buffer and two 8-bit counters. Here's a
declaration for an RC4_CTX data type:
typedef struct {
unsigned char sbox[256];
unsigned char i, j;
} RC4_CTX;
In OpenSSL, the same sort of context is named RC4_KEY, which is a bit of a misnomer. Throughout
this recipe, we will use RC4_CTX, but our implementation is otherwise compatible with OpenSSL's (our
functions have the same names and parameters). You'll only need to include the correct header file,
and alias RC4_CTX to RC4_KEY.
The "official" RC4 key setup function isn't generally secure without additional work, but we need to
have it around anyway:
#include <stdlib.h>
void RC4_set_key(RC4_CTX *c, size_t keybytes, unsigned char *key) {
int
i, j;
unsigned char keyarr[256], swap;
c->i = c->j = 0;
for (i = j = 0; i < 256; i++, j = (j + 1) % keybytes) {
c->sbox[i] = i;
keyarr[i] = key[j];
}
for (i = j = 0; i < 256; i++) {
j += c->sbox[i] + keyarr[i];
j %= 256;
swap = c->sbox[i];
c->sbox[i] = c->sbox[j];
c->sbox[j] = swap;
}
}
The RC4 function has the following arguments:
c
Pointer to an RC4_CTX object.
n
Number of bytes to encrypt.
in
Buffer to encrypt.
out
Output buffer.
void RC4(RC4_CTX *c, size_t n, unsigned char *in, unsigned char *out) {
unsigned char swap;
while (n--) {
c->j += c->sbox[++c->i];
swap = c->sbox[c->i];
c->sbox[c->i] = c->sbox[c->j];
c->sbox[c->j] = swap;
swap = c->sbox[c->i] + c->sbox[c->j];
*out++ = *in++ ^ c->sbox[swap];
}
}
That's it for an RC4 implementation. This function can be used incrementally or as an "all-in-one"
solution.
Now let's look at how to key RC4 properly.
Without going into the technical details of the problems with RC4 key setup, it's sufficient to say that
the real problem occurs when you key multiple RC4 instances with related keys. For example, in
some circles it is common to use a truncated base key, then concatenate a counter for each message
(which is not a good idea in and of itself because it reduces the effective key strength).
The first way to solve this problem is to use a cryptographic hash function to randomize the key. If
your key is 128 bits, you can use MD5 and take the entire digest value, or you can use a hash
function with a larger digest, such as SHA1 or SHA-256, truncating the result to the appropriate size.
Here's some code for setting up an RC4 context by hashing key material using MD5 (include
openssl/md5.h to have this work directly with OpenSSL's implementation). MD5 is fine for this
purpose; you can also use SHA1 and truncate to 16 bytes.
/* Assumes you have not yet initialized the context, but have allocated it. */
void secure_rc4_setup1(RC4_CTX *ctx, char *key) {
char res[16]; /* 16 is the size in bytes of the resulting MD5 digest. */
MD5(key, 16, res);
RC4_set_key(ctx, 16, res);
}
Note that RC4 does not use an initialization vector.
Another option is to start using RC4, but throw away the first 256 bytes worth of keystream. One
easy way to do that is to encrypt 256 bits of garbage and ignore the results:
/* Assumes an already instantiated RC4 context. */
void secure_rc4_setup2(RC4_CTX *ctx) {
char buf[256] = {0,};
RC4(ctx, sizeof(buf), buf, buf);
spc_memset(buf, 0, sizeof(buf));
}
[ Team LiB ]
[ Team LiB ]
5.24 Using One-Time Pads
5.24.1 Problem
You want to use an encryption algorithm that has provable secrecy properties, and deploy it in a
fashion that does not destroy the security properties of the algorithm.
5.24.2 Solution
Settle for more realistic security goals. Do not use a one-time pad.
5.24.3 Discussion
One-time pads are provably secure if implemented properly. Unfortunately, they are rarely used
properly. A one-time pad is very much like a stream cipher. Encryption is simply XOR'ing the
message with the keystream. The security comes from having every single bit of the keystream be
truly random instead of merely cryptographically random. If portions of the keystream are reused,
the security of data encrypted with those portions is incredibly weak.
There are a number of big hurdles when using one-time pads:
It is very close to impossible to generate a truly random keystream in software. (SeeChapter
11 for more information.)
The keystream must somehow be shared between client and server. Because there can be no
algorithm to produce the keystream, some entity will need to produce the keystream and
transmit it securely to both parties.
The keystream must be as long as the message. If you have a message that's bigger than the
keystream you have remaining, you can't send the entire message.
Integrity checking is just as important with one-time pads as with any other encryption
technique. As with the output of any stream cipher, if you modify a bit in the ciphertext
generated by a one-time pad, the corresponding bit of the plaintext will flip. In addition, onetime pads have no built-in mechanism for detecting truncation or additive attacks. Message
authentication in a provably secure manner essentially requires a keystream twice the data
length.
Basically, the secure deployment of one-time pads is almost always highly impractical. You are
generally far better off using a good high-level interface to encryption and decryption, such as the
one provided in Recipe 5.16.
5.24.4 See Also
Recipe 5.16
[ Team LiB ]
[ Team LiB ]
5.25 Using Symmetric Encryption with Microsoft's
CryptoAPI
5.25.1 Problem
You are developing an application that will run on Windows and make use of symmetric encryption.
You want to use Microsoft's CryptoAPI.
5.25.2 Solution
Microsoft's CryptoAPI is available on most versions of Windows that are widely deployed, so it is a
reasonable solution for many uses of symmetric encryption. CryptoAPI contains a small, yet nearly
complete, set of functions for creating and manipulating symmetric encryption keys (which the
Microsoft documentation usually refers to as session keys), exchanging keys, and encrypting and
decrypting data. While the information in the followingSection 5.25.3 will not provide you with all the
finer details of using CryptoAPI, it will give you enough background to get started using the API
successfully.
5.25.3 Discussion
CryptoAPI is designed as a high-level interface to various cryptographic constructs, including hashes,
MACs, public key encryption, and symmetric encryption. Its support for public key cryptography
makes up the majority of the API, but there is also a small subset of functions for symmetric
encryption.
Before you can do anything with CryptoAPI, you first need to acquire a provider context. CryptoAPI
provides a generic API that wraps around Cryptographic Service Providers (CSPs), which are
responsible for doing all the real work. Microsoft provides several different CSPs that provide
implementations of various algorithms. For symmetric cryptography, two CSPs are widely available
and of interest: Microsoft Base Cryptographic Service Provider and Microsoft Enhanced Cryptographic
Service Provider. A third, Microsoft AES Cryptographic Service Provider, is available only in the .NET
framework. The Base CSP provides RC2, RC4, and DES implementations. The Enhanced CSP adds
implementations for DES, two-key Triple-DES, and three-key Triple-DES. The AES CSP adds
implementations for AES with 128-bit, 192-bit, and 256-bit key lengths.
For our purposes, we'll concentrate only on the enhanced CSP. Acquiring a provider context is done
with the following code. We use the CRYPT_VERIFYCONTEXT flag here because we will not be using
private keys with the context. It doesn't necessarily hurt to omit the flag (which we will do inRecipe
5.26 and Recipe 5.27, for example), but if you don't need public key access with the context, you
should use the flag. Some CSPs may require user input when CryptAcquireContext( ) is called
without CRYPT_VERIFYCONTEXT.
#include <windows.h>
#include <wincrypt.h>
HCRYPTPROV SpcGetCryptContext(void) {
HCRYPTPROV hProvider;
if (!CryptAcquireContext(&hProvider, 0, MS_ENHANCED_PROV, PROV_RSA_FULL,
CRYPT_VERIFYCONTEXT)) return 0;
return hProvider;
}
Once a provider context has been successfully acquired, you need a key. The API provides three
ways to obtain a key object, which is stored by CryptoAPI as an opaque object to which you'll have
only a handle:
CryptGenKey( )
Generates a random key.
CryptDeriveKey( )
Derives a key from a password or passphrase.
CryptImportKey( )
Creates a key object from key data in a buffer.
All three functions return a new key object that keeps the key data hidden and has associated with it
a symmetric encryption algorithm and a set of flags that control the behavior of the key. The key
data can be obtained from the key object using CryptExportKey( ) if the key object allows it. The
CryptExportKey( ) and CryptImportKey( ) functions provide the means for exchanging keys.
The CryptExportKey( ) function will only allow you to export a symmetric
encryption key encrypted with another key. For maximum portability across all
versions of Windows, a public key should be used. However, Windows 2000
introduced the ability to encrypt the symmetric encryption key with another
symmetric encryption key. Similarly, CryptImportKey( ) can only import
symmetric encryption keys that are encrypted.
If you need the raw key data, you must first export the key in encrypted form,
then decrypt from it (see Recipe 5.27). While this may seem like a lot of extra
work, the reason is that CryptoAPI was designed with the goal of making it very
difficult (if not impossible) to unintentionally disclose sensitive information.
Generating a new key with CryptGenKey( ) that can be exported is very simple, as illustrated in the
following code. If you don't want the new key to be exportable, simply remove the
CRYPT_EXPORTABLE flag.
HCRYPTKEY SpcGetRandomKey(HCRYPTPROV hProvider, ALG_ID Algid, DWORD dwSize) {
DWORD
dwFlags;
HCRYPTKEY hKey;
dwFlags = ((dwSize << 16) & 0xFFFF0000) | CRYPT_EXPORTABLE;
if (!CryptGenKey(hProvider, Algid, dwFlags, &hKey)) return 0;
return hKey;
}
Deriving a key with CryptDeriveKey( ) is a little more complex. It requires a hash object to be
created and passed into it in addition to the same arguments required byCryptGenKey( ). Note that
once the hash object has been used to derive a key, additional data cannot be added to it, and it
should be immediately destroyed.
HCRYPTKEY SpcGetDerivedKey(HCRYPTPROV hProvider, ALG_ID Algid, LPTSTR password) {
BOOL
bResult;
DWORD
cbData;
HCRYPTKEY hKey;
HCRYPTHASH hHash;
if (!CryptCreateHash(hProvider, CALG_SHA1, 0, 0, &hHash)) return 0;
cbData = lstrlen(password) * sizeof(TCHAR);
if (!CryptHashData(hHash, (BYTE *)password, cbData, 0)) {
CryptDestroyHash(hHash);
return 0;
}
bResult = CryptDeriveKey(hProvider, Algid, hHash, CRYPT_EXPORTABLE, &hKey);
CryptDestroyHash(hHash);
return (bResult ? hKey : 0);
}
Importing a key with CryptImportKey( ) is, in most cases, just as easy as generating a new random
key. Most often, you'll be importing data obtained directly from CryptExportKey( ), so you'll
already have an encrypted key in the form of a SIMPLEBLOB, as required by CryptImportKey( ). If
you need to import raw key data, things get a whole lot trickier-see Recipe 5.26 for details.
HCRYPTKEY SpcImportKey(HCRYPTPROV hProvider, BYTE *pbData, DWORD dwDataLen,
HCRYPTKEY hPublicKey) {
HCRYPTKEY hKey;
if (!CryptImportKey(hProvider, pbData, dwDataLen, hPublicKey, CRYPT_EXPORTABLE,
&hKey)) return 0;
return hKey;
}
When a key object is created, the cipher to use is tied to that key, and it must be specified as an
argument to either CryptGenKey( ) or CryptDeriveKey( ). It is not required as an argument by
CryptImportKey( ) because the cipher information is stored as part of the SIMPLEBLOB structure
that is required. Table 5-8 lists the symmetric ciphers that are available using one of the three
Microsoft CSPs.
Table 5-8. Symmetric ciphers supported by Microsoft Cryptographic
Service Providers
Cipher
Cryptographic Service
Provider
ALG_ID
constant
Key length
Block
size
RC2
Base, Enhanced, AES
CALG_RC2
40 bits
64 bits
RC4
Base
CALG_RC4
40 bits
n/a
RC4
Enhanced, AES
CALG_RC4
128 bits
n/a
DES
Enhanced, AES
CALG_DES
56 bits
64 bits
CALG_3DES_112
112 bits
(effective)
64 bits
64 bits
2-key TripleDES
Enhanced, AES
3-key TripleDES
Enhanced, AES
CALG_3DES
168 bits
(effective)
AES
AES
CALG_AES_128
128 bits
128 bits
AES
AES
CALG_AES_192
192 bits
128 bits
AES
AES
CALG_AES_256
256 bits
128 bits
The default cipher mode to be used depends on the underlying CSP and the algorithm that's being
used, but it's generally CBC mode. The Microsoft Base and Enhanced CSPs provide support for CBC,
CFB, ECB, and OFB modes (see Recipe 5.4 for a discussion of cipher modes). The mode can be set
using the CryptSetKeyParam( ) function:
BOOL SpcSetKeyMode(HCRYPTKEY hKey, DWORD dwMode) {
return CryptSetKeyParam(hKey, KP_MODE, (BYTE *)&dwMode, 0);
}
#define
#define
#define
#define
SpcSetMode_CBC(hKey)
SpcSetMode_CFB(hKey)
SpcSetMode_ECB(hKey)
SpcSetMode_OFB(hKey)
SpcSetKeyMode((hKey),
SpcSetKeyMode((hKey),
SpcSetKeyMode((hKey),
SpcSetKeyMode((hKey),
CRYPT_MODE_CBC)
CRYPT_MODE_CFB)
CRYPT_MODE_ECB)
CRYPT_MODE_OFB)
In addition, the initialization vector for block ciphers will be set to zero, which is almost certainly not
what you want. The function presented below, SpcSetIV( ), will allow you to set the IV for a key
explicitly or will generate a random one for you. The IV should always be the same size as the block
size for the cipher in use.
BOOL SpcSetIV(HCRYPTPROV hProvider, HCRYPTKEY hKey, BYTE *pbIV) {
BOOL bResult;
BYTE *pbTemp;
DWORD dwBlockLen, dwDataLen;
if (!pbIV) {
dwDataLen = sizeof(dwBlockLen);
if (!CryptGetKeyParam(hKey, KP_BLOCKLEN, (BYTE *)&dwBlockLen, &dwDataLen, 0))
return FALSE;
dwBlockLen /= 8;
if (!(pbTemp = (BYTE *)LocalAlloc(LMEM_FIXED, dwBlockLen))) return FALSE;
bResult = CryptGenRandom(hProvider, dwBlockLen, pbTemp);
if (bResult)
bResult = CryptSetKeyParam(hKey, KP_IV, pbTemp, 0);
LocalFree(pbTemp);
return bResult;
}
return CryptSetKeyParam(hKey, KP_IV, pbIV, 0);
}
Once you have a key object, it can be used for encrypting and decrypting data. Access to the lowlevel algorithm implementation is not permitted through CryptoAPI. Instead, a high-level OpenSSL
EVP-like interface is provided (see Recipe 5.17 and Recipe 5.22 for details on OpenSSL's EVP API),
though it's somewhat simpler. Both encryption and decryption can be done incrementally, but there
is only a single function for each.
The CryptEncrypt( ) function is used to encrypt data all at once or incrementally. As a convenience,
the function can also pass the plaintext to be encrypted to a hash object to compute the hash as data
is passed through for encryption. CryptEncrypt( ) can be somewhat tricky to use because it places
the resulting ciphertext into the same buffer as the plaintext. If you're using a stream cipher, this is
no problem because the ciphertext is usually the same size as the plaintext, but if you're using a
block cipher, the ciphertext can be up to a whole block longer than the plaintext. The following
convenience function handles the buffering issues transparently for you. It requires thespc_memcpy(
) function from Recipe 13.2.
BYTE *SpcEncrypt(HCRYPTKEY hKey, BOOL bFinal, BYTE *pbData, DWORD *cbData) {
BYTE
*pbResult;
DWORD dwBlockLen, dwDataLen;
ALG_ID Algid;
dwDataLen = sizeof(ALG_ID);
if (!CryptGetKeyParam(hKey, KP_ALGID, (BYTE *)&Algid, &dwDataLen, 0)) return 0;
if (GET_ALG_TYPE(Algid) != ALG_TYPE_STREAM) {
dwDataLen = sizeof(DWORD);
if (!CryptGetKeyParam(hKey, KP_BLOCKLEN, (BYTE *)&dwBlockLen, &dwDataLen, 0))
return 0;
dwDataLen = ((*cbData + (dwBlockLen * 2) - 1) / dwBlockLen) * dwBlockLen;
if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, dwDataLen))) return 0;
CopyMemory(pbResult, pbData, *cbData);
if (!CryptEncrypt(hKey, 0, bFinal, 0, pbResult, &dwDataLen, *cbData)) {
LocalFree(pbResult);
return 0;
}
*cbData = dwDataLen;
return pbResult;
}
if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, *cbData))) return 0;
CopyMemory(pbResult, pbData, *cbData);
if (!CryptEncrypt(hKey, 0, bFinal, 0, pbResult, cbData, *cbData)) {
LocalFree(pbResult);
return 0;
}
return pbResult;
}
The return from SpcEncrypt( ) will be a buffer allocated with LocalAlloc( ) that contains the
ciphertext version of the plaintext that's passed as an argument into the function aspbData. If the
function fails for some reason, the return from the function will beNULL, and a call to GetLastError(
) will return the error code. This function has the following arguments:
hKey
Key to use for performing the encryption.
bFinal
Boolean value that should be passed as FALSE for incremental encryption except for the last
piece of plaintext to be encrypted. To encrypt all at once, passTRUE for bFinal in the single
call to SpcEncrypt( ). When CryptEncrypt( ) gets the final plaintext to encrypt, it performs
any cleanup that is needed to reset the key object back to a state where a new encryption or
decryption operation can be performed with it.
pbData
Plaintext.
cbData
Pointer to a DWORD type that should hold the length of the plaintext pbData buffer. If the
function returns successfully, it will be modified to hold the number of bytes returned in the
ciphertext buffer.
Decryption works similarly to encryption. The function CryptDecrypt( ) performs decryption either
all at once or incrementally, and it also supports the convenience function of passing plaintext data to
a hash object to compute the hash of the plaintext as it is decrypted. The primary difference between
encryption and decryption is that when decrypting, the plaintext will never be any longer than the
ciphertext, so the handling of data buffers is less complicated. The following function,SpcDecrypt(
), mirrors the SpcEncrypt( ) function presented previously.
BYTE *SpcDecrypt(HCRYPTKEY hKey, BOOL bFinal, BYTE *pbData, DWORD *cbData) {
BYTE
*pbResult;
DWORD dwBlockLen, dwDataLen;
ALG_ID Algid;
dwDataLen = sizeof(ALG_ID);
if (!CryptGetKeyParam(hKey, KP_ALGID, (BYTE *)&Algid, &dwDataLen, 0)) return 0;
if (GET_ALG_TYPE(Algid) != ALG_TYPE_STREAM) {
dwDataLen = sizeof(DWORD);
if (!CryptGetKeyParam(hKey, KP_BLOCKLEN, (BYTE *)&dwBlockLen, &dwDataLen, 0))
return 0;
dwDataLen = ((*cbData + dwBlockLen - 1) / dwBlockLen) * dwBlockLen;
if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, dwDataLen))) return 0;
} else {
if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, *cbData))) return 0;
}
CopyMemory(pbResult, pbData, *cbData);
if (!CryptDecrypt(hKey, 0, bFinal, 0, pbResult, cbData)) {
LocalFree(pbResult);
return 0;
}
return pbResult;
}
Finally, when you're finished using a key object, be sure to destroy the object by calling
CryptDestroyKey( ) and passing the handle to the object to be destroyed. Likewise, when you're
done with a provider context, you must release it by calling CryptReleaseContext( ).
5.25.4 See Also
Recipe 5.4, Recipe 5.17, Recipe 5.22, Recipe 5.26, Recipe 5.27, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
5.26 Creating a CryptoAPI Key Object from Raw Key Data
5.26.1 Problem
You have a symmetric key from another API, such as OpenSSL, that you would like to use with
CryptoAPI. Therefore, you must create a CryptoAPI key object with the key data.
5.26.2 Solution
The Microsoft CryptoAPI is designed to prevent unintentional disclosure of sensitive key information.
To do this, key information is stored in opaque data objects by theCryptographic Service Provider
(CSP) used to create the key object. Key data is exportable from key objects, but the data must be
encrypted with another key to prevent accidental disclosure of the raw key data.
5.26.3 Discussion
In Recipe 5.25, we created a convenience function, SpcGetCryptContext( ), for obtaining a handle
to a CSP context object. This function uses the CRYPT_VERIFYCONTEXT flag with the underlying
CryptAcquireContext( ) function, which serves to prevent the use of private keys with the
obtained context object. To be able to import and export symmetric encryption keys, you need to
obtain a handle to a CSP context object without that flag, and use that CSP context object for
creating the keys you wish to use. We'll create a new function calledSpcGetExportableContext( )
that will return a CSP context object suitable for creating, importing, and exporting symmetric
encryption keys.
#include <windows.h>
#include <wincrypt.h>
HCRYPTPROV SpcGetExportableContext(void) {
HCRYPTPROV hProvider;
if (!CryptAcquireContext(&hProvider, 0, MS_ENHANCED_PROV, PROV_RSA_FULL, 0)) {
if (GetLastError( ) != NTE_BAD_KEYSET) return 0;
if (!CryptAcquireContext(&hProvider, 0, MS_ENHANCED_PROV, PROV_RSA_FULL,
CRYPT_NEWKEYSET)) return 0;
}
return hProvider;
}
SpcGetExportableContext( ) will obtain a handle to the Microsoft Enhanced Cryptographic Service
Provider that allows for the use of private keys. Public key pairs are stored in containers by the
underlying CSP. This function will use the default container, creating it if it doesn't already exist.
Every public key container can have a special public key pair known as anexchange key, which is the
key that we'll use to encrypt the exported key data. The function CryptGetUserKey( ) is used to
obtain the exchange key. If it doesn't exist, SpcImportKeyData( ), listed later in this section, will
create a 1,024-bit exchange key, which will be stored as the exchange key in the public key container
so future attempts to get the key will succeed. The special algorithm identifierAT_KEYEXCHANGE is
used to reference the exchange key.
Symmetric keys are always imported via CryptImportKey( ) in "simple blob" format, specified by
the SIMPLEBLOB constant passed to CryptImportKey( ). A simple blob is composed of a BLOBHEADER
structure, followed by an ALG_ID for the algorithm used to encrypt the key data. The raw key data
follows the BLOBHEADER and ALG_ID header information. To import the raw key data into a CryptoAPI
key, a simple blob structure must be constructed and passed to CryptImportKey( ).
Finally, the raw key data must be encrypted using CryptEncrypt( ) and the exchange key. (The
CryptEncrypt( ) function is described in more detail in Recipe 5.25.) The return from
SpcImportKeyData( ) will be a handle to a CryptoAPI key object if the operation was performed
successfully; otherwise, it will be 0. The CryptoAPI makes a copy of the key data internally in the key
object it creates, so the key data passed into the function may be safely freed. Thespc_memset( )
function from Recipe 13.2 is used here to destroy the unencrypted key data before returning.
HCRYPTKEY SpcImportKeyData(HCRYPTPROV hProvider, ALG_ID Algid, BYTE *pbKeyData,
DWORD cbKeyData) {
BOOL
bResult = FALSE;
BYTE
*pbData = 0;
DWORD
cbData, cbHeaderLen, cbKeyLen, dwDataLen;
ALG_ID
*pAlgid;
HCRYPTKEY hImpKey = 0, hKey;
BLOBHEADER *pBlob;
if (!CryptGetUserKey(hProvider, AT_KEYEXCHANGE, &hImpKey)) {
if (GetLastError( ) != NTE_NO_KEY) goto done;
if (!CryptGenKey(hProvider, AT_KEYEXCHANGE, (1024 << 16), &hImpKey))
goto done;
}
cbData = cbKeyData;
cbHeaderLen = sizeof(BLOBHEADER) + sizeof(ALG_ID);
if (!CryptEncrypt(hImpKey, 0, TRUE, 0, 0, &cbData, cbData)) goto done;
if (!(pbData = (BYTE *)LocalAlloc(LMEM_FIXED, cbData + cbHeaderLen)))
goto done;
CopyMemory(pbData + cbHeaderLen, pbKeyData, cbKeyData);
cbKeyLen = cbKeyData;
if (!CryptEncrypt(hImpKey, 0, TRUE, 0, pbData + cbHeaderLen, &cbKeyLen, cbData))
goto done;
pBlob = (BLOBHEADER *)pbData;
pAlgid = (ALG_ID *)(pbData + sizeof(BLOBHEADER));
pBlob->bType
= SIMPLEBLOB;
pBlob->bVersion = 2;
pBlob->reserved = 0;
pBlob->aiKeyAlg = Algid;
dwDataLen = sizeof(ALG_ID);
if (!CryptGetKeyParam(hImpKey, KP_ALGID, (BYTE *)pAlgid, &dwDataLen, 0))
goto done;
bResult = CryptImportKey(hProvider, pbData, cbData + cbHeaderLen, hImpKey, 0,
&hKey);
if (bResult) spc_memset(pbKeyData, 0, cbKeyData);
done:
if (pbData) LocalFree(pbData);
CryptDestroyKey(hImpKey);
return (bResult ? hKey : 0);
}
5.26.4 See Also
Recipe 5.25, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
5.27 Extracting Raw Key Data from a CryptoAPI Key
Object
5.27.1 Problem
You have a symmetric key stored in a CryptoAPI key object that you want to use with another API,
such as OpenSSL.
5.27.2 Solution
The Microsoft CryptoAPI is designed to prevent unintentional disclosure of sensitive key information.
To do this, key information is stored in opaque data objects by the Cryptographic Service Provider
(CSP) used to create the key object. Key data is exportable from key objects, but the data must be
encrypted with another key to prevent accidental disclosure of the raw key data.
To extract the raw key data from a CryptoAPI key, you must first export the key using the CryptoAPI
function CryptoExportKey( ). The key data obtained from this function will be encrypted with
another key, which you can then use to decrypt the encrypted key data to obtain the raw key data
that another API, such as OpenSSL, can use.
5.27.3 Discussion
To export a key using the CryptoExportKey( ) function, you must provide the function with another
key that will be used to encrypt the key data that's to be exported. Recipe 5.26 includes a function,
SpcGetExportableContext( ), that obtains a handle to a CSP context object suitable for exporting
keys created with it. The CSP context object uses a "container" to store public key pairs. Every public
key container can have a special public key pair known as anexchange key, which is the key that
we'll use to decrypt the exported key data.
The function CryptGetUserKey( ) is used to obtain the exchange key. If it doesn't exist,
SpcExportKeyData( ), listed later in this section, will create a 1,024-bit exchange key, which will be
stored as the exchange key in the public key container so future attempts to get the key will succeed.
The special algorithm identifier AT_KEYEXCHANGE is used to reference the exchange key.
Symmetric keys are always exported via CryptExportKey( ) in "simple blob" format, specified by
the SIMPLEBLOB constant passed to CryptExportKey( ). The data returned in the buffer from
CryptExportKey( ) will have a BLOBHEADER structure, followed by an ALG_ID for the algorithm used
to encrypt the key data. The raw key data will follow the BLOBHEADER and ALG_ID header information.
For extracting the raw key data from a CryptoAPI key, the data in the BLOBHEADER structure and the
ALG_ID are of no interest, but you must be aware of their existence so that you can skip over them
to find the encrypted key data.
Finally, the encrypted key data can be decrypted using CryptDecrypt( ) and the exchange key. The
CryptDecrypt( ) function is described in more detail in Recipe 5.25. The decrypted data is the raw
key data that can now be passed off to other APIs or used in protocols that already provide their own
protection for the key. The return from SpcExportKeyData( ) will be a buffer allocated with
LocalAlloc( ) that contains the unencrypted symmetric key if no errors occur; otherwise, NULL will
be returned.
#include <windows.h>
#include <wincrypt.h>
BYTE *SpcExportKeyData(HCRYPTPROV hProvider, HCRYPTKEY hKey, DWORD *cbData) {
BOOL
bResult = FALSE;
BYTE
*pbData = 0, *pbKeyData;
HCRYPTKEY hExpKey = 0;
if (!CryptGetUserKey(hProvider, AT_KEYEXCHANGE, &hExpKey)) {
if (GetLastError( ) != NTE_NO_KEY) goto done;
if (!CryptGenKey(hProvider, AT_KEYEXCHANGE, (1024 << 16), &hExpKey))
goto done;
}
if (!CryptExportKey(hKey, hExpKey, SIMPLEBLOB, 0, 0, cbData)) goto done;
if (!(pbData = (BYTE *)LocbalAlloc(LMEM_FIXED, *cbData))) goto done;
if (!CryptExportKey(hKey, hExpKey, SIMPLEBLOB, 0, pbData, cbData))
goto done;
pbKeyData = pbData + sizeof(BLOBHEADER) + sizeof(ALG_ID);
(*cbData) -= (sizeof(BLOBHEADER) + sizeof(ALG_ID));
bResult = CryptDecrypt(hExpKey, 0, TRUE, 0, pbKeyData, cbData);
done:
if (hExpKey) CryptDestroyKey(hExpKey);
if (!bResult && pbData) LocalFree(pbData);
else if (pbData) MoveMemory(pbData, pbKeyData, *cbData);
return (bResult ? (BYTE *)LocalReAlloc(pbData, *cbData, 0) : 0);
}
5.27.4 See Also
Recipe 5.25, Recipe 5.26
[ Team LiB ]
[ Team LiB ]
Chapter 6. Hashes and Message
Authentication
In Chapter 5, we discussed primitives for symmetric encryption. Some of those primitives were
capable of providing two of the most important security goals: secrecy and message integrity. There
are occasions where secrecy may not be important in the slightest, but you'd still like to ensure that
messages are not modified as they go over the Internet. In such cases, you can use a symmetric
primitive such as CWC mode, which allows you to authenticate data without encrypting any of it.
Alternatively, you can consider using a standalone message authentication code (MAC).
This chapter focuses on MACs, and it also covers two types of one-way hash functions: cryptographic
hash functions and "universal" hash functions. Cryptographic hash functions are used in public key
cryptography and are a popular component to use in a MAC (you can also use block ciphers), but
universal hash functions turn out to be a much better foundation for a secure MAC.
Many of the recipes in this chapter are too low-level for general-purpose use.
We recommend that you first try to find what you need in Chapter 9; the
recipes there are more generally applicable. If you do use these recipes, please
be careful, read all our warnings, and consider using the higher-level constructs
we suggest.
[ Team LiB ]
[ Team LiB ]
6.1 Understanding the Basics of Hashes and MACs
6.1.1 Problem
You would like to understand the basic concepts behind hash functions as used in cryptography and
message authentication codes (MACs).
6.1.2 Solution
See Section 6.1.3. Be sure to note the possible attacks on these constructs, and how to thwart them.
6.1.3 Discussion
One common thread running through the three types of primitives described in this chapter is that
they take an arbitrary amount of data as an input, and produce a fixed-size output. The output is
always identical given the exact same inputs (where inputs may include keys, nonces, and text). In
addition, in each case, given random inputs, every output is (just about) equally likely.
6.1.3.1 Types of primitives
These are the three types of primitives:
Message authentication codes
MACs are hash functions that take a message and a secret key (and possibly a nonce) as input,
and produce an output that cannot, in practice, be forged without possessing the secret key.
This output is often called a tag. There are many ways to build a secure MAC, and there are
several good MACs available, including OMAC, CMAC, and HMAC.
Cryptographic hash functions
These functions are the simplest of the primitives we'll discuss (even though they are difficult
to use securely). They simply take an input string and produce a fixed-size output string (often
called a hash value or message digest). Given the output string, there should be no way to
determine the input string other than guessing (a dictionary attack). Traditional algorithms
include SHA1 and MD5, but you can use algorithms based on block ciphers (and, indeed, you
can get more assurance from a block cipher-based construction). Cryptographic hash functions
generally are not secure on their own. They are securely used in public key cryptography, and
are used as a component in a type of MAC called HMAC.
Universal hash functions
These are keyed hash functions with specific mathematical properties that can also be used as
MACs, despite the fact that they're not cryptographically secure. It turns out that if you take
the output of a keyed universal hash function, and combine it with seemingly random bits in
particular ways (such as encrypting the result with a block cipher), the result has incredibly
good security properties. Or, if you are willing to use one-time keys that are securely
generated, you don't have to use encryption at all! Dan Bernstein's hash127 is an example of a
fast, freely available universal hash function. Most people don't use universal hash functions
directly. They're usually used under the hood in a MAC. For example, CMAC uses a hash127like function as its foundation.
Generally, you should prefer an encryption mode like CWC that provides both encryption and
message integrity to one of these constructions. Using a MAC, you can get message integrity without
encryption, which is sometimes useful.
MACs aren't useful for software distribution, because the key itself must remain secret and can't be
public knowledge. Another limitation is that if there are two parties in the system, Alice and Bob, Alice
cannot prove that Bob sent a message by showing the MAC value sent by Bob (i.e., non-repudiation).
The problem is that Alice and Bob share a key; Alice could have forged the message and produced
the correct MAC value. Digital signature schemes (discussed inChapter 7) can circumvent these
problems, but they have limitations of their own-the primary one is efficiency.
6.1.3.2 Attacks against one-way constructs
There are numerous classes of problems that you need to worry about when you're using a
cryptographic hash function or a MAC. Generally, if an attacker can find collisions for a hash function
(two inputs that give the same output), that can be turned into a real attack.
The most basic collision attack is this: given a known hash function {input, output} pair, somehow
produce another input that gives the same output. To see how this can be a real attack, consider a
public key-based digital signature scheme where the message to "sign" gets cryptographically
hashed, and the hash gets encrypted with the private key of the signer. In such a scenario, anyone
who has the associated public key can validate the signature, and no one can forge it. (We'll discuss
such schemes further in Chapter 7.)
Suppose that an attacker sees the message being signed. From that, he can determine the hash
value computed. If he can find another message that gives the same hash value, he can claim that a
different message is being signed from the one that actually was. For example, an attacker could get
someone to sign a benign document, then substitute a contract that is beneficial to the attacker.
Of course, we assume that if an attacker has a way to force collisions in a reasonably efficient
manner, he can force the second plaintext to be a message of his choice, more or less. (This isn't
always the case, but it is generally a good assumption, particularly because it applies for the most
basic brute-force attacks.)
To illustrate, let's say that an attacker uses a hash function that is cryptographically strong but
outputs only a 16-bit hash. Given a message and a digest, an attacker should be able to generate a
collision after generating, on average, 32,768 messages. An attacker could identify 16 places where a
one-bit change could be made without significantly changing the content (e.g., 16 places where you
could put an extra space after a period, or refrain from doing so).
If the attacker can control both messages, collisions are far easier to find. For example, if an attacker
can give the target a message of his choosing and get the target to sign it, there is an attack that will
find a collision after 256 attempts, on average.
The basic idea is to take two model documents, one that the target will sign, and one that the
attacker would like the target to sign. Then, vary a few places in each of those, and generate hashes
of each document.
The difference between these two attacks is that it's statistically a lot easier to find a collision when
you don't have to find a collision for a particular message.
This is canonically illustrated with something called thebirthday paradox. The common analogy
involves finding people with the same birthday. If you're in a room of 253 people, the odds are just
about even that one of them will share your birthday. Surprisingly to some, if there are a mere 23
people in a room, the odds of finding two people with the same birth date is also a bit over 50
percent.
In both cases, we've got a better than 50% chance after checking 253 pairs of people. The difference
is that in the first scenario, a fixed person must always be a part of the pairings, which seriously
reduces the number of possible combinations of people we can consider. For this reason, the situation
where an attacker can find a collision between any two messages is called abirthday attack.
When a birthday attack applies, the maximum bit strength of a hash function is half the length of the
hash function's output (the digest size). Also, birthday attacks are often possible when people think
they're not. That is, the attacker doesn't need to be able to control both messages for a birthday
attack to apply.
For example, let's say that the target hashes a series of messages. An attacker can precompute a
series of hashes and wait for one of the messages to give the same hash. That's the same problem,
even though the attacker doesn't control the messages the target processes.
Generally, the only reliable way to thwart birthday attacks is to use a per-message nonce, which is
typically done only with MAC constructs. Indeed, many MAC constructs have built-in facilities for this.
We discuss how to use a nonce with a hash function in Recipe 6.8, and we discuss how to use one
with MACs that aren't built to use one in Recipe 6.12.
Another problem that occurs with every practical cryptographic hash function is that they are
susceptible to length extension attacks. That is, if you have a message and a hash value associated
with that message, you can easily construct a new message and hash value by extending the original
message.
The MACs we recommend in this chapter avoid length-extension problems and other attack vectors
against hash functions.[1] We discuss how to thwart length extension problems when using a hash
function outside the context of a MAC in Recipe 6.7.
[1]
While most of the MACs we recommend are based on block ciphers, if a MAC isn't carefully designed, it will
still be susceptible to the attacks we describe in this section, even if it's built on a block cipher.
6.1.4 See Also
Recipe 6.7, Recipe 6.8, Recipe 6.12
[ Team LiB ]
[ Team LiB ]
6.2 Deciding Whether to Support Multiple Message
Digests or MACs
6.2.1 Problem
You need to figure out whether to support multiple algorithms in your system.
6.2.2 Solution
The simple answer is that there is no right answer, as we discuss next.
6.2.3 Discussion
Clearly, if you need to support multiple algorithms for standards compliance or legacy support, you
should do so. Beyond that, there are two schools of thought. The first school recommends that you
support multiple algorithms in order to allow users to pick their favorite. The other benefit of this
approach is that if an algorithm turns out to be seriously broken, supporting multiple algorithms can
make it easier for users to switch. The second school of thought points out that the reality is if an
algorithm is broken, many users will never switch, so that's not a good reason for providing options.
Moreover, by supporting multiple algorithms, you risk adding additional complexity to your
application, and that can be detrimental. In addition, if there are multiple interoperating
implementations of a protocol you're creating, often other developers will implement only their own
preferred algorithms, potentially leading to major interoperability problems.
We personally prefer picking a single algorithm that will do a good enough job of meeting the needs
of all users. That way, the application is simpler to comprehend, and there are no interoperability
issues. If you choose well-regarded algorithms, the hope is that there won't be a break that actually
impacts end users. However, if there is such a break, you should make the algorithm easy to replace.
Because cryptographic hash functions and MACs tend to have standard interfaces, that is usually
easy to do.
Besides dedicated hash algorithms such as SHA1 (Secure Hash Algorithm 1) and MD5 (Message
Digest 5 from Ron Rivest), there are several constructs for turning ablock cipher into a cryptographic
hash function. One advantage of such a construct is that block ciphers are a better-studied construct
than hash functions. In addition, needing fewer cryptographic algorithms for an application can be
important when pushing cryptography into hardware.
One disadvantage of turning a block cipher into a hash function is speed. As we'll show inRecipe 6.3,
dedicated cryptographic hash constructs tend to be faster than those based on block ciphers.
In addition, all hash-from-cipher constructs assume that any cipher used will resist related-key
attacks, a type of attack that has not seen much mainstream study. Because cryptographic hash
functions aren't that well studied either, it's hard to say which of these types of hash constructs is
better.
It is clear that if you're looking for message authentication, a good universal MAC solution is better
than anything based on a cryptographic hash function, because such constructs tend to have
incredibly good, provable security properties, and they tend to be faster than traditional MACs.
Unfortunately, they're not often useful outside the context of message authentication.
6.2.4 See Also
Recipe 6.3
[ Team LiB ]
[ Team LiB ]
6.3 Choosing a Cryptographic Hash Algorithm
6.3.1 Problem
You need to use a hash algorithm for some purpose (often as a parameter to a MAC), and you want
to understand the important concerns so you can determine which algorithm best suits your needs.
6.3.2 Solution
Security requirements should be your utmost concern. SHA1 is a generally a good compromise for
those in need of efficiency. We recommend that you do not use the popular favorite MD5, particularly
in new applications.
Note that outside the context of a well-designed MAC, it is difficult to use a cryptographic hash
function securely, as we discuss in Recipe 6.5 through Recipe 6.8.
6.3.3 Discussion
A secure message digest function (or one-way hash function) should have the following properties:
One-wayness
If given an arbitrary hash value, it should be computationally infeasible to find a plaintext value
that generated that hash value.
Noncorrelation
It should also be computationally infeasible to find out anything about the original plaintext
value; the input bits and output bits should not be correlated.
Weak collision resistance
If given a plaintext value and the corresponding hash value, it should be computationally
infeasible to find a second plaintext value that gives the same hash value.
Strong collision resistance
It should be computationally infeasible to find two arbitrary inputs that give the same hash
value.
Partial collision resistance
It should be computationally infeasible to find two arbitrary inputs that give two hashes that
differ only by a few bits. The difficulty of finding partial collisions of sizen should, in the worst
case, be about as difficult as brute-forcing a symmetric key of lengthn/2.
Unfortunately, there are cryptographic hash functions that have been found to be broken with regard
to one or more of the above properties. MD4 is one example that is still in use today, despite its
insecurity. MD5 is worrisome as well. No full break of MD5 has been published, but there is a wellknown problem with a very significant component of MD5, resulting in very low trust in the security of
MD5. Most cryptographers recommend against using it in any new applications. In addition, because
MD5 was broken a long time ago, in 1995, it's a strong possibility that a government or some other
entity has a full break that is not being shared.
For the time being, it's not unreasonable to use MD5 in legacy applications and in some applications
where the ability to break MD5 buys little to nothing (don't try to be the judge of this yourself!), but
do realize that you might need to replace MD5 entirely in the short term.
The strength of a good hash function differs depending on the circumstances of its use. When given a
known hash value, finding an input that produces that hash value should have no attack much better
than brute force. In that case, the effective strength of the hash algorithm will usually be related to
the length of the algorithm's output. That is, the strength of a strong hash algorithm against such an
attack should be roughly equivalent to the strength of an excellent block cipher with keys of that
length.
However, hash algorithms are much better at protecting against attacks against the one-wayness of
the function than they are at protecting against attacks on the strong collision resistance. Basically, if
the application in question requires the strong collision resistance property, the algorithm will
generally have its effective strength halved in terms of number of bits. That is, SHA1, which has a
160-bit output, would have the equivalent of 80 bits of security, when this property is required.
It can be quite difficult to determine whether an application that uses hash functions really does need
the strong collision resistance property. Basically, it is best to assume that you always need it, then
figure out if your design somehow provides it. Generally, that will be the case if you use a hash
function in a component of a MAC that requires a nonce, and not true otherwise (however, seeRecipe
6.8).
As a result, you should consider MD5 to have, at best, 64 bits of strength. In fact, considering the
weaknesses inherent in MD5, you should assume that, in practice, MD5's strength is less than that.
64 bits of security is on the borderline of what is breakable. (It may or may not be possible for
entities with enough resources to brute-force 64 bits in a reasonable time frame.)
Table 6-1 lists popular cryptographic hash functions and compares important properties of those
functions. Note that the two MDC-2 constructs we detail are covered by patent restrictions until
August 28, 2004, but everything else in this list is widely believed to be patent-free.
When comparing speeds, times were measured in x86 cycles per byte processed (lower numbers are
better), though results will vary slightly from place to place. Implementations used for speed testing
were either the default OpenSSL implementation (when available); the implementation in this book
using OpenSSL versions of the underlying cryptographic primitives; or, when neither of those two
were available, a reference implementation from the Web (in particular, for the last three SHA
algorithms). In many cases, implementations of each algorithm exist that are more efficient, but we
believe that our testing strategy should give you a reasonable idea of relative speeds between
algorithms.
Table 6-1. Cryptographic hash functions and their properties
Security
confidence
Small message
speed (64 bytes),
in cycles per
byte[2]
Large message
speed (8K), in
cycles per byte
Uses
block
cipher
Algorithm
Digest size
DaviesMeyer-AES128
128 bits
(same length
as cipher
block size)
Good
46.7 cpb
57.8 cpb
Yes
MD2
128 bits
Good to low
392 cpb
184 cpb
No
MD4
128 bits
Insecure
32 cpb
5.8 cpb
No
MD5
128 bits
Very low, may
be insecure
40.9 cpb
7.7 cpb
No
MDC-2-AES128
256 bits
Very high
93 cpb
116 cpb
Yes
MDC-2-DES
128 bits
Good
444 cpb
444 cpb
Yes
RIPEMD-160
160 bits
High
62.2 cpb
20.6 cpb
No
SHA1
160 bits
High
53 cpb
15.9 cpb
No
SHA-256
256 bits
Very high
119 cpb
116 cpb
No
SHA-384
384 bits
Very high
171 cpb
166 cpb
No
SHA-512
512 bits
Very high
171 cpb
166 cpb
No
[2]
All timing values are best cases based on our empirical testing, and assume that the data being processed is
already in cache. Do not expect that you'll quite be able to match these speeds in practice.
Let's look briefly at the pros and cons of using these functions.
Davies-Meyer
This function is one way of turning block ciphers into one-way hash functions (Matyas-MeyerOseas is a similar technique that is also commonly seen). This technique does not thwart
birthday attacks without additional measures, and it's therefore an inappropriate construct to
use with most block ciphers because most ciphers have 64-bit blocks. AES is a good choice for
this construct, though 64 bits of resistance to birthday attacks is somewhat liberal. While we
believe this to be adequate for the time being, it's good to be forward-thinking and require
something with at least 80 bits of resistance against a birthday attack. If you use Davies-Meyer
with a nonce, it offers sufficient security. We show how to implement Davies-Meyer inRecipe
6.15.
MD2
MD2 (Message Digest 2 from Ron Rivest[3]) isn't used in many situations. It is optimized for
16-bit platforms and runs slowly everywhere else. It also hasn't seen much scrutiny, has an
internal structure now known to be weak, and has a small digest size. For these reasons, we
strongly suggest that you use other alternatives if at all possible.
[3]
MD4, MD5
MD1 was never public, nor was MD3.
As we mentioned, MD4 (Message Digest 4 from Ron Rivest) is still used in some applications,
but it is quite broken and should not be used, while MD5 should be avoided as well, because its
internal structure is known to be quite weak. This doesn't necessarily amount to a practical
attack, but cryptographers do not recommend the algorithm for new applications because there
probably is a practical attack waiting to be found.
MDC-2
MDC-2 is a way of improving Matyas-Meyer-Oseas to give an output that offers twice as many
bits of security (i.e., the digest is two blocks wide). This clearly imposes a speed hit over
Matyas-Meyer-Oseas, but it avoids the need for a nonce. Generally, when people say "MDC-2,"
they're talking about a DES-based implementation. We show how to implement MDC-2-AES in
Recipe 6.16.
RIPEMD-160, SHA1
RIPEMD-160 and SHA1 are both well-regarded hash functions with reasonable performance
characteristics. SHA1 is a bit more widely used, partially because it is faster, and partially
because the National Institute of Standards and Technology (NIST) has standardized it. While
there is no known attack better than a birthday attack against either of these algorithms,
RIPEMD-160 is generally regarded as having a somewhat more conservative design, but SHA1
has seen more study.
SHA-256, SHA-384, SHA-512
After the announcement of AES, NIST moved to standardize hash algorithms that, when
considering the birthday attack, offer comparable levels of security to AES-128, AES-192, and
AES-256. The result was SHA-256, SHA-384, and SHA-512. SHA-384 is merely SHA-512 with a
truncated digest value, and it therefore isn't very interesting in and of itself.
These algorithms are designed in a very conservative manner, and therefore their speed is
closer to that expected from a block cipher than that expected from a traditional cryptographic
message digest function. Clearly, if birthday-style attacks are not an issue (usually due to
proper use of nonce), then AES-256 and SHA-256 offer equivalent security margins, making
SHA-384 and SHA-512 overkill. In such a scenario, SHA1 is an excellent algorithm to pair with
AES-128. In practice, a nonce is a good idea, and we therefore recommend AES-128 and SHA1
when you want to use a block cipher and a separate message digest algorithm. Note also that
performance numbers for SHA-384 and SHA-512 would improve on a platform with native 64bit operations.
The cryptographic hash function constructs based on block ciphers not only tend to run more slowly
than dedicated functions, but also they rely on assumptions that are a bit unusual. In particular,
these constructions demand that the underlying cipher resist related-key attacks, which are relatively
unstudied compared with traditional attacks. On the other hand, dedicated hash functions have
received a whole lot less scrutiny from the cryptanalysts in the world-assuming that SHA1 acts like a
pseudo-random function (or close to it) is about as dicey.
In practice, if you really need to use a one-way hash function, we believe that SHA1 is suitable for
almost all needs, particularly if you are savvy about thwarting birthday attacks and collision attacks
on the block cipher (see Recipe 5.3). If you're using AES with 128-bit keys, SHA1 makes a
reasonable pairing. However, if you ever feel the need to use stronger key sizes (which is quite
unnecessary for the foreseeable future), you should also switch to SHA-256.
6.3.4 See Also
Recipe 5.3, Recipe 6.5-Recipe 6.8, Recipe 6.15, Recipe 6.16
[ Team LiB ]
[ Team LiB ]
6.4 Choosing a Message Authentication Code
6.4.1 Problem
You need to use a MAC (which yields a tag that can only be computed correctly on a piece of data by
an entity with a particular secret key), and you want to understand the important concerns so you
can determine which algorithm best suits your needs.
6.4.2 Solution
In most cases, instead of using a standalone MAC, we recommend that you use a dual-use mode that
provides both authentication and encryption all at once (such as CWC mode, discussed inRecipe
5.10). Dual-use modes can also be used for authentication when encryption is not required.
If a dual-use mode does not suit your needs, the best solution depends on your particular
requirements. In general, HMAC is a popular and well-supported alternative based on hash functions
(it's good for compatibility), and OMAC is a good solution based on a block cipher (which we see as a
strong advantage). If you care about maximizing efficiency, a hash127-based MAC is a reasonable
solution (though it has some limitations, so CMAC may be better in such cases; seeRecipe 6.13 and
Recipe 6.14).
We recommend against using RMAC and UMAC, for reasons discussed in the following section.
6.4.3 Discussion
Do not use the same key for encryption that you use in a MAC. SeeRecipe 4.11
for how to overcome this restriction.
As with hash functions, there are a large number of available algorithms for performing message
authentication, each with its own advantages and drawbacks. Besides algorithms designed explicitly
for message authentication, some encryption modes such as CWC provide message authentication as
a side effect. (See Recipe 5.4 for an overview of several such modes, and Recipe 6.10 for a discussion
of CWC.) Such dual-use modes are designed for general-purpose needs, and they are high-level
enough that it is far more difficult to use these modes in an insecure manner than regular
cryptography.
Table 6-2 lists interesting message authentication functions, all with provable security properties
assuming the security of the underlying primitive upon which they were based. This table also
compares important properties of those functions. When comparing speeds, we used an x86-based
machine and unoptimized implementations for testing. Results will vary depending on platform and
other operating conditions. Speeds are measured in cycles per byte; lower numbers are better.
Table 6-2. MACs and their properties
MAC
Built upon
Small
message
speed (64
bytes)[4]
Large
message
speed (8K)
Appropriate
for hardware
Patent
restrictions
Parallelizable
CMAC
A universal
hash and
AES
~18 cpb
~18 cpb
Yes
No
Yes
HMACSHA1
Message
digest
function
90 cpb
20 cpb
Yes
No
No
MAC127
hash127 +
AES
~6 cpb
~6 cpb
Yes
No
Yes
OMAC1
AES
29.5 cpb
37 cpb
Yes
No
No
OMAC2
AES
29.5 cpb
37 cpb
Yes
No
No
PMACAES
Block cipher 72 cpb
70 cpb
Yes
Yes
Yes
RMAC
Block cipher 89 cpb
80 cpb
Yes
No
No
UMAC32
UHASH and
AES
19 cpb
cpb
No
No
Yes
XMACCSHA1
Any cipher
or MD
function
162 cpb
29 cpb
Yes
Yes
Yes
[4]
All timing values are best cases based on our empirical testing, and assume that the data being processed is
already in cache. Do not expect that you'll quite be able to match these speeds in practice.
Note that our considerations for comparing MACs are different from our considerations for comparing
cryptographic hash functions. First, all of the MACs we discuss provide a reasonable amount of
assurance, assuming that the underlying construct is secure (though MACs without nonces do not
resist the birthday attack without additional work; see Recipe 6.12). Second, all of the cryptographic
hash functions we discussed are suitable for hardware, patent-free, and not parallelizable.
Let's look briefly at the pros and cons of using these functions.
CMAC
CMAC is the MAC portion of the CWC encryption mode, which can be used in a standalone
fashion. It's built upon a universal hash function that can be made to run very fast, especially
in hardware. CMAC is discussed in Recipe 6.14.
HMAC
HMAC, discussed in Recipe 6.10, is a widely used MAC, largely because it was one of the first
MAC constructs with provable security-even though the other MACs on this list also have
provable security (and the proofs for those other MACs tend to be based on somewhat more
favorable assumptions). HMAC is fairly fast, largely because it performs only two cryptographic
operations, both hashes. One of the hashes is constant time; and the other takes time
proportional to the length of the input, but it doesn't have the large overhead block ciphers
typically do as a result of hash functions having a very large block size internally (usually 64
bytes).
HMAC is designed to take a one-way hash function with an arbitrary input and a key to produce
a fixed-sized digest. Therefore, it cannot use block ciphers, unless you use a construction to
turn a block cipher into a strong hash function, which will significantly slow down HMAC. If you
want to use a block cipher to MAC (which we recommend), we strongly recommend that you
use another alternative. Note that HMAC does not use a nonce by default, making HMAC
vulnerable to capture replay attacks (and theoretically vulnerable to a birthday attack).
Additional effort can thwart such attacks, as shown in Recipe 6.12.
MAC127
MAC127 is a MAC we define in Recipe 6.14 that is based on Dan Bernstein's hash127. This MAC
is very similar to CMAC, but it runs faster in software. It's the fastest MAC in software that we
would actually recommend using.
OMAC1, OMAC2
OMAC1 and OMAC2, which we discuss in Recipe 6.11, are MACs built upon AES. They are
almost identical to each other, working by running the block cipher in CBC mode and
performing a bit of additional magic at the end. These are "fixed" versions of a well-known MAC
called CBC-MAC. CBC-MAC, without the kinds of modifications OMAC1 and OMAC2 make, was
insecure unless all messages MAC'd with it were exactly the same size. The OMAC algorithms
are a nice, general-purpose pair of MACs for when you want to keep your system simple, with
only one cryptographic primitive. What's more, if you use an OMAC with AES in CTR mode, you
need only have an implementation of the AES encryption operation (which is quite different
code from the decryption operation). There is little practical difference between OMAC1 and
OMAC2, although they both give different outputs. OMAC1 is slightly preferable, as it has a
very slight speed advantage. Neither OMAC1 nor OMAC2 takes a nonce. As of this writing, NIST
is expected to standardize OMAC1.
PMAC
PMAC is also parallelizable, but it is protected by patent. We won't discuss this MAC further
because there are reasonable free alternatives.
RMAC
RMAC is another MAC built upon a block cipher. It works by running the block cipher in CBC
mode and performing a bit of additional magic at the end. This is a mode created by NIST, but
cryptographers have found theoretical problems with it under certain conditions;[5] thus, we do
not recommend it for any use.
[5]
In particular, RMAC makes more assumptions about the underlying block cipher than other MACs need
to make. The extra assumptions are a bit unreasonable, because they require the block cipher to resist
related-key attacks, which are not well studied.
UMAC32
On many platforms, UMAC is the reigning speed champion for MACs implemented in software.
The version of UMAC timed for Table 6-2 uses 64-bit tags, which are sufficient for most
applications, if a bit liberal. That size is sufficient because tags generally need to have security
for only a fraction of a second, assuming some resistance to capture replay attacks. 64 bits of
strength should easily last years. The 128-bit version generally does a bit better than half the
speed of the 64-bit version. Nevertheless, although there are a few things out there using
UMAC, we don't recommend it. The algorithm is complex enough that, as of this writing, the
reference implementation of UMAC apparently has never been validated. In addition,
interoperability with UMAC is exceptionally difficult because there are many different
parameters that can be tweaked.
XMACC
XMACC can be built from a large variety of cryptographic primitives. It provides good
performance characteristics, and it is fully parallelizable. Unfortunately, it is patented, and for
this reason we won't discuss it further in this book.
All in all, we personally prefer MAC127 or CMAC. When you want to avoid using a nonce, OMAC1 is an
excellent choice.
6.4.4 See Also
Recipe 4.11, Recipe 5.4, Recipe 5.10, Recipe 6.9 through Recipe 6.14
[ Team LiB ]
[ Team LiB ]
6.5 Incrementally Hashing Data
6.5.1 Problem
You want to use a hash function to process data incrementally, returning a result when the last of the
data is finally available.
6.5.2 Solution
Most hash functions use a standard interface for operation, following these steps:
1. The user creates a "context" object to hold intermediate state.
2. The context object gets initialized.
3. The context is "updated" by passing in the data to be hashed.
4. When the data is updated, "finalization" returns the output of the cryptographic hash function.
6.5.3 Discussion
Hash functions are not secure by themselves-not for a password system, not
for message authentication, not for anything! If you do need a hash function by
itself, be sure to at least protect against length extension attacks, as described
in Recipe 6.7 and Recipe 6.8.
Libraries with cryptographic hash functions tend to support incremental operation using a standard
structure. In fact, this structure is standardized for cryptographic hardware APIs in PKCS (Public Key
Cryptography Standard) #11. There are four steps:
1. Allocate a context object. The context object holds the internal state of the hash until data
processing is complete. The type can be specific to the hash function, or it can be a single type
that works for all hash functions in a library (such as the EVP_MD_CTX type in the OpenSSL
library or HCRYPTHASH in Microsoft's CryptoAPI).
2. Initialize the context object, resetting internal parameters of the hash function. Generally, this
function takes no arguments other than a pointer to the context object, unless you're using a
generic API, in which case you will need to specify which hash algorithm to use.
3. "Update" the context object by passing in data to be hashed and the associated length of that
3.
input. The results of the hash will be dependent on the order of the data you pass, but you can
pass in all the partial data you wish. That is, calling the update routine with the string "he" then
"llo" would produce the same results as calling it once with the string "hello". The update
function generally takes the context object, the data to process, and the associated length of
that data as arguments.
4. "Finalize" the context object and produce the message digest. Most APIs take as arguments the
context object and a buffer into which the message digest is placed.
The OpenSSL API has both a single generic interface to all its hash functions and a separate API for
each hash function. Here's an example using the SHA1 API:
#include <stdio.h>
#include <string.h>
#include <openssl/sha.h>
int main(int argc, char *argv[ ]) {
int
i;
SHA_CTX
ctx;
unsigned char result[SHA_DIGEST_LENGTH]; /* SHA1 has a 20-byte digest. */
unsigned char *s1 = "Testing";
unsigned char *s2 = "...1...2...3...";
SHA1_Init(&ctx);
SHA1_Update(&ctx, s1, strlen(s1));
SHA1_Update(&ctx, s2, strlen(s2));
/* Yes, the context object is last. */
SHA1_Final(result, &ctx);
printf("SHA1(\"%s%s\") = ", s1, s2);
for (i = 0; i < SHA_DIGEST_LENGTH; i++) printf("%02x", result[i]);
printf("\n");
return 0;
}
Every hash function that OpenSSL supports has a similar API. In addition, every such function has an
"all-in-one" API that allows you to combine the work of calls for initialization, updating, and
finalization, obviating the need for a context object:
unsigned char *SHA1(unsigned char *in, unsigned long len, unsigned char *out);
This function returns a pointer to the out argument.
Both the incremental API and the all-in-one API are very standard, even beyond OpenSSL. The
reference versions of most hash algorithms look incredibly similar. In fact, Microsoft's CryptoAPI for
Windows provides a very similar API. Any of the Microsoft CSPs provide implementations of MD2,
MD5, and SHA1. The following code is the CryptoAPI version of the OpenSSL code presented
previously:
#include <windows.h>
#include <wincrypt.h>
#include <stdio.h>
int main(int argc, char *argv[ ]) {
BYTE
*pbData;
DWORD
cbData = sizeof(DWORD), cbHashSize, i;
HCRYPTHASH
hSHA1;
HCRYPTPROV
hProvider;
unsigned char *s1 = "Testing";
unsigned char *s2 = "...1...2...3...";
CryptAcquireContext(&hProvider, 0, MS_DEF_PROV, PROV_RSA_FULL, 0);
CryptCreateHash(hProvider, CALG_SHA1, 0, 0, &hSHA1);
CryptHashData(hSHA1, s1, strlen(s1), 0);
CryptHashData(hSHA1, s2, strlen(s2), 0);
CryptGetHashParam(hSHA1, HP_HASHSIZE, (BYTE *)&cbHashSize, &cbData, 0);
pbData = (BYTE *)LocalAlloc(LMEM_FIXED, cbHashSize);
CryptGetHashParam(hSHA1, HP_HASHVAL, pbData, &cbHashSize, 0);
CryptDestroyHash(hSHA1);
CryptReleaseContext(hProvider, 0);
printf("SHA1(\"%s%s\") = ", s1, s2);
for (i = 0; i < cbHashSize; i++) printf("%02x", pbData[i]);
printf("\n");
LocalFree(pbData);
return 0;
}
The preferred API for accessing hash functions from OpenSSL, though, is the EVP API, which provides
a generic API to all of the hash functions OpenSSL supports. The following code does the same thing
as the first example with the EVP interface instead of the SHA1 interface:
#include <stdio.h>
#include <string.h>
#include <openssl/evp.h>
int main(int argc, char *argv[ ]) {
int
i, ol;
EVP_MD_CTX
ctx;
unsigned char result[EVP_MAX_MD_SIZE]; /* enough for any hash function */
unsigned char *s1 = "Testing";
unsigned char *s2 = "...1...2...3...";
/* Note the extra parameter */
EVP_DigestInit(&ctx, EVP_sha1( ));
EVP_DigestUpdate(&ctx, s1, strlen(s1));
EVP_DigestUpdate(&ctx, s2, strlen(s2));
/* Here, the context object is first. Notice the pointer to the output length */
EVP_DigestFinal(&ctx, result, &ol);
printf("SHA1(\"%s%s\") = ", s1, s2);
for (i = 0; i < ol; i++) printf("%02x", result[i]);
printf("\n");
return 0;
}
Note particularly that EVP_DigestFinal( ) requires you to pass in a pointer to an integer, into which
the output length is stored. You should use this value in your computations instead of hardcoding
SHA1's digest size, under the assumption that you might someday have to replace crypto algorithms
in a hurry, in which case the digest size may change. For that reason, allocateEVP_MAX_MD_SIZE
bytes for any buffer into which you store a message digest, even if some of that space may go
unused.
Alternatively, if you'd like to allocate a buffer of the correct size for output dynamically (which is a
good idea if you're space-constrained, because if SHA-512 is ever added to OpenSSL,
EVP_MAX_MD_SIZE will become 512 bits), you can use the function EVP_MD_CTX_size( ), which takes
a context object and returns the size of the digest. For example:
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<openssl/evp.h>
int main(int argc, char *argv[ ]) {
int
i, ol;
EVP_MD_CTX
ctx;
unsigned char *result;
unsigned char *s1 = "Testing";
unsigned char *s2 = "...1...2...3...";
EVP_DigestInit(&ctx, EVP_sha1( ));
EVP_DigestUpdate(&ctx, s1, strlen(s1));
EVP_DigestUpdate(&ctx, s2, strlen(s2));
if (!(result = (unsigned char *)malloc(EVP_MD_CTX_block_size(&ctx))))abort();
EVP_DigestFinal(&ctx, result, &ol);
printf("SHA1(\"%s%s\") = ", s1, s2);
for (i = 0; i < ol; i++) printf("%02x", result[i]);
printf("\n");
free(result);
return 0;
}
The OpenSSL library supports only two cryptographic hash functions that we recommend, SHA1 and
RIPEMD-160. It also supports MD2, MD4, MD5, and MDC-2-DES. MDC-2-DES is reasonable, but it is
slow and provides only 64 bits of resistance to birthday attacks, whereas we recommend a minimum
baseline of 80 bits of security. As an alternative, you could initialize the hash function with a nonce,
as discussed in Recipe 6.8.
Nonetheless, Table 6-3 contains a summary of the necessary information on each hash function to
use both the EVP and hash-specific APIs with OpenSSL.
Table 6-3. OpenSSL-supported hash functions
Message
digest
function
EVP function to
specify MD
Context type for
MD-specific API
Prefix for MDspecific API calls
(i.e., XXX_Init,
...)
Include file for
MD-specific API
MD2
EVP_md2()
MD2_CTX
MD2
openssl/md2.h
MD4
EVP_md4()
MD4_CTX
MD4
openssl/md4.h
MD5
EVP_md5()
MD5_CTX
MD5
openssl/md5.h
MDC-2-DES
EVP_mdc2()
MDC2_CTX
MDC2
openssl/mdc2.h
RIPEMD-160 EVP_ripemd160()
RIPEMD160_CTX
RIPEMD160
openssl/ripemd.h
SHA1
SHA_CTX
SHA1
openssl/sha.h
EVP_sha1()
Of course, you may want to use an off-the-shelf hash function that isn't supported by either OpenSSL
or CryptoAPI-for example, SHA-256, SHA-384, or SHA-512. Aaron Gifford has produced a good, free
library with implementations of these functions and released it under a BSD-style license. It is
available from http://www.aarongifford.com/computers/sha.html.
That library exports an API that should look very familiar:
SHA256_Init(SHA256_CTX *ctx);
SHA256_Update(SHA256_CTX *ctx, unsigned char *data, size_t inlen);
SHA256_Final(unsigned char out[SHA256_DIGEST_LENGTH], SHA256_CTX *ctx);
SHA384_Init(SHA384_CTX *ctx);
SHA384_Update(SHA384_CTX *ctx, unsigned char *data, size_t inlen);
SHA384_Final(unsigned char out[SHA384_DIGEST_LENGTH], SHA384_CTX *ctx);
SHA512_Init(SHA512_CTX *ctx);
SHA512_Update(SHA512_CTX *ctx, unsigned char *data, size_t inlen);
SHA512_Final(unsigned char out[SHA512_DIGEST_LENGTH], SHA512_CTX *ctx);
All of the previous functions are prototyped in the sha2.h header file.
6.5.4 See Also
Implementations of SHA-256 and SHA-512 from Aaron Gifford:
http://www.aarongifford.com/computers/sha.html
Recipe 6.7, Recipe 6.8
[ Team LiB ]
[ Team LiB ]
6.6 Hashing a Single String
6.6.1 Problem
You have a single string of data that you would like to hash, and you don't like the complexity of the
incremental interface.
6.6.2 Solution
Use an "all-in-one" interface, if available, or write your own wrapper, as shown inSection 6.6.3.
6.6.3 Discussion
Hash functions are not secure by themselves-not for a password system, not
for message authentication, not for anything! If you do need a hash function by
itself, be sure to at least protect against length extension attacks, as described
in Recipe 6.7.
Complexity can certainly get you in trouble, and a simpler API can be better. While not every API
provides a single function that can perform a cryptographic hash, many of them do. For example,
OpenSSL provides an all-in-one API for each of the message digest algorithms it supports:
unsigned
unsigned
unsigned
unsigned
unsigned
char
char
char
char
char
*MD2(unsigned char *in, unsigned long n, unsigned char *md);
*MD4(unsigned char *in, unsigned long n, unsigned char *md);
*MD5(const unsigned char *in, unsigned long n, unsigned char *md);
*MDC2(const unsigned char *in, unsigned long n, unsigned char *md);
*RIPEMD160(const unsigned char *in, unsigned long n,
unsigned char *md);
unsigned char *SHA1(const unsigned char *in, unsigned long n, unsigned char *md);
APIs in this style are commonly seen, even outside the context of OpenSSL. Note that these functions
require you to pass in a buffer into which the digest is placed, but they also return a pointer to that
same buffer.
OpenSSL does not provide an all-in-one API for calculating message digests with the EVP interface.
However, here's a simple wrapper that even allocates its result with malloc( ):
#include <stdio.h>
#include <stdlib.h>
#include <openssl/evp.h>
/* Returns 0 when malloc() fails. */
unsigned char *spc_digest_message(EVP_MD *type, unsigned char *in,
unsigned long n, unsigned int *outlen) {
EVP_MD_CTX
ctx;
unsigned char *ret;
EVP_DigestInit(&ctx, type);
EVP_DigestUpdate(&ctx, in, n);
if (!(ret = (unsigned char *)malloc(EVP_MD_CTX_size(&ctx))) return 0;
EVP_DigestFinal(&ctx, ret, outlen);
return ret;
}
Here's a simple example that uses the previous wrapper:
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<openssl/evp.h>
int main(int argc, char *argv[ ]) {
int
i;
unsigned int ol;
unsigned char *s = "Testing...1...2...3...";
unsigned char *r;
r = spc_digest_message(EVP_sha1(
), s, strlen(s), &ol);
printf("SHA1(\"%s\") = ", s);
for (i = 0; i < ol; i++) printf("%02x", r[i]);
printf("\n");
free(r);
return 0;
}
Such a wrapper can be adapted easily to any incremental hashing API, simply by changing the names
of the functions and the underlying data type, and removing the first argument of the wrapper if it is
not necessary. Here is the same wrapper implemented using Microsoft's CryptoAPI:
#include <windows.h>
#include <wincrypt.h>
BYTE *SpcDigestMessage(ALG_ID Algid, BYTE *pbIn, DWORD cbIn, DWORD *cbOut) {
BYTE
*pbOut;
DWORD
cbData = sizeof(DWORD);
HCRYPTHASH hHash;
HCRYPTPROV hProvider;
CryptAcquireContext(&hProvider, 0, MS_DEF_PROV, PROV_RSA_FULL, 0);
CryptCreateHash(hProvider, Algid, 0, 0, &hHash);
CryptHashData(hHash, pbIn, cbIn, 0);
CryptGetHashParam(hHash, HP_HASHSIZE, (BYTE *)cbOut, &cbData, 0);
pbOut = (BYTE *)LocalAlloc(LMEM_FIXED, *cbOut);
CryptGetHashParam(hHash, HP_HASHVAL, pbOut, cbOut, 0);
CryptDestroyHash(hHash);
CryptReleaseContext(hProvider, 0);
return pbOut;
}
6.6.4 See Also
Recipe 6.7
[ Team LiB ]
[ Team LiB ]
6.7 Using a Cryptographic Hash
6.7.1 Problem
You need to use a cryptographic hash function outside the context of a MAC, and you want to avoid
length-extension attacks, which are quite often possible.
6.7.2 Solution
A good way to thwart length-extension attacks is to run the hash function twice, once over the
message, and once over the output of the first hash. This does not protect against birthday attacks,
which probably aren't a major problem in most situations. If you need to protect against those
attacks as well, use the advice in Recipe 6.8 on the first hash operation.
6.7.3 Discussion
Hash functions are not secure by themselves-not for a password system, not
for message authentication, not for anything!
Because all of the commonly used cryptographic hash functions break a message into blocks that get
processed in an iterative fashion, it's often possible to extend the message and at the same time
extend the associated hash, even if some sort of "secret" data was processed at the start of a
message.
It's easy to get rid of this kind of problem at the application level. When you need a cryptographic
hash, don't use SHA1 or something similar directly. Instead, write a wrapper that hashes the
message with your cryptographic hash function, then takes that output and hashes it as well,
returning the result.
For example, here's a wrapper for the all-in-one SHA1 interface discussed in Recipe 6.6:
#define SPC_SHA1_DGST_LEN (20)
/* Include anything else you need. */
void spc_extended_sha1(unsigned char *message, unsigned long n,unsigned char *md) {
unsigned char tmp[SPC_SHA1_DGST_LEN];
SHA1(message, n, tmp);
SHA1(tmp, sizeof(tmp), md);
}
Note that this solution does not protect against birthday attacks. When using SHA1, birthday attacks
are generally considered totally impractical. However, to be conservative, you can use a nonce to
protect against such attacks, as discussed in Recipe 6.8.
6.7.4 See Also
Recipe 6.6, Recipe 6.8
[ Team LiB ]
[ Team LiB ]
6.8 Using a Nonce to Protect Against Birthday Attacks
6.8.1 Problem
You want to harden a hash function against birthday attacks instead of switching to an algorithm with
a longer digest.
6.8.2 Solution
Use a nonce or salt before and after your message (preferably a securely generated random salt),
padding the nonce to the internal block size of the hash function.
6.8.3 Discussion
Hash functions are not secure by themselves-not for a password system, not
for message authentication, not for anything! If you do need a hash function by
itself, be sure to at least protect against length extension attacks, as described
in Recipe 6.7.
In most cases, when using a nonce or salt with a hash function, where the nonce is as large as the
output length of the hash function, you double the effective strength of the hash function in
circumstances where a birthday attack would apply. Even smaller nonces help improve security.
To ensure the best security, we strongly recommend that you follow these steps:
1. Select a nonce using a well-seeded cryptographic random number generator (seeChapter 11).
If you're going to have multiple messages to process, select a random portion that is common
to all messages (at least 64 bits) and use a counter for the rest. (The counter should be big
enough to handle any possible number of messages. Here we also recommend dedicating at
least 64 bits.)
2. Determine the internal block length of the hash function (discussed later in this section).
3. Pad the nonce to the internal block length by adding as many zero-bytes as necessary.
4. Add the padded nonce to both the beginning and the end of the message.
5. Hash, creating a value V.
6. Hash V to get the final output. This final step protects against length-extension attacks, as
discussed in Recipe 6.7.
6.
One thing that you need to be sure to avoid is a situation in which the attacker can control the nonce
value. A nonce works well only if it cannot be reused. If an attacker can control the nonce, he can
generally guarantee it gets reused, in which case problems like the birthday attack still apply.
In cases where having a nonce that the attacker can't control isn't appropriate, you can probably live
with birthday attacks if you're using SHA1 or better. To protect against other attacks without using a
nonce, see Recipe 6.7.
All hash functions have a compression function as an element. The size to which that function
compresses is the internal block size of the function, and it is usually larger than the actual digest
value. For hash functions based on block ciphers, the internal block size is the output length of the
hash function (and the compression function is usually built around XOR'ing multiple pieces of blocksized data). Table 6-4 lists the internal block sizes of common message digest functions not based on
block ciphers.
Table 6-4. Internal block sizes of common message digest functions
Algorithm
Digest size
Internal block size
MD2
128 bits
16 bytes (128 bits)
MD4
128 bits
64 bytes (512 bits)
MD5
128 bits
64 bytes (512 bits)
RIPEMD-160
160 bits
64 bytes (512 bits)
SHA1
160 bits
64 bytes (512 bits)
SHA-256
256 bits
64 bytes (512 bits)
SHA-384
384 bits
128 bytes (1,024 bits)
SHA-512
512 bits
128 bytes (1,024 bits)
Here's a pair of functions that do all-in-one wrapping of the OpenSSL EVP message digest interface:
#include <openssl/evp.h>
#include <openssl/rand.h>
#include <string.h>
unsigned char *spc_create_nonced_digest(EVP_MD *type, unsigned char *in,
unsigned long n, unsigned int *outlen) {
int
bsz, dlen;
EVP_MD_CTX
ctx;
unsigned char *pad, *ret;
EVP_DigestInit(&ctx, type);
dlen = EVP_MD_CTX_size(&ctx);
if (!(ret = (unsigned char *)malloc(dlen * 2))) return 0;
RAND_bytes(ret, dlen);
EVP_DigestUpdate(&ctx, ret, dlen);
bsz = EVP_MD_CTX_block_size(&ctx);
if (!(pad = (unsigned char *)malloc(bsz - dlen))) {
free(ret);
return 0;
}
memset(pad, 0, bsz - dlen);
EVP_DigestUpdate(&ctx, pad, bsz - dlen);
EVP_DigestUpdate(&ctx, in, n);
EVP_DigestUpdate(&ctx, ret, dlen);
EVP_DigestUpdate(&ctx, pad, bsz - dlen);
free(pad);
EVP_DigestFinal(&ctx, ret + dlen, outlen);
*outlen *= 2;
return ret;
}
int spc_verify_nonced_digest(EVP_MD *type, unsigned char *in, unsigned long n,
unsigned char *toverify) {
int
dlen, outlen, bsz, i;
EVP_MD_CTX
ctx;
unsigned char *pad, *vfy;
EVP_DigestInit(&ctx, type);
bsz = EVP_MD_CTX_block_size(&ctx);
dlen = EVP_MD_CTX_size(&ctx);
EVP_DigestUpdate(&ctx, toverify, dlen);
if (!(pad = (unsigned char *)malloc(bsz - dlen))) return 0;
memset(pad, 0, bsz - dlen);
EVP_DigestUpdate(&ctx, pad, bsz - dlen);
EVP_DigestUpdate(&ctx, in, n);
EVP_DigestUpdate(&ctx, toverify, dlen);
EVP_DigestUpdate(&ctx, pad, bsz - dlen);
free(pad);
if (!(vfy = (unsigned char *)malloc(dlen))) return 0;
EVP_DigestFinal(&ctx, vfy, &outlen);
in += dlen;
for (i = 0; i < dlen; i++)
if (vfy[i] != toverify[i + dlen]) {
free(vfy);
return 0;
}
free(vfy);
return 1;
}
The first function, spc_create_nonced_digest( ), automatically selects a nonce from the OpenSSL
random number generator and returns twice the digest size in output, where the first digest-sized
block is the nonce and the second is the hash. The second function,spc_verify_nonced_digest( ),
takes data consisting of a nonce concatenated with a hash value, and returns 1 if the hash validates,
and 0 otherwise.
Two macros can make extracting the nonce and the hash easier:
#include <stdio.h>
#include <string.h>
#include <openssl/evp.h>
/* Here, l is the output length of spc_create_nonced_digest(
#define spc_extract_nonce(l, s) (s)
#define spc_extract_digest(l, s) ((s)+((l) / 2))
) */
Here's a sample program using this API:
int main(int argc, char *argv[ ]) {
unsigned int i, ol;
unsigned char *s = "Testing hashes with nonces.";
unsigned char *dgst, *nonce, *ret;
ret
= spc_create_nonced_digest(EVP_sha1( ), s, strlen(s), &ol);
nonce = spc_extract_nonce(ol, ret);
dgst = spc_extract_digest(ol, ret);
printf("Nonce = ");
for(i = 0; i < ol / 2; i++)
printf("%02x", nonce[i]);
printf("\nSHA1-Nonced(Nonce, \"%s\") = \n\t", s);
for(i = 0; i < ol / 2; i++)
printf("%02x", dgst[i]);
printf("\n");
if (spc_verify_nonced_digest(EVP_sha1( ), s, strlen(s), ret))
printf("Recalculation verified integrity.\n");
else
printf("Recalculation FAILED to match.\n");
return 0;
}
6.8.4 See Also
Recipe 6.7
[ Team LiB ]
[ Team LiB ]
6.9 Checking Message Integrity
6.9.1 Problem
You want to provide integrity for messages in such a way that people with a secret key can verify
that the message has not changed since the integrity value (often called atag) was first calculated.
6.9.2 Solution
Use a message integrity check. As with hash functions, there are somewhat standard interfaces,
particularly an incremental interface.
6.9.3 Discussion
Libraries that support MACs tend to support incremental operation using a standard structure, very
similar to that used by hash functions:
1. Allocate and key a context object. The context object holds the internal state of the MAC until
data processing is complete. The type of the context object can be specific to the MAC, or there
can be a single type that works for all hash functions in a library. OpenSSL supports only one
MAC and has only the associated context type. The key can be reused numerous times without
reallocating. Often, you will need to specify the underlying algorithm you are using for your
MAC.
2. Reset the context object, setting the internal parameters of the MAC to their initial state so that
another message's authentication tag can be calculated. Many MACs accept a nonce, and this is
where you would pass that in. This is often combined with the "init" call when the algorithm
does not take a nonce, such as with OMAC and HMAC.
3. "Update" the context object by passing in data to be authenticated and the associated length of
that input. The results of the MAC'ing process will be dependent on the order of the data that
you pass, but you can pass in all the partial data you wish. That is, calling the update routine
with the strings "he" then "llo" would produce the same results as calling it once with the string
"hello". The update function generally takes as arguments the context object, the data to
process, and the associated length of that data.
4. "Finalize" the context object and produce the authentication tag. Most APIs will generally take
as arguments the context object and a buffer into which the message digest is placed.
Often, you may have a block cipher or a hash function that you'd like to turn into a MAC, but no
associated code comes with the cryptographic primitive. Alternately, you might use a library such as
OpenSSL or CryptoAPI that provides very narrow choices. For this reason, the next several recipes
provide implementations of MACs we recommend for general-purpose use, particularly OMAC, CMAC,
and HMAC.
Security Recommendations for MACs
MACs are not quite as low-level as cryptographic hash functions. Yet they are still fairly
low-level constructs, and there are some common pitfalls associated with them. We
discuss these elsewhere in the book, but here's a summary of steps you should take to
defend yourself against common problems:
Don't use the same MAC key as an encryption key. If you'd like to have a system
with a single key, key your MAC and encryption separately, using the technique from
Recipe 4.11.
Use a securely generated, randomly chosen key for your MAC, not something
hardcoded or otherwise predictable!
Be sure to read Recipe 6.18 on how to use a MAC and encryption together securely,
as it can be difficult to do.
Use an always-increasing nonce, and use this to actively thwart capture replay
attacks. Do this even if the MAC doesn't have built-in support for nonces. (See
Recipe 6.21 for information on how to thwart capture replay attacks, and Recipe 6.12
for using a nonce with MACs that don't have direct support for them.)
It is of vital importance that any parties computing a MAC agree on exactly what
data is to be processed. To that end, it pays to get very detailed in specifying the
content of messages, including any fields you have and how they are encoded before
the MAC is computed. Any encoding should be unambiguous.
Some MAC interfaces may not remove key material from memory when done. Be sure to check the
particular implementation you're using.
OpenSSL provides only a single MAC implementation, HMAC, while CryptoAPI supports both CBC-MAC
and HMAC. Neither quite follows the API outlined in this recipe, though they stray in different ways.
OpenSSL performs the reset operation the same way as the initialization operation (you just pass in 0
in place of the key and the algorithm arguments). CryptoAPI does not allow resetting the context
object, and instead requires that a completely new context object be created.
OMAC and HMAC do not take a nonce by default. See Recipe 6.12 to see how to use these algorithms
with a nonce. To see how to use the incremental HMAC interface in OpenSSL and CryptoAPI, see
Recipe 6.10. CryptoAPI does not have an all-in-one interface, but instead requires use of its
incremental API.
Most libraries also provide an all-in-one interface to the MACs they provide. For example, the HMAC
all-in-one function for OpenSSL looks like this:
unsigned char *HMAC(const EVP_MD *evp_md, const void *key, int key_len,
const unsigned char *msg, int msglen, unsigned char *tag,
unsigned int *tag_len);
There is some variation in all-in-one APIs. Some are single-pass, like the OpenSSL API described in
this section. Others have a separate initialization step and a context object, so that you do not need
to specify the underlying cryptographic primitive and rekey every single time you want to use the
MAC. That is, such interfaces automatically call functions for resetting, updating, and finalization for
you.
6.9.4 See Also
Recipe 4.11, Recipe 6.10, Recipe 6.12, Recipe 6.18, Recipe 6.21
[ Team LiB ]
[ Team LiB ]
6.10 Using HMAC
6.10.1 Problem
You want to provide message authentication using HMAC.
6.10.2 Solution
If you are using OpenSSL, you can use the HMAC API:
/* The incremental interface */
void HMAC_Init(HMAC_CTX *ctx, const void *key, int len, const EVP_MD *md);
void HMAC_Update(HMAC_CTX *ctx, const unsigned char *data, int len);
void HMAC_Final(HMAC_CTX *ctx, unsigned char *tag, unsigned int *tag_len);
/* HMAC_cleanup erases the key material from memory. */
void HMAC_cleanup(HMAC_CTX *ctx);
/* The all-in-one interface. */
unsigned char *HMAC(const EVP_MD *evp_md, const void *key, int key_len,
const unsigned char *msg, int msglen, unsigned char *tag,
unsigned int *tag_len);
If you are using CryptoAPI, you can use the CryptCreateHash( ), CryptHashData( ),
CryptGetHashParam( ), CryptSetHashParam( ), and CryptDestroyHash( ) functions:
BOOL WINAPI CryptCreateHash(HCRYPTPROV hProv, ALG_ID Algid, HCRYPTKEY hKey,
DWORD dwFlags, HCRYPTHASH *phHash);
BOOL WINAPI CryptHashData(HCRYPTHASH hHash, BYTE *pbData, DWORD cbData,
DWORD dwFlags);
BOOL WINAPI CryptGetHashParam(HCRYPTHASH hHash, DWORD dwParam, BYTE *pbData,
DWORD *pcbData, DWORD dwFlags);
BOOL WINAPI CryptSetHashParam(HCRYPTHASH hHash, DWORD dwParam, BYTE *pbData,
DWORD dwFlags);
BOOL WINAPI CryptDestroyHash(HCRYPTHASH hHash);
Otherwise, you can use the HMAC implementation provided with this recipe in combination with any
cryptographic hash function you have handy.
6.10.3 Discussion
Be sure to look at our generic recommendations for using a MAC (Recipe 6.9).
Here's an example of using OpenSSL's incremental interface to hash two messages using SHA1:
#include <stdio.h>
#include <openssl/hmac.h>
void spc_incremental_hmac(unsigned char *key, size_t keylen) {
int
i;
HMAC_CTX
ctx;
unsigned int len;
unsigned char out[20];
HMAC_Init(&ctx, key, keylen, EVP_sha1( ));
HMAC_Update(&ctx, "fred", 4);
HMAC_Final(&ctx, out, &len);
for (i = 0; i < len; i++) printf("%02x", out[i]);
printf("\n");
HMAC_Init(&ctx, 0, 0, 0);
HMAC_Update(&ctx, "fred", 4);
HMAC_Final(&ctx, out, &len);
for (i = 0; i < len; i++) printf("%02x", out[i]);
printf("\n");
HMAC_cleanup(&ctx); /* Remove key from memory */
}
To reset the HMAC context object, we call HMAC_Init( ), passing in zeros (NULLs) in place of the
key, key length, and digest type to use. The NULL argument when initializing in OpenSSL generally
means "I'm not supplying this value right now; use what you already have."
The following example shows an implementation of the same code provided for OpenSSL, this time
using CryptoAPI (with the exception of resetting the context, because CryptoAPI actually requires a
new one to be created). This implementation requires the use of the code inRecipe 5.26 to convert
raw key data into an HCRYPTKEY object as required by CryptCreateHash( ). Note the difference in
the arguments required between spc_incremental_hmac( ) as implemented for OpenSSL, and
SpcIncrementalHMAC( ) as implemented for CryptoAPI. The latter requires an additional argument
that specifies the encryption algorithm for the key. Although the information is never really used,
CryptoAPI insists on tying an encryption algorithm to key data. In general,CALG_RC4 should work
fine for arbitrary key data (the value will effectively be ignored).
#include <windows.h>
#include <wincrypt.h>
#include <stdio.h>
void SpcIncrementalHMAC(BYTE *pbKey, DWORD cbKey, ALG_ID Algid) {
BYTE
out[20];
DWORD
cbData = sizeof(out), i;
HCRYPTKEY hKey;
HMAC_INFO HMACInfo;
HCRYPTHASH hHash;
HCRYPTPROV hProvider;
hProvider = SpcGetExportableContext( );
hKey = SpcImportKeyData(hProvider, Algid, pbKey, cbKey);
CryptCreateHash(hProvider, CALG_HMAC, hKey, 0, &hHash);
HMACInfo.HashAlgid
=
HMACInfo.pbInnerString =
HMACInfo.cbInnerString =
CryptSetHashParam(hHash,
CALG_SHA1;
HMACInfo.pbOuterString = 0;
HMACInfo.cbOuterString = 0;
HP_HMAC_INFO, (BYTE *)&HMACInfo, 0);
CryptHashData(hHash, (BYTE *)"fred", 4, 0);
CryptGetHashParam(hHash, HP_HASHVAL, out, &cbData, 0);
for (i = 0; i < cbData; i++) printf("%02x", out[i]);
printf("\n");
CryptDestroyHash(hHash);
CryptDestroyKey(hKey);
CryptReleaseContext(hProvider, 0);
}
If you aren't using OpenSSL or CryptoAPI, but you have a hash function that you'd like to use with
HMAC, you can use the following HMAC implementation:
#include <stdlib.h>
#include <string.h>
typedef struct {
DGST_CTX
mdctx;
unsigned char inner[DGST_BLK_SZ];
unsigned char outer[DGST_BLK_SZ];
} SPC_HMAC_CTX;
void SPC_HMAC_Init(SPC_HMAC_CTX *ctx, unsigned char *key, size_t klen) {
int
i;
unsigned char dk[DGST_OUT_SZ];
DGST_Init(&(ctx->mdctx));
memset(ctx->inner, 0x36, DGST_BLK_SZ);
memset(ctx->outer, 0x5c, DGST_BLK_SZ);
if (klen <= DGST_BLK_SZ) {
for (i = 0; i < klen; i++) {
ctx->inner[i] ^= key[i];
ctx->outer[i] ^= key[i];
}
} else {
DGST_Update(&(ctx->mdctx), key, klen);
DGST_Final(dk, &(ctx->mdctx));
DGST_Reset(&(ctx->mdctx));
for (i = 0; i < DGST_OUT_SZ;
ctx->inner[i] ^= dk[i];
ctx->outer[i] ^= dk[i];
}
i++) {
}
DGST_Update(&(ctx->mdctx), ctx->inner, DGST_BLK_SZ);
}
void SPC_HMAC_Reset(SPC_HMAC_CTX *ctx) {
DGST_Reset(&(ctx->mdctx));
DGST_Update(&(ctx->mdctx), ctx->inner, DGST_BLK_SZ);
}
void SPC_HMAC_Update(SPC_HMAC_CTX *ctx, unsigned char *m, size_t l) {
DGST_Update(&(ctx->mdctx), m, l);
}
void SPC_HMAC_Final(unsigned char *tag, SPC_HMAC_CTX *ctx) {
unsigned char is[DGST_OUT_SZ];
DGST_Final(is, &(ctx->mdctx));
DGST_Reset(&(ctx->mdctx));
DGST_Update(&(ctx->mdctx), ctx->outer, DGST_BLK_SZ);
DGST_Update(&(ctx->mdctx), is, DGST_OUT_SZ);
DGST_Final(tag, &(ctx->mdctx));
}
void SPC_HMAC_Cleanup(SPC_HMAC_CTX *ctx) {
volatile char *p = ctx->inner;
volatile char *q = ctx->outer;
int i;
for (i = 0;
i < DGST_BLK_SZ;
i++) *p++ = *q++ = 0;
}
The previous code does require a particular interface to a hash function interface. First, it requires
two constants: DGST_BLK_SZ, which is the internal block size of the underlying hash function (see
Recipe 6.3), and DGST_OUT_SZ, which is the size of the resulting message digest. Second, it requires
a context type for the message digest, which you should typedef to DGST_CTX. Finally, it requires an
incremental interface to the hash function:
void
void
void
void
DGST_Init(DGST_CTX *ctx);
DGST_Reset(DGST_CTX *ctx);
DGST_Update(DGST_CTX *ctx, unsigned char *m, size_t len);
DGST_Final(unsigned char *tag. DGST_CTX *ctx);
Some hash function implementations won't have an explicit reset implementation, in which case you
can implement the reset functionality by calling DGST_Init( ) again.
Even though OpenSSL already has an HMAC implementation, here is an example of binding the
previous HMAC implementation to OpenSSL's SHA1 implementation:
typedef SHA_CTX DGST_CTX;
#define DGST_BLK_SZ 64
#define DGST_OUT_SZ 20
#define
#define
#define
#define
DGST_Init(x)
SHA1_Init(x)
DGST_Reset(x)
DGST_Init(x)
DGST_Update(x, m, l) SHA1_Update(x, m, l)
DGST_Final(o, x)
SHA1_Final(o, x)
6.10.4 See Also
Recipe 5.26, Recipe 6.3, Recipe 6.4, Recipe 6.9
[ Team LiB ]
[ Team LiB ]
6.11 Using OMAC (a Simple Block Cipher-Based MAC)
6.11.1 Problem
You want to use a simple MAC based on a block cipher, such as AES.
6.11.2 Solution
Use the OMAC implementation provided in Section 6.11.3.
6.11.3 Discussion
Be sure to look at our generic recommendations for using a MAC (seeRecipe
6.9).
OMAC is a straightforward message authentication algorithm based on the CBC-encryption mode. It
fixes some security problems with the naïve implementation of a MAC from CBC mode (CBC-MAC). In
particular, that MAC is susceptible to length-extension attacks, similar to the ones we consider for
cryptographic hash functions in Recipe 6.7.
OMAC has been explicitly specified for AES, and it is easy to adapt to any 128-bit block cipher. It is
possible, but a bit more work, to get it working with ciphers with 64-bit blocks. In this section, we
only cover using OMAC with AES.
The basic idea behind using CBC mode as a MAC is to encrypt a message in CBC mode and throw
away everything except the very last block of output. That's not generally secure, though. It only
works when all messages you might possibly process are a particular size.
Besides OMAC, there are several MACs that try to fix the CBC-MAC problem, including XCBC-MAC,
TMAC, and RMAC:
RMAC
RMAC (the R stands for randomized) has security issues in the general case, and is not favored
by the cryptographic community.[6]
[6]
Most importantly, RMAC requires the underlying block cipher to protect against related-key attacks,
where other constructs do not. Related-key attacks are not well studied, so it's best to prefer constructs
that can avoid them when possible.
XCBC-MAC
XCBC-MAC (eXtended CBC-MAC) is the foundation for TMAC and OMAC, but it uses three
different keys.
TMAC
TMAC uses two keys (thus the T in the name).
OMAC is the first good CBC-MAC derivative that uses a single key. OMAC works the same way CBCMAC does until the last block, where it XORs the state with an additional value before encrypting.
That additional value is derived from the result of encrypting all zeros, and it can be performed at key
setup time. That is, the additional value is key-dependent, not message-dependent.
OMAC is actually the name of a family of MAC algorithms. There are two concrete versions, OMAC1
and OMAC2, which are slightly different but equally secure. OMAC1 is slightly preferable because its
key setup can be done a few cycles more quickly than OMAC2's key setup. NIST is expected to
standardize on OMAC1.
First, we provide an incremental API for using OMAC. This code requires linking against an AES
implementation, and also that the macros developed in Recipe 5.5 be defined (they bridge the API of
your AES implementation with this book's API). The secure memory functionspc_memset( ) from
Recipe 13.2 is also required.
To use this API, you must instantiate an SPC_OMAC_CTX object and pass it to the various API
functions. To initialize the context, call either spc_omac1_init( ) or spc_omac2_init( ), depending
on whether you want to use OMAC1 or OMAC2. The initialization functions always return success
unless the key length is invalid, in which case they return 0. Successful initialization is indicated by a
return value of 1.
int spc_omac1_init(SPC_OMAC_CTX *ctx, unsigned char *key, int keylen);
int spc_omac2_init(SPC_OMAC_CTX *ctx, unsigned char *key, int keylen);
These functions have the following arguments:
ctx
Context object to be initialized.
key
Block cipher key.
keylen
Length of the key in bytes. The length of the key must be 16, 24, or 32 bytes; any other key
length is invalid.
Once initialized, spc_omac_update( ) can be used to process data. Note that the only differences
between OMAC1 and OMAC2 in this implementation are handled at key setup time, so they both use
the same functions for updating and finalization. Multiple calls tospc_omac_update( ) act just like
making a single call where all of the data was concatenated together. Here is its signature:
void spc_omac_update(SPC_OMAC_CTX *ctx, unsigned char *in, size_t il);
This function has the following arguments:
ctx
Context object to use for the current message.
in
Buffer that contains the data to be processed.
il
Length of the data buffer to be processed in bytes.
To obtain the output of the MAC operation, call spc_omac_final( ), which has the following
signature:
int spc_omac_final(SPC_OMAC_CTX *ctx, unsigned char *out);
This function has the following arguments:
ctx
Context object to be finalized.
out
Buffer into which the output will be placed. This buffer must be at least 16 bytes in size. No
more than 16 bytes will ever be written to it.
Here is the code implementing OMAC:
#include <stdlib.h>
typedef struct {
SPC_KEY_SCHED ks;
int
ix;
unsigned char iv[SPC_BLOCK_SZ];
unsigned char c1[SPC_BLOCK_SZ]; /* L * u */
unsigned char c2[SPC_BLOCK_SZ]; /* L / u */
} SPC_OMAC_CTX;
int spc_omac1_init(SPC_OMAC_CTX *ctx, unsigned char *key, int keylen) {
int
condition, i;
unsigned char L[SPC_BLOCK_SZ] = {0,};
if (keylen != 16 && keylen != 24 && keylen != 32) return 0;
SPC_ENCRYPT_INIT(&(ctx->ks), key, keylen);
SPC_DO_ENCRYPT(&(ctx->ks), L, L);
spc_memset(ctx->iv, 0, SPC_BLOCK_SZ);
ctx->ix = 0;
/* Compute L * u */
condition = L[0] & 0x80;
ctx->c1[0] = L[0] << 1;
for (i = 1; i < SPC_BLOCK_SZ; i++) {
ctx->c1[i - 1] |= L[i] >> 7;
ctx->c1[i]
= L[i] << 1;
}
if (condition) ctx->c1[SPC_BLOCK_SZ - 1] ^= 0x87;
/* Compute L * u * u */
condition = ctx->c1[0] & 0x80;
ctx->c2[0] = ctx->c1[0] << 1;
for (i = 1; i < SPC_BLOCK_SZ; i++) {
ctx->c2[i - 1] |= ctx->c1[i] >> 7;
ctx->c2[i]
= ctx->c1[i] << 1;
}
if (condition) ctx->c2[SPC_BLOCK_SZ - 1] ^= 0x87;
spc_memset(L, 0, SPC_BLOCK_SZ);
return 1;
}
int spc_omac2_init(SPC_OMAC_CTX *ctx, unsigned char *key, int keylen) {
int
condition, i;
unsigned char L[SPC_BLOCK_SZ] = {0,};
if (keylen != 16 && keylen != 24 && keylen != 32) return 0;
SPC_ENCRYPT_INIT(&(ctx->ks), key, keylen);
SPC_DO_ENCRYPT(&(ctx->ks), L, L);
spc_memset(ctx->iv, 0, SPC_BLOCK_SZ);
ctx->ix = 0;
/* Compute L * u, storing it in c1 */
condition = L[0] >> 7;
ctx->c1[0] = L[0] << 1;
for (i = 1; i < SPC_BLOCK_SZ; i++) {
ctx->c1[i - 1] |= L[i] >> 7;
ctx->c1[i]
= L[i] << 1;
}
if (condition) ctx->c1[SPC_BLOCK_SZ - 1] ^= 0x87;
/* Compute L * u ^ -1, storing it in c2 */
condition = L[SPC_BLOCK_SZ - 1] & 0x01;
i = SPC_BLOCK_SZ;
while (--i) ctx->c2[i] = (L[i] >> 1) | (L[i - 1] << 7);
ctx->c2[0] = L[0] >> 1;
L[0] >>= 1;
if (condition) {
ctx->c2[0]
^= 0x80;
ctx->c2[SPC_BLOCK_SZ - 1] ^= 0x43;
}
spc_memset(L, 0, SPC_BLOCK_SZ);
return 1;
}
void spc_omac_update(SPC_OMAC_CTX *ctx, unsigned char *in, size_t il) {
int i;
if (il < SPC_BLOCK_SZ - ctx->ix) {
while (il--) ctx->iv[ctx->ix++] ^= *in++;
return;
}
if (ctx->ix) {
while (ctx->ix < SPC_BLOCK_SZ) --il, ctx->iv[ctx->ix++] ^= *in;
SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, ctx->iv);
}
while (il > SPC_BLOCK_SZ) {
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((unsigned int *)(ctx->iv))[i] ^= ((unsigned int *)in)[i];
SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, ctx->iv);
in += SPC_BLOCK_SZ;
il -= SPC_BLOCK_SZ;
}
for (i = 0; i < il; i++) ctx->iv[i] ^= in[i];
ctx->ix = il;
}
int spc_omac_final(SPC_OMAC_CTX *ctx, unsigned char *out) {
int i;
if (ctx->ix != SPC_BLOCK_SZ) {
ctx->iv[ctx->ix] ^= 0x80;
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((int *)ctx->iv)[i] ^= ((int *)ctx->c2)[i];
} else {
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)
((int *)ctx->iv)[i] ^= ((int *)ctx->c1)[i];
}
SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, out);
return 1;
}
For those interested in the algorithm itself, note that we precompute two special values at key setup
time, both of which are derived from the value we get from encrypting the all-zero data block. Each
precomputed value is computed by using a 128-bit shift and a conditional XOR. The last block of data
is padded, if necessary, and XOR'd with one of these two values, depending on its length.
Here is an all-in-one wrapper to OMAC, exporting both OMAC1 and OMAC2:
int SPC_OMAC1(unsigned char key[ ], int keylen, unsigned char in[
unsigned char out[16]) {
SPC_OMAC_CTX c;
], size_t l,
if (!spc_omac1_init(&c, key, keylen)) return 0;
spc_omac_update(&c, in, l);
spc_omac_final(&c, out);
return 1;
}
int SPC_OMAC2(unsigned char key[ ], int keylen, unsigned char in[
unsigned char out[16]) {
], size_t l,
SPC_OMAC_CTX c;
if (!spc_omac2_init(&c, key, keylen)) return 0;
spc_omac_update(&c, in, l);
spc_omac_final(&c, out);
return 1;
}
6.11.4 See Also
Recipe 5.5, Recipe 6.7, Recipe 6.9, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
6.12 Using HMAC or OMAC with a Nonce
6.12.1 Problem
You want to use HMAC or OMAC, but improve its resistance to birthday attacks and capture replay
attacks.
6.12.2 Solution
Use an ever-incrementing nonce that is concatenated to your message.
6.12.3 Discussion
Be sure to actually test the nonce when validating the nonce value, so as to
thwart capture replay attacks. (See Recipe 6.21.)
If you're using an off-the-shelf HMAC implementation, such as OpenSSL's or CryptoAPI's, you can
easily concatenate your nonce to the beginning of your message.
You should use a nonce that's at least half as large as your key size, if not larger. Ultimately, we
would recommend that any nonce contain a message counter that is 64 bits (it can be smaller if
you're 100% sure you'll never use every counter value) and a random portion that is at least 64 bits.
The random portion can generally be chosen per session instead of per message.
Here's a simple wrapper that provides a nonced all-in-one version of OMAC1, using the
implementation from Recipe 6.11 and a 16-byte nonce:
void spc_OMAC1_nonced(unsigned char key[ ], int keylen, unsigned char in[ ],
size_t l, unsigned char nonce[16], unsigned char out[16]) {
SPC_OMAC_CTX c;
if (!spc_omac1_init(&c, key, keylen)) abort(
spc_omac_update(&c, nonce, 16);
spc_omac_update(&c, in, l);
spc_omac_final(&c, out);
}
6.12.4 See Also
);
Recipe 6.11, Recipe 6.21
[ Team LiB ]
[ Team LiB ]
6.13 Using a MAC That's Reasonably Fast in Software and
Hardware
6.13.1 Problem
You want to use a MAC that is fast in both software and hardware.
6.13.2 Solution
Use CMAC. It is available from http://www.zork.org/cmac/.
6.13.3 Discussion
Be sure to look at our generic recommendations for using a MAC (seeRecipe
6.9).
CMAC is the message-integrity component of the CWC encryption mode. It is based on a universal
hash function that is similar to hash127. It requires an 11-byte nonce per message. The Zork
implementation has the following API:
int cmac_init(cmac_t *ctx, unsigned char key[16]);
void cmac_mac(cmac_t *ctx, unsigned char *msg, u_int32 msglen,
unsigned char nonce[11], unsigned char output[16]);
void cmac_cleanup(cmac_t *ctx);
void cmac_update(cmac_t *ctx, unsigned char *msg, u_int32 msglen);
void cmac_final(cmac_t *ctx, unsigned char nonce[11], unsigned char output[16]);
The cmac_t type keeps track of state and needs to be initialized only when you key the algorithm.
You can then make messages interchangeably using the all-in-one API or the incremental API.
The all-in-one API consists of the cmac_mac( ) function. It takes an entire message and a nonce as
arguments and produces a 16-byte output. If you want to use the incremental API,cmac_update( )
is used to pass in part of the message, and cmac_final( ) is used to set the nonce and get the
resulting tag. The cmac_cleanup( ) function securely erases the context object.
To use the CMAC API, just copy the cmac.h and cmac.c files, and compile and link against cmac.c.
6.13.4 See Also
The CMAC home page: http://www.zork.org/cmac/
Recipe 6.9
[ Team LiB ]
[ Team LiB ]
6.14 Using a MAC That's Optimized for Software Speed
6.14.1 Problem
You want to use the MAC that is fastest in software.
6.14.2 Solution
Use a MAC based on Dan Bernstein's hash127, as discussed in the next section. The hash127 library
is available from http://cr.yp.to.
6.14.3 Discussion
Be sure to look at our generic recommendations for using a MAC (seeRecipe
6.9).
The hash127 algorithm is a universal hash function that can be turned into a secure MAC using AES.
It is available from Dan Bernstein's web page: http://cr.yp.to/hash127.html. Follow the directions on
how to install the hash127 library. Once the library is compiled, just include the directory containing
hash127.h in your include path and link against hash127.a.
Unfortunately, at the time of this writing, the hash127 implementation has not
been ported to Windows. Aside from differences in inline assembler syntax
between GCC and Microsoft Visual C++, some constants used in the
implementation overflow Microsoft Visual C++'s internal token buffer. When a
port becomes available, we will update the book's web site with the relevant
information.
The way to use hash127 as a MAC is to hash the message you want to authenticate (the hash
function takes a key and a nonce as inputs, as well as the message), then encrypt the result of the
hash function using AES.
In this recipe, we present an all-in-one MAC API based on hash127, which we callMAC127. This
construction first hashes a message using hash127, then uses two constant-time postprocessing
operations based on AES. The postprocessing operations give this MAC excellent provable security
under strong assumptions.
When initializing the MAC, a 16-byte key is turned into three 16-byte keys by AES-encrypting three
constant values. The first two derived keys are AES keys, used for postprocessing. The third derived
key is the hash key (though the hash127 algorithm will actually ignore one bit of this key).
Note that Bernstein's hash127 interface has some practical limitations:
The entire message must be present at the time hash127( ) is called. That is, there's no
incremental interface. If you need a fast incremental MAC, use CMAC (discussed inRecipe 6.13)
instead.
The API takes an array of 32-bit values as input, meaning that it cannot accept an arbitrary
character string.
However, we can encode the leftover bytes of input in the last parameter passed tohash127( ).
Bernstein expects the last parameter to be used for additional per-message keying material. We're
not required to use that parameter for keying material (i.e., our construction is still a secure MAC).
Instead, we encode any leftover bytes, then unambiguously encode the length of the message.
To postprocess, we encrypt the hash output with one AES key, encrypt the nonce with the other AES
key, then XOR the two ciphertexts together. This gives us provable security with good assumptions,
plus the additional benefits of a nonce (see Recipe 6.12).
The core MAC127 data type is SPC_MAC127_CTX. There are only two functions: one to initialize a
context, and one to MAC a message. The initialization function has the following signature:
void spc_mac127_init(SPC_MAC127_CTX *ctx, unsigned char *key);
This function has the following arguments:
ctx
Context object that holds key material so that several messages may be MAC'd with a single
key.
key
Buffer that contains a 16-byte key.
To MAC a message, we use the function spc_mac127( ):
void spc_mac127(SPC_MAC127_CTX *ctx, unsigned char *m, size_t l,
unsigned char *nonce, unsigned char *out);
This function has the following arguments:
ctx
Context object to be used to perform the MAC.
m
Buffer that contains the message to be authenticated.
l
Length of the message buffer in octets.
nonce
Buffer that contains a 16-byte value that must not be repeated.
out
Buffer into which the output will be placed. It must be at least 16 bytes in size. No more than
16 bytes will ever be written to it.
Here is our implementation of MAC127:
#include <stdlib.h>
#ifndef WIN32
#include <sys/types.h>
#include <netinet/in.h>
#include
#else
#include
#include
#endif
#include
<arpa/inet.h>
<windows.h>
<winsock.h>
<hash127.h>
typedef struct {
struct hash127 hctx;
SPC_KEY_SCHED ekey;
SPC_KEY_SCHED nkey;
} SPC_MAC127_CTX;
void spc_mac127_init(SPC_MAC127_CTX *ctx, unsigned char key[16]) {
int
i;
unsigned char
pt[16] = {0, };
volatile int32
hk[4];
volatile unsigned char ek[16], nk[16];
SPC_ENCRYPT_INIT(&(ctx->ekey), key, 16);
SPC_DO_ENCRYPT(&(ctx->ekey), pt, (unsigned char *)ek);
pt[15] = 1;
SPC_DO_ENCRYPT(&(ctx->ekey), pt, (unsigned char *)nk);
pt[15] = 2;
SPC_DO_ENCRYPT(&(ctx->ekey), pt, (unsigned char *)hk);
SPC_ENCRYPT_INIT(&(ctx->ekey), (unsigned char *)ek, 16);
SPC_ENCRYPT_INIT(&(ctx->nkey), (unsigned char *)nk, 16);
hk[0] = htonl(hk[0]);
hk[1] = htonl(hk[1]);
hk[2] = htonl(hk[2]);
hk[3] = htonl(hk[3]);
hash127_expand(&(ctx->hctx), (int32 *)hk);
hk[0] = hk[1] = hk[2] = hk[3] = 0;
for (i = 0; i < 16; i++) ek[i] = nk[i] = 0;
}
void spc_mac127(SPC_MAC127_CTX *c, unsigned char *msg, size_t mlen,
unsigned char nonce[16], unsigned char out[16]) {
int
i, r = mlen % 4; /* leftover bytes to stick into final block */
int32 x[4] = {0,};
for (i = 0; i <r; i++) ((unsigned char *)x)[i] = msg[mlen - r + i];
x[3] = (int32)mlen;
hash127_little((int32 *)out, (int32 *)msg, mlen / 4, &(c->hctx), x);
x[0] = htonl(*(int *)out);
x[1] = htonl(*(int *)(out + 4));
x[2] = htonl(*(int *)(out + 8));
x[3] = htonl(*(int *)(out + 12));
SPC_DO_ENCRYPT(&(c->ekey), out, out);
SPC_DO_ENCRYPT(&(c->nkey), nonce, (unsigned char *)x);
((int32 *)out)[0] ^= x[0];
((int32 *)out)[1] ^= x[1];
((int32 *)out)[2] ^= x[2];
((int32 *)out)[3] ^= x[3];
}
6.14.4 See Also
hash127 home page: http://cr.yp.to/hash127.html
Recipe 6.9, Recipe 6.12, Recipe 6.13
[ Team LiB ]
[ Team LiB ]
6.15 Constructing a Hash Function from a Block Cipher
6.15.1 Problem
You're in an environment in which you'd like to use a hash function, but you would prefer to use one
based on a block cipher. This might be because you have only a block cipher available, or because
you would like to minimize security assumptions in your system.
6.15.2 Solution
There are several good algorithms for doing this. We present one, Davies-Meyer, where the digest
size is the same as the block length of the underlying cipher. With 64-bit block ciphers, Davies-Meyer
does not offer sufficient security unless you add a nonce, in which case it is barely sufficient. Even
with AES-128, without a nonce, Davies-Meyer is somewhat liberal when you consider birthday
attacks.
Unfortunately, there is only one well-known scheme worth using for converting a block cipher into a
hash function that outputs twice the block length (MDC-2), and it is patented at the time of this
writing. However, those patent issues will go away by August 28, 2004. MDC-2 is covered inRecipe
6.16.
Note that such constructs assume that block ciphers resist related-key attacks. SeeRecipe 6.3 for a
general comparison of such constructs compared to dedicated constructs like SHA1.
6.15.3 Discussion
Hash functions do not provide security in and of themselves! If you need to
perform message integrity checking, use a MAC instead.
The Davies-Meyer hash function uses the message to hash as key material for the block cipher. The
input is padded, strengthened, and broken into blocks based on the key length, each block used as a
key to encrypt a single value. Essentially, the message is broken into a series of keys.
With Davies-Meyer, the first value encrypted is an initialization vector (IV) that is usually agreed
upon in advance. You may treat it as a nonce instead, however, which we strongly recommend. (The
nonce is then as big as the block size of the cipher.) The result of encryption is XOR'd with the IV,
then used as a new IV. This is repeated until all keys are exhausted, resulting in the hash output. See
Figure 6-1 for a visual description of one pass of Davies-Meyer.
Figure 6-1. The Davies-Meyer construct
Traditionally, hash functions pad by appending a bit with a value of 1, then however many zeros are
necessary to align to the next block of input. Input is typically strengthened by adding a block of data
to the end that encodes the message length. Nonetheless, such strengthening does not protect
against length-extension attacks. (To prevent against those, see Recipe 6.7.)
Matyas-Meyer-Oseas is a similar construction that is preferable in that the plaintext itself is not used
as the key to a block cipher (this could make related-key attacks on Davies-Meyer easier); we'll
present that as a component when we show how to implement MDC-2 in Recipe 6.16.
Here is an example API for using Davies-Meyer wihtout a nonce:
void spc_dm_init(SPC_DM_CTX *c);
void spc_dm_update(SPC_DM_CTX *c, unsigned char *msg, size_t len);
void spc_dm_final(SPC_DM_CTX *c, unsigned char out[SPC_BLOCK_SZ]);
The following is an implementation using AES-128. This code requires linking against an AES
implementation, and it also requires that the macros developed inRecipe 5.5 be defined (they bridge
the API of your AES implementation with this book's API).
#include <stdlib.h>
#include <string.h>
#ifndef WIN32
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#else
#include <windows.h>
#include <winsock.h>
#endif
#define SPC_KEY_SZ 16
typedef struct {
unsigned char h[SPC_BLOCK_SZ];
unsigned char b[SPC_KEY_SZ];
size_t
ix;
size_t
tl;
} SPC_DM_CTX;
void spc_dm_init(SPC_DM_CTX *c) {
memset(c->h, 0x52, SPC_BLOCK_SZ);
c->ix = 0;
c->tl = 0;
}
static void spc_dm_once(SPC_DM_CTX *c, unsigned char b[SPC_KEY_SZ]) {
int
i;
SPC_KEY_SCHED ks;
unsigned char tmp[SPC_BLOCK_SZ];
SPC_ENCRYPT_INIT(&ks, b, SPC_KEY_SZ);
SPC_DO_ENCRYPT(&ks, c->h, tmp);
for (i = 0; i < SPC_BLOCK_SZ / sizeof(int);
((int *)c->h)[i] ^= ((int *)tmp)[i];
i++)
}
void spc_dm_update(SPC_DM_CTX *c, unsigned char *t, size_t l) {
c->tl += l; /* if c->tl < l: abort */
while (c->ix && l) {
c->b[c->ix++] = *t++;
l--;
if (!(c->ix %= SPC_KEY_SZ)) spc_dm_once(c, c->b);
}
while (l > SPC_KEY_SZ) {
spc_dm_once(c, t);
t += SPC_KEY_SZ;
l -= SPC_KEY_SZ;
}
c->ix = l;
for (l = 0; l < c->ix; l++) c->b[l] = *t++;
}
void spc_dm_final(SPC_DM_CTX *c, unsigned char output[SPC_BLOCK_SZ]) {
int i;
c->b[c->ix++] = 0x80;
while (c->ix < SPC_KEY_SZ) c->b[c->ix++] = 0;
spc_dm_once(c, c->b);
memset(c->b, 0, SPC_KEY_SZ - sizeof(size_t));
c->tl = htonl(c->tl);
for (i = 0; i < sizeof(size_t); i++)
c->b[SPC_KEY_SZ - sizeof(size_t) + i] = ((unsigned char *)(&c->tl))[i];
spc_dm_once(c, c->b);
memcpy(output, c->h, SPC_BLOCK_SZ);
}
6.15.4 See Also
Recipe 5.5, Recipe 6.3, Recipe 6.7, Recipe 6.16
[ Team LiB ]
[ Team LiB ]
6.16 Using a Block Cipher to Build a Full-Strength Hash
Function
6.16.1 Problem
Given a block cipher, you want to produce a one-way hash function, where finding collisions should
always be as hard as inverting the block cipher.
6.16.2 Solution
Use MDC-2, which is a construction that turns a block cipher into a hash function using two MatyasMeyer-Oseas hashes and a bit of postprocessing.
6.16.3 Discussion
Hash functions do not provide security in and of themselves! If you need to
perform message integrity checking, use a MAC instead.
The MDC-2 message digest construction turns an arbitrary block cipher into a one-way hash function.
It's different from Davies-Meyer and Matyas-Meyer-Oseas in that the output of the hash function is
twice the block length of the cipher. It is also protected by patent until August 28, 2004.
However, MDC-2 does use two instances of Matyas-Meyer-Oseas as components in its construction.
Matyas-Meyer-Oseas hashes block by block and uses the internal state as a key used to encrypt each
block of input. The resulting ciphertext is XOR'd with the block of input, and the output of that
operation becomes the new internal state. The output of the hash function is the final internal state
(though if the block size is not equal to the key size, it may need to be expanded, usually by
repeating the value). The initial value of the internal state can be any arbitrary constant. SeeFigure
6-2 for a depiction of how one block of the message is treated.
Figure 6-2. The Mayas-Meyer-Oseas construct
An issue with Matyas-Meyer-Oseas is that the cipher block size can be smaller than the key size, so
you might need to expand the internal state somehow before using it to encrypt. Simply duplicating
part of the key is sufficient. In the code we provide with this recipe, though, we'll assume that you
want to use AES with 128-bit keys. Because the block size of AES is also 128 bits, there doesn't need
to be an expansion operation.
MDC-2 is based on Matyas-Meyer-Oseas. There are two internal states instead of one, and each is
initialized with a different value. Each block of input is copied, and the two copies go through one
round of Matyas-Meyer-Oseas separately. Then, before the next block of input is processed, the two
internal states are shuffled a bit; the lower halves of the two states are swapped. This is all illustrated
for one block of the message in Figure 6-3.
Figure 6-3. The MDC-2 construct
Clearly, input needs to be padded to the block size of the cipher. We do this internally to our
implementation by adding a 1 bit to the end of the input, then as many zeros as are necessary to
make the resulting string block-aligned.
One important thing to note about MDC-2 (as well as Matyas-Meyer-Oseas) is that there are ways to
extend a message to get the same hash as a result, unless you do something to improve the
function. The typical solution is to use MD-strengthening, which involves adding to the end of the
input a block that encodes the length of the input. We do that in the code presented later in this
section.
Our API allows for incremental processing of messages, which means that there is a context object.
The type for our context object is named SPC_MDC2_CTX. As with other hash functions presented in
this chapter, the incremental API has three operations: initialization, updating (where data is
processed), and finalization (where the resulting hash is output).
The initialization function has the following signature:
void spc_mdc2_init(SPC_MDC2_CTX *c);
All this function does is set internal state to the correct starting values.
Processing data is actually done by the following updating function:
void spc_mdc2_update(SPC_MDC2_CTX *c, unsigned char *t, size_t l);
This function hashes l bytes located at memory address t into the context c.
The result is obtained with the following finalization function:
void spc_mdc2_final(SPC_MDC2_CTX *c, unsigned char *output);
The output argument is always a pointer to a buffer that is twice the block size of the cipher being
used. In the case of AES, the output buffer should be 32 bytes.
Following is our implementation of MDC-2, which is intended for use with AES-128. Remember: if you
want to use this for other AES key sizes or for ciphers where the key size is different from the block
size, you will need to perform some sort of key expansion before callingSPC_ENCRYPT_INIT( ). Of
course, you'll also have to change that call to SPC_ENCRYPT_INIT( ) to pass in the desired key
length.
#include <stdlib.h>
#include <string.h>
#ifndef WIN32
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#else
#include <windows.h>
#include <winsock.h>
#endif
/* This implementation only works when the block size is equal to the key size */
typedef struct {
unsigned char h1[SPC_BLOCK_SZ];
unsigned char h2[SPC_BLOCK_SZ];
unsigned char bf[SPC_BLOCK_SZ];
size_t
ix;
size_t
tl;
} SPC_MDC2_CTX;
void spc_mdc2_init(SPC_MDC2_CTX *c) {
memset(c->h1, 0x52, SPC_BLOCK_SZ);
memset(c->h2, 0x25, SPC_BLOCK_SZ);
c->ix = 0;
c->tl = 0;
}
static void spc_mdc2_oneblock(SPC_MDC2_CTX *c, unsigned char bl[SPC_BLOCK_SZ]) {
int
i, j;
SPC_KEY_SCHED ks1, ks2;
SPC_ENCRYPT_INIT(&ks1, c->h1, SPC_BLOCK_SZ);
SPC_ENCRYPT_INIT(&ks2, c->h2, SPC_BLOCK_SZ);
SPC_DO_ENCRYPT(&ks1, bl, c->h1);
SPC_DO_ENCRYPT(&ks2, bl, c->h2);
j = SPC_BLOCK_SZ / (sizeof(int) * 2);
for (i = 0; i < SPC_BLOCK_SZ / (sizeof(int) * 2);
((int *)c->h1)[i]
^= ((int *)bl)[i];
((int *)c->h2)[i]
^= ((int *)bl)[i];
((int *)c->h1)[i + j] ^= ((int *)bl)[i + j];
((int *)c->h2)[i + j] ^= ((int *)bl)[i + j];
/* Now swap the lower halves using XOR. */
((int *)c->h1)[i + j] ^= ((int *)c->h2)[i + j];
((int *)c->h2)[i + j] ^= ((int *)c->h1)[i + j];
((int *)c->h1)[i + j] ^= ((int *)c->h2)[i + j];
}
i++) {
}
void spc_mdc2_update(SPC_MDC2_CTX *c, unsigned char *t, size_t l) {
c->tl += l; /* if c->tl < l: abort */
while (c->ix && l) {
c->bf[c->ix++] = *t++;
l--;
if (!(c->ix %= SPC_BLOCK_SZ))
spc_mdc2_oneblock(c, c->bf);
}
while (l > SPC_BLOCK_SZ) {
spc_mdc2_oneblock(c, t);
t += SPC_BLOCK_SZ;
l -= SPC_BLOCK_SZ;
}
c->ix = l;
for (l = 0; l < c->ix; l++)
c->bf[l] = *t++;
}
void spc_mdc2_final(SPC_MDC2_CTX *c, unsigned char output[SPC_BLOCK_SZ * 2]) {
int i;
c->bf[c->ix++] = 0x80;
while (c->ix < SPC_BLOCK_SZ)
c->bf[c->ix++] = 0;
spc_mdc2_oneblock(c, c->bf);
memset(c->bf, 0, SPC_BLOCK_SZ - sizeof(size_t));
c->tl = htonl(c->tl);
for (i = 0; i < sizeof(size_t); i++)
c->bf[SPC_BLOCK_SZ - sizeof(size_t) + i] = ((unsigned char *)(&c->tl))[i];
spc_mdc2_oneblock(c, c->bf);
memcpy(output, c->h1, SPC_BLOCK_SZ);
memcpy(output+SPC_BLOCK_SZ, c->h2, SPC_BLOCK_SZ);
}
[ Team LiB ]
[ Team LiB ]
6.17 Using Smaller MAC Tags
6.17.1 Problem
You want to trade off security for smaller authentication tags.
6.17.2 Solution
Truncate the least significant bytes of the MAC, but make sure to retain adequate security.
6.17.3 Discussion
Normal software environments should not have a need for smaller MACs because space is not at a
premium. However, if you're working in a space-constrained embedded environment, it's acceptable
to truncate MAC tags if space is a requirement. Note that doing so will not reduce computation costs.
In addition, keep in mind that security goes down as the tag size decreases, particularly if you are
not using a nonce (or are using a small nonce).
[ Team LiB ]
[ Team LiB ]
6.18 Making Encryption and Message Integrity Work
Together
6.18.1 Problem
You need to encrypt data and ensure the integrity of your data at the same time.
6.18.2 Solution
Use either an encryption mode that performs both encryption and message integrity checking, such
as CWC mode, or encrypt data with one secret key and use a second key to MAC the encrypted data.
6.18.3 Discussion
Unbelievably, many subtle things can go wrong when you try to perform encryption and message
integrity checking in tandem. This is part of the reason encryption modes such as CWC and CCM are
starting to appear, both of which perform encryption and message integrity checking together, and
they are still secure (such modes are compared in Recipe 5.4, and CWC is discussed in Recipe 5.10).
However, if you're not willing to use one of those encryption modes, follow these guidelines to ensure
security:
Use two separate keys, one for encryption and one for MAC'ing.
Encrypt first, then MAC the ciphertext.
We recommend encrypting, then MAC'ing the ciphertext (the encrypt-then-authenticate paradigm;
see Figure 6-4) because other approaches aren't always secure.
Figure 6-4. The encrypt-then-authenticate paradigm
For example, if you're using a stream-based mode such as CTR (discussed inRecipe 5.9), or if you're
using CBC mode (Recipe 5.6), you will still have a good design if you use a MAC to authenticate the
plaintext, then encrypt both the plaintext and the MAC tag (the authenticate-then-encrypt paradigm;
see Figure 6-5). But if you fail to encrypt the MAC tag (this is actually called theauthenticate-andencrypt paradigm, because the two operations could happen in parallel with the same results; see
Figure 6-6), or if you use an encryption mode with bad security properties (such as ECB mode), you
might have something significant to worry about.
Figure 6-5. The authenticate-then-encrypt paradigm
Another advantage of encrypting first is that if you're careful, your servers can reject bogus
messages before decrypting them, which can help improve resistance to denial of service attacks. We
consider this of minor interest at best.
Figure 6-6. The authenticate-and-encrypt paradigm
The one significant reason you might want to encrypt first is to give extra protection for message
authentication, assuming your MAC is cryptographically broken. The hope is that if the privacy
component isn't broken, the MAC may still be secure, which may or may not be the case, depending
on the nature of the attack.
In practice, if you're using a well-designed system-a dual-use scheme such as CWC mode-the
correct functioning of authentication and encryption both assume the correct functioning of an
underlying cipher such as AES. If this is broken, we consider all bets to be off anyway!
6.18.4 See Also
Recipe 5.4, Recipe 5.6, Recipe 5.9
[ Team LiB ]
[ Team LiB ]
6.19 Making Your Own MAC
6.19.1 Problem
You do not want to use an off-the-shelf MAC; you would prefer just to use a hash function.
6.19.2 Solution
Don't do it.
6.19.3 Discussion
Many things can go wrong, and there's really no reason not to use one of the excellent existing
solutions. Nonetheless, some people believe they can do message authentication in a straightforward
manner using a hash function, and they believe they would be better off doing this than using an offthe-shelf solution. Basically, they think they can do something less complex and faster with just a
hash function. Other people think that creating some sort of "encryption with redundancy" scheme is
a good idea, even though many such schemes are known to be bad.
OMAC, HMAC, CMAC, and MAC127, which we compare in Recipe 6.4, are all simple and efficient, and
there are proofs that those constructions are secure with some reasonable assumptions. Will that be
the case for anything you put together manually?
6.19.4 See Also
Recipe 6.4
[ Team LiB ]
[ Team LiB ]
6.20 Encrypting with a Hash Function
6.20.1 Problem
You want to encrypt with a hash function, possibly because you want only a single cryptographic
primitive and to use a hash function instead of a block cipher.
6.20.2 Solution
Use a hash-based MAC in counter mode.
6.20.3 Discussion
Use a separate key from the one you use to authenticate, and don't forget to
use the MAC for message authentication as well!
You can turn any MAC into a stream cipher essentially by using the MAC in counter(CTR) mode. You
should not use a hash function by itself, because it's difficult to ensure that you're doing so securely.
Basically, if you have a MAC built on a hash function that is known to be a secure MAC, it will be
secure for encryption in CTR mode.
There is no point in using any MAC that uses a block cipher in any way, such as OMAC, CMAC, or
MAC127 (see Recipe 6.4 for a discussion of MAC solutions). Instead, just use the underlying block
cipher in CTR mode, which will produce the same results. This recipe should be used only when you
don't want to use a block cipher.
Using a MAC in CTR mode is easy. As illustrated inFigure 6-7, key it, then use it to "MAC" a nonce
concatenated with a counter. XOR the results with the plaintext.
Figure 6-7. Encrypting with a MAC in counter mode
For example, here's a function that encrypts a stream of data using the HMAC-SHA1 implementation
from Recipe 6.10:
#include <stdlib.h>
#include <string.h>
#define NONCE_LEN 16
#define CTR_LEN
16
#define MAC_OUT_SZ 20
unsigned char *spc_MAC_encrypt(unsigned char *in, size_t len, unsigned char *key,
int keylen, unsigned char *nonce) {
/* We're using a 128-bit nonce and a 128-bit counter, packed into one variable */
int
i;
size_t
blks;
SPC_HMAC_CTX ctx;
unsigned char ctr[NONCE_LEN + CTR_LEN];
unsigned char keystream[MAC_OUT_SZ];
unsigned char *out;
if (!(out = (unsigned char *)malloc(len))) abort( );
SPC_HMAC_Init(&ctx, key, keylen);
memcpy(ctr, nonce, NONCE_LEN);
memset(ctr + NONCE_LEN, 0, CTR_LEN);
blks = len / MAC_OUT_SZ;
while (blks--) {
SPC_HMAC_Reset(&ctx);
SPC_HMAC_Update(&ctx, ctr, sizeof(ctr));
SPC_HMAC_Final(out, &ctx);
i = NONCE_LEN + CTR_LEN;
/* Increment the counter. */
while (i-- != NONCE_LEN)
if (++ctr[i]) break;
for (i = 0; i < MAC_OUT_SZ; i++) *out++ = *in++ ^ keystream[i];
}
if (len % MAC_OUT_SZ) {
SPC_HMAC_Reset(&ctx);
SPC_HMAC_Update(&ctx, ctr, sizeof(ctr));
SPC_HMAC_Final(out, &ctx);
for (i = 0; i < len % MAC_OUT_SZ; i++) *out++ = *in++ ^ keystream[i];
}
return out;
}
Note that this code is not optimized; it works on individual characters to avoid potential endian-ness
problems.
6.20.4 See Also
Recipe 6.4, Recipe 6.10
[ Team LiB ]
[ Team LiB ]
6.21 Securely Authenticating a MAC (Thwarting Capture
Replay Attacks)
6.21.1 Problem
You are using a MAC, and you need to make sure that when you get a message, you properly validate
the MAC.
6.21.2 Solution
If you're using an ever-increasing nonce (which we strongly recommend), check to make sure that the
nonce associated with the message is indeed larger than the last one. Then, of course, recalculate the
MAC and check against the transmitted MAC.
6.21.3 Discussion
The following is an example of validating a MAC using the OMAC1 implementation in Recipe 6.11,
along with AES-128. We nonce the MAC by using a 16-byte nonce as the first block of input, as
discussed in Recipe 6.12 . Note that we expect you to be MAC'ing the ciphertext, as discussed in Recipe
6.18 .
#include <stdlib.h>
#include <string.h>
/* last_nonce must be a pointer to a NULL on first invocation. */
int spc_omac1_validate(unsigned char *ct, size_t ctlen, unsigned char sent_nonce[16],
unsigned char *sent_tag, unsigned char *k,
unsigned char **last_nonce) {
int
i;
SPC_OMAC_CTX
c;
unsigned char calc_tag[16]; /* Maximum tag size for OMAC. */
spc_omac1_init(&c, k, 16);
if (*last_nonce) {
for (i = 0; i < 16; i++)
if (sent_nonce[i] > (*last_nonce)[i]) goto nonce_okay;
return 0; /* Nonce is equal to or less than last nonce. */
}
nonce_okay:
spc_omac_update(&c, sent_nonce, 16);
spc_omac_update(&c, ct, ctlen);
spc_omac_final(&c, calc_tag);
for (i = 0; i < 16; i++)
if (calc_tag[i] != sent_tag[i]) return 0;
if (sent_nonce) {
if (!*last_nonce) *last_nonce = (unsigned char *)malloc(16);
if (!*last_nonce) abort(); /* Consider an exception instead. */
memcpy(*last_nonce, sent_nonce, 16);
}
return 1;
}
This code requires you to pass in a char ** to track the last nonce that was received. You're expected
to allocate your own char * , set it to NULL , and pass in the address of that char * . The validate
function will update that memory with the last valid nonce it saw, so that it can check the new nonce
against the last nonce to make sure it got bigger. The function will return 1 if the MAC validates;
otherwise, it will return 0.
6.21.4 See Also
Recipe 6.11 , Recipe 6.12 , Recipe 6.18
[ Team LiB ]
[ Team LiB ]
6.22 Parallelizing MACs
6.22.1 Problem
You want to use a MAC, but parallelize the computation.
6.22.2 Solution
Run multiple MACs at the same time, then MAC the resulting tags together (and in order) to yield one
tag.
6.22.3 Discussion
If you want to perform message authentication in parallel, you can do so with a variation of
interleaving (which we discussed for block ciphers inRecipe 5.12 through Recipe 5.14) Basically, you
can run multiple MACs keyed separately at the same time and divide up the data stream between
those MACs. For example, you might run two MACs in parallel and alternate sending 64 bytes to each
MAC.
The problem with doing this is that your two MAC's authentication values need to be tied together;
otherwise, someone could rearrange the two halves of your stream. For example, if you were to MAC
this message:
ABCDEFGHIJKL
where MAC 1 processed the first six characters, yielding tag A, and MAC 2 processed the final six,
yielding tag B, an attacker could rearrange the message to be:
GHIJKLABCDEF
and report the tags in the reverse order. Authentication would not detect the change. To solve this
problem, once all the MACs are reported, MAC all the resulting tags to create a composite MAC.
Alternatively, you could take the last MAC context and add in the MAC values for the other contexts
before generating the tag, as illustrated in Figure 6-8.
Figure 6-8. Properly interleaving MACs
If your MAC accepts a nonce, you can use the same key for each context, as long as you never reuse
a {key, nonce} pair.
Here's a simple sequential example that runs two OMAC1 contexts, alternating every 512 bytes, that
produces a single resulting tag of 16 bytes. It uses the OMAC1 implementation fromRecipe 6.11.
#include <stddef.h>
#define INTERLEAVE_SIZE 512
unsigned char *spc_double_mac(unsigned char *text, size_t len,
unsigned char key[16]) {
SPC_OMAC_CTX
ctx1, ctx2;
unsigned char *out = (unsigned char *)malloc(16);
unsigned char tmp[16];
if (!out) abort(); /* Consider throwing an exception instead. */
spc_omac1_init(&ctx1, key, 16);
spc_omac1_init(&ctx2, key, 16);
while (len > 2 * INTERLEAVE_SIZE) {
spc_omac_update(ctx1, text, INTERLEAVE_SIZE);
spc_omac_update(ctx2, text + INTERLEAVE_SIZE, INTERLEAVE_SIZE);
text += 2 * INTERLEAVE_SIZE;
len -= 2 * INTERLEAVE_SIZE;
}
if (len > INTERLEAVE_SIZE) {
spc_omac_update(ctx1, text, INTERLEAVE_SIZE);
spc_omac_update(ctx2, text + INTERLEAVE_SIZE, len - INTERLEAVE_SIZE);
} else spc_omac_update(ctx1, text, len);
spc_omac_final(ctx1, tmp);
spc_omac_update(ctx2, tmp, sizeof(tmp));
spc_omac_final(ctx2, out);
return out;
}
6.22.4 See Also
Recipe 5.11, Recipe 6.12 through Recipe 6.14
[ Team LiB ]
[ Team LiB ]
Chapter 7. Public Key Cryptography
Many of the recipes in this chapter are too low-level for general-purpose use.
We recommend that you first try to find what you need in Chapter 9 before
resorting to building solutions yourself. If you do use this chapter, please be
careful, read all of our warnings, and do consider the higher-level constructs we
suggest.
Public key cryptography offers a number of important advantages over traditional, or symmetric,
cryptography:
Key agreement
Traditional cryptography is done with a single shared key. There are obvious limitations to that
kind of cryptography, though. The biggest one is the key agreement problem: how do two
parties that wish to communicate do so securely? One option is to use a more secure out-ofband medium for transport, such as telephone or postal mail. Such a solution is rarely
practical, however, considering that we might want to do business securely with an online
merchant we've never previously encountered. Public key cryptography can help solve the key
agreement problem, although doing so is not as easy as one might hope. We touch upon this
issue throughout this chapter and expand upon it in Chapter 8.
Digital signatures
Another useful service that public key cryptography can provide is digital signatures, which
allow for message integrity checks without a shared secret. In a symmetric environment with
message authentication codes (MACs) for message authentication, a user can determine that
someone with the MAC key sent a particular message, but it isn't possible to provide third
parties any assurance as to who signed a message (this ability is callednon-repudiation). That
is, if Alice and Bob exchange messages using a MAC, and somehow Charlie has been given a
copy of the message and the MAC key, Charlie will be able to determine only that someone
who had the MAC key at some point before him generated the message. Using only symmetric
cryptography, he cannot distinguish between messages created by Alice and messages created
by Bob in a secure manner.
Establishing identity
A third use of public key cryptography is in authentication schemes for purposes of identity
establishment (e.g., login). We'll largely skip this topic for now, coming back to it inChapter 8.
In practice, public key cryptography is a complex field with a lot of infrastructure built around it.
Using it effectively requires a trusted third party, which is usually apublic key infrastructure (PKI).
This entire chapter is effective only in the context of some kind of working PKI,
even if it is an ad hoc PKI. Refer to Chapter 10 for PKI basics.
In this chapter, we'll describe the fundamentals of key exchange and digital signatures at a low level.
Unfortunately, this area is quite vast, and we've had to limit our discussion to the topics we believe
are most relevant to the average developer. We expect that supplemental recipes for more esoteric
topics will gradually become available on this book's web site, based on reader contributions.
There are certain interesting topics that we simply don't have room for in this chapter. For example,
elliptic curve cryptography is a type of public key encryption that can offer security similar to that of
the traditional algorithms presented in this chapter, with notable speed gains. While elliptic curve
cryptography doesn't speed things up so much that you would want to use it in places where
traditional public key cryptography isn't useful, it does allow you to better scale the number of
simultaneous connections you can handle. While elliptic curve cryptography is a fascinating and useful
area, however, it's not nearly as important as the rest of the material in this chapter, particularly
considering that standards and implementations for this kind of public key cryptography have
emerged only in the last few years, and that the technology isn't yet deployed on a wide scale (plus,
there are intellectual property issues when using the standard).
We've also limited our examples to OpenSSL whenever it supports the topic under discussion. While
we do cover Microsoft's CryptoAPI in several other chapters side by side with OpenSSL, we won't be
discussing it in this chapter. CryptoAPI's support for public key cryptography is sufficiently crippled
that providing solutions that use it would be incomplete to the point of providing you with little or no
utility. In particular, CryptoAPI provides no means to exchange keys in any kind of recognized
portable format (such as DER or PEM; see Recipe 7.16 and Recipe 7.17) and no means by which keys
other than randomly generated ones can generate digital signatures. These limitations effectively rule
out a large portion of public key cryptography's common uses, which make up the majority of coderelated recipes in this chapter.
The code presented in this chapter should otherwise translate easily to most other functionally
complete libraries. Again, in situations where this is not the case, we expect that reader contributions
will eventually mend this problem.
We expect that for most purposes, the general-purpose networking recipes
provided in Chapter 9 are likely to be more applicable to the average
developer. Unless you really know what you're doing, there is significant risk of
needing a prosthetic foot when using this chapter.
[ Team LiB ]
[ Team LiB ]
7.1 Determining When to Use Public Key Cryptography
7.1.1 Problem
You want to know when to use public key cryptography as opposed to symmetric cryptography.
7.1.2 Solution
Use public key cryptography only for key exchange or digital signatures. Otherwise, there are a lot of
disadvantages and things that can go wrong (particularly when using it for general-purpose
encryption). Because public key operations are computationally expensive, limit digital signatures to
authentication at connection time and when you need non-repudiation.
Whenever you use public key encryption, be sure to remember also to perform
proper authentication and message integrity checking.
7.1.3 Discussion
Public key cryptography allows parties to communicate securely without having to establish a key
through a secure channel in advance of communication, as long as a trusted third party is involved.
Therein lies the first rub. Generally, if you use public key cryptography, you need to determine
explicitly with whom you're communicating, and you need to check with a trusted third party in a
secure manner. To do that, you will need to have identification data that is bound to your trusted
third party, which you'll probably need to authenticate over some secure channel.
Figure 7-1 (A) illustrates why public key cryptography on its own does not provide secure
communication. Suppose the server has a {public key, private key} pair, and the client wishes to
communicate with the server. If the client hasn't already securely obtained the public key of the
server, it will need to request those credentials, generally over an insecure channel (e.g., over the
Internet). What is to stop an attacker from replacing the server's credentials with its own credentials?
Then, when the client tries to establish a secure connection, it could actually be talking to an
attacker, who may choose to either masquerade as the server or just sit in the middle,
communicating with the server on the client's behalf, as shown inFigure 7-1 (B). Such an attack is
known as a man-in-the-middle attack.
Figure 7-1. A man-in-the-middle attack
Getting a server's key over an insecure channel is okay as long as there is some way of determining
whether the key the client gets back is actually the right one. The most common way of establishing
trust is by using a PKI, a concept we explain in Recipe 10.1.
Another issue when it comes to public key cryptography is speed. Even the fastest public key
cryptography that's believed to be secure is orders of magnitude slower than traditional symmetric
encryption. For example, a Pentium class machine may encrypt data using RC4 with 128-bit keys at
about 11 cycles per byte (the key size isn't actually a factor in RC4's speed). The same machine can
process data at only about 2,500 cycles per byte when using an optimized version of vanilla RSA and
2,048-bit keys (the decrypt speed is the limiting factor-encryption is usually about 20 times faster).
True, versions of RSA based on elliptic curves can perform better, but they still don't perform well for
general-purpose use.
Because public key encryption is so expensive, it is only really useful for processing small pieces of
data. As a result, there are two ways in which public key cryptography is widely used: key exchange
(done by encrypting a symmetric encryption key) and digital signatures (done by encrypting a hash
of the data to sign; see Recipe 7.12, Recipe 7.13 and Recipe 7.15).
When using digital signatures for authentication, a valid signature on a piece of data proves that the
signer has the correct secret key that corresponds to the public key we have (of course, we then
need to ensure that the public key really does belong to the entity we want to authenticate). The
signature also validates that the message arrived without modification. However, it's not a good idea
to use digital signatures for all of our message integrity needs because it is incredibly slow. You
essentially need public key cryptography to provide message integrity for a key exchange, and while
you're doing that, you might as well use it to authenticate (the authentication is often free).
However, once you have a symmetric key to use, you should use MACs to provide message integrity
because they're far more efficient.
The only time it makes sense to use a digital signature outside the context of initial connection
establishment is when there is a need for non-repudiation. That is, if you wish to be able to
demonstrate that a particular user "signed" a piece of data to a third party, you must use public key-
based algorithms. Symmetric key integrity checks are not sufficient for implementing nonrepudiation, because anyone who has the shared secret can create valid message integrity values.
There's no way to bind the output of the integrity check algorithm to a particular entity in the system.
Public key cryptography allows you to demonstrate that someone who has the private key associated
with a particular public key "signed" the data, and that the data hasn't changed since it was signed.
7.1.4 See Also
Recipe 7.12, Recipe 7.13, Recipe 7.15, Recipe 10.1
[ Team LiB ]
[ Team LiB ]
7.2 Selecting a Public Key Algorithm
7.2.1 Problem
You want to determine which public key algorithms you should support in your application.
7.2.2 Solution
RSA is a good all-around solution. There is also nothing wrong with using Diffie-Hellman for key
exchange and DSA for digital signatures.
Elliptic curve cryptography can provide the same levels of security with much smaller key sizes and
with faster algorithms, but this type of cryptography is not yet in widespread use.
7.2.3 Discussion
Be sure to see the general recommendations for using public key cryptography
in Recipe 7.1.
Security-wise, there's no real reason to choose any one of the common algorithms over the others.
There are also no intellectual property restrictions on any of these algorithms (though there may be
on some elliptic curve variants). RSA definitely sees the most widespread use.
RSA private key operations can be made much faster than operations in other algorithms, which is a
major reason it's preferred in many circumstances. Public key operations across RSA and the two
other major algorithms (Diffie-Hellman and DSA) tend to be about the same speed.
When signing messages, RSA tends to be about the same speed or perhaps a bit slower than DSA,
but it is about 10 times faster for verification, if implemented properly. RSA is generally much
preferable for key establishment, because some protocols can minimize server load better if they're
based on RSA.
Elliptic curve cryptography is appealing in terms of efficiency, but there is a practical downside in that
the standard in this space (IEEE P1363) requires licensing patents fromCerticom. We believe you can
probably implement nonstandard yet still secure elliptic curve cryptosystems that completely avoid
any patent restrictions, but we would never pursue such a thing without first obtaining legal counsel.
7.2.4 See Also
Recipe 7.1
[ Team LiB ]
[ Team LiB ]
7.3 Selecting Public Key Sizes
7.3.1 Problem
You've decided to use public key cryptography, and you need to know what size numbers you should
use in your system. For example, if you want to use RSA, should you use 512-bit RSA or 4,096-bit
RSA?
7.3.2 Solution
There's some debate on this issue. When using RSA, we recommend a 2,048-bit instantiation for
general-purpose use. Certainly don't use fewer than 1,024 bits, and use that few only if you're not
worried about long-term security from attackers with big budgets. ForDiffie-Hellman and DSA, 1,024
bits should be sufficient. Elliptic curve systems can use far fewer bits.
7.3.3 Discussion
The commonly discussed "bit size" of an algorithm should be an indication of the algorithm's
strength, but it measures different things for different algorithms. For example, with RSA, the bit size
really refers to the bit length of a public value that is a part of the public key. It just so happens that
the combined bit length of the two secret primes tends to be about the same size. With DiffieHellman, the bit length refers to a public value, as it does with DSA.[1] In elliptic curve
cryptosystems, bit length does roughly map to key size, but there's a lot you need to understand to
give an accurate depiction of exactly what is being measured (and it's not worth understanding for
the sake of this discussion-"key size" will do!).
[1]
With DSA, there is another parameter that's important to the security of the algorithm, which few people ever
mention, let alone understand (though the second parameter tends not to be a worry in practice). See any good
cryptography book, such as Applied Cryptography, or the Handbook of Applied Cryptography, for more
information.
Obviously, we can't always compare numbers directly, even across public key algorithms, never mind
trying to make a direct comparison to symmetric algorithms. A 256-bit AES key probably offers more
security than you'll ever need, whereas the strength of a 256-bit key in a public key cryptosystem
can be incredibly weak (as with vanilla RSA) or quite strong (as is believed to be the case for
standard elliptic variants of RSA). Nonetheless, relative strengths in the public key world tend to be
about equal for all elliptic algorithms and for all nonelliptic algorithms. That is, if you were to talk
about "1,024-bit RSA" and "1,024-bit Diffie-Hellman," you'd be talking about two things that are
believed to be about as strong as each other.
In addition, in the block cipher world, there's an assumption that the highly favored ciphers do their
job well enough that the best practical attack won't be much better than brute force. Such an
assumption seems quite reasonable because recent ciphers such as AES were developed to resist all
known attacks. It's been quite a long time since cryptographers have found a new methodology for
attacking block ciphers that turns into a practical attack when applied to a well-regarded algorithm
with 128-bit key sizes or greater. While there are certainly no proofs, cryptographers tend to be very
comfortable with the security of 128-bit AES for the long term, even if quantum computing becomes
a reality.
In the public key world, the future impact of number theory and other interesting approaches such as
quantum computing is a much bigger unknown. Cryptographers have a much harder time predicting
how far out in time a particular key size is going to be secure. For example, in 1990,Ron Rivest, the
"R" in RSA, believed that a 677-bit modulus would provide average security, and 2,017 bits would
provide high security, at least through the year 2020. Ten years later, 512 bits was clearly weak, and
1,024 was the minimum size anyone was recommending (though few people have recommended
anything higher until more recently, when 2,048 bits is looking like the conservative bet).
Cryptographers try to relate the bit strength of public key primitives to the key strength of symmetric
key cryptosystems. That way, you can figure out what sort of protection you'd like in a symmetric
world and pick public key sizes to match. Usually, the numbers you will see are guesses, but they
should be as educated as possible if they come from a reputable source.Table 7-1 lists our
recommendations. Note that not everyone agrees what numbers should be in each of these boxes
(for example, the biggest proponents of elliptic curve cryptography will suggest larger numbers in the
nonelliptic curve public key boxes). Nonetheless, these recommendations shouldn't get you into
trouble, as long as you check current literature in four or five years to make sure that there haven't
been any drastic changes.
Table 7-1. Recommended key strengths for public key cryptography
Desired security level
Acceptable (probably secure 5
years out, perhaps 10)
Symmetric
length
"Regular" public key
lengths
Elliptic curve
sizes
80 bits
2048 bits (1024 bits in some
cases; see below)
160 bits
Good (may even last forever)
128 bits
2048 bits
224 bits
Paranoid
192 bits
4096 bits
384 bits
Very paranoid
256 bits
8192 bits
512 bits
Remember that "acceptable" is usually good enough; cryptography is rarely the
weakest link in a system!
Until recently, 1,024 bits was the public key size people were recommending. Then, in 2003,Adi
Shamir (the "S" in RSA) and Eran Tromer demonstrated that a $10 million machine could be used to
break RSA keys in under a year. That means 1,024-bit keys are very much on the liberal end of the
spectrum. They certainly do not provide adequate secrecy if you're worried about well-funded
attackers such as governments.
[ Team LiB ]
[ Team LiB ]
7.4 Manipulating Big Numbers
7.4.1 Problem
You need to do integer-based arithmetic on numbers that are too large to represent in 32 (or 64)
bits. For example, you may need to implement a public key algorithm that isn't supported by the
library you're using.
7.4.2 Solution
Use a preexisting library for arbitrary-precision integer math, such as the BIGNUM library that comes
with OpenSSL (discussed here) or the GNU Multi-Precision (gmp) library.
7.4.3 Discussion
Most of the world tends to use a small set of public key primitives, and the popular libraries reflect
that fact. There are a lot of interesting things you can do with public key cryptography that are in the
academic literature but not in real libraries, such as a wide variety of different digital signature
techniques.
If you need such a primitive and there aren't good free libraries that implement it, you may need to
strike off on your own, which will generally require doing math with very large numbers.
In general, arbitrary-precision libraries work by keeping an array of words that represents the value
of a number, then implementing operations on that representation in software. Math on very large
numbers tends to be slow, and software implementation of such math tends to be even slower. While
there are tricks that occasionally come in handy (such as using a fast Fourier transform for
multiplication instead of longhand multiplication when the numbers are large enough to merit it),
such libraries still tend to be slow, even though the most speed-critical parts are often implemented
in hand-optimized assembly. For this reason, it's a good idea to stick with a preexisting library for
arbitrary-precision arithmetic if you have general-purpose needs.
In this recipe, we'll cover the OpenSSL BIGNUM library, which supports arbitrary precision math,
albeit with a very quirky interface.
7.4.3.1 Initialization and cleanup
The BIGNUM library generally lives in libcrypto, which comes with OpenSSL. Its API is defined in
openssl/bn.h. This library exports the BIGNUM type. BIGNUM objects always need to be initialized
before use, even if they're statically declared. For example, here's how to initialize a statically
allocated BIGNUM object:
BIGNUM bn;
void BN_init(&bn);
If you're dynamically allocating a BIGNUM object, OpenSSL provides a function that allocates and
initializes in one fell swoop:
BIGNUM *bn = BN_new(
);
You should not use malloc( ) to allocate a BIGNUM object because you are likely to confuse the
library (it may believe that your object is unallocated).
If you would like to deallocate a BIGNUM object that was allocated using BN_new( ), pass it to
BN_free( ).
In addition, for security purposes, you may wish to zero out the memory used by aBIGNUM object
before you deallocate it. If so, pass it to BN_clear( ), which explicitly overwrites all memory in use
by a BIGNUM context. You can also zero and free in one operation by passing the object to
BIGNUM_clear_free( ).
void BN_free(BIGNUM *bn);
void BN_clear(BIGNUM *bn);
void BN_clear_free(BIGNUM *bn);
Some operations may require you to allocate BN_CTX objects. These objects are scratch space for
temporary values. You should always create BN_CTX objects dynamically by calling BN_CTX_new( ),
which will return a dynamically allocated and initializedBN_CTX object. When you're done with a
BN_CTX object, destroy it by passing it to BN_CTX_free( ).
BN_CTX *BN_CTX_new(void);
int BN_CTX_free(BN_CTX *c);
7.4.3.2 Assigning to BIGNUM objects
Naturally, we'll want to assign numerical values to BIGNUM objects. The easiest way to do this is to
copy another number. OpenSSL provides a way to allocate a new BIGNUM object and copy a second
BIGNUM object all at once:
BIGNUM *BN_dup(BIGNUM *bn_to_copy);
In addition, if you already have an allocated context, you can just callBN_copy( ), which has the
following signature:
BIGNUM *BN_copy(BIGNUM *destination_bn, BIGNUM *src_bn);
This function returns destination_bn on success.
You can assign the value 0 to a BIGNUM object with the following function:
int BN_zero(BIGNUM *bn);
You can also use BN_clear( ), which will write over the old value first.
There's a similar function for assigning the value 1:
int BN_one(BIGNUM *bn);
You can also assign any nonnegative value that fits in an unsigned long using the function
BN_set_word( ):
int BN_set_word(BIGNUM *bn, unsigned long value);
The previous three functions return 1 on success.
If you need to assign a positive number that is too large to represent as anunsigned long, you can
represent it in binary as a sequence of bytes and have OpenSSL convert the binary buffer to a
BIGNUM object. Note that the bytes must be in order from most significant to least significant. That is,
you can't just point OpenSSL at memory containing a 64-bit long long (_ _int64 on Windows) on a
little-endian machine, because the bytes will be backwards. Once your buffer is in the right format,
you can use the function BN_bin2bn( ), which has the following signature:
BIGNUM *BN_bin2bn(unsigned char *buf, int len, BIGNUM *c);
This function has the following arguments:
buf
Buffer containing the binary representation to be converted.
len
Length of the buffer in bits. It does not need to be a multiple of eight. Extra bits in the buffer
will be ignored.
c
BIGNUM object to be loaded with the value from the binary representation. This may be
specified as NULL, in which case a new BIGNUM object will be dynamically allocated. The new
BIGNUM object will be returned if one is allocated; otherwise, the specifiedBIGNUM object will be
returned.
None of the previously mentioned techniques allows us to represent a negative number. The simplest
technique is to get the corresponding positive integer, then use the following macro that takes a
pointer to a BIGNUM object and negates it (i.e., multiplies by -1):
#define BN_negate(x) ((x)->neg = (!((x)->neg)) & 1)
7.4.3.3 Getting BIGNUM objects with random values
Before you can get BIGNUM objects with random values, you need to have seeded the OpenSSL
random number generator. (With newer versions of OpenSSL, the generator will be seeded for you
on most platforms; see Recipe 11.9).
One common thing to want to do is generate a random prime number. The API for this is somewhat
complex:
BIGNUM *BN_generate_prime(BIGNUM *ret, int num, int safe, BIGNUM *add, BIGNUM *rem,
void (*callback)(int, int, void *), void *cb_arg);
This function has the following arguments:
ret
An allocated BIGNUM object, which will also be returned on success. If it is specified asNULL, a
new BIGNUM object will be dynamically allocated and returned instead. The prime number that
is generated will be stored in this object.
num
Number of bits that should be in the generated prime number.
safe
Boolean value that indicates whether a safe prime should be generated. A safe prime is a prime
number for which the prime minus 1 divided by 2 is also a prime number. For Diffie-Hellman
key exchange, a safe prime is required; otherwise, it usually isn't necessary.
add
If this argument is specified as non-NULL, the remainder must be the value of the rem
argument when the generated prime number is divided by this number. The use of this
argument is important for Diffie-Hellman key exchange.
rem
If the add argument is specified as non-NULL, this value should be the remainder when the
generated prime number is divided by the value of the add argument. If this argument is
specified as NULL, a value of 1 is used.
callback
Pointer to a callback function to be called during prime generation to report progress. It may be
specified as NULL, in which case no progress information is reported.
cb_arg
If a callback function to monitor progress is specified, this argument is passed directly to the
callback function.
Note that, depending on your hardware, it can take several seconds to generate a prime number,
even if you have sufficient entropy available. The callback functionality allows you to monitor the
progress of prime generation. Unfortunately, there's no way to determine how much time finding a
prime will actually take, so it's not feasible to use this callback to implement a progress meter. We do
not discuss the callback mechanism any further in this book. However, callbacks are discussed in the
book Network Security with OpenSSL by John Viega, Matt Messier, and Pravir Chandra (O'Reilly &
Associates) as well as in the online OpenSSL documentation.
It's much simpler to get a BIGNUM object with a random value:
int BN_rand_range(BIGNUM *result, BIGNUM *range);
This function requires you to pass in a pointer to an initializedBIGNUM object that receives the
random value. The possible values for the random number are zero through one less than the
specified range.
Additionally, you can ask for a random number with a specific number of bits:
int BN_rand(BIGNUM *result, int bits, int top, int bottom);
This function has the following arguments:
result
The generated random number will be stored in this BIGNUM object.
bits
Number of bits that the generated random number should contain.
top
If the value of this argument is 0, the most significant bit in the generated random number will
be set. If it is -1, the most significant bit can be anything. If it is 1, the 2 most significant bits
will be set. This is useful when you want to make sure that the product of 2 numbers of a
particular bit length will always have exactly twice as many bits.
bottom
If the value of this argument is 1, the resulting random number will be odd. Otherwise, it may
be either odd or even.
7.4.3.4 Outputting BIGNUM objects
If you wish to represent your BIGNUM object as a binary number, you can use BN_bn2bin( ), which
will store the binary representation of the BIGNUM object in the buffer pointed to by the outbuf
argument:
int BN_bn2bin(BIGNUM *bn, unsigned char *outbuf);
Unfortunately, you first need to know in advance how big the output buffer needs to be. You can
learn this by calling BN_num_bytes( ), which has the following signature:
int BN_num_bytes(BIGNUM *bn);
BN_bn2bin( ) will not output the sign of a number. You can manually query
the sign of the number by using the following macro:
#define BN_is_negative(x) ((x)->neg)
The following is a wrapper that converts a BIGNUM object to binary, allocating its result via malloc( )
and properly setting the most significant bit to 1 if the result is negative. Note that you have to pass
in a pointer to an unsigned integer. That integer gets filled with the size of the returned buffer in
bytes.
#include <stdlib.h>
#include <openssl/bn.h>
#define BN_is_negative(x) ((x)->neg)
unsigned char *BN_to_binary(BIGNUM *b, unsigned int *outsz) {
unsigned char *ret;
*outsz = BN_num_bytes(b);
if (BN_is_negative(b)) {
(*outsz)++;
if (!(ret = (unsigned char *)malloc(*outsz))) return 0;
BN_bn2bin(b, ret + 1);
ret[0] = 0x80;
} else {
if (!(ret = (unsigned char *)malloc(*outsz))) return 0;
BN_bn2bin(b, ret);
}
return ret;
}
Remember that the binary format used by a BIGNUM object is big-endian, so if
you wish to take the binary output and put it in an integer on a little-endian
architecture (such as an Intel x86 machine), you must byte-swap each word.
If you wish to print BIGNUM objects, you can print to a FILE pointer using BN_print_fp( ). It will
only print in hexadecimal format, but it does get negative numbers right:
int BN_print_fp(FILE *f, BIGNUM *bn);
Note that you have to supply your own newline if required.
You can also convert a BIGNUM object into a hexadecimal or a base-10 string using one of the
following two functions:
char *BN_bn2hex(BIGNUM *bn);
char *BN_bn2dec(BIGNUM *bn);
You can then do what you like with the string, but note that when it comes time to deallocate the
string, you must call OPENSSL_free( ).
7.4.3.5 Common tests on BIGNUM objects
The function BN_cmp( ) compares two BIGNUM objects, returning 0 if they're equal, 1 if the first one
is larger, or -1 if the second one is larger:
int BN_cmp(BIGNUM *a, BIGNUM *b);
The function BN_ucmp( ) is the same as BN_cmp( ), except that it compares the absolute values of
the two numbers:
int BN_ucmp(BIGNUM *a, BIGNUM *b);
The following functions are actually macros that test the value of a singleBIGNUM object, and return 1
or 0 depending on whether the respective condition is true or false:
BN_is_zero(BIGNUM *bn);
BN_is_one(BIGNUM *bn);
BN_is_odd(BIGNUM *bn);
In addition, you might wish to test a number to see if it is prime. The API for that one is a bit
complex:
int BN_is_prime(BIGNUM *bn, int numchecks, void (*callback)(int, int, void *),
BN_CTX *ctx, void *cb_arg);
int BN_is_prime_fasttest(BIGNUM *bn, int numchecks,
void (*callback)(int, int, void *), BN_CTX *ctx,
void *cb_arg);
These functions do not guarantee that the number is prime. OpenSSL uses the Rabin-Miller primality
test, which is an iterative, probabilistic algorithm, where the probability that the algorithm is right
increases dramatically with every iteration. The checks argument specifies how many iterations to
use. We strongly recommend using the built-in constant BN_prime_checks, which makes probability
of the result being wrong negligible. When using that value, the odds of the result being wrong are 1
in 280.
This function requires you to pass in a pointer to an initializedBN_CTX object, which it uses as scratch
space.
Prime number testing isn't that cheap. BN_is_prime_fasttest( ) explicitly tries factoring by a
bunch of small primes, which speeds things up when the value you're checking might not be prime
(which is the case when you're generating a random prime).
Because testing the primality of a number can be quite expensive, OpenSSL provides a way to
monitor status by using the callback and cb_arg arguments. In addition, because the primalitytesting algorithm consists of performing a fixed number of iterations, this callback can be useful for
implementing a status meter of some sort.
If you define the callback, it is called after each iteration. The first argument is always 1, the second
is always the iteration number (starting with 0), and the third is the value ofcb_arg (this can be
used to identify the calling thread if multiple threads are sharing the same callback).
7.4.3.6 Math operations on BIGNUM objects
Yes, we saved the best for last. Table 7-2 lists the math operations supported by OpenSSL's BIGNUM
library.
Table 7-2. Math operations supported by OpenSSL's BIGNUM library
Function
int BN_add(BIGNUM *r, BIGNUM
*a, BIGNUM *b);
int BN_sub(BIGNUM *r, BIGNUM
*a, BIGNUM *b);
Description
Limitations
Comments
r = a+b
r = a-b
r
a and r
b
Values may be the same, but
the objects may not be.
Function
Description
Limitations
Comments
int BN_mul(BIGNUM *r, BIGNUM
*a, BIGNUM *b, BN_CTX *ctx);
r = axb
Use BN_lshift or BN_lshift1
instead to multiply by a known
power of 2 (it's faster).
int BN_lshift1(BIGNUM *r,
BIGNUM *a);
r = ax2
Fastest way to multiply by 2.
Fastest way to multiply by a
power of 2 where n>1.
int BN_lshift(BIGNUM *r,
BIGNUM *a, int n);
r=
int BN_rshift1(BIGNUM *r,
BIGNUM *a);
r = a÷2
Fastest way to divide by 2.
int BN_rshift(BIGNUM *r,
BIGNUM *a, int n);
r=a÷2n
Fastest way to divide by a
power of 2 where n>1.
int BN_sqr(BIGNUM *r, BIGNUM
*a, BN_CTX *ctx);
r = axa
Faster than BN_mul.
ax2n
int BN_exp(BIGNUM
*r, BIGNUM *a, BIGNUM *p, BN_ r = ap
CTX *ctx);
int BN_div(BIGNUM *d, BIGNUM d = a÷b
*r, BIGNUM *a, BIGNUM *b, BN_
CTX *ctx);
r = a mod b
int BN_mod(BIGNUM *r, BIGNUM
*a, BIGNUM *b, BN_CTX *ctx);
r = a mod b
r
d
a, r
a, d
a, r
r
p
a, r
b, r
b
Values may be the same, but
the objects may not be.
Values may be the same, but
the objects may not be; either
d or r may be NULL.
b
Values may be the same, but
the objects may not be.
b
Values may be the same, but
the objects may not be.
int BN_nnmod(BIGNUM *r,
BIGNUM *a, BIGNUM *b, BN_CTX
*ctx);
r = |a mod b| r
int BN_mod_add(BIGNUM *r,
BIGNUM *a, BIGNUM *b, BIGNUM
*m, BN_CTX *ctx);
r = |a+b mod r a, r
m|
m
b, r
Values may be the same, but
the objects may not be.
int BN_mod_sub(BIGNUM *r,
BIGNUM *a, BIGNUM *b, BIGNUM
*m, BN_CTX *ctx);
r = |a-b mod
m|
r a, r
m
b, r
Values may be the same, but
the objects may not be.
int BN_mod_mul(BIGNUM *r,
BIGNUM *a, BIGNUM *b, BIGNUM
*m, BN_CTX *ctx);
r = |axb mod r a, r
m|
m
b, r
Values may be the same, but
the objects may not be.
int BN_mod_sqr(BIGNUM *r,
BIGNUM *a, BIGNUM *b, BIGNUM
*m, BN_CTX *ctx);
r = |axa mod
m|
int BN_mod_exp(BIGNUM *r,
BIGNUM *a, BIGNUM *p, BIGNUM
*m, BN_CTX *ctx);
r = |ap mod
m|
r
a, r
a, r
m
Values may be the same, but
the objects may not be.
Faster than BN_mod_mul.
r a, r
m
p, r
Values may be the same, but
the objects may not be.
Function
Description
Limitations
Comments
Returns NULL on error, such
as when no modular inverse
exists.
BIGNUM *BN_mod_
inverse(BIGNUM *r, BIGNUM
*a, BIGNUM *m, BN_CTX *ctx);
int BN_gcd(BIGNUM *r, BIGNUM
*a, BIGNUM *b, BN_CTX *ctx);
r = GCD(a,b)
int BN_add_word(BIGNUM *a,
BN_ULONG w);
a = a+w
int BN_sub_word(BIGNUM *a,
BN_ULONG w);
a = a-w
int BN_mul_word(BIGNUM *a,
BN_ULONG *a);
a = axw
BN_ULONG BN_div_word(BIGNUM
*a, BN_ULONG w);
a = a÷w
BN_ULONG BN_mod_word(BIGNUM
*a, BN_ULONG w);
return a mod
w
Greatest common divisor.
Returns the remainder.
All of the above functions that return an int return 1 on success or 0 on failure. BN_div_word( )
and BN_mod_word( ) return their result. Note that the type BN_ULONG is simply a typedef for
unsigned long.
7.4.4 See Also
Recipe 11.9
[ Team LiB ]
[ Team LiB ]
7.5 Generating a Prime Number (Testing for Primality)
7.5.1 Problem
You need to generate a random prime number or test to see if a number is prime.
7.5.2 Solution
Use the routines provided by your arbitrary-precision math library, or generate a random odd
number and use the Rabin-Miller primality test to see whether the number generated is actually
prime.
7.5.3 Discussion
Good arbitrary-precision math libraries have functions that can automatically generate primes and
determine to a near certainty whether a number is prime. In addition, these libraries should have
functionality that produces "safe" primes (that is, a prime whose value minus 1 divided by 2 is also
prime). You should also be able to ask for a prime that gives a particular remainder when you divide
that prime by a particular number. The last two pieces of functionality are useful for generating
parameters for Diffie-Hellman key exchange.
The OpenSSL functionality for generating and testing primes is discussed inRecipe 7.4.
The most common way primes are generated is by choosing a random odd number of the desired bit
length from a secure pseudo-random source (we discuss pseudo-randomness in depth inRecipe
11.1). Generally, the output of the random number generator will have the first and last bits set.
Setting the last bit ensures that the number is odd; no even numbers are primes. Setting the first bit
ensures that the generated number really is of the desired bit length.
When generating RSA keys, people usually set the first two bits of all their potential primes. That
way, if you multiply two primes of the same bit length together, they'll produce a result that's exactly
twice the bit length. When people talk about the "bit length of an RSA key," they're generally talking
about the size of such a product.
For determining whether a number is prime, most people use theRabin-Miller test, which can
determine primality with high probability. Every time you run the Rabin-Miller test and the test
reports the number "may be prime," the actual probability of the number being prime increases
dramatically. By the time you've run five iterations and have received "may be prime" every time,
the odds of the random value's not being prime aren't worth worrying about.
If you are generating a prime number for use in Diffie-Hellman key exchange (i.e., a "safe" prime),
you should test the extra conditions before you even check to see if the number itself is prime
because doing so will speed up tests.
We provide the following code that implements Rabin-Miller on top of the OpenSSL BIGNUM library,
which almost seems worthless, because if you're using OpenSSL, it already contains this test as an
API function (again, see Recipe 7.4). However, the OpenSSL BIGNUM API is straightforward. It
should be easy to take this code and translate it to work with whatever package you're using for
arbitrary precision math.
Do note, though, that any library you use is likely already to have a function
that performs this work for you.
In this code, we explicitly attempt division for the first 100 primes, although we recommend trying
more primes than that. (OpenSSL itself tries 2,048, a widely recommended number.) We omit the
additional primes for space reasons, but you can find a list of those primes on this book's web site. In
addition, we use spc_rand( ) to get a random binary value. See Recipe 11.2 for a discussion of this
function.
#include <stdlib.h>
#include <openssl/bn.h>
#define NUMBER_ITERS
#define NUMBER_PRIMES
static
2,
59,
137,
227,
313,
419,
509,
};
5
100
unsigned long primes[NUMBER_PRIMES] = {
3,
5,
7,
11, 13, 17, 19, 23,
61, 67, 71, 73, 79, 83, 89, 97,
139, 149, 151, 157, 163, 167, 173, 179,
229, 233, 239, 241, 251, 257, 263, 269,
317, 331, 337, 347, 349, 353, 359, 367,
421, 431, 433, 439, 443, 449, 457, 461,
521, 523, 541
29,
101,
181,
271,
373,
463,
31,
103,
191,
277,
379,
467,
37,
107,
193,
281,
383,
479,
41,
109,
197,
283,
389,
487,
43,
113,
199,
293,
397,
491,
47,
127,
211,
307,
401,
499,
53,
131,
223,
311,
409,
503,
static int is_obviously_not_prime(BIGNUM *p);
static int passes_rabin_miller_once(BIGNUM *p);
static unsigned int calc_b_and_m(BIGNUM *p, BIGNUM *m);
int spc_is_probably_prime(BIGNUM *p) {
int i;
if (is_obviously_not_prime(p)) return 0;
for (i = 0; i < NUMBER_ITERS; i++)
if (!passes_rabin_miller_once(p))
return 0;
return 1;
}
BIGNUM *spc_generate_prime(int nbits) {
BIGNUM
*p = BN_new( );
unsigned char binary_rep[nbits / 8];
/* This code assumes we'll only ever want to generate primes with the number of
* bits a multiple of eight!
*/
if (nbits % 8 || !p) abort( );
for (;;) {
spc_rand(binary_rep, nbits / 8);
/* Set the two most significant and the least significant bits to 1. */
binary_rep[0] |= 0xc0;
binary_rep[nbits / 8 - 1] |= 1;
/* Convert this number to its BIGNUM representation */
if (!BN_bin2bn(binary_rep, nbits / 8, p)) abort( );
/* If you're going to test for suitability as a Diffie-Hellman prime, do so
* before calling spc_is_probably_prime(p).
*/
if (spc_is_probably_prime(p)) return p;
}
}
/* Try simple division with all our small primes. This is, for each prime, if it
* evenly divides p, return 0. Note that this obviously doesn't work if we're
* checking a prime number that's in the list!
*/
static int is_obviously_not_prime(BIGNUM *p) {
int i;
for (i = 0; i < NUMBER_PRIMES; i++)
if (!BN_mod_word(p, primes[i])) return 1;
return 0;
}
static int passes_rabin_miller_once(BIGNUM *p) {
BIGNUM
a, m, z, tmp;
BN_CTX
*ctx;
unsigned int b, i;
/* Initialize a, m, z and tmp properly. */
BN_init(&a);
BN_init(&m);
BN_init(&z);
BN_init(&tmp);
ctx = BN_CTX_new( );
b = calc_b_and_m(p, &m);
/* a is a random number less than p: */
if (!BN_rand_range(&a, p)) abort( );
/* z = a^m mod p. */
if (!BN_mod_exp(&z, &a, &m, p, ctx)) abort(
);
/* if z = 1 at the start, pass. */
if (BN_is_one(&z)) return 1;
for (i = 0; i < b; i++) {
if (BN_is_one(&z)) return 0;
/* if z = p-1, pass! */
BN_copy(&tmp, &z);
if (!BN_add_word(&tmp, 1)) abort(
if (!BN_cmp(&tmp, p)) return 1;
);
/* z = z^2 mod p */
BN_mod_sqr(&tmp, &z, p, ctx);
BN_copy(&z, &tmp);
}
/* if z = p-1, pass! */
BN_copy(&tmp, &z);
if (!BN_add_word(&tmp, 1)) abort(
if (!BN_cmp(&tmp, p)) return 1;
);
/* Fail! */
return 0;
}
/* b = How many times does 2 divide p - 1?
This gets returned.
* m is (p-1)/(2^b).
*/
static unsigned int calc_b_and_m(BIGNUM *p, BIGNUM *x) {
unsigned int b;
if (!BN_copy(x, p)) abort( );
if (!BN_sub_word(x, 1)) abort(
for (b = 0; !BN_is_odd(x);
BN_div_word(x, 2);
return b;
}
7.5.4 See Also
Recipe 7.4, Recipe 11.1, Recipe 11.2
[ Team LiB ]
b++)
);
[ Team LiB ]
7.6 Generating an RSA Key Pair
7.6.1 Problem
You want to use RSA to encrypt data, and you need to generate a public key and its corresponding
private key.
7.6.2 Solution
Use a cryptography library's built-in functionality to generate an RSA key pair. Here we'll describe the
OpenSSL API. If you insist on implementing RSA yourself (generally a bad idea), see the following
discussion.
7.6.3 Discussion
Be sure to see Recipe 7.1 and Recipe 7.2 for general-purpose guidance on
using public key cryptography.
The OpenSSL library provides a function, RSA_generate_key( ), that generates a {public key,
private key} pair, which is stored in an RSA object. The signature for this function is:
RSA *RSA_generate_key(int bits, unsigned long exp, void (*cb)(int, int, void),
void *cb_arg);
This function has the following arguments:
bits
Size of the key to be generated, in bits. This must be a multiple of 16, and at a bare minimum
it should be at least 1,024. 2,048 is a common value, and 4,096 is used occasionally. The more
bits in the number, the more secure and the slower operations will be. We recommend 2,048
bits for general-purpose use.
exp
Fixed exponent to be used with the key pair. This value is typically 3, 17, or 65,537, and it can
vary depending on the exact context in which you're using RSA. For example, public key
certificates encode the public exponent within them, and it is almost universally one of these
three values. These numbers are common because it's fast to multiply other numbers with
these numbers, particularly in hardware. This number is stored in theRSA object, and it is used
for both encryption and decryption operations.
cb
Callback function; when called, it allows for monitoring the progress of generating a prime. It is
passed directly to the function's internal call to BN_generate_prime( ), as discussed in Recipe
7.4.
cb_arg
Application-specific argument that is passed directly to the callback function, if one is specified.
If you need to generate an "n-bit" key manually, you can do so as follows:
1. Choose two random primes p and q, both of length n/2, using the techniques discussed in
Recipe 7.5. Ideally, both primes will have their two most significant bits set to ensure that the
public key (derived from these primes) is exactly n bits long.
2. Compute n, the product of p and q. This is the public key.
3. Compute d, the inverse of the chosen exponent, modulo (p - 1) x (q - 1). This is generally done
using the extended Euclidean algorithm, which is outside the scope of this book. See the
Handbook of Applied Cryptography by Alfred J. Menezes, Paul C. Van Oorschot, and Scott A.
Vanstone for a good discussion of the extended Euclidean algorithm.
4. Optionally, precompute some values that will significantly speed up private key operations
(decryption and signing): d mod (p - 1), d mod (q - 1), and the inverse of q mod p (again using
the extended Euclidean algorithm).
Here's an example, using the OpenSSL BIGNUM library, of computing all the values you need for a
key, given two primes p and q:
#include <openssl/bn.h>
typedef struct {
BIGNUM
*n;
unsigned long e; /* This number should generally be small. */
} RSA_PUBKEY;
typedef struct {
BIGNUM *n;
BIGNUM *d; /* The actual private key. */
/* These aren't necessary, but speed things up if used. If you do use them,
you don't need to keep n or d around. */
BIGNUM *p;
BIGNUM *q;
BIGNUM *dP, *dQ, *qInv;
} RSA_PRIVATE;
void spc_keypair_from_primes(BIGNUM *p, BIGNUM *q, unsigned long e,
RSA_PUBKEY *pubkey, RSA_PRIVATE *privkey)
{
BN_CTX *x = BN_CTX_new( );
BIGNUM p_minus_1, q_minus_1, one, tmp, bn_e;
pubkey->n
privkey->d
pubkey->e
privkey->p
privkey->q
=
=
=
=
=
privkey->n = BN_new(
BN_new( );
e;
p;
q;
);
BN_mul(pubkey->n, p, q, x);
BN_init(&p_minus_1);
BN_init(&q_minus_1);
BN_init(&one);
BN_init(&tmp);
BN_init(&bn_e);
BN_set_word(&bn_e, e);
BN_one(&one);
BN_sub(&p_minus_1, p, &one);
BN_sub(&q_minus_1, q, &one);
BN_mul(&tmp, &p_minus_1, &q_minus_1, x);
BN_mod_inverse(privkey->d, &bn_e, &tmp, x);
/* Compute extra values.
privkey->dP
= BN_new(
privkey->dQ
= BN_new(
privkey->qInv = BN_new(
*/
);
);
);
BN_mod(privkey->dP, privkey->d, &p_minus_1, x);
BN_mod(privkey->dQ, privkey->d, &q_minus_1, x);
BN_mod_inverse(privkey->qInv, q, p, x);
}
7.6.4 See Also
Recipe 7.1, Recipe 7.2, Recipe 7.5
[ Team LiB ]
[ Team LiB ]
7.7 Disentangling the Public and Private Keys in OpenSSL
7.7.1 Problem
You are using OpenSSL and have a filled RSA object. You wish to remove the private parts of the key,
leaving only the public key, so that you can serialize the data structure and send it off to a party who
should not have the private information.
7.7.2 Solution
Remove all elements of the structure except for n and e.
7.7.3 Discussion
OpenSSL lumps the private key and the public key into a singleRSA structure. They do this because
the information in the public key is useful to anyone with the private key. If an entity needs only the
public key, you're supposed to clear out the rest of the data.
#include <openssl/rsa.h>
void remove_private_key(RSA *r) {
r->d = r->p = r->q = r->dmp1 = r->dmq1 = r->iqmp = 0;
}
Be sure to deallocate the BIGNUM objects if you're erasing the last reference to them.
Any party that has the private key should also hold on to the public key.
[ Team LiB ]
[ Team LiB ]
7.8 Converting Binary Strings to Integers for Use with
RSA
7.8.1 Problem
You need to encode a string as a number for use with the RSA encryption algorithm.
7.8.2 Solution
Use the standard PKCS #1 method for converting a nonnegative integer to a string of a specified
length. PKCS #1 is the RSA Security standard for encryption with the RSA encryption algorithm.[2]
[2]
For the PKCS #1 specification, see http://www.rsasecurity.com/rsalabs/pkcs/pkcs-1/.
7.8.3 Discussion
The PKCS #1 method for representing binary strings as integers is simple. You simply treat the
binary representation of the string directly as the binary representation of the number, where the
string is considered a list of bytes from most significant to least significant (big-endian notation).
For example, if you have the binary string "Test", you would have a number represented as a list of
ASCII values. In decimal, these values are:
84, 101, 115, 116
This would map to the hexadecimal value:
0x54657374
If you simply treat the hexadecimal value as a number, you'll get the integer representation. In base
10, the previous number would be 1415934836.
If, for some reason, you need to calculate this value manually given the ASCII values of the integers,
you would compute the following:
84x2563 + 101x256 2 + 115x256 1 + 116x256 0
In the real world, your arbitrary-precision math library will probably have a way to turn binary strings
into numbers that is compatible with the PKCS algorithm. For example, OpenSSL provides
BN_bin2bn( ), which is discussed in Recipe 7.4.
If you need to perform this conversion yourself, make sure that your numerical representation uses
either an array of char values or an array of unsigned int values. If you use the former, you can
use the binary string directly as a number. If you use the latter, you will have to byte-swap each
word on a little-endian machine before treating the string as a number. On a big-endian machine,
you need not perform any swap.
7.8.4 See Also
PKCS #1 page: http://www.rsasecurity.com/rsalabs/pkcs/pkcs-1/
Recipe 7.4
[ Team LiB ]
[ Team LiB ]
7.9 Converting Integers into Binary Strings for Use with
RSA
7.9.1 Problem
You have a number as a result of an RSA operation that you'd like to turn into a binary string of a
fixed length.
7.9.2 Solution
Use the inverse of the previous recipe, padding the start of the string with zero-bits, if necessary, to
reach the desired output length. If the number is too big, return an error.
7.9.3 Discussion
In practice, you should be using a binary representation of very large integers that stores a value as
an array of values of type unsigned int or type char. If you're using a little-endian machine and
word-sized storage, each word will need to be byte- swapped before the value can be treated as a
binary string.
Byte swapping can be done with the htonl( ) macro, which can be imported by including arpa/inet.h
on Unix or winsock.h on Windows.
[ Team LiB ]
[ Team LiB ]
7.10 Performing Raw Encryption with an RSA Public Key
7.10.1 Problem
You want to encrypt a small message using an RSA public key so that only an entity with the
corresponding private key can decrypt the message.
7.10.2 Solution
Your cryptographic library should have a straightforward API to the RSA encryption algorithm: you
should be able to give it the public key, the data to encrypt, a buffer for the results, an indication of
the data's length, and a specification as to what kind of padding to use (EME-OAEP padding is
recommended).
When using OpenSSL, this can be done with the RSA_public_encrypt( ) function, defined in
openssl/rsa.h.
If, for some reason, you need to implement RSA on your own (which we strongly recommend
against), refer to the Public Key Cryptography Standard (PKCS) #1, Version 2.1 (the latest version).
7.10.3 Discussion
Be sure to read the generic considerations for public key cryptography inRecipe
7.1 and Recipe 7.2.
Conceptually, RSA encryption is very simple. A message is translated into an integer and encrypted
with integer math. Given a message m written as an integer, if you want to encrypt to a public key,
you take the modulus n and the exponent e from that public key. Then compute c = me mod n,
where c is the ciphertext, written as an integer. Given the ciphertext, you must have the private key
to recover m. The private key consists of a single integer d, which can undo the encipherment with
the operation m = cd mod n.
This scheme is believed to be as "hard" as factoring a very large number. That's becausen is the
product of two secret primes, p and q. Given p and q, it is easy to compute d. Without those two
primes, it's believed that the most practical way to decrypt messages is by factoringn to get p and q.
RSA is mathematically simple and elegant. Unfortunately, a straightforward implementation of RSA
based directly on the math will usually fall prey to a number of attacks. RSA itself is secure, but only
if it is deployed correctly, and that can be quite a challenge. Therefore, if you're going to use RSA
(and not something high-level), we strongly recommend sticking to preexisting standards. In
particular, you should use a preexisting API or, at the very worst, follow PKCS#1 recommendations
for deployment.
It's important to note that using RSA properly is predicated on your having
received a known-to-be-valid public key over a secure channel (otherwise,
man-in-the-middle attacks are possible; see Recipe 7.1 for a discussion of this
problem). Generally, secure public key distribution is done with a PKI (see
Recipe 10.1 for an introduction to PKI).
From the average API's point of view, RSA encryption is similar to standard symmetric encryption,
except that there are practical limitations imposed on RSA mainly due to the fact thatRSA is brutally
slow compared to symmetric encryption. As a result, many libraries have two APIs for RSA
encryption: one performs "raw" RSA encryption, and the other uses RSA to encrypt a temporary key,
then uses that temporary key to encrypt the data you actually wanted to encrypt. Such an interface
is sometimes called an enveloping interface.
As with symmetric encryption, you need to pass in relevant key material, the input buffer, and the
output buffer. There will be a length associated with the input buffer, but you are probably expected
to know the size of the output in advance. With OpenSSL, if you have a pointer to anRSA object x,
you can call RSA_size(x) to determine the output size of an RSA encryption, measured in bytes.
When performing raw RSA encryption, you should expect there to be a small maximum message
length. Generally, the maximum message length is dependent on the type of padding that you're
using.
While RSA is believed to be secure if used properly, it is very easy not to use
properly. Secure padding schemes are an incredibly important part of securely
deploying RSA. Note that there's no good reason to invent your own padding
format (you strongly risk messing something up, too). Instead, we recommend
EME-OAEP padding (specified in PKCS #1 v2.0 or later).
There are primarily two types of padding: PKCS #1 v1.5 padding and EME-OAEP padding. The latter
is specified in Version 2.0 and later of PKCS #1, and is recommended for all new applications. Use
PKCS #1 v1.5 padding only for legacy systems. Do not mix padding types in a single application.
With EME-OAEP padding, the message is padded by a random value output from a cryptographic
one-way hash function. There are two parameters for EME-OAEP padding: the hash function to use
and an additional function used internally by the padding mechanism. The only internal function in
widespread use is called MGF1 and is defined in PKCS #1 v2.0 and later. While any cryptographic
one-way hash algorithm can be used with EME-OAEP padding, many implementations are hardwired
to use SHA1. Generally, you should decide which hash algorithm to use based on the level of security
you need overall in your application, assuming that hash functions give you half their output length in
security. That is, if you're comfortable with 80 bits of security (which we believe you should be for the
foreseeable future), SHA1 is sufficient. If you're feeling conservative, use SHA-256, SHA-384, or
SHA-512 instead.
When using EME-OAEP padding, if k is the number of bytes in your public RSA modulus, and ifh is
the number of bytes output by the hash function you choose, the maximum message length you can
encrypt is k - (2h + 2) bytes. For example, if you're using 2,048-bit RSA and SHA1, thenk = 2,048 /
8 and h = 20. Therefore, you can encrypt up to 214 bytes. With OpenSSL, specifying EME-OAEP
padding forces the use of SHA1.
Do not use PKCS #1 v1.5 public key padding for any purpose other than encrypting session keys or
hash values. This form of padding can encrypt messages up to 11 bytes smaller than the modulus
size in bytes. For example, if you're using 2,048-bit RSA, you can encrypt 245-byte messages.
With OpenSSL, encryption with RSA can be done using the functionRSA_public_encrypt( ):
int RSA_public_encrypt(int l, unsigned char *pt, unsigned char *ct, RSA *r, int p);
This function has the following arguments:
l
Length of the plaintext to be encrypted.
pt
Buffer that contains the plaintext data to be encrypted.
ct
Buffer into which the resulting ciphertext data will be placed. The size of the buffer must be
equal to the size in bytes of the public modulus. This value can be obtained by passing theRSA
object to RSA_size( ).
r
RSA object containing the public key to be used to encrypt the plaintext data. The public
modulus (n) and the public exponent (e) must be filled in, but everything else may be absent.
p
Type of padding to use.
The constants that may be used to specify the type of padding to use, as well as the prototype for
RSA_public_encrypt( ), are defined in the header file openssl/rsa.h. The defined constants are:
RSA_PKCS1_PADDING
Padding mode specified in version 1.5 of PKCS #1. This mode is in wide use, but it should only
be used for compatibility. Use the EME-OAEP padding method instead.
RSA_PKCS1_OAEP_PADDING
EME-OAEP padding as specified in PKCS #1 Version 2.0 and later. It is what you should use for
new applications.
RSA_SSLV23_PADDING
The SSL and TLS protocols specify a slight variant of PKCS #1 v1.5 padding. This shouldn't be
used outside the context of the SSL or TLS protocols.
RSA_NO_PADDING
This mode disables padding. Do not use this mode unless you're using it to implement a
known-secure padding mode.
When you're encrypting with RSA, the message you're actually trying to encrypt is represented as an
integer. The binary string you pass in is converted to an integer for you, using the algorithm
described in Recipe 7.8.
You can encrypt only one integer at a time with most low-level interfaces, and the OpenSSL interface
is no exception. This is part of the reason there are limits to message size. In practice, you should
never need a larger message size. Instead, RSA is usually used to encrypt a temporary key for a
much faster encryption algorithm, or to encrypt some other small piece of data.
If there are a small number of possible plaintext inputs toRSA encryption, the
attacker can figure out which plaintext was used via a dictionary attack.
Therefore, make sure that there are always a reasonable number of possible
plaintexts and that all plaintexts are equally likely. Again, it is best to simply
encrypt a 16-byte symmetric key.
If you forego padding (which is insecure; we discuss it just to explain how RSA works), the number
you encrypt must be a value between 0 and n - 1, where n is the public modulus (the public key).
Also, the value must be represented in the minimum number of bytes it takes to representn. We
recommend that you not do this unless you absolutely understand the security issues involved. For
example, if you're using OpenSSL, the only reason you should ever consider implementing your own
padding mechanism would be if you wanted to use EME-OAEP padding with a hash algorithm stronger
than SHA1, such as SHA-256. See the PKCS #1 v2.1 document for a comprehensive implementation
guide for EME-OAEP padding.
If you are using a predefined padding method, you don't have to worry about performing any
padding yourself. However, you do need to worry about message length. If you try to encrypt a
message that is too long, RSA_public_encrypt( ) will return 0. Again, you should be expecting to
encrypt messages of no more than 32 bytes, so this should not be a problem.
7.10.4 See Also
PKCS #1 page: http://www.rsasecurity.com/rsalabs/pkcs/pkcs-1/
Recipe 7.1, Recipe 7.2, Recipe 7.8, Recipe 10.1
[ Team LiB ]
[ Team LiB ]
7.11 Performing Raw Decryption Using an RSA Private
Key
7.11.1 Problem
You have a session key encrypted with an RSA public key (probably using a standard padding
algorithm), and you need to decrypt the value with the corresponding RSA private key.
7.11.2 Solution
Your cryptographic library should have a straightforward API-to-RSA decryption algorithm: you
should be able to give it the public key, the data to decrypt, a buffer for the results, and a
specification as to what kind of padding was used for encryption (EME-OAEP padding is
recommended; see Recipe 7.10). The size of the input message will always be equal to the bit length
of RSA you're using. The API function should return the length of the result, and this length will
usually be significantly smaller than the input.
If, for some reason, you need to implement RSA on your own (which we strongly recommend
against), refer to the Public Key Cryptography Standard (PKCS) #1, Version 2.1 (the latest version).
7.11.3 Discussion
While RSA is believed to be secure if used properly, it is very easy to use
improperly. Be sure to read the Recipe on RSA encryption and the generalpurpose considerations for public key encryption in Recipe 7.1 and Recipe 7.2 in
addition to this one.
When using OpenSSL, decryption can be done with the RSA_private_decrypt( ) function, defined in
openssl/rsa.h and shown below. It will return the length of the decrypted string, or -1 if an error
occurs.
int RSA_private_decrypt(int l, unsigned char *ct, unsigned char *pt, RSA *r, int p);
This function has the following arguments:
l
Length in bytes of the ciphertext to be decrypted, which must be equal to the size in bytes of
the public modulus. This value can be obtained by passing theRSA object to RSA_size( ).
ct
Buffer containing the ciphertext to be decrypted.
pt
Buffer into which the plaintext will be written. The size of this buffer must be at least
RSA_size(r) bytes.
r
RSA object containing the private key to be used to decrypt the ciphertext.
p
Type of padding that was used when encrypting. The defined constants for padding types are
enumerated in Recipe 7.10.
Some implementations of RSA decryption are susceptible totiming attacks. Basically, if RSA
decryption operations do not happen in a fixed amount of time, such attacks may be a possibility. A
technique called blinding can thwart timing attacks. The amount of time it takes to decrypt is
randomized somewhat by operating on a random number in the process. To eliminate the possibility
of such attacks, you should always turn blinding on before doing a decryption operation. To thwart
blinding attacks in OpenSSL, you can use the RSA_blinding_on( ) function, which has the following
signature:
int RSA_blinding_on(RSA *r, BN_CTX *x);
This function has the following arguments:
r
RSA object for which blinding should be enabled.
x
BN_CTX object that will be used by the blinding operations as scratch space (seeRecipe 7.4 for
a discussion of BN_CTX objects). It may be specified as NULL, in which case a new one will be
allocated and used internally.
7.11.4 See Also
Recipe 7.1, Recipe 7.2, Recipe 7.4, Recipe 7.10
[ Team LiB ]
[ Team LiB ]
7.12 Signing Data Using an RSA Private Key
7.12.1 Problem
You want to use RSA to digitally sign data.
7.12.2 Solution
Use a well-known one-way hash function to compress the data, then use a digital signing technique
specified in PKCS #1 v2.0 or later. Any good cryptographic library should have primitives for doing
exactly this. OpenSSL provides both a low-level interface and a high-level interface, although the
high-level interface doesn't end up removing any complexity.
7.12.3 Discussion
Digital signing with RSA is roughly equivalent to encrypting with a private key. Basically, the signer
computes a message digest, then encrypts the value with his private key. The verifier also computes
the digest and decrypts the signed value, comparing the two. Of course, the verifier has to have the
valid public key for the entity whose signature is to be verified, which means that the public key
needs to be validated by some trusted third party or transmitted over a secure medium such as a
trusted courier.
Digital signing works because only the person with the correct private key will produce a "signature"
that decrypts to the correct result. An attacker cannot use the public key to come up with a correct
encrypted value that would authenticate properly. If that were possible, it would end up implying that
the entire RSA algorithm could be broken.
PKCS #1 v2.0 specifies two different signing standards, both of which are assumed to operate on
message digest values produced by standard algorithms. Basically, these standards dictate how to
take a message digest value and produce a "signature." The preferred standard isRSASSA-PSS,
which is analogous to RSAES-OAEP, the padding standard used for encryption. It has provable
security properties and therefore is no less robust than the alternative,RSASSA-PKCS1v1.5.[3] There
aren't any known problems with the RSASSA-PKCS1v1.5, however, and it is in widespread use. On
the other hand, few people are currently using RSASSA-PSS. In fact, OpenSSL doesn't support
RSASSA-PSS. If RSASSA-PSS is available in your cryptographic library, we recommend using it,
unless you are concerned about interoperating with a legacy application. Otherwise, there is nothing
wrong with RSASSA-PKCS1v1.5.
[3]
There is a known theoretical problem with RSASSA-PKCS1v1.5, but it is not practical, in that it's actually
harder to attack the scheme than it is to attack the underlying message digest algorithm when using SHA1.
Both schemes should have a similar interface in a cryptographic library supporting RSA. That is,
signing should take the following parameters:
The signer's private key.
The message to be signed. In a low-level API, instead of the actual message, you will be
expected to provide a hash digest of the data you really want to be signing. High-level APIs will
do the message digest operation for you.
An indication of which message digest algorithm was used in the signing. This may be assumed
for you in a high-level API (in which case it will probably be SHA1).
RSASSA-PKCS1v1.5 encodes the message digest value into its result to avoid certain classes of
attack. RSASSA-PSS does no such encoding, but it uses a hash function internally, and that function
should generally be the same one used to create the digest to be signed.
You may or may not need to give an indication of the length of the input message digest. The value
can be deduced easily if the API enforces that the input should be a message digest value. Similarly,
the API may output the signature size, even though it is a well-known value (the same size as the
public RSA modulus-for example, 2,048 bits in 2,048-bit RSA).
OpenSSL supports RSASSA-PKCS1v1.5 only for digital signatures. It does
support raw encrypting with the private key, which you can use to implement
RSASSA-PSS. However, we don't generally recommend this, and you certainly
should not use the raw interface (RSA_private_encrypt( )) for any other
purpose whatsoever.
In OpenSSL, we recommend always using the low-level interface to RSA signing, using the function
RSA_sign( ) to perform signatures when you've already calculated the appropriate hash. The
signature, defined in openssl/rsa.h, is:
int RSA_sign(int md_type, unsigned char *dgst, unsigned int dlen,
unsigned char *sig, unsigned int *siglen, RSA *r);
This function has the following arguments:
md_type
OpenSSL-specific identifier for the hash function. Possible values areNID_sha1, NID_ripemd, or
NID_md5. A fourth value, NID_md5_sha1, can be used to combine MD5 and SHA1 by hashing
with both hash functions and concatenating the results. These four constants are defined in the
header file openssl/objects.h.
dgst
Buffer containing the digest to be signed. The digest should have been generated by the
algorithm specified by the md_type argument.
dlen
Length in bytes of the digest buffer. For MD5, the digest buffer should always be 16 bytes. For
SHA1 and RIPEMD, it should always be 20 bytes. For the MD5 and SHA1 combination, it should
always be 36 bytes.
sig
Buffer into which the generated signature will be placed.
siglen
The number of bytes written into the signature buffer will be placed in the integer pointed to by
this argument. The number of bytes will always be the same size as the public modulus, which
can be determined by calling RSA_size( ) with the RSA object that will be used to generate the
signature.
r
RSA object to be used to generate the signature. The RSA object must contain the private key
for signing.
The high-level interface to RSA signatures is certainly no less complex than computing the digest and
calling RSA_sign( ) yourself. The only advantage of it is that you can minimize the amount of code
you need to change if you would additionally like to support DSA signatures. If you're interested in
this API, see the book Network Security with OpenSSL for more information.
Here's an example of signing an arbitrary message using OpenSSL's RSA_sign( ) function:
#include <openssl/sha.h>
#include <openssl/rsa.h>
#include <openssl/objects.h>
int spc_sign(unsigned char *msg, unsigned int mlen, unsigned char *out,
unsigned int *outlen, RSA *r) {
unsigned char hash[20];
if (!SHA1(msg, mlen, hash)) return 0;
return RSA_sign(NID_sha1, hash, 20, out, outlen, r);
}
[ Team LiB ]
[ Team LiB ]
7.13 Verifying Signed Data Using an RSA Public Key
7.13.1 Problem
You have some data, an RSA digital signature of that data, and the public key that you believe
corresponds to the signature. You want to determine whether the signature is valid. A successful
check would demonstrate both that the data was not modified from the time it was signed (message
integrity) and that the entity with the corresponding public key signed the data (authentication).
7.13.2 Solution
Use the verification algorithm that corresponds to the chosen signing algorithm fromRecipe 7.12.
Generally, this should be included with your cryptographic library.
7.13.3 Discussion
Recipe 7.12 explains the basic components of digital signatures with RSA. When verifying, you will
generally need to provide the following inputs:
The signer's public key.
The signature to be verified.
The message digest corresponding to the message you want to authenticate. If it's a high-level
API, you might be able to provide only the message.
An indication of the message digest algorithm used in the signing operation. Again, this may be
assumed in a high-level API.
The API should simply return indication of success or failure.
Some implementations of RSA signature verification are susceptible to timing attacks. Basically, if
RSA private key operations do not happen in a fixed amount of time, such attacks are possible. A
technique called blinding can thwart timing attacks. The amount of time it takes to decrypt is
randomized somewhat by operating on a random number in the process. To eliminate the possibility
of such attacks, you should always turn blinding on before doing a signature validation operation.
With OpenSSL, blinding can be enabled with by callingRSA_blinding_on( ), which has the following
signature:
int RSA_blinding_on(RSA *r, BN_CTX *x);
This function has the following arguments:
r
RSA object for which blinding should be enabled.
x
BN_CTX object that will be used by the blinding operations as scratch space. (SeeRecipe 7.4 for
a discussion of BN_CTX objects.) It may be specified as NULL, in which case a new one will be
allocated and used internally.
The OpenSSL analog to RSA_sign( ) (discussed in Recipe 7.12) is RSA_verify( ), which has the
following signature:
int RSA_verify(int md_type, unsigned char *dgst, unsigned int dlen,
unsigned char *sig, unsigned int siglen, RSA *r);
This function has the following arguments:
md_type
OpenSSL-specific identifier for the hash function. Possible values areNID_sha1, NID_ripemd, or
NID_md5. A fourth value, NID_md5_sha1, can be used to combine MD5 and SHA1 by hashing
with both hash functions and concatenating the results. These four constants are defined in the
header file openssl/objects.h.
dgst
Buffer containing the digest of the data whose signature is to be verified. The digest should
have been generated by the algorithm specified by the md_type argument.
dlen
Length in bytes of the digest buffer. For MD5, the digest buffer should always be 16 bytes. For
SHA1 and RIPEMD, it should always be 20 bytes. For the MD5 and SHA1 combination, it should
always be 36 bytes.
sig
Buffer containing the signature that is to be verified.
siglen
Number of bytes contained in the signature buffer. The number of bytes should always be the
same size as the public modulus, which can be determined by callingRSA_size( ) with the
RSA object that will be used to verify the signature.
r
RSA object to be used to verify the signature. The RSA object must contain the signer's public
key for verification to be successful.
As we discussed in Recipe 7.12, OpenSSL RSA signatures only support PKCS #1 v1.5 and do not
support RSASSA-PSS.
Here's code that implements verification on an arbitrary message, given a signature and the public
RSA key of the signer:
#include <openssl/bn.h>
#include <openssl/sha.h>
#include <openssl/rsa.h>
#include <openssl/objects.h>
int spc_verify(unsigned char *msg, unsigned int mlen, unsigned char *sig,
unsigned int siglen, RSA *r) {
unsigned char hash[20];
BN_CTX
*c;
int
ret;
if (!(c = BN_CTX_new( ))) return 0;
if (!SHA1(msg, mlen, hash) || !RSA_blinding_on(r, c)) {
BN_CTX_free(c);
return 0;
}
ret = RSA_verify(NID_sha1, hash, 20, sig, siglen, r);
RSA_blinding_off(r);
BN_CTX_free(c);
return ret;
}
7.13.4 See Also
Recipe 7.4, Recipe 7.12
[ Team LiB ]
[ Team LiB ]
7.14 Securely Signing and Encrypting with RSA
7.14.1 Problem
You need to both sign and encrypt data using RSA.
7.14.2 Solution
Sign the concatenation of the public key of the message recipient and the data you actually wish to
sign. Then concatenate the signature to the plaintext, and encrypt everything, in multiple messages if
necessary.
7.14.3 Discussion
Naïve implementations where a message is both signed and encrypted with public key cryptography
tend to be insecure. Simply signing data with a private key and then encrypting the data with a public
key isn't secure, even if the signature is part of the data you encrypt. Such a scheme is susceptible to
an attack called surreptitious forwarding. For example, suppose that there are two servers, S1 and
S2. The client C signs a message and encrypts it with S1's public key. Once S1 decrypts the
message, it can reencrypt it with S2's public key and make it look as if the message came from C.
In a connection-oriented protocol, it could allow a compromised S1 to replay a key transport between
C and S1 to a second server S2. That is, if an attacker compromises S1, he may be able to imitate C
to S2. In a document-based environment such as an electronic mail system, if Alice sends email to
Bob, Bob can forward it to Charlie, making it look as if it came from Alice instead of Bob. For
example, if Alice sends important corporate secrets to Bob, who also works for the company, Bob can
send the secrets to the competition and make it look as if it came from Alice. When the CEO finds
out, it will appear that Alice, not Bob, is responsible.
There are several strategies for fixing this problem. However, encrypting and then signing doesnot
fix the problem. In fact, it makes the system far less secure. A secure solution to this problem is to
concatenate the recipient's public key with the message, and sign that. The recipient can then easily
determine that he or she was indeed the intended recipient.
One issue with this solution is how to represent the public key. The important thing is to be
consistent. If your public keys are stored as X.509 certificates (seeChapter 10 for more on these),
you can include the entire certificate when you sign. Otherwise, you can simply represent the public
modulus and exponent as a single binary string (the DER-encoding of the X.509 certificate) and
include that string when you sign.
The other issue is that RSA operations such as encryption tend to work on small messages. A digital
signature of a message will often be too large to encrypt using public key encryption. Plus, you will
need to encrypt your actual message as well! One way to solve this problem is to perform multiple
public key encryptions. For example, let's say you have a 2,048-bit modulus, and the recipient has a
1,024-bit modulus. You will be encrypting a 16-byte secret and your signature, where that signature
will be 256 bytes, for a total of 272 bytes. The output of encryption to the 1,024-bit modulus is 128
bytes, but the input can only be 86 bytes, because of the need for padding. Therefore, we'd need four
encryption operations to encrypt the entire 272 bytes.
In many client-server architectures where the client initiates a connection, the
client won't have the server's public key in advance. In such a case, the server
will often send a copy of its public key at its first opportunity (or a digital
certificate containing the public key). In this case, the client can't assume that
public key is valid; there's nothing to distinguish it from an attacker's public
key! Therefore, the key needs to be validated using a trusted third party before
the client trusts that the party on the other end is really the intended server.
See Recipe 7.1.
Here is an example of generating, signing, and encrypting a 16-byte secret in a secure manner using
OpenSSL, given a private key for signing and a public key for the recipient. The secret is placed in the
buffer pointed to by the final argument, which must be 16 bytes. The encrypted result is placed in the
third argument, which must be big enough to hold the modulus for the public key.
Note that we represent the public key of the recipient as the binary representation of the modulus
concatenated with the binary representation of the exponent. If you are using any sort of high-level
key storage format such as an X.509 certificate, it makes sense to use the canonical representation
of that format instead. See Recipe 7.16 and Recipe 7.17 for information on converting common
formats to a binary string.
#include
#include
#include
#include
#include
<openssl/sha.h>
<openssl/rsa.h>
<openssl/objects.h>
<openssl/rand.h>
<string.h>
#define MIN(x,y) ((x) > (y) ? (y) : (x))
unsigned char *generate_and_package_128_bit_secret(RSA *recip_pub_key,
RSA *signers_key, unsigned char *sec, unsigned int *olen) {
unsigned char *tmp = 0, *to_encrypt = 0, *sig = 0, *out = 0, *p, *ptr;
unsigned int len, ignored, b_per_ct;
int
bytes_remaining; /* MUST NOT BE UNSIGNED. */
unsigned char hash[20];
/* Generate the secret. */
if (!RAND_bytes(sec, 16)) return 0;
/* Now we need to sign the public key and the secret both.
* Copy the secret into tmp, then the public key and the exponent.
*/
len = 16 + RSA_size(recip_pub_key) + BN_num_bytes(recip_pub_key->e);
if (!(tmp = (unsigned char *)malloc(len))) return 0;
memcpy(tmp, sec, 16);
if (!BN_bn2bin(recip_pub_key->n, tmp + 16)) goto err;
if (!BN_bn2bin(recip_pub_key->e, tmp + 16 + RSA_size(recip_pub_key))) goto err;
/*
if
if
if
Now sign tmp (the hash of it), again mallocing space for the signature. */
(!(sig = (unsigned char *)malloc(BN_num_bytes(signers_key->n)))) goto err;
(!SHA1(tmp, len, hash)) goto err;
(!RSA_sign(NID_sha1, hash, 20, sig, &ignored, signers_key)) goto err;
/* How many bytes we can encrypt each time, limited by the modulus size
* and the padding requirements.
*/
b_per_ct = RSA_size(recip_pub_key) - (2 * 20 + 2);
if (!(to_encrypt = (unsigned char *)malloc(16 + RSA_size(signers_key))))
goto err;
/* The calculation before the mul is the number of encryptions we're
* going to make. After the mul is the output length of each
* encryption.
*/
*olen = ((16 + RSA_size(signers_key) + b_per_ct - 1) / b_per_ct) *
RSA_size(recip_pub_key);
if (!(out = (unsigned char *)malloc(*olen))) goto err;
/* Copy the data to encrypt into a single buffer. */
ptr = to_encrypt;
bytes_remaining = 16 + RSA_size(signers_key);
memcpy(to_encrypt, sec, 16);
memcpy(to_encrypt + 16, sig, RSA_size(signers_key));
p = out;
while (bytes_remaining > 0) {
/* encrypt b_per_ct bytes up until the last loop, where it may be fewer. */
if (!RSA_public_encrypt(MIN(bytes_remaining,b_per_ct), ptr, p,
recip_pub_key, RSA_PKCS1_OAEP_PADDING)) {
free(out);
out = 0;
goto err;
}
bytes_remaining -= b_per_ct;
ptr += b_per_ct;
/* Remember, output is larger than the input. */
p += RSA_size(recip_pub_key);
}
err:
if (sig) free(sig);
if (tmp) free(tmp);
if (to_encrypt) free(to_encrypt);
return out;
}
Once the message generated by this function is received on the server side, the following code will
validate the signature on the message and retrieve the secret:
#include
#include
#include
#include
#include
<openssl/sha.h>
<openssl/rsa.h>
<openssl/objects.h>
<openssl/rand.h>
<string.h>
#define MIN(x,y) ((x) > (y) ? (y) : (x))
/* recip_key must contain both the public and private key. */
int validate_and_retreive_secret(RSA *recip_key, RSA *signers_pub_key,
unsigned char *encr, unsigned int inlen,
unsigned char *secret) {
int
result = 0;
BN_CTX
*tctx;
unsigned int ctlen, stlen, i, l;
unsigned char *decrypt, *signedtext, *p, hash[20];
if (inlen % RSA_size(recip_key)) return 0;
if (!(p = decrypt = (unsigned char *)malloc(inlen))) return 0;
if (!(tctx = BN_CTX_new( ))) {
free(decrypt);
return 0;
}
RSA_blinding_on(recip_key, tctx);
for (ctlen = i = 0; i < inlen / RSA_size(recip_key); i++) {
if (!(l = RSA_private_decrypt(RSA_size(recip_key), encr, p, recip_key,
RSA_PKCS1_OAEP_PADDING))) goto err;
encr += RSA_size(recip_key);
p += l;
ctlen += l;
}
if (ctlen != 16 + RSA_size(signers_pub_key)) goto err;
stlen = 16 + BN_num_bytes(recip_key->n) + BN_num_bytes(recip_key->e);
if (!(signedtext = (unsigned char *)malloc(stlen))) goto err;
memcpy(signedtext, decrypt, 16);
if (!BN_bn2bin(recip_key->n, signedtext + 16)) goto err;
if (!BN_bn2bin(recip_key->e, signedtext + 16 + RSA_size(recip_key))) goto err;
if (!SHA1(signedtext, stlen, hash)) goto err;
if (!RSA_verify(NID_sha1, hash, 20, decrypt + 16, RSA_size(signers_pub_key),
signers_pub_key)) goto err;
memcpy(secret, decrypt, 16);
result = 1;
err:
RSA_blinding_off(recip_key);
BN_CTX_free(tctx);
free(decrypt);
if (signedtext) free(signedtext);
return result;
}
7.14.4 See Also
Recipe 7.1, Recipe 7.16, Recipe 7.17
[ Team LiB ]
[ Team LiB ]
7.15 Using the Digital Signature Algorithm (DSA)
7.15.1 Problem
You want to perform public key-based digital signatures, and you have a requirement necessitating
the use of DSA.
7.15.2 Solution
Use an existing cryptographic library's implementation of DSA.
7.15.3 Discussion
DSA and Diffie-Hellman are both based on the same math problem. DSA only provides digital
signatures; it does not do key agreement or general-purpose encryption. Unlike Diffie-Hellman, the
construction is quite a bit more complex. For that reason, we recommend using an existing
implementation. If you must implement it yourself, obtain the standard available from the NIST web
site (http://www.nist.gov).
With DSA, the private key is used to sign arbitrary data. As is traditionally done with RSA signatures,
the data is actually hashed before it's signed. The DSA standard mandates the use ofSHA1 as the
hash function.
Anyone who has the DSA public key corresponding to the key used to sign a piece of data can
validate signatures. DSA signatures are most useful for authentication during key agreement and for
non-repudiation. We discuss how to perform authentication during key agreement inRecipe 8.18,
using Diffie-Hellman as the key agreement algorithm.
DSA requires three public parameters in addition to the public key: a very large prime number,p; a
generator, g; and a prime number, q, which is a 160-bit prime factor of p - 1.[4] Unlike the generator
in Diffie-Hellman, the DSA generator is not a small constant. Instead, it's a computed value derived
from p, q, and a random number.
[4]
The size of q does impact security, and higher bit lengths can be useful. However, 160 bits is believed to
offer good security, and the DSA standard currently does not allow for other sizes.
Most libraries should have a type representing a DSA public key with the same basic fields. We'll
cover OpenSSL's API; other APIs should be similar.
OpenSSL defines a DSA object that can represent both the private key and the public key in one
structure. Here's the interesting subset of the declaration:
typedef struct {
BIGNUM *p, *q, *g, *pub_key, *priv_key;
} DSA;
The function DSA_generate_parameters( ) will allocate a DSA object and generate a set of
parameters. The new DSA object that it returns can be destroyed with the function DSA_free( ).
DSA *DSA_generate_parameters(int bits, unsigned char *seed, int seed_len,
int *counter_ret, unsigned long *h_ret,
void (*callback)(int, int, void *), void *cb_arg);
This function has the following arguments:
bits
Size in bits of the prime number to be generated. This value must be a multiple of 64. The DSA
standard only allows values up to 1,024, but it's somewhat common to use larger sizes
anyway, and OpenSSL supports that.
seed
Optional buffer containing a starting point for the prime number generation algorithm. It
doesn't seem to speed anything up; we recommend setting it toNULL.
seed_len
If the starting point buffer is not specified as NULL, this is the length in bytes of that buffer. If
the buffer is specified as NULL, this should be specified as 0.
counter_ret
Optional argument that, if not specified as NULL, will have the number of iterations the function
went through to find suitable primes for p and q stored in it.
h_ret
Optional argument that, if not specified as NULL, will have the number of iterations the function
went through to find a suitable generator stored in it.
callback
Pointer to a function that will be called by BN_generate_prime( ) to report status when
generating the primes p and q. It may be specified as NULL, in which case no progress will be
reported. See Recipe 7.4 for a discussion of BN_generate_prime( ).
cb_arg
Application-specific value that will be passed directly to the callback function for progress
reporting if one is specified.
Note that DSA_generate_parameters( ) does not generate an actual key pair. Parameter sets can
be reused across multiple users; key pairs cannot. An OpenSSL DSA object with the parameters set
properly can be used to generate a key pair with the function DSA_generate_key( ), which will
allocate and load BIGNUM objects for the pub_key and priv_key fields. It returns 1 on success.
int DSA_generate_key(DSA *ctx);
With OpenSSL, there is an optional precomputation step to DSA signing. Basically, for each message
you sign, DSA requires you to select a random value and perform some expensive math operations
on that value. You can do this precomputation before there's actually data to sign, or you can wait
until you have data to sign, which will slow down the signature process.
To maintain security, the results of precomputation can only be used for a
single signature. You can precompute again before the next signature, though.
DSA signature precomputation is a two-step process. First, you use DSA_sign_setup( ), which will
actually perform the precomputation of two values, kinv and r:
int DSA_sign_setup(DSA *dsa, BN_CTX *ctx, BIGNUM **kinvp, BIGNUM **rp);
This function has the following arguments:
dsa
Context object containing the parameters and the private key that will be used for signing.
ctx
Optional BN_CTX object that will be used for scratch space (see Recipe 7.4). If it is specified as
NULL, DSA_sign_setup( ) will internally create its own BN_CTX object and free it before
returning.
kinvp
Pointer to a BIGNUM object, which will receive the precomputed kinv value. If the BIGNUM object
is specified as NULL (in other words, a pointer to NULL is specified), a new BIGNUM object will be
automatically allocated. In general, it's best to let OpenSSL allocate theBIGNUM object for you.
rp
Pointer to a BIGNUM object, which will receive the precomputed r value. If the BIGNUM object is
specified as NULL (in other words, a pointer to NULL is specified), a new BIGNUM object will be
automatically allocated. In general, it's best to let OpenSSL allocate theBIGNUM object for you.
The two values computed by the call to DSA_sign_setup( ) must then be stored in the DSA object.
DSA_sign_setup( ) does not automatically store the precomputed values in the DSA object so that a
large number of precomputed values may be stored up during idle cycles and used as needed.
Ideally, OpenSSL would provide an API for storing the precomputed values in aDSA object without
having to directly manipulate the members of the DSA object, but it doesn't. The BIGNUM object
returned as kinvp must be assigned to the kinv member of the DSA object, and the BIGNUM object
returned as rp must be assigned to the r member of the DSA object. The next time a signature is
generated with the DSA object, the precomputed values will be used and freed so that they're not
used again.
Whether or not you've performed the precomputation step, generating a signature with OpenSSL is
done in a uniform way by calling DSA_sign( ), which maps directly to the RSA equivalent (see Recipe
7.12):
int DSA_sign(int md_type, const unsigned char *dgst, int dlen, unsigned char *sig,
unsigned int *siglen, DSA *dsa);
This function has the following arguments:
md_type
OpenSSL-specific identifier for the hash function. It is always ignored because DSA mandates
the use of SHA1. For that reason, you should always specify NID_sha1, which is defined in the
header file openssl/objects.h.
dgst
Buffer containing the digest to be signed. The digest should have been generated by the
algorithm specified by the md_type argument, which for DSA must always be SHA1.
dlen
Length in bytes of the digest buffer. For SHA1, it should always be 20 bytes.
sig
Buffer into which the generated signature will be placed.
siglen
The number of bytes written into the signature buffer will placed in the integer pointed to by
this argument. The number of bytes will always be the same size as the prime parameterq,
which can be determined by calling DSA_size( ) with the DSA object that will be used to
generate the signature.
dsa
DSA object to be used to generate the signature. The DSA object must contain the parameters
and the private key for signing.
Here's a slightly higher-level function that wraps the DSA_sign( ) function, signing an arbitrary
message:
#include <openssl/dsa.h>
#include <openssl/sha.h>
#include <openssl/objects.h>
int spc_DSA_sign(unsigned char *msg, int msglen, unsigned char *sig, DSA *dsa) {
unsigned int ignored;
unsigned char hash[20];
if (!SHA1(msg, msglen, hash)) return 0;
return DSA_sign(NID_sha1, hash, 20, sig, &ignored, dsa);
}
Verification of a signature is done with the functionDSA_verify( ):
int DSA_verify(int type, unsigned char *md, int mdlen, unsigned char *sig,
int siglen, DSA *dsa);
The arguments for DSA_verify( ) are essentially the same as the arguments for DSA_sign( ). The
DSA object must contain the public key of the signer, and the fourth argument, sig, must contain the
signature that is to be verified. Unlike with DSA_sign( ), it actually makes sense to pass in the
length of the signature because it saves the caller from having to check to see if the signature is of
the proper length. Nonetheless, DSA_verify( ) could do without the first argument, and it could
hash the message for you. Here's our wrapper for it:
#include <openssl/dsa.h>
#include <openssl/sha.h>
#include <openssl/objects.h>
int spc_DSA_verify(unsigned char *msg, int msglen, unsigned char *sig, int siglen,
DSA *dsa) {
unsigned char hash[20];
if (!SHA1(msg, msglen, hash)) return 0;
return DSA_verify(NID_sha1, hash, 20, sig, siglen, dsa);
}
7.15.4 See Also
NIST web site: http://www.nist.gov/
Recipe 7.4, Recipe 7.11, Recipe 8.18
[ Team LiB ]
[ Team LiB ]
7.16 Representing Public Keys and Certificates in Binary
(DER Encoding)
7.16.1 Problem
You want to represent a digital certificate or some other cryptographic primitive in a standard binary
format, either for signing or for storing to disk.
7.16.2 Solution
There is an industry-standard way to represent cryptographic objects in binary, but it isn't very
pretty at all. (You need to use this standard if you want to programmatically sign an X.509 certificate
in a portable way.) We strongly recommend sticking to standard APIs for encoding and decoding
instead of writing your own encoding and decoding routines.
When storing data on disk, you may want to use a password to encrypt the DER-encoded
representation, as discussed in Recipe 4.10.
7.16.3 Discussion
ASN.1 is a language for specifying the fields a data object must contain. It's similar in purpose to XML
(which it predates). Cryptographers use ASN.1 extensively for defining precise descriptions of data.
For example, the definition of X.509 certificates is specified in the language. If you look at that
specification, you can clearly see which parts of the certificate are optional and which are required,
and see important properties of all of the fields.
ASN.1 is supposed to be a high-level specification of data. By that, we mean that there could be a
large number of ways to translate ASN.1 data objects into a binary representation. That is, data may
be represented however you want it to be internal to your applications, but if you want to exchange
data in a standard way, you need to be able to go back and forth from your internal representation to
some sort of standard representation. An ASN.1 representation can be encoded in many ways,
though!
The cryptographic community uses distinguished encoding rules (DER) to specify how to map an
ASN.1 specification of a data object to a binary representation. That is, if you look at the ASN.1
specification of an X.509 certificate, and you have all the data ready to go into the certificate, you
can use DER and the ASN.1 specification to encode the data into an interoperable binary
representation.
ASN.1 specifications of data objects can be quite complex. In particular, the specification for X.509v3
is vast because X.509v3 is a highly versatile certificate format. If you plan on reading and writing
DER-encoded data on your own instead of using a cryptographic library, we recommend using an
ASN.1 "compiler" that can take an ASN.1 specification as input and produce C data structures and
routines that encode and parse data in a DER-encoded format. The Enhanced SNACC ASN.1 compiler
is available under the GNU GPL from http://www.getronicsgov.com/hot/snacc_lib.htm.
If you need to do sophisticated work with certificates, you may want to look at the freeware
Certificate Management Library, available from http://www.getronicsgov.com/hot/cml_home.htm. It
handles most operations you can perform on X.509 certificates, including retrieving certificates from
LDAP databases.
Here, we'll show you the OpenSSL APIs for DER-encoding data objects and for converting binary data
into OpenSSL data types. All of the functions in the OpenSSL API either convert OpenSSL's internal
representation to a DER representation (the i2d functions) or convert DER into the internal
representation (the d2i functions).
The basic i2d functions output to memory and take two arguments: the object to convert to DER and
a buffer into which to write the result. The second argument is a pointer to a buffer of unsigned
characters, represented as unsigned char **. That is, if you are outputting into an unsigned char
*x, where x doesn't actually hold the string, but holds the address in memory where that string
starts, you need to pass in the address of x.
OpenSSL requires you to pass in a pointer to a pointer because it takes your
actual pointer and "advances" it. We don't like this feature and have never
found it useful. In general, you should copy over the pointer to your buffer into
a temporary variable, then send in the address of the temporary variable.
Note that you need to know how big a buffer to pass in as the second parameter. To figure that out,
call the function with a NULL value as the second argument. That causes the function to calculate and
return the size.
For example, here's how to DER-encode an RSA public key:
#include <openssl/rsa.h>
/* Returns the malloc'd buffer, and puts the size of the buffer into the integer
* pointed to by the second argument.
*/
unsigned char *DER_encode_RSA_public(RSA *rsa, int *len) {
unsigned char *buf, *next;
*len = i2d_RSAPublicKey(rsa, 0);
if (!(buf = next = (unsigned char *)malloc(*len))) return 0;
i2d_RSAPublicKey(rsa, &next); /* If we use buf here, return buf; becomes wrong */
return buf;
}
For each basic function in the i2d API, there are two additional functions-implemented as
macros-that output to a FILE object or an OpenSSL BIO object, which is the library's generic IO
abstraction.[5] The name of the base function is suffixed with _fp or _bio as appropriate, and the
second argument changes to a FILE or a BIO pointer as appropriate.
[5]
There are three exceptions to this rule, having to do with the OpenSSL EVP interface. We don't discuss (or
even list) the functions here, because we don't cover the OpenSSL EVP interface (it's not a very good
abstraction of anything in our opinion). If you do want to look at this interface, it's covered in the book Network
Security with OpenSSL.
The d2i API converts DER-encoded data to an internal OpenSSL representation. The functions in this
API take three arguments. The first is a pointer to a pointer to the appropriate OpenSSL object (for
example, an RSA ** instead of the expected RSA *). The second is a pointer to a pointer to the buffer
storing the representation (i.e., a char ** instead of a char *). The third is the input length of the
buffer (a long int). The first two arguments are pointers to pointers because OpenSSL "advances"
your pointer just as it does in the i2d API.
The return value is a pointer to the object written. However, if the object cannot be decoded
successfully (i.e., if there's an error in the encoded data stream), aNULL value will be returned. The
first argument may be a NULL value, in which case an object of the appropriate type is allocated and
returned.
Here's an example of converting an RSA public key from DER format to OpenSSL's internal
representation:
#include <openssl/rsa.h>
/* Note that the pointer to the buffer gets copied in. Therefore, when
* d2i_... changes its value, those changes aren't reflected in the caller's copy
* of the pointer.
*/
RSA *DER_decode_RSA_public(unsigned char *buf, long len) {
return d2i_RSAPublicKey(0, &buf, len);
}
As with the i2d interface, all of the functions have macros that allow you to pass in aFILE or an
OpenSSL BIO object, this time so that you may use one as the input source. Those macros take only
two arguments, where the base function takes three. The first argument is theBIO or FILE pointer
from which to read. The second argument is a pointer to a pointer to the output object (for example,
an RSA **). Again, you can pass in a NULL value for this argument. The len argument is omitted; the
library figures it out for itself. It could have figured it out for itself in the base API, but it requires you
to pass in the length so that it may ensure that it doesn't read or write past the bounds of your
buffer.
Table 7-3 lists the most prominent things you can convert to DER and back. The last two rows
enumerate calls that are intended for people implementing actual infrastructure for a PKI, and they
will not generally be of interest to the average developer applying cryptography.[6]
[6]
However, PKCS #7 can be used to store multiple certificates in one data object, which may be appealing to
some, instead of DER-encoding multiple X.509 objects separately.
Table 7-3. Objects that can be converted to and from DER format
Kind of object
RSA public key
OpenSSL
object type
RSA
Base encoding
function
i2d_RSAPublicKey()
Base decoding
function
d2i_RSAPublicKey()
Header File
openssl/rsa.h
Kind of object
OpenSSL
object type
Base encoding
function
Base decoding
function
Header File
RSA private key
RSA
i2d_RSAPrivateKey()
d2i_RSAPrivateKey()
openssl/rsa.h
Diffie-Hellman
parameters
DH
i2d_DHparams()
d2i_DHparams()
openssl/dh.h
DSA parameters
DSA
i2d_DSAparams()
d2i_DSAparams()
openssl/dsa.h
DSA public key
DSA
i2d_DSAPublicKey()
d2i_DSAPublicKey()
openssl/dsa.h
DSA private key
DSA
i2d_DSAPrivateKey()
d2i_DSAPrivateKey()
openssl/dsa.h
X.509 certificate
X509
i2d_X509()
d2i_X509()
openssl/x509.h
X.509 CRL
X509_CRL
i2d_X509_CRL()
d2i_X509_CRL()
openssl/x509.h
PKCS #10
certificate signing X509_REQ
request
i2d_X509_REQ()
d2i_X509_REQ()
openssl/x509.h
PKCS #7
container
i2d_PCKS7()
d2i_PKCS7()
openssl/x509.h
PKCS7
7.16.4 See Also
Enhanced SNACC ASN.1 compiler: http://www.getronicsgov.com/hot/snacc_lib.htm
Certificate Management Library: http://www.getronicsgov.com/hot/cml_home.htm
Recipe 4.10
[ Team LiB ]
[ Team LiB ]
7.17 Representing Keys and Certificates in Plaintext (PEM
Encoding)
7.17.1 Problem
You want to represent cryptographic data such as public keys or certificates in a plaintext format, so
that you can use it in protocols that don't accept arbitrary binary data. This may include storing an
encrypted version of a private key.
7.17.2 Solution
The PEM format represents DER-encoded data in a printable format. Traditionally, PEM encoding
simply base64-encodes DER-encoded data and adds a simple header and footer. OpenSSL provides
an API for such functionality that handles the DER encoding and header writing for you.
OpenSSL has introduced extensions for using encrypted DER representations, allowing you to use
PEM to store encrypted private keys and other cryptographic data in ASCII format.
7.17.3 Discussion
Privacy Enhanced Mail (PEM) is the original encrypted email standard. Although the standard is long
dead, a small subset of its encoding mechanism has managed to survive.
In today's day and age, PEM-encoded data is usually just DER-encoded data with a header and
footer. The header is a single line consisting of five dashes followed by the word "BEGIN", followed by
anything. The data following the word "BEGIN" is not really standardized. In some cases, there might
not be anything following this word. However, if you are using theOpenSSL PEM outputting routines,
there is a textual description of the type of data object encoded. For example, OpenSSL produces the
following header line for an RSA private key:
-----BEGIN RSA PRIVATE KEY----This is a good convention, and one that is widely used.
The footer has the same format, except that "BEGIN" is replaced with "END". You should expect that
anything could follow. Again, OpenSSL uses a textual description of the content.
In between the two lines is a base64-encoded DER representation, which may contain line breaks
(\r\n, often called CRLFs for "carriage return and line feed"), which get ignored. We cover base64 in
Recipe 4.5 and Recipe 4.6, and DER encoding in Recipe 7.16.
If you want to encrypt a DER object, the original PEM format supported that as well, but no one uses
these extensions today. OpenSSL does implement something similar. First, we'll describe what
OpenSSL does, because this will offer compatibility with applications built with OpenSSL that use this
format-most notably Apache with mod_ssl. Next, we'll demonstrate how to use OpenSSL's PEM API
directly.
We'll explain this format by walking through an example. Here's a PEM-encoded, encrypted RSA
private key:
-----BEGIN RSA PRIVATE KEY----Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,F2D4E6438DBD4EA8
LjKQ2r1Yt9foxbHdLKZeClqZuzN7PoEmy+b+dKq9qibaH4pRcwATuWt4/Jzl6y85
NHM6CM4bOV1MHkyD01tFsT4kJ0GwRPg4tKAiTNjE4Yrz9V3rESiQKridtXMOToEp
Mj2nSvVKRSNEeG33GNIYUeMfSSc3oTmZVOlHNp9f8LEYWNmIjfzlHExvgJaPrixX
QiPGJ6K05kV5FJWRPET9vI+kyouAm6DBcyAhmR80NYRvaBbXGM/MxBgQ7koFVaI5
zoJ/NBdEIMdHNUh0h11GQCXAQXOSL6Fx2hRdcicm6j1CPd3AFrTt9EATmd4Hj+D4
91jDYXElALfdSbiO0A9Mz6USUepTXwlfVV/cbBpLRz5Rqnyg2EwI2tZRU+E+Cusb
/b6hcuWyzva895YMUCSyDaLgSsIqRWmXxQV1W2bAgRbs8jD8VF+G9w= =
-----END RSA PRIVATE KEY----The first line is as discussed at the beginning of this section.Table 7-4 lists the most useful values for
the data type specified in the first and last line. Other values can be found inopenssl/pem.h.
Table 7-4. PEM header types
Name
Comments
RSA PUBLIC KEY
-
RSA PRIVATE KEY
-
DSA PUBLIC KEY
-
DSA PRIVATE KEY
-
DH PARAMETERS
Parameters for Diffie-Hellman key exchange
CERTIFICATE
An X.509 digital certificate
TRUSTED CERTIFICATE
A fully trusted X.509 digital certificate
CERTIFICATE REQUEST
A PKCS #10 certificate signing request
X509 CRL
An X.509 certificate revocation list
SSL SESSION PARAMETERS
-
The header line is followed by three lines that look like MIME headers. Do not treat them as MIME
headers, though. Yes, the base64-encrypted text is separated from the header information by a line
with nothing on it (two CRLFs). However, you should assume that there is no real flexibility in the
headers. You should have either the two headers that are there, or nothing (and if you're not
including headers, be sure to remove the blank line). In addition, the headers should be in the order
shown above, and they should have the same comma-separated fields.
As far as we can determine, the second line must appear exactly as shown above for OpenSSL
compatibility. There's some logic in OpenSSL to handle two other options that would add an integritychecking value to the data being encoded, but it appears that the OpenSSL team never actually
finished a full implementation, so these other options aren't used (it's left over from a time when the
OpenSSL implementers were concerned about compliance with the original PEM RFCs). The first
parameter on the "DEK-Info" line (where DEK stands for "data encrypting key") contains an ASCII
representation of the algorithm used for encryption, which should always be a CBC-based mode.
Table 7-5 lists the identifiers OpenSSL currently supports.
Table 7-5. PEM encryption algorithms supported by OpenSSL
Cipher
String
AES with 128-bit keys
AES-128-CBC
AES with 192-bit keys
AES-192-CBC
AES with 256-bit keys
AES-256-CBC
Blowfish
BF-CBC
CAST5
CAST-CBC
DES
DES-CBC
DESX
DESX
2-key Triple-DES
DES-EDE-CBC
3-key Triple-DES
DES-EDE3-CBC
IDEA
IDEA-CBC
RC2
RC2-CBC
RC5 with 128-bit keys and 12 rounds
RC5-CBC
The part of the DEK-Info field after the comma is a CBC initialization vector (which should be
randomly generated), represented in uppercase hexadecimal.
The way encrypted PEM representations work in OpenSSL is as follows:
1. The data is DER-encoded.
2. The data is encrypted using a key that isn't specified anywhere (i.e., it's not placed in the
headers, for obvious reasons). Usually, the user must type in a password to derive an
encryption key. (See Recipe 4.10.[7]) The key-from-password functionality has the initialization
vector double as a salt value, which is probably okay.
[7]
OpenSSL uses PKCS #5 Version 1.5 for key derivation. PKCS #5 is an earlier version of the algorithm
described in Recipe 4.10. MD5 is used as the hash algorithm with an iteration count of 1. There are some
differences between PKCS #5 Version 1.5 and Version 2.0. If you don't care about OpenSSL compatibility,
3.
you should definitely use Version 2.0 (the man pages even recommend it).
3. The encrypted data is base64-encoded.
The OpenSSL API for PEM encoding and decoding (include openssl/pem.h) only allows you to operate
on FILE or OpenSSL BIO objects, which are the generic OpenSSL IO abstraction. If you need to
output to memory, you can either use a memory BIO or get the DER representation and encode it by
hand.
The BIO API and the FILE API are similar. The BIO API changes the name of each function in a
predictable way, and the first argument to each function is a pointer to aBIO object instead of a FILE
object. The object type on which you're operating is always the second argument to a PEM function
when outputting PEM. When reading in data, pass in a pointer to a pointer to the encoded object. As
with the DER functions described in Recipe 7.16, OpenSSL increments this pointer.
All of the PEM functions are highly regular. All the input functions and all the output functions take
the same arguments and have the same signature, except that the second argument changes type
based on the type of data object with which you're working. For example, the second argument to
PEM_write_RSAPrivateKey( ) will be an RSA object pointer, whereas the second argument to
PEM_writeDSAPrivateKey( ) will be a DSA object pointer.
We'll show you the API by demonstrating how to operate on RSA private keys. Then we'll provide a
table that gives you the relevant functions for other data types.
Here's the signature for PEM_write_RSAPrivateKey( ):
int PEM_write_RSAPrivateKey(FILE *fp, RSA *obj, EVP_CIPHER *enc,
unsigned char *kstr, int klen,
pem_password_cb callback, void *cb_arg);
This function has the following arguments:
fp
Pointer to the open file for output.
obj
RSA object that is to be PEM-encoded.
enc
Optional argument that, if not specified as NULL, is the EVP_CIPHER object for the symmetric
encryption algorithm (see Recipe 5.17 for a list of possibilities) that will be used to encrypt the
data before it is base64-encoded. It is a bad idea to use anything other than a CBC-based
cipher.
kstr
Buffer containing the key to be used to encrypt the data. If the data is not encrypted, this
argument should be specified as NULL. Even if the data is to be encrypted, this buffer may be
specified as NULL, in which case the key to use will be derived from a password or passphrase.
klen
If the key buffer is not specified as NULL, this specifies the length of the buffer in bytes. If the
key buffer is specified as NULL, this should be specified as 0.
callback
If the data is to be encrypted and the key buffer is specified asNULL, this specifies a pointer to
a function that will be called to obtain the password or passphrase used to derive the
encryption key. It may be specified as NULL, in which case OpenSSL will query the user for the
password or passphrase to use.
cb_arg
If a callback function is specified to obtain the password or passphrase for key derivation, this
application-specific value is passed directly to the callback function.
If encryption is desired, OpenSSL will use PKCS #5 Version 1.5 to derive an encryption key from a
password. This is an earlier version of the algorithm described inRecipe 4.10.
This function will return 1 if the encoding is successful, 0 otherwise (for example, if the underlying file
is not open for writing).
The type pem_password_cb is defined as follows:
typedef int (*pem_password_cb)(char *buf, int len, int rwflag, void *cb_arg);
It has the following arguments:
buf
Buffer into which the password or passphrase is to be written.
len
Length in bytes of the password or passphrase buffer.
rwflag
Indicates whether the password is to be used for encryption or decryption. For encryption
(when writing out data in PEM format), the argument will be 1; otherwise, it will be 0.
cb_arg
This application-specific value is passed in from the final argument to the PEM encoding or
decoding function that caused this callback to be made.
Make sure that you do not overflow buf when writing data into it!
Your callback function is expected to return 1 if it successfully reads a password; otherwise, it should
return 0.
The function for writing an RSA private key to a BIO object has the following signature, which is
essentially the same as the function for writing an RSA private key to aFILE object. The only
difference is that the first argument is the BIO object to write to instead of a FILE object.
int PEM_write_bio_RSAPrivateKey(BIO *bio, RSA *obj, EVP_CIPHER *enc,
unsigned char *kstr, int klen,
pem_password_cb callback, void *cbarg);
Table 7-6 lists the FILE object-based functions for the most useful PEM-encoding variants.[8] The BIO
object-based functions can be derived by adding _bio_ after read or write.
[8]
The remainder can be found by looking for uses of the IMPLEMENT_PEM_rw macro in the OpenSSL
crypto/pem source directory.
Table 7-6. FILE object-based functions for PEM encoding
Kind of object
Object
type
FILE object-based encoding
function
FILE object-based decoding
function
RSA public key
RSA
PEM_write_RSAPublicKey()
PEM_read_RSAPublicKey()
RSA private key
RSA
PEM_write_RSAPrivateKey()
PEM_read_RSAPrivateKey()
Diffie-Hellman
parameters
DH
PEM_write_DHparams()
PEM_read_DHparams()
DSA parameters
DSA
PEM_write_DSAparams()
PEM_read_DSAparams()
DSA public key
DSA
PEM_write_DSA_PUBKEY()
PEM_read_DSA_PUBKEY()
DSA private key
DSA
PEM_write_DSAPrivateKey()
PEM_read_DSAPrivateKey()
X.509 certificate
X509
PEM_write_X509()
PEM_read_X509()
X.509 CRL
X509_CRL
PEM_write_X509_CRL()
PEM_read_X509_CRL()
PKCS #10
certificate signing
request
X509_REQ
PEM_write_X509_REQ()
PEM_read_X509_REQ()
PEM_write_PKCS7()
PEM_read_PKCS7()
PKCS #7 container PKCS7
The last two rows enumerate calls that are intended for people implementing actual infrastructure for
a PKI, and they will not generally be of interest to the average developer applying cryptography.[9]
[9]
PKCS #7 can be used to store multiple certificates in one data object, however, which may be appealing to
some, instead of DER-encoding multiple X.509 objects separately.
7.17.4 See Also
Recipe 4.5, Recipe 4.6, Recipe 4.10, Recipe 5.17,Recipe 7.16
[ Team LiB ]
[ Team LiB ]
Chapter 8. Authentication and Key
Exchange
At first glance, it may not be clear that authentication and key exchange are two topics that go
together. But they do. This chapter is really all about secure connection establishment-everything
the client and server need to do before they start talking. Generally, the server will need to
authenticate the client; the client will need to make sure the server is the correct machine (not some
attacker). Then the two parties will need to come to some agreement on how to communicate
securely beyond that, also agreeing on an encryption key (or a set of keys).
Yes, authentication doesn't always happen over an insecure network connection-it is certainly
possible to authenticate over a console or some other medium where network attacks pose little to
no risk. In the real world, however, it's rare that one can assume a secure channel for authentication.
Nonetheless, many authentication mechanisms need some kind of secure channel, such as an
authenticated SSL connection, before they can offer even reasonable security levels.
In this chapter, we'll sort through these technologies for connection establishment. Note that in these
recipes we cover only standalone technologies for authentication and key exchange. InChapter 9, we
cover authentication with SSL/TLS, and in Chapter 10, we cover authentication in the context of
public key infrastructures (PKI).
[ Team LiB ]
[ Team LiB ]
8.1 Choosing an Authentication Method
8.1.1 Problem
You need to perform authentication, and you need to choose an appropriate method.
8.1.2 Solution
The correct method depends on your needs. When a server needs to be authenticated, and the client
does not, SSL/TLS is a popular solution. When mutual authentication is desirable, there are a whole
bevy of options, such as tunneling a traditional protocol over SSL/TLS or using a dedicated protocol.
The best dedicated protocols not only perform mutual authentication but also exchange keys that can
then be used for encryption.
8.1.3 Discussion
An authentication factor is some thing that contributes to establishing an identity. For example, a
password is an authentication factor, as is a driver's license. There are three major categories of
authentication factors:
Things you know
This category generally refers to passwords, PIN numbers, or passphrases. However, there are
systems that are at least partially based on the answers to personal questions (though such
systems are low on the usability scale; they are primarily used to reset forgotten passwords
without intervention from customer service people, in order to thwart social engineering
attacks).
Things you have
ATM cards are common physical tokens that are often implicitly used for authentication. That
is, when you go to an ATM, having the card is one factor in having the ATM accept who you
are. Your PIN by itself is not going to allow someone to get money out in your name.
Things you are
This category generally refers to biometrics such as fingerprints or voice analysis. It includes
things you have that you are not going to lose. Of course, an attacker could mimic your
information in an attempt to impersonate you.
No common authentication factors are foolproof. Passwords tend to be easy to guess. While
cryptography can help keep properly used physical tokens from being forged, they can still be lost or
stolen. And biometric devices today have a significant false positive rate. In addition, it can be simple
to fool biometric devices; see http://www.puttyworld.com/thinputdeffi.html.
In each of these major categories, there are many different technologies. In addition, it is easy to
have a multifactor system in which multiple technologies are required to log in (supporting the
common security principle of defense in depth). Similarly, you can have "either-or" authentication to
improve usability, but that tends to decrease security by opening up new attack vectors.
Clearly, choosing the right technology requires a thorough analysis of requirements for an
authentication system. In this chapter, we'll look at several common requirements, then examine
common technologies in light of those requirements.
However, let us first point out that it is good to build software in such a way that authentication is
implemented as a framework, where the exact requirements can be determined by an operational
administrator instead of a programmer. PAM (Pluggable Authentication Modules) lets you do just
that, at least on the server side, in a client-server system. SASL (Simple Authentication and Security
Layer) is another such technology that tries to push the abstraction that provides plugability off the
server and into the network. We find SASL a large mess and therefore do not cover it here. PAM is
covered in Recipe 8.12.
There are several common and important requirements for authentication mechanisms. Some of
these may be more or less important to you in your particular environment:
Practicality of deployment
This is the reason that password systems are so common even though there are so many
problems with them. Biometrics and physical tokens both require physical hardware and cost
money. When deploying Internet-enabled software, it is generally highly inconvenient to force
users to adopt one of these solutions.
Usability
Usability is a very important consideration. Unfortunately, usability often trades off against
good security. Passwords are a good example: more secure mechanism would require public
keys to establish identity. Often, the user's private key will be password-protected for defense
in depth, but that only protects against local attacks where an attacker might get access to
steal the key-a well-designed public key-based protocol should not be vulnerable to passwordguessing attacks.
Another common usability-related requirement is that the user should not have to bring any
special bits with him to a computer to be able to log in. That is, many people want a user to be
able to sit down at an arbitrary computer and be able to authenticate with data in his head
(e.g., a password), even if it means weaker security. For others, it is not unreasonable to ask
users to carry a public key around.
When passwords are used, there are many different mechanisms to improve security, but most
of them decrease usability. You can, for example, expire passwords, but users hate that.
Alternatively, you can enforce passwords that seem to have sufficient entropy in them (e.g., by
checking against a dictionary of words), but again, users will often get upset with the system.
In many cases, adding something like a public key mechanism adds more security and is less
burdensome than such hacks turn out to be.
Use across applications
For some people, it is important to manage authentication centrally across a series of
applications. In such a situation, authentication should involve a separate server that manages
credentials. Kerberos is the popular technology for meeting this requirement, but a privately
run public key infrastructure can be used to do the same thing.
Patents
Many people also want to avoid any algorithms that are likely to be covered by patent.
Efficiency
Other people may be concerned about efficiency, particularly on a server that might need to
process many connections in a short period of time. In that situation, it could be important to
avoid public key cryptography altogether, or to find some other way to minimize the impact on
the server, to prevent against denial of service.
Common mechanism
It may also be a requirement to have authentication and key exchange be done by the same
mechanism. This can improve ease of development if you pick the right solution.
Economy of expression
An authentication protocol should use a minimal number of messages to do work. Generally,
three messages are considered the target to hit, even when authentication and key exchange
are combined. This is usually not such a big deal, however. A few extra messages generally will
not noticeably impact performance. Protocol designers like to strive to minimize the number of
messages, because it makes their work more elegant and less ad hoc. Of course, simplicity
should be a considered requirement, but then again, we have seen simple five-message
protocols, and ridiculously complex three-message protocols!
Security
Security is an obvious requirement at the highest level, but there are many different security
properties you might care about, as we'll describe in the rest of this section.
In terms of the security of your mechanism, you might require a mechanism that effectively provides
its own secure channel, resisting sniffing attacks, man-in-the-middle attacks, and so on that might
lead to password compromise, or even just the attacker's somehow masquerading as either the client
or server without compromising the password. (This could happen, for example, if the attacker
manages to get the server password database.)
On the other hand, you might want to require something that does not build its own secure channel.
For example, if you are writing something that will be used only on the console, you will already be
assuming a trusted path from the user to your code, so why bother building a secure channel?
Similarly, you might already be able to establish an authenticated remote connection to a server
through something like SSL, in which case you get a secure channel over which you can do a simpler
authentication protocol. (Mutual authentication versus one-sided authentication is therefore another
potentially interesting requirement.) Of course, that works only if the server really is authenticated,
which people often fail to do properly.
Whether or not you have a secure channel, you will probably want to make sure that you avoid
capture replay attacks. In addition, you should consider which possible masquerading scenarios worry
you. Obviously, it is bad if an arbitrary person can masquerade as either the client or the server just
from watching network traffic. What if an attacker manages to break into a server, however? Should
the attacker then be able to masquerade as the user to that server? To other servers where the user
has the same credentials (e.g., the same password)?
In addition, when a user shares authentication credentials across multiple servers, should he be able
to distinguish those servers? Such a requirement can demand significant trade-offs, because to meet
it, you will need either a public key infrastructure or some other secure secret that users need to
carry around that authenticates each server. If you are willing to assume that the server is not
compromised at account creation time but may be compromised at some later point, you can meet
the requirement more easily.
We have already mentioned no susceptibility to password guessing attacks as a possible requirement.
When that is too strict, there are other requirements we can impose that are actually reasonable:
When an attacker steals the authentication database on the server, an offline cracking job
should be incredibly difficult-with luck, infeasible, even if the password being attacked is fairly
predictable.
Guessing attacks should be possible only by attempting to authenticate directly with the server,
and the login attempt should not reveal any information about the actual password beyond
whether or not the guess was correct.
There should not be large windows of vulnerability where the server has the password. That is,
the server should need to see the password only at account initialization time, or not at all. It
should always be unacceptable for a server to store the actual password.
No doubt there are other interesting requirements for password systems.
For authentication systems that also do key exchange, there are other interesting requirements you
should consider:
Recoverability from randomness problems
You might want to require that the system be able to recover if either the client or the server
has a bad source of randomness. That is generally done by using a key agreement protocol,
where both sides contribute to the key, instead of a key transport protocol, where one side
selects the key and sends it to the other.
Forward secrecy
You might want to require that an attacker who manages to break one key exchange should
not be able to decrypt old connections, if he happens to capture the data. Achieving this
property often involves some tradeoffs.
Let's look at common technologies in light of these requirements.
8.1.3.1 Traditional UNIX crypt( )
This solution is a single-factor, password-based system. Using it requires a preexisting secure
channel (and one that thwarts capture replay attacks). There are big windows of vulnerability
because the user's password must be sent to the server every time the user wishes to authenticate.
It does not meet any of the desirable security requirements for a password-based system we outlined
above (it is susceptible to offline guessing attacks, for example), and the traditional mechanism is not
even very strong cryptographically. Using this mechanism on an unencrypted channel would expose
the password. Authentication using crypt( ) is covered in Recipe 8.9.
8.1.3.2 MD5 Modular Crypt Format (a.k.a. md5crypt or MD5-MCF)
This function replaces crypt( ) on many operating systems (the API is the same, but it is not
backward-compatible). It makes offline cracking attacks a little harder, and it uses stronger
cryptography. There are extensions to the basic modular format that use other algorithms and
provide better protection against offline guessing; the OpenBSD project's Blowfish-based
authentication mechanism is one. Using this mechanism on an unencrypted channel would expose
the password. Authentication using MD5-MCF is covered in Recipe 8.10.
8.1.3.3 PBKDF2
You can use PBKDF2 (Password-Based Key Derivation Function 2; see Recipe 4.10) as a password
storage mechanism. It meets all the same requirements as the Blowfish variant of MD5-MCF
discussed in the previous subsection. Authentication using PBKDF2 is covered inRecipe 8.11.
8.1.3.4 S/KEY and OPIE
S/KEY and OPIE are one-time password systems, meaning that the end user sends a different
password over the wire each time. This requires the user and the server to preestablish a secret. As a
result, if an attacker somehow gets the secret database (e.g., if he manages to dumpster-dive for an
old backup disk), he can masquerade as the client.
In addition, the user will need to keep some kind of physical token, like a sheet of one-time
passwords (which will occasionally need to be refreshed) or a calculator to compute correct
passwords. To avoid exposing the password if the server database is compromised, the user will also
need to reinitialize the server from time to time (and update her calculator).
These mechanisms do not provide their own secure channel. S/KEY, as specified, relies on MD4,
which is now known to be cryptographically broken. If it's used on an unencrypted channel, no
information about the password is revealed, but an attacker can potentially hijack a connection.
8.1.3.5 CRAM
CRAM (Challenge-Response Authentication Mechanism) is a password-based protocol that avoids
sending the password out over the wire by using a challenge-response protocol, meaning that the
two ends each prove to the other that they have the secret, without someone actually sending the
secret. Therefore, CRAM (which does not itself provide a secure channel) can be used over an
insecure channel. However, it is still subject to a number of password attacks on the server,
particularly because the server must store the actual password. Therefore, you should not use CRAM
in new systems.
8.1.3.6 Digest-Auth (RFC 2617)
Digest-Auth is one of the authentication mechanisms specified for HTTP/1.1 and later (the other is
quite weak). It does not provide a secure channel, and it provides only moderate protections against
attacks on passwords (much of it through an optional nonce that is rarely used).
8.1.3.7 SRP
All of the mechanisms we've looked at so far have been password-based. None of them create their
own secure channel, nor do they provide mutual authentication. SRP (Secure Remote Password) is a
password-based mechanism that does all of the above, and it has a host of other benefits:
Client-server authentication
SRP not only allows a server to authenticate clients, but it also allows clients to know that
they're talking to the right server-as long as the authentication database isn't stolen.
Protection against information leakage
SRP also prevents all but a minimal amount of information leakage. That is, an attacker can try
one password at a time by contacting the server, but that is the only way he can get any
information at all about the password's value. Throttling the number of allowed login attempts
to a few dozen a day should reasonably thwart most attacks, though it opens up a denial of
service risk. You might consider slightly more sophisticated throttling, such as a limit of 12
times a day per IP address. (Of course, even that is not perfect). A far less restrictive method
of throttling failed authentication attempts is discussed inRecipe 8.8.
Protection against compromise
SRP protects against most server-compromise attacks (but not a multiserver masquerading
attack, which we do not think is worth worrying about anyway). It even prevents an attacker
who compromises the server from logging into other machines using information in the
database.
Key exchange
Another big benefit is that SRP exchanges a key as a side effect of authentication. SRP uses
public key cryptography, which can be a denial-of-service issue.
The big problem with SRP is that patents cover it. As a result, we do not explore SRP in depth.
Another potential issue is that this algorithm does not provide forward secrecy, although you could
easily introduce forward secrecy on top of it.
8.1.3.8 Basic public key exchange
There are plenty of strong authentication systems based on public key cryptography. These systems
can meet most of the general requirements we've discussed, depending on how they're implemented.
Generally, the public key is protected by a password, but the password-protected key must be
transported to any client machine the user might wish to use. This is a major reason why people
often implement password-based protocols instead of using public key-based protocols. We discuss a
basic protocol using public key cryptography in Recipe 8.16.
8.1.3.9 SAX
SAX (Symmetric Authenticated eXchange) is a protocol that offers most of the same benefits of SRP,
but it is not covered by patents. Unlike SRP, it does not use public key encryption, which means that
it minimizes computational overhead. There is a masquerading attack in the case of server
compromise, but it effectively requires compromise of two servers and does not buy the attacker any
new capabilities, so it is not very interesting in practice.
SAX has two modes of use:
You can avoid leaking any information about the password if the user is willing to carry around
or memorize a secret provided by the server at account creation time (that secret needs to be
entered into any single client only once, though).
Otherwise, SAX can be used in an SRP-like manner, where the user need not carry around
anything other than the password, but information about the password can be learned, but
primarily through guessing attacks. Someone can mount an offline dictionary attack on the
server side, but the cost of such an attack can be made prohibitive.
If an attacker somehow gets the secret database (e.g., if he manages to dumpster-dive for an old
backup disk), he can masquerade as the client. PAX is a similar protocol that fixes this problem.
8.1.3.10 PAX
PAX (Public key Authenticated eXchange) is a basic two-way authenticating key exchange using
public key encryption that uses passwords to generate the keys. The server needs to know the
password once at initialization time, and never again.
This protocol is similar to SAX, but has some minor advantages because it uses public key
cryptography. For example, you can back away from using passwords (for example, you might take
the key and put the client's private key onto a smart card, obviating the need to type in a password
on the client end). Additionally, if an attacker does get the authentication database, he nonetheless
cannot masquerade as the client.
PAX can be used in one of two modes:
You can get all the advantages of a full public-key based system if the user is willing to carry
around or memorize a secret provided by the server at account creation time (that secret needs
to be entered into any single client only once, though).
Otherwise, PAX can be used in an SRP-like manner, where the user need not carry around
anything other than the password; information about the password can be learned, but only
through guessing attacks.
As with SRP, you can easily layer forward secrecy on top of PAX (by adding another layer of
cryptography; see Recipe 8.21).
Unlike SRP, PAX is not believed to be covered by patents.
8.1.3.11 Kerberos
Kerberos is a password-based authentication mechanism that requires a central authentication
server. It does not use any public key cryptography whatsoever, instead relying on symmetric
cryptography for encryption and authentication (typically DES or Triple-DES in CBC mode with MD5
or SHA1 for authentication).
Although Kerberos never transmits passwords in the clear, it does make the assumption that users
will not use weak passwords, which is a poor assumption to make, because users will invariably use
passwords that they find easy to remember. That typically also makes these passwords easy for an
attacker to guess or to discover by way of a dictionary attack.
Kerberos does assume that the environment in which it operates is insecure. It can overcome a
compromised system or network; however, if the system on which its central database resides is
compromised, the security afforded by Kerberos is seriously compromised.
We cover authentication with Kerberos in Recipe 8.13. Because of the complexity of the SSPI API in
Windows, we do not cover Kerberos on Windows in this book. Instead, recipes are available on our
web site.
8.1.3.12 Windows NT LAN Manager (NTLM)
Windows NT LAN Manager is a password-based protocol that avoids sending the password out over
the wire by using a challenge-response protocol, meaning that the two ends each prove to the other
that they have the secret, without someone actually sending the secret. Therefore, NTLM (which does
not itself provide a secure channel) can be used over an insecure channel. However, it is still subject
to a number of password attacks on the server, particularly because the server must store the actual
password.
Windows uses NTLM for network authentication and for interactive authentication on standalone
systems. Beginning with Windows 2000, Kerberos is the preferred network authentication method on
Windows, but NTLM can still be used in the absence of a Kerberos infrastructure.
Because of the complexity of the SSPI API in Windows, we do not cover authentication with NTLM in
this book. Instead, recipes are available on our web site.
8.1.3.13 SSL certificate-based checking
Secure Sockets Layer (SSL) and its successor, Transport Layer Security (TLS), use certificates to
allow entities to identify entities in a system. Certificates are verified using a PKI where a mutually
trusted third party vouches for the identity of a certificate holder. SeeRecipe 10.1 for an introduction
to certificates and PKI.
Certificates are obtained from a trusted third party known as a certification authority (CA), which
digitally signs the certificate with its own private key. If the CA is trusted, and its signature on the
certificate is valid, the certificate can be trusted. Certificates typically also contain other important
pieces of information that must also be verified-for example, validity dates and the name of the
entity that will present the certificate.
To be effective, certificates require the mutually trusted third party. One of the primary problems
with certificates and PKI is one of revocation. If the private key for a certificate is compromised, how
is everyone supposed to know that the certificate should no longer be trusted? CAs periodically
publish lists known as certificate revocation lists (CRLs) that identify all of the certificates that have
been revoked and should no longer be trusted, but it is the responsibility of the party verifying a
certificate to seek out these lists and use them properly. In addition, there is often a significant
window of time between when a CA revokes a certificate and when a new CRL is published.
SSL is widely deployed and works sufficiently well for many applications; however, because it is
difficult to use properly, it is often deployed insecurely. We discuss certificate verification inRecipe
10.4 through Recipe 10.7.
8.1.4 See Also
Thinking Putty article on defeating biometric fingerprint scanners:
http://www.puttyworld.com/thinputdeffi.html
RFC 1510: The Kerberos Network Authentication Service (V5)
RFC 2617: HTTP Authentication: Basic and Digest Access Authentication
Recipe 4.10, Recipe 8.8, Recipe 8.9, Recipe 8.10, Recipe 8.11, Recipe 8.12, Recipe 8.13, Recipe
8.16, Recipe 8.21, Recipe 10.1, Recipe 10.4, Recipe 10.5, Recipe 10.6, Recipe 10.7
[ Team LiB ]
[ Team LiB ]
8.2 Getting User and Group Information on Unix
8.2.1 Problem
You need to discover information about a user or group, and you have a username or user ID or a
group name or ID.
8.2.2 Solution
On Unix, user and group names correspond to numeric identifiers. Most system calls require numeric
identifiers upon which to operate, but names are typically easier for people to remember. Therefore,
most user interactions involve the use of names rather than numbers. The standard C runtime library
provides several functions to map between names and numeric identifiers for both groups and users.
8.2.3 Discussion
Declarations for the functions and data types needed to map between names and numeric identifiers
for users are in the header file pwd.h. Strictly speaking, mapping functions do not actually exist.
Instead, one function provides the ability to look up user information using the user's numeric
identifier, and another function provides the ability to look up user information using the user's name.
The function used to look up user information by numeric identifier has the following signature:
#include <sys/types.h>
#include <pwd.h>
struct passwd *getpwuid(uid_t uid);
The function used to look up user information by name has the following signature:
#include <sys/types.h>
#include <pwd.h>
struct passwd *getpwnam(const char *name);
Both functions return a pointer to a structure allocated internally by the runtime library. One side
effect of this behavior is that successive calls replace the information from the previous call. Another
is that the functions are not thread-safe. If either function fails to find the requested user
information, a NULL pointer is returned.
The contents of the passwd structure differ across platforms, but some fields remain the same
everywhere. Of particular interest to us in this recipe are the two fieldspw_name and pw_uid. These
two fields are what enable mapping between names and numeric identifiers. For example, the
following two functions will obtain mappings:
#include <sys/types.h>
#include <pwd.h>
#include <string.h>
int spc_user_getname(uid_t uid, char **name) {
struct passwd *pw;
if (!(pw = getpwuid(uid)) ) {
endpwent( );
return -1;
}
*name = strdup(pw->pw_name);
endpwent( );
return 0;
}
int spc_user_getuid(char *name, uid_t *uid) {
struct passwd *pw;
if (!(pw = getpwnam(name))) {
endpwent( );
return -1;
}
*uid = pw->pw_uid;
endpwent( );
return 0;
}
Note that spc_user_getname( ) will dynamically allocate a buffer to return the user's name, which
must be freed by the caller. Also notice the use of the functionendpwent( ). This function frees any
resources allocated by the lookup functions. Its use is important because failure to free the resources
can cause unexpected leaking of memory, file descriptors, socket descriptors, and so on. Exactly
what types of resources may be leaked vary depending on the underlying implementation, which may
differ not only from platform to platform, but also from installation to installation.
In our example code, we call endpwent( ) after every lookup operation, but this isn't necessary if
you need to perform multiple lookups. In fact, if you know you will be performing a large number of
lookups, always calling endpwent( ) after each one is wasteful. Any number of lookup operations
may be performed safely before eventually calling endpwent( ).
Looking up group information is similar to looking up user information. The header filegrp.h contains
the declarations for the needed functions and data types. Two functions similar togetpwnam( ) and
getpwuid( ) also exist for groups:
#include <sys/types.h>
#include <grp.h>
struct group *getgrgid(gid_t gid);
struct group *getgrnam(const char *name);
These two functions behave as their user counterparts do. Thus, we can use them to perform nameto-numeric-identifier mappings, and vice versa. Just as user information lookups require a call to
endpwent( ) to clean up any resources allocated during the lookup, group information lookups
require a call to endgrent( ) to do the same.
#include <sys/types.h>
#include <grp.h>
#include <string.h>
int spc_group_getname(gid_t gid, char **name) {
struct group *gr;
if (!(gr = getgruid(gid)) ) {
endgrent( );
return -1;
}
*name = strdup(gr->gr_name);
endgrent( );
return 0;
}
int spc_group_getgid(char *name, gid_t *gid) {
struct group *gr;
if (!(gr = getgrnam(name))) {
endgrent( );
return -1;
}
*gid = gr->gr_gid;
endgrent( );
return 0;
}
Groups may contain more than a single user. Theoretically, groups may contain any number of
members, but be aware that some implementations may impose artificial limits on the number of
users that may belong to a group.
The group structure that is returned by either getgrnam( ) or getgrgid( ) contains a field called
gr_mem that is an array of strings containing the names of all the member users. The last element in
the array will always be a NULL pointer. Determining whether a user is a member of a group is a
simple matter of iterating over the elements in the array, comparing each one to the name of the
user for which to look:
#include <sys/types.h>
#include <grp.h>
#include <string.h>
int spc_group_ismember(char *group_name, char *user_name) {
int
i;
struct group *gr;
if (!(gr = getgrnam(group_name))) {
endgrent(
return 0;
);
}
for (i = 0; gr->gr_mem[i]; i++)
if (!strcmp(user_name, gr->gr_mem[i])) {
endgrent( );
return 1;
}
endgrent(
return 0;
}
[ Team LiB ]
);
[ Team LiB ]
8.3 Getting User and Group Information on Windows
8.3.1 Problem
You need to discover information about a user or group, and you have a username or user ID or a
group name or ID.
8.3.2 Solution
Windows identifies users and groups using security identifiers (SIDs), which are unique, variably
sized values assigned by an authority such as the local machine or a Windows NT server domain.
Functions and data structures typically represent users and groups using SIDs, rather than using
names.
The Win32 API provides numerous functions for manipulating SIDs, but of particular interest to us in
this recipe are the functions LookupAccountName( ) and LookupAccountSid( ), which are used to
map between names and SIDs.
8.3.3 Discussion
The Win32 API function LookupAccountName( ) is used to find the SID that corresponds to a name.
You can use it to obtain information about a name on either the local system or a remote system.
While it might seem that mapping a name to a SID is a simple operation,LookupAccountName( )
actually requires a large number of arguments to allow it to complete its work.
LookupAccountName( ) has the following signature:
BOOL LookupAccountName(LPCTSTR lpSystemName, LPCTSTR lpAccountName, PSID Sid,
LPDWORD cbSid, LPTSTR ReferencedDomainName,
LPDWORD cbReferencedDomainName, PSID_NAME_USE peUse);
This function has the following arguments:
lpSystemName
String representing the name of the remote system on which to look up the name. If you
specify this argument as NULL, the lookup will be done on the local system.
lpAccountName
String representing the name of the user or group to look up. This argument may not be
specified as NULL.
Sid
Buffer into which the SID will be written. Initially, you may specify this argument asNULL to
determine how large a buffer is required to hold the SID.
cbSid
Pointer to an integer that both specifies the size of the buffer to receive the SID, and receives
the size of the buffer required for the SID.
ReferencedDomainName
Buffer into which the domain name where the user or group name was found is to be written.
Initially, you may specify this argument as NULL to determine how large a buffer is required to
hold the domain name.
cbReferencedDomainName
Pointer to an integer that both specifies the size of the buffer to receive the domain name, and
receives the size of the buffer required for the domain name.
peUse
Pointer to an enumeration that receives the type of SID to which the looked-up name
corresponds. The most commonly returned values are SidTypeUser (1) and SidTypeGroup (2).
The following function, SpcLookupName( ), is essentially a wrapper around LookupAccountName( ).
It handles the nuances of performing user and group name lookup, including allocating the necessary
buffers and error conditions. If the name is successfully found, the return will be a pointer to a
dynamically allocated SID structure, which you must later free using LocalFree( ). If the name
could not be found, NULL will be returned, and GetLastError( ) will return ERROR_NONE_MAPPED. If
any other kind of error occurs, SpcLookupName( ) will return NULL, and GetLastError( ) will return
the relevant error code.
#include <windows.h>
PSID SpcLookupName(LPCTSTR lpszSystemName, LPCTSTR lpszAccountName) {
PSID
Sid;
DWORD
cbReferencedDomainName, cbSid;
LPTSTR
ReferencedDomainName;
SID_NAME_USE eUse;
cbReferencedDomainName = cbSid = 0;
if (LookupAccountName(lpszSystemName, lpszAccountName, 0, &cbSid,
0, &cbReferencedDomainName, &eUse)) {
SetLastError(ERROR_NONE_MAPPED);
return 0;
}
if (GetLastError( ) != ERROR_INSUFFICIENT_BUFFER) return 0;
if (!(Sid = (PSID)LocalAlloc(LMEM_FIXED, cbSid))) return 0;
ReferencedDomainName = (LPTSTR)LocalAlloc(LMEM_FIXED, cbReferencedDomainName);
if (!ReferencedDomainName) {
LocalFree(Sid);
return 0;
}
if (!LookupAccountName(lpszSystemName, lpszAccountName, Sid, &cbSid,
ReferencedDomainName, &cbReferencedDomainName, &eUse)) {
LocalFree(ReferencedDomainName);
LocalFree(Sid);
return 0;
}
LocalFree(ReferencedDomainName);
return Sid;
}
The Win32 API function LookupAccountSid( ) is used to find the name that corresponds to a SID.
You can use it to obtain information about a SID on either the local system or a remote system.
While it might seem that mapping a SID to a name is a simple operation,LookupAccountSid( )
actually requires a large number of arguments to allow it to complete its work.
LookupAccountSid( ) has the following signature:
BOOL LookupAccountSid(LPCTSTR lpSystemName, PSID Sid,LPTSTR Name, LPDWORD cbName,
LPTSTR ReferencedDomainName, LPDWORD cbReferencedDomainName,
PSID_NAME_USE peUse);
This function has the following arguments:
lpSystemName
String representing the name of the remote system on which to look up the SID. If you specify
this argument as NULL, the lookup will be done on the local system.
Sid
Buffer containing the SID to look up. This argument may not be specified asNULL.
Name
Buffer into which the name will be written. Initially, you may specify this argument asNULL to
determine how large a buffer is required to hold the name.
cbName
Pointer to an integer that both specifies the size of the buffer to receive the name, and receives
the size of the buffer required for the name.
ReferencedDomainName
Buffer into which the domain name where the SID was found is to be written. Initially, you may
specify this argument as NULL to determine how large a buffer is required to hold the domain
name.
cbReferencedDomainName
Pointer to an integer that both specifies the size of the buffer to receive the domain name, and
receives the size of the buffer required for the domain name.
peUse
Pointer to an enumeration that receives the type of SID to which the looked-up SID
corresponds. The most commonly returned values are SidTypeUser (1) and SidTypeGroup (2).
The following function, SpcLookupSid( ), is essentially a wrapper around LookupAccountSid( ). It
handles the nuances of performing SID lookup, including allocating the necessary buffers and error
conditions. If the SID is successfully found, the return will be a pointer to a dynamically allocated
buffer containing the user or group name, which you must later free usingLocalFree( ). If the SID
could not be found, NULL will be returned, and GetLastError( ) will return ERROR_NONE_MAPPED. If
any other kind of error occurs, SpcLookupSid( ) will return NULL, and GetLastError( ) will return
the relevant error code.
#include <windows.h>
LPTSTR SpcLookupSid(LPCTSTR lpszSystemName, PSID Sid) {
DWORD
cbName, cbReferencedDomainName;
LPTSTR
lpszName, ReferencedDomainName;
SID_NAME_USE eUse;
cbName = cbReferencedDomainName = 0;
if (LookupAccountSid(lpszSystemName, Sid, 0, &cbName,
0, &cbReferencedDomainName, &eUse)) {
SetLastError(ERROR_NONE_MAPPED);
return 0;
}
if (GetLastError( ) != ERROR_INSUFFICIENT_BUFFER) return 0;
if (!(lpszName = (LPTSTR)LocalAlloc(LMEM_FIXED, cbName))) return 0;
ReferencedDomainName = (LPTSTR)LocalAlloc(LMEM_FIXED, cbReferencedDomainName);
if (!ReferencedDomainName) {
LocalFree(lpszName);
return 0;
}
if (!LookupAccountSid(lpszSystemName, Sid, lpszName, &cbName,
ReferencedDomainName, &cbReferencedDomainName, &eUse)) {
LocalFree(ReferencedDomainName);
LocalFree(lpszName);
return 0;
}
LocalFree(ReferencedDomainName);
return lpszName;
}
[ Team LiB ]
[ Team LiB ]
8.4 Restricting Access Based on Hostname or IP Address
8.4.1 Problem
You want to restrict access to the network based on hostname or IP address.
8.4.2 Solution
First, get the IP address of the remote connection, and verify that the address has a hostname
associated with it. To ensure that the hostname is not being spoofed (i.e., the address reverses to
one hostname, but the hostname does not map to that IP address), look up the hostname and
compare the resulting IP address with the IP address of the connection; if the IP addresses do not
match, the hostname is likely being spoofed.
Next, compare the IP address and/or hostname with a set of rules that determine whether to grant
the remote connection access.
8.4.3 Discussion
Restricting access based on the remote connection's IP address or hostname is
risky at best. The hostname and/or IP address could be spoofed, or the remote
system could be compromised with an attacker in control. Address-based
access control is no substitute for strong authentication methods.
The first step in restricting access from the network based on hostname or IP address is to ensure
that the remote connection is not engaging in a DNS spoofing attack. No foolproof method exists for
guaranteeing that the address is not being spoofed, though the code presented here can provide a
reasonable assurance for most cases. In particular, if the DNS server for the domain that an IP
address reverse-maps to has been compromised, there is no way to know.
The first code listing that we present implements a worker function,check_spoofdns( ), which
performs a set of DNS lookups and compares the results. The first lookup retrieves the hostname to
which an IP address maps. An IP address does not necessarily have to reverse-map to a hostname,
so if this first lookup yields no mapping, it is generally safe to assume that no spoofing is taking
place.
If the IP address does map to a hostname, a lookup is performed on that hostname to retrieve the IP
address or addresses to which it maps. The hostname should exist, but if it does not, the connection
should be considered suspect. Although it is possible that something funny is going on with the
remote connection, the lack of a name-to- address mapping could be innocent.
Each of the addresses returned by the hostname lookup is compared against the IP address of the
remote connection. If the IP address of the remote connection is not matched, the likelihood of a
spoofing attack is high, though still not guaranteed. If the IP address of the remote connection is
matched, the code assumes that no spoofing attack is taking place.
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#define
#define
#define
#define
<sys/types.h>
<sys/socket.h>
<netinet/in.h>
<arpa/inet.h>
<netdb.h>
<errno.h>
<stdio.h>
<stdlib.h>
<string.h>
<ctype.h>
SPC_ERROR_NOREVERSE
SPC_ERROR_NOHOSTNAME
SPC_ERROR_BADHOSTNAME
SPC_ERROR_HOSTDENIED
1
2
3
4
/*
/*
/*
/*
IP address does not map to a hostname
Reversed hostname does not exist
IP addresses do not match
TCP/SPC Wrappers denied host access
*/
*/
*/
*/
static int check_spoofdns(int sockfd, struct sockaddr_in *addr, char **name) {
int
addrlen, i;
char
*hostname;
struct hostent *he;
*name = 0;
for (;;) {
addrlen = sizeof(struct sockaddr_in);
if (getpeername(sockfd, (struct sockaddr *)addr, &addrlen) != -1) break;
if (errno != EINTR && errno != EAGAIN) return -1;
}
for (;;) {
he = gethostbyaddr((char *)&addr->sin_addr, sizeof(addr->sin_addr), AF_INET);
if (he) break;
if (h_errno = = HOST_NOT_FOUND) {
endhostent( );
return SPC_ERROR_NOREVERSE;
}
if (h_errno != TRY_AGAIN) {
endhostent( );
return -1;
}
}
hostname = strdup(he->h_name);
for (;;) {
if ((he = gethostbyname(hostname)) != 0) break;
if (h_errno = = HOST_NOT_FOUND) {
endhostent( );
free(hostname);
return SPC_ERROR_NOHOSTNAME;
}
if (h_errno != TRY_AGAIN) {
endhostent( );
free(hostname);
return -1;
}
}
/* Check all IP addresses returned for the hostname. If one matches, return
* 0 to indicate that the address is not likely being spoofed.
*/
for (i = 0; he->h_addr_list[i]; i++)
if (*(in_addr_t *)he->h_addr_list[i] = = addr->sin_addr.s_addr) {
*name = hostname;
endhostent( );
return 0;
}
/* No matches. Spoofing very likely */
free(hostname);
endhostent( );
return SPC_ERROR_BADHOSTNAME;
}
The next code listing contains several worker functions as well as the functionspc_host_init( ),
which requires a single argument that is the name of a file from which access restriction information
is to be read. The access restriction information is read from the file and stored in an in-memory list,
which is then used by spc_host_check( ) (we'll describe that function shortly).
Access restriction information read by spc_host_init( ) is required to be in a very specific format.
Whitespace is mostly ignored, and lines beginning with a hash mark (#) or a semicolon (;) are
considered comments and ignored. Any other line in the file must begin with either "allow:" or
"deny:" to indicate the type of rule.
Following the rule type is a whitespace-separated list of addresses that are to be either allowed or
denied access. Addresses may be hostnames or IP addresses. IP addresses may be specified as an
address and mask or simply as an address. In the former case, the address may contain up to four
parts, where each part must be expressed in decimal (ranging from 0 to 255), and a period (.) must
be used to separate them. A forward slash (/) separates the address from the mask, and the mask is
expressed as the number of bits to set. Table 8-1 lists example representations that are accepted as
valid.
Table 8-1. Example address representations accepted by spc_host_init( )
Representation
www.oreilly.com
Meaning
The host to which the reverse-and-forward maps www.oreilly.com will be
matched.
Representation
Meaning
12.109.142.4
Only the specific address 12.109.142.4 will be matched.
10/24
Any address starting with 10 will be matched.
192.168/16
Any address starting with 192.168 will be matched.
If any errors are encountered when parsing the access restriction data file, a message containing the
name of the file and the line number is printed. Parsing of the file then continues on the next line.
Fatal errors (e.g., out of memory) are also noted in a similar fashion, but parsing terminates
immediately and any data successfully parsed so far is thrown away.
When spc_host_init( ) completes successfully (even if parse errors are encountered), it will return
1; otherwise, it will return 0.
#define SPC_HOST_ALLOW 1
#define SPC_HOST_DENY 0
typedef struct {
int
action;
char
*name;
in_addr_t addr;
in_addr_t mask;
} spc_hostrule_t;
static int
spc_host_rulecount;
static spc_hostrule_t *spc_host_rules;
static int add_rule(spc_hostrule_t *rule) {
spc_hostrule_t *tmp;
if (!(spc_host_rulecount % 256)) {
if (!(tmp = (spc_hostrule_t *)realloc(spc_host_rules,
sizeof(spc_host_rulecount) * (spc_host_rulecount + 256))))
return 0;
spc_host_rules = tmp;
}
spc_host_rules[spc_host_rulecount++] = *rule;
return 1;
}
static void free_rules(void) {
int i;
if (spc_host_rules) {
for (i = 0; i < spc_host_rulecount; i++)
if (spc_host_rules[i].name) free(spc_host_rules[i].name);
free(spc_host_rules);
spc_host_rulecount = 0;
spc_host_rules = 0;
}
}
static in_addr_t parse_addr(char *str) {
int
shift = 24;
char
*tmp;
in_addr_t addr = 0;
for (tmp = str; *tmp; tmp++) {
if (*tmp = = '.') {
*tmp = 0;
addr |= (atoi(str) << shift);
str = tmp + 1;
if ((shift -= 8) < 0) return INADDR_NONE;
} else if (!isdigit(*tmp)) return INADDR_NONE;
}
addr |= (atoi(str) << shift);
return htonl(addr);
}
static in_addr_t make_mask(int bits) {
in_addr_t mask;
bits = (bits < 0 ? 0 : (bits > 32 ? 32 : bits));
for (mask = 0; bits--; mask |= (1 << (31 - bits)));
return htonl(mask);
}
int spc_host_init(const char *filename) {
int
lineno = 0;
char
*buf, *p, *slash, *tmp;
FILE
*f;
size_t
bufsz, len = 0;
spc_hostrule_t rule;
if (!(f = fopen(filename, "r"))) return 0;
if (!(buf = (char *)malloc(256))) {
fclose(f);
return 0;
}
while (fgets(buf + len, bufsz - len, f) != 0) {
len += strlen(buf + len);
if (buf[len - 1] != '\n') {
if (!(buf = (char *)realloc((tmp = buf), bufsz += 256))) {
fprintf(stderr, "%s line %d: out of memory\n", filename, ++lineno);
free(tmp);
fclose(f);
free_rules( );
return 0;
}
continue;
}
buf[--len] = 0;
lineno++;
for (tmp = buf; *tmp && isspace(*tmp); tmp++) len--;
while (len && isspace(tmp[len - 1])) len--;
tmp[len] = 0;
len = 0;
if (!tmp[0] || tmp[0] = = '#' || tmp[0] = = ';') continue;
memset(&rule, 0, sizeof(rule));
if (strncasecmp(tmp, "allow:", 6) && strncasecmp(tmp, "deny:", 5)) {
fprintf(stderr, "%s line %d: parse error; continuing anyway.\n",
filename, lineno);
continue;
}
if (!strncasecmp(tmp, "deny:", 5)) {
rule.action = SPC_HOST_DENY;
tmp += 5;
} else {
rule.action = SPC_HOST_ALLOW;
tmp += 6;
}
while (*tmp && isspace(*tmp)) tmp++;
if (!*tmp) {
fprintf(stderr, "%s line %d: parse error; continuing anyway.\n",
filename, lineno);
continue;
}
for (p = tmp; *p; tmp = p) {
while (*p && !isspace(*p)) p++;
if (*p) *p++ = 0;
if ((slash = strchr(tmp, '/')) != 0) {
*slash++ = 0;
rule.name = 0;
rule.addr = parse_addr(tmp);
rule.mask = make_mask(atoi(slash));
} else {
if (inet_addr(tmp) = = INADDR_NONE) rule.name = strdup(tmp);
else {
rule.name = 0;
rule.addr = inet_addr(tmp);
rule.mask = 0xFFFFFFFF;
}
}
if (!add_rule(&rule)) {
fprintf(stderr, "%s line %d: out of memory\n", filename, lineno);
free(buf);
fclose(f);
free_rules( );
return 0;
}
}
}
free(buf);
fclose(f);
return 1;
}
Finally, the function spc_host_check( ) performs access restriction checks. If the remote connection
should be allowed, the return will be 0. If some kind of error unrelated to access restriction occurs
(e.g., out of memory, bad socket descriptor, etc.), the return will be -1. Otherwise, one of the
following error constants may be returned:
SPC_ERROR_NOREVERSE
Indicates that the IP address of the remote connection has no reverse mapping. If strict
checking is not being done, this error code will not be returned.
SPC_ERROR_NOHOSTNAME
Indicates that the IP address of the remote connection reverse-maps to a hostname that does
not map to any IP address. This condition does not necessarily indicate that a DNS spoofing
attack is taking place; however, we do recommend that you treat it as such.
SPC_ERROR_BADHOSTNAME
Indicates that the likelihood of a DNS spoofing attack is high. The IP address of the remote
connection does not match any of the IP addresses that its hostname maps to.
SPC_ERROR_HOSTDENIED
Indicates that no DNS spoofing attack is believed to be taking place, but the access restriction
rules have matched the remote address with a deny rule.
The function spc_host_check( ) has the following signature:
int spc_host_check(int sockfd, int strict, int action);
This function has the following arguments:
sockfd
Socket descriptor for the remote connection. This argument is used solely to obtain the IP
address of the remote connection.
strict
Boolean value indicating whether strict DNS spoofing checks are to be done. If this argument is
specified as 0, IP addresses that do not have a reverse mapping will be allowed; otherwise,
SPC_ERROR_NOREVERSE will be returned for such connections.
action
Default action to take if the remote IP address does not match any of the defined access
restriction rules. It may be specified as either SPC_HOST_ALLOW or SPC_HOST_DENY. Any other
value will be treated as equivalent to SPC_HOST_DENY.
You may use spc_host_check( ) without using spc_host_init( ), in which case it will essentially
only perform DNS spoofing checks. If you do not use spc_host_init( ), spc_host_check( ) will
have an empty rule set, and it will always use the default action if the remote connection passes the
DNS spoofing checks.
int spc_host_check(int sockfd, int strict, int action) {
int
i, rc;
char
*hostname;
struct sockaddr_in addr;
if ((rc = check_spoofdns(sockfd, &addr, &hostname)) = = -1) return -1;
if (rc && (rc != SPC_ERROR_NOREVERSE || strict)) return rc;
for (i = 0; i < spc_host_rulecount; i++) {
if (spc_host_rules[i].name) {
if (hostname && !strcasecmp(hostname, spc_host_rules[i].name)) {
free(hostname);
return (spc_host_rules[i].action = = SPC_HOST_ALLOW);
}
} else {
if ((addr.sin_addr.s_addr & spc_host_rules[i].mask) = =
spc_host_rules[i].addr) {
free(hostname);
return (spc_host_rules[i].action = = SPC_HOST_ALLOW);
}
}
}
if (hostname) free(hostname);
return (action = = SPC_HOST_ALLOW);
}
[ Team LiB ]
[ Team LiB ]
8.5 Generating Random Passwords and Passphrases
8.5.1 Problem
You would like to avoid problems with easy-to-guess passwords by randomly generating passwords
that are difficult to guess.
8.5.2 Solution
For passwords, choose random characters from an acceptable set of characters using
spc_rand_range( ) (see Recipe 11.11). For passphrases, choose random words from a predefined
list of acceptable words.
8.5.3 Discussion
In many situations, it may be desirable to present a user with a pregenerated password. For
example, if the user is not present at the time of account creation, you will want to generate a
reasonably secure password for the account and deliver the password to the user via some secure
mechanism such as in person or over the phone.
Randomly generated passwords are also useful when you want to enforce safe password
requirements. If the user cannot supply an adequately secure password after a certain number of
attempts, it may be best to present her with a randomly generated password to use, which will most
likely pass all of the requirements tests.
The primary disadvantage of randomly generated passwords is that they are usually difficult to
memorize (and type), which often results in users writing them down. In many cases, however, this
is a reasonable trade-off.
The basic strategy for generating a random password is to define a character set that contains all of
the characters that are valid for the type of password you are generating, then choose random
members of that set until enough characters have been chosen to meet the length requirements.
The string spc_password_characters defines the character set from which random password
characters are chosen. The function spc_generate_password( ) requires a buffer and the size of the
buffer as arguments. The buffer is filled with randomly chosen password characters and is properly
NULL-terminated. As written, the function will always succeed, and it will return a pointer to the buffer
filled with the randomly generated password.
#include <string.h>
static char *spc_password_characters = "abcdefghijklmnopqrstuvwxyz0123456789"
"[email protected]#$%^&*(
"-=_+;[ ]{ }\\|,./<>?;";
)"
char *spc_generate_password(char *buf, size_t bufsz) {
size_t choices, i;
choices = strlen(spc_password_characters) - 1;
for (i = 0; i < bufsz - 1; i++) /* leave room for NULL terminator */
buf[i] = spc_password_characters[spc_rand_range(0, choices)];
buf[bufsz - 1] = 0;
return buf;
}
Although there is no conceptual difference between a password and a passphrase, each has different
connotations to users:
Password
Typically one word, short or medium in length (usually under 10 characters, and rarely longer
than 15).
Passphrases
Usually short sentences, or a number of unrelated words grouped together with no coherent
meaning.
While a passphrase can be a long string of random characters and a password can be multiple words,
the typical passphrase is a sentence that the user picks, usually because it is related to something
that is easily remembered. Even though their length and freeform nature make passphrases much
harder to run something such as the Crack program on, they are still subject to guessing.
For example, if you are trying to guess someone's passphrase, and you know that person's favorite
song, trying some lyrics from that song may prove to be a very good strategy for discovering what
the passphrase is. It is important to choose a passphrase carefully. It should be something easy to
remember, but it should not be something that someone who knows a little bit about you will be able
to guess quickly.
As with passwords, there are times when a randomly generated passphrase is needed. The strategy
for randomly generating a passphrase is not altogether different from randomly generating a
password. Instead of using single characters, whole words are used, separated by spaces.
The function spc_generate_passphrase( ) uses a data file to obtain the list of words from which to
choose. The words in the file should be ordered one per line, and they should not be related in any
way. In addition, the selection of words should be sufficiently large that a brute-force attack on
generated passphrases is not feasible. Most Unix systems have a file,/usr/share/dict/words, that
contains a large number of words from the English dictionary.
This implementation of spc_generate_passphrase( ) keeps the word data file open and builds an
in-memory list of the offsets into the file for the beginning of each word. The function keeps offsets
instead of the whole words as a memory-saving measure, although with a large enough list of words,
the amount of memory required for this list is not insignificant. To choose a word, the function
chooses an index into the list of offsets, moves the file pointer to the proper offset, and reads the
word. Word lengths can be determined by computing the difference between the next offset and the
selected one.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define SPC_WORDLIST_FILE "/usr/share/dict/words"
static
static
static
static
FILE
size_t
size_t
unsigned int
*spc_wordlist_file;
*spc_wordlist_offsets;
spc_wordlist_shortest;
spc_wordlist_count;
static int load_wordlist(void) {
char
buf[80];
FILE
*f;
size_t
*offsets, shortest, *tmp;
unsigned int count;
if (!(f = fopen(SPC_WORDLIST_FILE, "r"))) return 0;
if (!(offsets = (size_t *)malloc(sizeof(size_t) * 1024))) {
fclose(f);
return 0;
}
count
= 0;
shortest
= ~0;
offsets[0] = 0;
while (fgets(buf, sizeof(buf), f))
if (buf[strlen(buf) - 1] = = '\n') {
if (!((count + 1) % 1024)) {
if (!(offsets = (size_t *)realloc((tmp = offsets),
sizeof(size_t) * (count + 1025)))) {
fclose(f);
free(tmp);
return 0;
}
}
offsets[++count] = ftell(f);
if (offsets[count] - offsets[count - 1] < shortest)
shortest = offsets[count] - offsets[count - 1];
}
if (!feof(f)) {
fclose(f);
free(offsets);
return 0;
}
if (ftell(f) - offsets[count - 1] < shortest)
shortest = ftell(f) - offsets[count - 1];
spc_wordlist_file
spc_wordlist_offsets
= f;
= offsets;
spc_wordlist_count
= count;
spc_wordlist_shortest = shortest - 1; /* shortest includes NULL terminator */
return 1;
}
static int get_wordlist_word(unsigned int num, char *buf, size_t bufsz) {
size_t end, length;
if (num >= spc_wordlist_count) return -1;
if (num = = spc_wordlist_count - 1) {
fseek(spc_wordlist_file, 0, SEEK_END);
end = ftell(spc_wordlist_file);
} else end = spc_wordlist_offsets[num + 1];
length = end - spc_wordlist_offsets[num]; /* includes NULL terminator */
if (length > bufsz) return 0;
if (fseek(spc_wordlist_file, spc_wordlist_offsets[num], SEEK_SET) = = -1)
return -1;
fread(buf, length, 1, spc_wordlist_file);
buf[length - 1] = 0;
return 1;
}
char *spc_generate_passphrase(char *buf, size_t bufsz) {
int
attempts = 0, rc;
char
*outp;
size_t
left, len;
unsigned int idx;
if (!spc_wordlist_file && !load_wordlist(
)) return 0;
outp = buf;
left = bufsz - 1;
while (left > spc_wordlist_shortest) {
idx = spc_rand_range(0, spc_wordlist_count - 1);
rc = get_wordlist_word(idx, outp, left + 1);
if (rc = = -1) return 0;
else if (!rc && ++attempts < 10) continue;
else if (!rc) break;
len = strlen(outp) + 1;
*(outp + len - 1) = ' ';
outp += len;
left -= len;
}
*(outp - 1) = 0;
return buf;
}
When spc_generate_passphrase( ) is called, it opens the data file containing the words to choose
from and leaves it open. In addition, depending on the size of the file, it may allocate a sizable
amount of memory that remains allocated. When you're done generating passphrases, you should
call spc_generate_cleanup( ) to close the data file and free the memory allocated by
spc_generate_passphrase( ).
void spc_generate_cleanup(void) {
if (spc_wordlist_file) fclose(spc_wordlist_file);
if (spc_wordlist_offsets) free(spc_wordlist_offsets);
spc_wordlist_file
spc_wordlist_offsets
spc_wordlist_count
spc_wordlist_shortest
}
8.5.4 See Also
Recipe 11.11
[ Team LiB ]
=
=
=
=
0;
0;
0;
0;
[ Team LiB ]
8.6 Testing the Strength of Passwords
8.6.1 Problem
You want to ensure that passwords are not easily guessable or crackable.
8.6.2 Solution
Use CrackLib, which is available from http://www.crypticide.org/users/alecm/.
8.6.3 Discussion
When users are allowed to choose their own passwords, a large number of people will inevitably
choose passwords that are relatively simple, making them either easy to guess or easy to crack.
Secure passwords are often difficult for people to remember, so they tend to choose passwords that
are easy to remember, but not very secure. Some of the more common choices are simple words,
dates, names, or some variation of these things.
Recognizing this tendency, Alec Muffett developed a program named Crack that takes an encrypted
password from the system password file and attempts to guess-or crack-the password. It works by
trying words found in a dictionary, combinations of the user's login name and real name, and simple
patterns and combinations of words.
CrackLib is the core functionality of Crack, extracted into a library for the intended purpose of
including it in password-setting and -changing programs to prevent users from choosing insecure
passwords. It exports a simple API, consisting of a single function,FascistCheck( ), which has the
following signature:
char *FascistCheck(char *pw, char *dictpath);
This function has the following arguments:
pw
Buffer containing the password that the user is attempting to use.
dictpath
Buffer containing the name of a file that contains a list of dictionary words for CrackLib to use
in its checks.
The dictionary file used by CrackLib is a binary data file (actually, several of them) that is normally
built as part of building CrackLib itself. A small utility built as part of CrackLib (but not normally
installed) reads in a text file containing a list of words one per line, and builds the binary dictionary
files that can be used by CrackLib.
If the FascistCheck( ) function is unable to match the password against the words in the dictionary
and its other tests, it will return NULL to indicate that the password is secure and may be used safely.
Otherwise, an error message (rather than an error code) is returned; it is suitable for display to the
user as a reason why the password could not be accepted.
CrackLib is intended to be used on Unix systems. It relies on certain Unix-specific functions to obtain
information about users. In addition, it requires a list of words (a dictionary). Porting CrackLib to
Windows should not be too difficult, but we are not aware of any efforts to do so.
8.6.4 See Also
CrackLib by Alec Muffett: http://www.crypticide.org/users/alecm/
[ Team LiB ]
[ Team LiB ]
8.7 Prompting for a Password
8.7.1 Problem
You need to prompt an interactive user for a password.
8.7.2 Solution
On Unix systems, you can use the standard C runtime function getpass( ) if you can accept limiting
passwords to _PASSWORD_LEN, which is typically defined to be 128 characters. If you want to read
longer passwords, you can use the function described in the followingSection 8.7.3.
On Windows, you can use the standard EDIT control with ES_PASSWORD specified as a style flag to
mask the characters typed by a user.
8.7.3 Discussion
In the following subsections we'll look at several different approaches to prompting for passwords.
8.7.3.1 Prompting for a password on Unix using getpass( ) or readpassphrase( )
The standard C runtime function getpass( ) is the most portable way to obtain a password from a
user interactively. Unfortunately, it does have several limitations that you may find unacceptable. The
first is that only up to _PASSWORD_LEN (typically 128) characters may be entered; any characters
after that are simply discarded. The second is that the password is stored in a statically defined
buffer, so it is not thread-safe, but ordinarily this is not much of a problem because there is
fundamentally no way to read from the terminal in a thread-safe manner anyway.
The getpass( ) function has the following signature:
#include <sys/types.h>
#include <unistd.h>
char *getpass(const char *prompt);
The text passed as the function's only argument is displayed on the terminal, terminal echo is
disabled, and input is gathered in a buffer internal to the function until the user presses Enter. The
return value from the function is a pointer to the internal buffer, which will be at most
_PASSWORD_LEN + 1 bytes in size, with the additional byte left to hold the NULL terminator.
FreeBSD and OpenBSD both support an alternative function, readpassphrase( ), that provides the
underlying implementation for getpass( ). It is more flexible than getpass( ), allowing the caller to
preallocate a buffer to hold a password or passphrase of any size. In addition, it also supports a
variety of control flags that control its behavior.
The readpassphrase( ) function has the following signature:
#include <sys/types.h>
#include <readpassphrase.h>
char *readpassphrase(const char *prompt, char *buf, size_t bufsiz, int flags);
This function has the following arguments:
prompt
String that will be displayed to the user before accepting input.
buf
Buffer into which the input read from the interactive user will be placed.
bufsiz
Size of the buffer (in bytes) into which input read from the interactive user is placed. Up to one
less byte than the size specified may be read. Any additional input is silently discarded.
flags
Set of flags that may be logically OR'd together to control the behavior of the function.
A number of flags are defined as macros in the readpassphrase.h header file. While some of the flags
are mutually exclusive, some of them may be logically combined together:
RPP_ECHO_OFF
Disables echoing of the user's input on the terminal. If neither this flag norRPP_ECHO_ON is
specified, this is the default. The two flags are mutually exclusive, but if both are specified,
echoing will be enabled.
RPP_ECHO_ON
Enables echoing of the user's input on the terminal.
RPP_REQUIRE_TTY
If there is no controlling tty, and this flag is specified,readpassphrase( ) will return an error;
otherwise, the prompt will be written to stderr, and input will be read from stdin. When input
is read from stdin, it's often not possible to disable echoing.
RPP_FORCELOWER
Causes all input from the user to be automatically converted to lowercase. This flag is mutually
exclusive with RPP_FORCEUPPER; however, if both flags are specified, RPP_FORCEUPPER will take
precedence.
RPP_FORCEUPPER
Causes all input from the user to be automatically converted to uppercase.
RPP_SEVENBIT
Indicates that the high bit will be stripped from all user input.
For both getpass( ) and readpassphrase( ), a pointer to the input buffer will be returned if the
function completes successfully; otherwise, a NULL pointer will be returned, and the error that
occurred will be stored in the global errno variable.
Both getpass( ) and readpassphrase( ) can return an error with errno set
to EINTR, which means that the input from the user was interrupted by a
signal. If such a condition occurs, all input from the user up to the point when
the signal was delivered will be stored in the buffer, but in the case of
getpass( ), there will be no way to retrieve that data.
Once getpass( ) or readpassphrase( ) return successfully, you should perform as quickly as
possible whatever operation you need to perform with the password that was obtained. Then clear
the contents of the returned buffer so that the cleartext password or passphrase will not be left
visible in memory to a potential attacker.
8.7.3.2 Prompting for a password on Unix without getpass( ) or readpassphrase( )
The function presented in this subsection, spc_read_password( ), requires two arguments. The first
is a prompt to be displayed to the user, and the second is theFILE object that points to the input
source. If the input source is specified as NULL, spc_read_password( ) will use _PATH_TTY, which is
usually defined to be /dev/tty.
The function reads as much data from the input source as memory is available to hold. It allocates an
internal buffer, which grows incrementally as it is filled. If the function is successful, the return value
will be a pointer to this buffer; otherwise, it will be aNULL pointer.
Note that we use the unbuffered I/O API for reading data from the input source. The unbuffered read
is necessary to avoid potential odd side effects in the I/O. We cannot use the stream API because
there is no way to save and restore the size of the stream buffer. That is, we cannot know whether
the stream was previously buffered.
#include
#include
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<unistd.h>
<termios.h>
<signal.h>
<paths.h>
#define BUF_STEP 1024 /* Allocate this much space for the password, and if it gets
* this long, reallocate twice the space.
* Rinse, lather, repeat.
*/
static unsigned char *read_password(int termfd) {
unsigned char ch, *ret, *tmp;
unsigned long ctr = 0;
if (!(ret = (unsigned char *)malloc(BUF_STEP + 1))) return 0;
for (;;) {
switch (read(termfd, &ch, 1)) {
case 1:
if (ch != '\n') break;
/* FALL THROUGH */
case 0:
ret[ctr] = 0;
return ret;
default:
free(ret);
return 0;
}
ret[ctr] = ch;
if (ctr && !(ctr & BUF_STEP)) {
if (!(tmp = (unsigned char *)realloc(ret, ctr + BUF_STEP + 1))) {
free(ret);
return 0;
}
ret = tmp;
}
ctr++;
}
}
unsigned char *spc_read_password(unsigned char *prompt, FILE *term) {
int
close = 0, termfd;
sigset_t
saved_signals, set_signals;
unsigned char *retval;
struct termios saved_term, set_term;
if (!term) {
if (!(term = fopen(_PATH_TTY, "r+"))) return 0;
close = 1;
}
termfd = fileno(term);
fprintf(term, "%s", prompt);
fflush(term);
/* Defer interruption when echo is turned off */
sigemptyset(&set_signals);
sigaddset(&set_signals, SIGINT);
sigaddset(&set_signals, SIGTSTP);
sigprocmask(SIG_BLOCK, &set_signals, &saved_signals);
/*Save the current state and set the terminal to not echo */
tcgetattr(termfd, &saved_term);
set_term = saved_term;
set_term.c_lflag &= ~(ECHO|ECHOE|ECHOK|ECHONL);
tcsetattr(termfd, TCSAFLUSH, &set_term);
retval = read_password(termfd);
fprintf(term, "\n");
tcsetattr(termfd, TCSAFLUSH, &saved_term);
sigprocmask(SIG_SETMASK, &saved_signals, 0);
if (close) fclose(term);
return retval;
}
8.7.3.3 Prompting for a password on Windows
On Windows, prompting for a password is as simple as setting theES_PASSWORD style flag for an EDIT
control. When this flag is set, Windows will not display the characters typed by the user. Instead, the
password character will be displayed for each character that is typed. By default, thepassword
character is an asterisk (*), but you can change it by sending the control anEM_SETPASSWORDCHAR
message with wParam set to the character to display.
Unfortunately, there is no way to prevent Windows from displaying something as the user types. The
closest that can be achieved is to set the password character to a space, which will make it difficult
for an onlooker to determine how many characters have been typed.
To safely retrieve the password stored in the EDIT control's internal buffer, the control should first be
queried to determine how many characters it holds. Allocate a buffer to hold the data and query the
data from the control. The control will make a copy of the data but leave the original internal buffer
unchanged.
To be safe, it's a good idea to set the contents of the buffer to clear the password from internal
memory used by the EDIT control. Simply setting the control's internal buffer to an empty string is
not sufficient. Instead, set a string that is the length of the string retrieved, then set an empty string
if you wish. For example:
#include <windows.h>
BOOL IsPasswordValid(HWND hwndPassword) {
BOOL
bValid = FALSE;
DWORD dwTextLength;
LPTSTR lpText;
if (!(dwTextLength = (DWORD)SendMessage(hwndPassword, WM_GETTEXTLENGTH, 0, 0)))
return FALSE;
lpText = (LPTSTR)LocalAlloc(LMEM_FIXED, (dwTextLength + 1) * sizeof(TCHAR));
if (!lpText) return FALSE;
SendMessage(hwndPassword, WM_GETTEXT, dwTextLength + 1, (LPARAM)lpText);
/* Do something to validate the password */
while (dwTextLength--) *(lpText + dwTextLength) = ' ';
SendMessage(hwndPassword, WM_SETTEXT, 0, (LPARAM)lpText);
LocalFree(lpText);
return bValid;
}
Other processes running on the same machine can access the contents of your
edit control. Unfortunately, the best mitigation strategy, at this time, is to get
rid of the edit control as soon as possible.
[ Team LiB ]
[ Team LiB ]
8.8 Throttling Failed Authentication Attempts
8.8.1 Problem
You want to prevent an attacker from making too many attempts at guessing a password through
normal interactive means.
8.8.2 Solution
It's best to use a protocol where such attacks don't leak any information about a password, such as a
public key-based mechanism.
Delay program execution after a failed authentication attempt. For each additional failure, increase
the delay before allowing the user to make another attempt to authenticate.
8.8.3 Discussion
Throttling failed authentication attempts is a balance between allowing legitimate users who simply
mistype a password or passphrase to have a quick retry and delaying attackers who are trying to
brute-force passwords or passphrases.
Our recommended strategy has three variables that control how it delays repeated authentication
attempts:
Maximum number of attempts
If this limit is reached, the authentication should be considered a complete failure, resulting in
a disconnection of the network connection or shutting down of the program that requires
authentication. A reasonable limit on the maximum number of allowed authentication attempts
is three, or perhaps five at most.
Maximum number of failed attempts allowed before enabling throttling
In general, it is reasonable to allow one or two failed attempts before instituting delays,
depending on the maximum number of allowed authentication failures.
Number of seconds to delay between successive authentication attempts
For each successive failure, the delay increases exponentially. For example, if the base number
of seconds to delay is set to two, the first delay will be two seconds, the second delay will be
four seconds, the third delay will be eight seconds, and so on. A reasonable starting delay is
generally one or two seconds, but depending on the settings you choose for the first two
variables, you may want to increase the starting delay. In particular, if you allow a large
number of attempts, it is probably a good idea to increase the delay.
The best way to institute a delay depends entirely upon the architecture of your program. If
authentication is being performed over a network in a single-threaded server that is multiplexing
connections with select( ) or poll( ), the best option may be to compute the future time at which
the next authentication attempt will be accepted, and ignore any input until that time arrives.
When authenticating a user interactively on a terminal on Unix, the best solution is likely to be to use
the sleep( ) function. On Windows, there is no strict equivalent. The Win32 API functionsSleep( )
and SleepEx( ) will both return immediately-regardless of the specified wait time-if there are no
other threads of equal priority waiting to run.
Some of these techniques can increase the risk of denial-of-service attacks.
In a GUI environment, any authentication dialog presented to the user will have a button labeled
"OK" or some equivalent. When a delay must be made, disable the button for the duration of the
delay, then enable it. On Windows, this is easily accomplished using timers.
The following function, spc_throttle( ), computes the number of seconds to delay based on the
three variables we've described and the number of failed authentication attempts. It has four
arguments:
attempts
Pointer to an integer used to count the number of failed attempts. Initially, the value of the
integer to which it points should be zero, and each call tospc_throttle( ) will increment it by
one.
max_attempts
Maximum number of attempts to allow. When this number of attempts has been made, the
return from spc_throttle( ) will be -1 to indicate a complete failure to authenticate.
allowed_fails
Number of attempts allowed before enabling throttling.
delay
Base delay in seconds.
If the maximum number of attempts has been reached, the return value from spc_throttle( ) will
be -1. If there is to be no delay, the return value will be 0; otherwise, the return value will be the
number of seconds to delay before allowing another authentication attempt.
int spc_throttle(int *attempts, int max_attempts, int allowed_fails, int delay) {
int exp;
(*attempts)++;
if (*attempts > max_attempts) return -1;
if (*attempts <= allowed_fails) return 0;
for (exp = *attempts - allowed_fails - 1;
delay *= 2;
return delay;
}
exp;
exp--)
[ Team LiB ]
[ Team LiB ]
8.9 Performing Password-Based Authentication with
crypt( )
8.9.1 Problem
You need to use the standard Unix crypt( ) function for password-based authentication.
8.9.2 Solution
The standard Unix crypt( ) function typically uses a weak one-way algorithm to perform its
encryption, which is usually also slow and insecure. You should, therefore, usecrypt( ) only for
compatibility reasons.
Despite this limitation, you might want to use crypt( ) for compatibility purposes. If so, to encrypt a
password, choose a random salt and call crypt( ) with the plaintext password and the chosen salt.
To verify a password encrypted with crypt( ), encrypt the plaintext password using the already
encrypted password as the salt, then compare the result with the already encrypted password. If
they match, the password is correct.
8.9.3 Discussion
What we are doing here isn't really encrypting a password. Actually, we are
creating a password validator. We use the term encryption because it is in
common use and is a more concise way to explain the process.
The crypt( ) function is normally found in use only on older Unix systems that still exclusively use
the /etc/passwd file for storing user information. Modern Unix systems typically use stronger
algorithms and alternate storage methods for user information, such as the Lightweight Directory
Access Protocol (LDAP), Kerberos (see Recipe 8.13), NIS, or some other type of directory service.
The traditional implementation of crypt( ) uses DES (see Recipe 5.2 for a discussion of symmetric
ciphers, including DES) to perform its encryption. DES is a symmetric cipher, which essentially means
that if you have the key used to encrypt, you can decrypt the encrypted data. To make the function
one-way, crypt( ) encrypts the key with itself.[1]
[1]
Some older versions encrypt a string of zeros instead.
The DES algorithm requires a salt, which crypt( ) limits to 12 bits. It also prepends the salt to the
resulting ciphertext, which is base64-encoded. DES is a weak block cipher to start, and thecrypt( )
function traditionally limits passwords to a single block, which serves to further weaken its capabilities
because the block size is 64 bits, or 8 bytes.
Because DES is a weak cipher and crypt( ) limits the plaintext to a single DES block, we strongly
recommend against using crypt( ) in new authentication systems. You should use it only if you
have a need to maintain compatibility with an older system that uses it.
Encrypting a password with crypt( ) is a simple operation, but programmers often get it wrong. The
most common mistake is to use the plaintext password as the salt, but recall thatcrypt( ) stores
the salt as the first two bytes of its result. Because passwords are limited to eight bytes, using the
plaintext password as the salt reveals at least a quarter of the password and makes dictionary
attacks easier.
The crypt( ) function has the following signature:
char *crypt(const char *key, const char *salt);
This function has the following arguments:
key
Password to encrypt.
salt
Buffer containing the salt to use. Remember that crypt( ) will use only 12 bits for the salt, so
it will use only the first two bytes of this buffer; passing in a larger salt will have no effect. For
maximum compatibility, the salt should contain only alphanumeric characters, a period, or a
forward slash.
The following function, spc_crypt_encrypt( ), will generate a suitable random salt and return the
result from calling crypt( ) with the password and generated salt. The crypt( ) function returns a
pointer to a statically allocated buffer, so you should not callcrypt( ) more than once without using
the results from earlier calls because the data returned from earlier calls will be overwritten.
#include <string.h>
#include <unistd.h>
char *spc_crypt_encrypt(const char *password) {
char salt[3];
static char *choices = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
"0123456789./";
salt[0] = choices[spc_rand_range(0, strlen(choices) - 1)];
salt[1] = choices[spc_rand_range(0, strlen(choices) - 1)];
salt[2] = 0;
return crypt(password, salt);
}
Verifying a password encrypted with crypt( ) involves encrypting the plaintext password to be
verified and comparing it with the already encrypted password, which would normally be obtained
from the passwd structure returned by getpwnam( ) or getpwuid( ). (See Recipe 8.2.)
Recall that crypt( ) stores the salt as the first two bytes of its result. For purposes of verification,
you will not want to generate a random salt. Instead, you should use the already encrypted password
as the salt.
You can use the following function, spc_crypt_verify( ), to verify a password; however, we're
really only providing an example of how crypt( ) should be called to verify a password. It does little
more than call crypt( ) and compare its result with the encrypted password.
#include <string.h>
#include <unistd.h>
int spc_crypt_verify(const char *plain_password, const char *cipher_password) {
return !strcmp(cipher_password, crypt(plain_password, cipher_password));
}
8.9.4 See Also
Recipe 5.2, Recipe 8.2, Recipe 8.13
[ Team LiB ]
[ Team LiB ]
8.10 Performing Password-Based Authentication with
MD5-MCF
8.10.1 Problem
You want to use MD5 as a method for encrypting passwords.
8.10.2 Solution
Many modern systems support the use of MD5 for encrypting passwords. An encoding known as
Modular Crypt Format (MCF) is used to allow the use of the traditionalcrypt( ) function to handle
the old DES encryption as well as MD5 and any number of other possible algorithms.
On systems that support MCF through crypt( ),[2] you can simply use crypt( ) as discussed in
Recipe 8.9 with some modification to the required salt. Otherwise, you can use the implementation in
this recipe.
[2]
FreeBSD, Linux, and OpenBSD support MCF via crypt( ). Darwin, NetBSD, and Solaris do not. Windows
also does not because it does not support crypt( ) at all.
8.10.3 Discussion
What we are doing here isn't really encrypting a password. Actually, we are
creating a password validator. We use the term encryption because it is in
common use and is a more concise way to explain the process.
MCF is a 7-bit encoding that allows for encoding multiple fields into a single string. A dollar sign
delimits each field, with the first field indicating the algorithm to use by way of a predefined number.
At present, only two well-known algorithms are defined: 1 indicates MD5 and 2 indicatesBlowfish.
The contents of the first field also dictate how many fields should follow and the type of data each
one contains. The first character in an MCF string is always a dollar sign, which technically leaves the
0th field empty.
For encoding MD5 in MCF, the first field must contain a 1, and two additional fields must follow: the
first is the salt, and the second is the MD5 checksum that is calculated from a sequence of MD5
operations based on a nonintuitive process that depends on the value of the salt and the password.
The intent behind this process was to slow down brute-force attacks; however, we feel that the
algorithm is needlessly complex, and there are other, better ways to achieve the same goals.
As with the traditional DES-based crypt( ), we do not recommend that you
use MD5-MCF in new authentication systems. You should use it only when you
must maintain compatibility with existing systems. We recommend that you
consider using something like PBKDF2 instead. (See Recipe 8.11.)
The function spc_md5_encrypt( ) implements a crypt( )-like function that uses the MD5-MCF
method that we've described. If it is successful (the only error that should ever occur is an out-ofmemory error), it will return a dynamically allocated buffer that contains the encrypted password in
MCF.
In this recipe, we present two versions of spc_md5_encrypt( ) in their entirety. The first uses
OpenSSL and standard C runtime functions; the second uses the native Win32 API and CryptoAPI.
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<openssl/md5.h>
static char *crypt64_encode(const unsigned char *buf) {
int
i;
char
*out, *ptr;
unsigned long l;
static char
*crypt64_set = "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz";
if (!(out = ptr = (char *)malloc(23))) return 0;
#define CRYPT64_ENCODE(x, y, z)
for (i = 0, l = (buf[(x)] << 16) | (buf[(y)] << 8) | buf[(z)];
l >>= 6) *ptr++ = crypt64_set[l & 0x3F]
CRYPT64_ENCODE(0, 6, 12);
CRYPT64_ENCODE(2, 8, 14);
CRYPT64_ENCODE(4, 10, 5);
for (i = 0, l = buf[11];
*ptr = 0;
CRYPT64_ENCODE(1,
CRYPT64_ENCODE(3,
i++ < 2;
7, 13);
9, 15);
l >>= 6) *ptr++ = crypt64_set[l & 0x3F];
#undef CRYPT64_ENCODE
return out;
}
static void compute_hash(unsigned char *hash, const char *key,
const char *salt, size_t salt_length) {
int
i, length;
size_t key_length;
MD5_CTX ctx, ctx1;
key_length = strlen(key);
\
i++ < 4; \
MD5_Init(&ctx);
MD5_Update(&ctx, key, key_length);
MD5_Update(&ctx, salt, salt_length);
MD5_Init(&ctx1);
MD5_Update(&ctx1, key, key_length);
MD5_Update(&ctx1, salt, salt_length);
MD5_Update(&ctx1, key, key_length);
MD5_Final(hash, &ctx1);
for (length = key_length; length > 0; length -= 16)
MD5_Update(&ctx, hash, (length > 16 ? 16 : length));
memset(hash, 0, 16);
for (i = key_length; i; i >>= 1)
if (i & 1) MD5_Update(&ctx, hash, 1);
else MD5_Update(&ctx, key, 1);
MD5_Final(hash, &ctx);
for (i = 0; i < 1000; i++) {
MD5_Init(&ctx);
if (i & 1) MD5_Update(&ctx, key, key_length);
else MD5_Update(&ctx, hash, 16);
if (i % 3) MD5_Update(&ctx, salt, salt_length);
if (i % 7) MD5_Update(&ctx, key, key_length);
if (i & 1) MD5_Update(&ctx, hash, 16);
else MD5_Update(&ctx, key, key_length);
MD5_Final(hash, &ctx);
}
}
char *spc_md5_encrypt(const char *key, const char *salt) {
char
*base64_out, *base64_salt, *result, *salt_end, *tmp_string;
size_t
result_length, salt_length;
unsigned char out[16], raw_salt[16];
base64_out = base64_salt = result = 0;
if (!salt) {
salt_length = 8;
spc_rand(raw_salt, sizeof(raw_salt));
if (!(base64_salt = crypt64_encode(raw_salt))) goto done;
if (!(tmp_string = (char *)realloc(base64_salt, salt_length + 1)))
goto done;
base64_salt = tmp_string;
} else {
if (strncmp(salt, "$1$", 3) != 0) goto done;
if (!(salt_end = strchr(salt + 3, '$'))) goto done;
salt_length = salt_end - (salt + 3);
if (salt_length > 8) salt_length = 8; /* maximum salt is 8 bytes */
if (!(base64_salt = (char *)malloc(salt_length + 1))) goto done;
memcpy(base64_salt, salt + 3, salt_length);
}
base64_salt[salt_length] = 0;
compute_hash(out, key, base64_salt, salt_length);
if (!(base64_out = crypt64_encode(out))) goto done;
result_length = strlen(base64_out) + strlen(base64_salt) + 5;
if (!(result = (char *)malloc(result_length + 1))) goto done;
sprintf(result, "$1$%s$%s", base64_salt, base64_out);
done:
/* cleanup */
if (base64_salt) free(base64_salt);
if (base64_out) free(base64_out);
return result;
}
We have named the Windows version of spc_md5_encrypt( ) as SpcMD5Encrypt( ) to adhere to
conventional Windows naming conventions. In addition, the implementation uses only Win32 API and
CryptoAPI functions, rather than relying on the standard C runtime for string and memory handling.
#include <windows.h>
#include <wincrypt.h>
static LPSTR Crypt64Encode(BYTE *pBuffer) {
int
i;
DWORD dwTemp;
LPSTR lpszOut, lpszPtr;
static LPSTR lpszCrypt64Set = "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwyxz";
if (!(lpszOut = lpszPtr = (char *)LocalAlloc(LMEM_FIXED, 23))) return 0;
#define CRYPT64_ENCODE(x, y, z)
\
for (i = 0, dwTemp = (pBuffer[(x)] << 16) | (pBuffer[(y)] << 8) | \
pBuffer[(z)]; i++ < 4; dwTemp >>= 6)
\
*lpszPtr++ = lpszCrypt64Set[dwTemp & 0x3F]
CRYPT64_ENCODE(0, 6, 12);
CRYPT64_ENCODE(2, 8, 14);
CRYPT64_ENCODE(4, 10, 5);
CRYPT64_ENCODE(1,
CRYPT64_ENCODE(3,
7, 13);
9, 15);
for (i = 0, dwTemp = pBuffer[11]; i++ < 2; dwTemp >>= 6)
*lpszPtr++ = lpszCrypt64Set[dwTemp & 0x3F];
*lpszPtr = 0;
#undef CRYPT64_ENCODE
return lpszOut;
}
static BOOL ComputeHash(BYTE *pbHash, LPCSTR lpszKey, LPCSTR lpszSalt,
DWORD dwSaltLength) {
int
DWORD
HCRYPTHASH
HCRYPTPROV
i, length;
cbHash, dwKeyLength;
hHash, hHash1;
hProvider;
dwKeyLength = lstrlenA(lpszKey);
if (!CryptAcquireContext(&hProvider, 0, MS_DEF_PROV, 0, CRYPT_VERIFYCONTEXT))
return FALSE;
if (!CryptCreateHash(hProvider, CALG_MD5, 0, 0, &hHash)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
CryptHashData(hHash, (BYTE *)lpszKey, dwKeyLength, 0);
CryptHashData(hHash, (BYTE *)lpszSalt, dwSaltLength, 0);
if (!CryptCreateHash(hProvider, CALG_MD5, 0, 0, &hHash1)) {
CryptDestroyHash(hHash);
CryptReleaseContext(hProvider, 0);
return FALSE;
}
CryptHashData(hHash1, lpszKey, dwKeyLength, 0);
CryptHashData(hHash1, lpszSalt, dwSaltLength, 0);
CryptHashData(hHash1, lpszKey, dwKeyLength, 0);
cbHash = 16; CryptGetHashParam(hHash1, HP_HASHVAL, pbHash, &cbHash, 0);
CryptDestroyHash(hHash1);
for (length = dwKeyLength; length > 0; length -= 16)
CryptHashData(hHash, pbHash, (length > 16 ? 16 : length), 0);
SecureZeroMemory(pbHash, 16);
for (i = dwKeyLength; i; i >>= 1)
if (i & 1) CryptHashData(hHash, pbHash, 1, 0);
else CryptHashData(hHash, lpszKey, 1, 0);
cbHash = 16; CryptGetHashParam(hHash, HP_HASHVAL, pbHash, &cbHash, 0);
CryptDestroyHash(hHash);
for (i = 0; i < 1000; i++) {
if (!CryptCreateHash(hProvider, CALG_MD5, 0, 0, &hHash)) {
CryptReleaseContext(hProvider, 0);
return FALSE;
}
if (i & 1) CryptHashData(hHash, lpszKey, dwKeyLength, 0);
else CryptHashData(hHash, pbHash, 16, 0);
if (i % 3) CryptHashData(hHash, lpszSalt, dwSaltLength, 0);
if (i & 7) CryptHashData(hHash, lpszKey, dwKeyLength, 0);
if (i & 1) CryptHashData(hHash, pbHash, 16, 0);
else CryptHashData(hHash, lpszKey, dwKeyLength, 0);
cbHash = 16; CryptGetHashParam(hHash, HP_HASHVAL, pbHash, &cbHash, 0);
CryptDestroyHash(hHash);
}
CryptReleaseContext(hProvider, 0);
return TRUE;
}
LPSTR SpcMD5Encrypt(LPCSTR lpszKey, LPCSTR lpszSalt) {
BYTE pbHash[16], pbRawSalt[8];
DWORD dwResultLength, dwSaltLength;
LPSTR lpszBase64Out, lpszBase64Salt, lpszResult, lpszTemp;
LPCSTR lpszSaltEnd;
lpszBase64Out = lpszBase64Salt = lpszResult = 0;
if (!lpszSalt) {
spc_rand(pbRawSalt, (dwSaltLength = sizeof(pbRawSalt)));
if (!(lpszBase64Salt = Crypt64Encode(pbRawSalt))) goto done;
if (!(lpszTemp = (LPSTR)LocalReAlloc(lpszBase64Salt, dwSaltLength + 1, 0)))
goto done;
lpszBase64Salt = lpszTemp;
} else {
if (lpszSalt[0] != '$' || lpszSalt[1] != '1' || lpszSalt[2] != '$') goto done;
for (lpszSaltEnd = lpszSalt + 3; *lpszSaltEnd != '$'; lpszSaltEnd++)
if (!*lpszSaltEnd) goto done;
dwSaltLength = (lpszSaltEnd - (lpszSalt + 3));
if (dwSaltLength > 8) dwSaltLength = 8; /* maximum salt is 8 bytes */
if (!(lpszBase64Salt = (LPSTR)LocalAlloc(LMEM_FIXED,dwSaltLength + 1)))
goto done;
CopyMemory(lpszBase64Salt, lpszSalt + 3, dwSaltLength);
}
lpszBase64Salt[dwSaltLength] = 0;
if (!ComputeHash(pbHash, lpszKey, lpszBase64Salt, dwSaltLength)) goto done;
if (!(lpszBase64Out = Crypt64Encode(pbHash))) goto done;
dwResultLength = lstrlenA(lpszBase64Out) + lstrlenA(lpszBase64Salt) + 5;
if (!(lpszResult = (LPSTR)LocalAlloc(LMEM_FIXED, dwResultLength + 1)))
goto done;
wsprintfA(lpszResult, "$1$%s$%s", lpszBase64Salt, lpszBase64Out);
done:
/* cleanup */
if (lpszBase64Salt) LocalFree(lpszBase64Salt);
if (lpszBase64Out) LocalFree(lpszBase64Out);
return lpszResult;
}
Verifying a password encrypted using MD5-MCF works the same way as verifying a password
encrypted with crypt( ): encrypt the plaintext password with the already encrypted password as
the salt, and compare the result with the already encrypted password. If they match, the password is
correct.
For the sake of both consistency and convenience, you can use the functionspc_md5_verify( ) to
verify a password encrypted using MD5-MCF.
int spc_md5_verify(const char *plain_password, const char *crypt_password) {
int match = 0;
char *md5_result;
if ((md5_result = spc_md5_encrypt(plain_password, crypt_password)) != 0) {
match = !strcmp(md5_result, crypt_password);
free(md5_result);
}
return match;
}
8.10.4 See Also
Recipe 8.9, Recipe 8.11
[ Team LiB ]
[ Team LiB ]
8.11 Performing Password-Based Authentication with
PBKDF2
8.11.1 Problem
You want to use a stronger encryption method than crypt( ) and MD5-MCF (see Recipe 8.9 and
Recipe 8.10).
8.11.2 Solution
Use the PBKDF2 method of converting passwords to symmetric keys. See Recipe 4.10 for a more
detailed discussion of PBKDF2.
8.11.3 Discussion
What we are doing here isn't really encrypting a password. Actually, we are
creating a password validator. We use the term encryption because it is in
common use and is a more concise way to explain the process.
The PBKDF2 algorithm provides a way to convert an arbitrary-sized password or passphrase into an
arbitrary-sized key. This method fits perfectly with the need to store passwords in a way that does
not allow recovery of the actual password. The PBKDF2 algorithm requires two extra pieces of
information besides the password: an iteration count and a salt. The iteration count specifies how
many times to run the underlying operation; this is a way to slow down the algorithm to thwart
brute-force attacks. The salt provides the same function as the salt in MD5 or DES-basedcrypt( )
implementations.
Storing a password using this method is simple; store the result of the PBKDF2 operation, along with
the iteration count and the salt. When verification of a password is required, retrieve the stored
values and run the PBKDF2 using the supplied password, saved iteration count, and salt. Compare
the output of this operation with the stored result, and if the two are equal, the password is correct;
otherwise, the passwords do not match.
The function spc_pbkdf2_encrypt( ) implements a crypt( )-like function that uses the PBKDF2
method that we've described, and it assumes the implementation found inRecipe 4.10. If it is
successful (the only error that should ever occur is an out-of-memory error), it will return a
dynamically allocated buffer that contains the encrypted password in MCF, which encodes the salt
and encrypted password in base64 as well as includes the iteration count.
MCF delimits the information it encodes with dollar signs. The first field is a digit that identifies the
algorithm represented, which also dictates what the other fields contain. As of this writing, only two
algorithms are defined for MCF: 1 indicates MD5 (see Recipe 8.9), and 2 indicates Blowfish. We have
chosen to use 10 for PBKDF2 so that it is unlikely that it will conflict with anything else.
#include
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<limits.h>
<errno.h>
char *spc_pbkdf2_encrypt(const char *key, const char *salt) {
int
error;
char
*base64_out, *base64_salt, *result, *salt_end, *tmp_string;
size_t
length, result_length, salt_length;
unsigned int iterations, tmp_uint;
unsigned char out[16], *raw_salt;
unsigned long tmp_ulong;
raw_salt = 0;
base64_out = base64_salt = result = 0;
if (!salt) {
if (!(raw_salt = (unsigned char *)malloc((salt_length = 8)))) return 0;
spc_rand(raw_salt, salt_length);
if (!(base64_salt = spc_base64_encode(raw_salt, salt_length, 0))) {
free(raw_salt);
return 0;
}
iterations = 10000;
} else {
if (strncmp(salt, "$10$", 4) != 0) return 0;
if (!(salt_end = strchr(salt + 4, '$'))) return 0;
if (!(base64_salt = (char *)malloc(salt_end - (salt + 4) + 1))) return 0;
memcpy(base64_salt, salt + 4, salt_end - (salt + 4));
base64_salt[salt_end - (salt + 4)] = 0;
tmp_ulong = strtoul(salt_end + 1, &tmp_string, 10);
if ((tmp_ulong = = ULONG_MAX && errno = = ERANGE) || tmp_ulong > UINT_MAX ||
!tmp_string || *tmp_string != '$') {
free(base64_salt);
return 0;
}
iterations = (unsigned int)tmp_ulong;
raw_salt = spc_base64_decode(base64_salt, &salt_length, 1, &error);
if (!raw_salt || error) {
free(base64_salt);
return 0;
}
}
spc_pbkdf2((char *)key, strlen(key), raw_salt, salt_length, iterations,
out, sizeof(out));
if (!(base64_out = spc_base64_encode(out, sizeof(out), 0))) goto done;
for (tmp_uint = iterations, length = 1; tmp_uint; length++) tmp_uint /= 10;
result_length = strlen(base64_out) + strlen(base64_salt) + length + 6;
if (!(result = (char *)malloc(result_length + 1))) goto done;
sprintf(result, "$10$%s$%u$%s", base64_salt, iterations, base64_out);
done:
/* cleanup */
if (raw_salt) free(raw_salt);
if (base64_salt) free(base64_salt);
if (base64_out) free(base64_out);
return result;
}
Verifying a password encrypted using PBKDF2 works the same way as verifying a password
encrypted with crypt( ): encrypt the plaintext password with the already encrypted password as
the salt, and compare the result with the already encrypted password. If they match, the password is
correct.
For the sake of both consistency and convenience, you can use the following function,
spc_pbkdf2_verify( ), to verify a password encrypted using PBKDF2.
int spc_pbkdf2_verify(const char *plain_password, const char *crypt_password) {
int match = 0;
char *pbkdf2_result;
if ((pbkdf2_result = spc_pbkdf2_encrypt(plain_password, crypt_password)) != 0) {
match = !strcmp(pbkdf2_result, crypt_password);
free(pbkdf2_result);
}
return match;
}
8.11.4 See Also
Recipe 4.10, Recipe 8.9, Recipe 8.10
[ Team LiB ]
[ Team LiB ]
8.12 Authenticating with PAM
8.12.1 Problem
You need to perform authentication in your application, but you do not want to tie your application to
any specific authentication system. Instead, you want to allow the system administrator to configure
an authentication system that is appropriate for the environment in which the application will run.
8.12.2 Solution
Use Pluggable Authentication Modules (PAM), which provides an API that is independent of the
underlying authentication system. PAM allows the system administrator to configure the
authentication system or systems to use, and it supports a wide variety of existing systems, such as
traditional Unix password-based authentication, Kerberos, Radius, and many others.
8.12.3 Discussion
We do not discuss building your own PAM modules in this book, but there is a
recipe on that topic on the book's web site.
Most modern Unix systems provide support for PAM and even use it for system-wide authentication
(for example, for interactive user login for shell access). Many popular and widely deployed services
that use authentication are also capable of using PAM.
Every application that makes use of PAM uses a service name, such as "login" or "ftpd".PAM uses the
service name along with a configuration file (often /etc/pam.conf) or files (one for each service,
named after the service, and usually located in /etc/pam.d). PAM uses configuration information
gleaned from the appropriate configuration file to determine which modules to use, how to treat
successes and failures, and other miscellaneous information.
Modules are implemented as shared libraries that are dynamically loaded into your application as
required. Each module is expected to export several standard functions in order to interact with the
PAM infrastructure. Implementation of PAM modules is outside the scope of this book, but our web
site contains more information on this topic.
PAM and its modules handle the drudgery of obtaining passwords from users if required, exchanging
keys, or doing whatever must be done to authenticate. All that you need to do in your code is make
the proper sequence of calls with the necessary information to PAM, and the details of authentication
are handled for you, allowing you to concentrate on the rest of your application.
Unfortunately, the PAM API is somewhat clumsy, and the steps necessary for performing basic
authentication with PAM are not necessarily as straightforward as they could be. The functions
presented in this recipe, spc_pam_login( ) and spc_pam_logout( ), work together to perform the
necessary steps properly.
To use PAM in your own code, you will need to include the header filessecurity/pam_appl.h and
security/pam_misc.h in your program, and link against the PAM library, usually by specifying-lpam
on the linker command line.
To authenticate a user, call spc_pam_login( ), which has the following signature:
pam_handle_t *spc_pam_login(const char *service, const char *user, int **rc);
This function has the following arguments:
service
Name of the service to use. PAM uses the service name to find the appropriate module
configuration information in its configuration file or files. You will typically want to use a service
name that does not conflict with anything else, though if you are writing an FTP server, for
example, you will want to use "ftpd" as the service.
user
Name of the user to authenticate.
rc
Pointer to an integer that will receive the PAM error code if an error occurs.
If the user is authenticated successfully, spc_pam_login( ) will return a non-NULL pointer to a
pam_handle_t context object. Otherwise, it will return NULL, and you should consult the rc argument
for the error code.
#include <security/pam_appl.h>
#include <security/pam_misc.h>
static struct pam_conv spc_pam_conv = { misc_conv, 0 };
pam_handle_t *spc_pam_login(const char *service, const char *user, int *rc) {
pam_handle_t *hndl;
if (!service || !user || !rc) {
if (rc) *rc = PAM_ABORT;
return 0;
}
if ((*rc = pam_start(service, user, &spc_pam_conv, &hndl)) != PAM_SUCCESS) {
pam_end(hndl, *rc);
return 0;
}
if ((*rc = pam_authenticate(hndl, PAM_DISALLOW_NULL_AUTHTOK)) != PAM_SUCCESS) {
pam_end(hndl, *rc);
return 0;
}
*rc = pam_acct_mgmt(hndl, 0);
if (*rc = = PAM_NEW_AUTHTOK_REQD) {
pam_chauthtok(hndl, PAM_CHANGE_EXPIRED_AUTHTOK);
*rc = pam_acct_mgmt(hndl, 0);
}
if (*rc != PAM_SUCCESS) {
pam_end(hndl, *rc);
return 0;
}
if ((*rc = pam_setcred(hndl, PAM_ESTABLISH_CRED)) != PAM_SUCCESS) {
pam_end(hndl, *rc);
return 0;
}
if ((*rc = pam_open_session(hndl, 0)) != PAM_SUCCESS) {
pam_end(hndl, *rc);
return 0;
}
/* no need to set *rc to PAM_SUCCESS; we wouldn't be here if it weren't */
return hndl;
}
After the authentication is successful, you should maintain thepam_handle_t object returned by
spc_pam_login( ) until the user logs out from your application, at which point you should call
spc_pam_logout( ) to allow PAM to perform anything it needs to do to log the user out.
void spc_pam_logout(pam_handle_t *hndl) {
if (!hndl) return;
pam_close_session(hndl, 0);
pam_end(hndl, PAM_SUCCESS);
}
8.12.4 See Also
"Pluggable Authentication Modules" by A. G. Morgan:
http://www.kernel.org/pub/linux/libs/pam/pre/doc/current-draft.txt
OpenPAM home page: http://openpam.sourceforge.net
Linux PAM home page: http://www.kernel.org/pub/linux/libs/pam/
Solaris PAM home page: http://wwws.sun.com/software/solaris/pam/
[ Team LiB ]
[ Team LiB ]
8.13 Authenticating with Kerberos
8.13.1 Problem
You need to authenticate using Kerberos.
8.13.2 Solution
If the client and the server are operating within the same Kerberos realm (or in separate realms, but
cross-realm authentication is possible), you can use the user's credentials to authenticate from the
client with the server. Both the client and the server must support this authentication method.
The code presented in this recipe assumes you are using either the Heimdal or the MIT Kerberos
implementation. It further assumes you are using Version 5, which we consider reasonable because
Version 4 has been obsolete for so many years. We do not cover the Windows interface to Kerberos
in this book because of the significant difference in the API compared to Heimdal and MIT
implementations, as well as the complexity of the SSPI API that is required on Windows. We do,
however, present an equivalent recipe for Windows on the book's web site.
8.13.3 Discussion
First, we define a structure primarily for convenience. After a successful authentication, several
pieces of information are passed back from the Kerberos API. We store each of these pieces of
information in a single structure rather than adding several additional arguments to our
authentication functions.
#include <krb5.h>
typedef struct {
krb5_context
ctx;
krb5_auth_context auth_ctx;
krb5_ticket
*ticket;
} spc_krb5bundle_t;
On the client side, only the ctx and auth_ctx fields will be initialized. On the server side, all three
fields will be initialized. Before passing anspc_krb5bundle_t object to either spc_krb5_client( )
or spc_krb5_server( ), you must ensure that auth_ctx and ticket are initialized to NULL. If the
ctx field is not NULL, it should be a valid krb5_context object, which will be used instead of creating
a new one.
Both the client and the server must be able to handle using Kerberos authentication. The code
required for each side of the connection is very similar. On the client side,spc_krb5_client( ) will
attempt to authenticate with the server. The code assumes that the user has already obtained a
ticket-granting ticket from the appropriate Key Distribution Center (KDC), and that a credentials
cache exists.
The function spc_krb5_client( ) has the following signature:
krb5_error_code spc_krb5_client(int sockfd, spc_krb5bundle_t *bundle,
char *service, char *host, char *version);
This function has the following arguments:
sockfd
Socket descriptor over which the authentication should be performed. The connection to the
server should already be established, and the socket should be in blocking mode.
bundle
spc_krb5bundle_t object that will be loaded with information if the authentication with the
server is successful. Before calling spc_krb5_client( ), you should be sure to zero the
contents of this structure. If the structure contains a pointer to a Kerberos context object,
spc_krb5_client( ) will use it instead of creating a new one.
service
Name component of the server's principal. It is combined with the server's hostname or
instance to build the principal for the server. The server's principal will be of the form
service/[email protected] The realm is assumed to be the user's default realm.
host
Hostname of the server. It is used as the instance component of the server's principal.
version
Version string that is sent to the server. This string is generally used to indicate a version of
the protocol that the client and server will speak to each other. It does not have anything to do
with the Kerberos protocol or the version of Kerberos in use. The string may be anything you
want, but both the client and server must agree on the same string for authentication to
succeed.
If authentication is successful, the return value from spc_krb5_client( ) will be 0, and the relevant
fields in the spc_krb5bundle_t object will be filled in. The client may then proceed to use other
Kerberos API functions to exchange encrypted and authenticated information with the server. Of
particular interest is that a key suitable for use with a symmetric cipher is now available. (SeeRecipe
9.6 for an example of how to use the key effectively.)
If any kind of error occurs while attempting to authenticate with the server, the return value from the
following spc_krb5_client( ) function will be the error code returned by the Kerberos API function
that failed. Complete lists of error codes are available in the Heimdal and MIT Kerberos header files.
krb5_error_code spc_krb5_client(int sockfd, spc_krb5bundle_t *bundle,
char *service, char *host, char *version) {
int
free_context = 0;
krb5_principal server = 0;
krb5_error_code rc;
if (!bundle->ctx) {
if ((rc = krb5_init_context(&(bundle->ctx))) != 0) goto error;
free_context = 1;
}
if ((rc = krb5_sname_to_principal(bundle->ctx, host, service,
KRB5_NT_SRV_HST, &server)) != 0) goto error;
rc = krb5_sendauth(bundle->ctx, &(bundle->auth_ctx), &sockfd, version,
0, server, AP_OPTS_MUTUAL_REQUIRED, 0, 0, 0, 0, 0, 0);
if (!rc) {
krb5_free_principal(bundle->ctx, server);
return 0;
}
error:
if (server) krb5_free_principal(bundle->ctx, server);
if (bundle->ctx && free_context) {
krb5_free_context(bundle->ctx);
bundle->ctx = 0;
}
return rc;
}
The code for the server side of the connection is similar to the client side, although it is somewhat
simplified because most of the information in the exchange comes from the client. The function
spc_krb5_server( ), listed later in this section, performs the server-side part of the authentication.
It ultimately calls krb5_recvauth( ), which waits for the client to initiate an authenticate request.
The function spc_krb5_server( ) has the following signature:
krb5_error_code spc_krb5_server(int sockfd, spc_krb5bundle_t *bundle,
char *service, char *version);
This function has the following arguments:
sockfd
Socket descriptor over which the authentication should be performed. The connection to the
client should already be established, and the socket should be in blocking mode.
bundle
spc_krb5bundle_t object that will be loaded with information if the authentication with the
server is successful. Before calling spc_krb5_server( ), you should be sure to zero the
contents of this structure. If the structure contains a pointer to a Kerberos context object,
spc_krb5_server( ) will use it instead of creating a new one.
service
Name component of the server's principal. It is combined with the server's hostname or
instance to build the principal for the server. The server's principal will be of the form
service/[email protected]
On the client side, an additional argument is required to specify the hostname of the server,
but on the server side, the hostname of the machine on which the program is running will be
used.
version
Version string that is generally used to indicate a version of the protocol that the client and
server will speak to each other. It does not have anything to do with the Kerberos protocol or
the version of Kerberos in use. The string may be anything you want, but both the client and
server must agree on the same string for authentication to succeed.
If authentication is successful, the return value from spc_krb5_server( ) will be 0, and the relevant
fields in the spc_krb5bundle_t object will be filled in. If any kind of error occurs while attempting to
authenticate with the server, the return value from spc_krb5_server( ) will be the error code
returned by the Kerberos API function that failed.
krb5_error_code spc_krb5_server(int sockfd, spc_krb5bundle_t *bundle,
char *service, char *version) {
int
free_context = 0;
krb5_principal server = 0;
krb5_error_code rc;
if (!bundle->ctx) {
if ((rc = krb5_init_context(&(bundle->ctx))) != 0) goto error;
free_context = 1;
}
if ((rc = krb5_sname_to_principal(bundle->ctx, 0, service,
KRB5_NT_SRV_HST, &server)) != 0) goto error;
rc = krb5_recvauth(bundle->ctx, &(bundle->auth_ctx), &sockfd, version,
server, 0, 0, &(bundle->ticket));
if (!rc) {
krb5_free_principal(bundle->ctx, server);
return 0;
}
error:
if (server) krb5_free_principal(bundle->ctx, server);
if (bundle->ctx && free_context) {
krb5_free_context(bundle->ctx);
bundle->ctx = 0;
}
return rc;
}
When a successful authentication is completed, an spc_krb5bundle_t object is filled with information
resulting from the authentication. This information should eventually be cleaned up, of course. You
may safely keep the information around as long as you need it, or you may clean it up at any time.
If, once the authentication is complete, you don't need to retain any of the resulting information for
further communication, you may even clean it up immediately.
Call the function spc_krb5_cleanup( )when you no longer need any of the information contained in
an spc_krb5bundle_t object. It will free all of the allocated resources in the proper order.
void spc_krb5_cleanup(spc_krb5bundle_t *bundle) {
if (bundle->ticket) {
krb5_free_ticket(bundle->ctx, bundle->ticket);
bundle->ticket = 0;
}
if (bundle->auth_ctx) {
krb5_auth_con_free(bundle->ctx, bundle->auth_ctx);
bundle->auth_ctx = 0;
}
if (bundle->ctx) {
krb5_free_context(bundle->ctx);
bundle->ctx = 0;
}
}
8.13.4 See Also
Recipe 9.6
[ Team LiB ]
[ Team LiB ]
8.14 Authenticating with HTTP Cookies
8.14.1 Problem
You are developing a CGI application for the Web and need to store data on the client's machine
using a cookie, but you want to prevent the client from viewing the data or modifying it without your
application being able to detect the change.
8.14.2 Solution
Web cookies are implemented by setting a value in the MIME headers sent to the client in a server
response. If the client accepts the cookie, it will present the cookie back to the server every time the
specified conditions are met. The cookie is stored on the client's computer, typically in a plaintext file
that can be modified with any editor. Many browsers even provide an interface for viewing and
editing cookies that have been stored.
A single MIME header is a header name followed by a colon, a space, and the header value. The
format of the header value depends on the header name. Here, we're concerned with only two
headers: the Set-Cookie header, which can be sent to the client when presenting a web page, and
the Cookie header, which the client presents to the server when the user browses to a site which
stores a cookie.
To ensure the integrity of the data that we store on the client's computer with our cookie, we should
encrypt and MAC the data. The server does encoding when setting a cookie, then decrypts and
validates whenever the cookie comes back. The server does not share its keys with any other
entity-it alone uses them to ensure that the data has not been read or modified since it originally
left the server.
8.14.3 Discussion
When encrypting and MAC'ing the data stored in a cookie, we encounter a problem: we can use only
a limited character set in cookie headers, yet the output of our cryptographic algorithms is always
binary. To solve this problem, we encode the binary data into the base64 character set. The base64
character set uses the uppercase letters, the lowercase letters, the numbers, and a few pieces of
punctuation to represent data. Out of necessity, the length of data grows considerably when base64encoded. We can use the spc_base64_encode( ) function from Recipe 4.5 for base64 encoding to
suit our purposes.
The first thing that the server must do is call spc_cookie_init( ), which will initialize a context
object that we'll use for both encoding and decoding cookie data. To simplify the encryption and
MAC'ing process, as well as reduce the complexity of sending and processing received cookies, we'll
use CWC mode from Recipe 5.10.
Initialization requires a key to use for encrypting and MAC'ing the data in cookies. The
implementation of CWC described in Recipe 5.10 can use keys that are 128, 192, or 256 bits in size.
Before calling spc_cookie_init( ), you should create a key using spc_rand( ), as defined in
Recipe 11.2. If the cookies you are sending to the client are persistent, you should store the key on
the server so that the same key is always used, rather than generating a new one every time the
server starts up. You can either hardcode the key into your program or store it in a file somewhere
that is inaccessible through the web server so that you are sure it cannot be compromised.
#include <stdlib.h>
#include <string.h>
#include <cwc.h>
static cwc_t spc_cookie_cwc;
static unsigned char spc_cookie_nonce[11];
int spc_cookie_init(unsigned char *key, size_t keylen) {
memset(spc_cookie_nonce, 0, sizeof(spc_cookie_nonce));
return cwc_init(&spc_cookie_cwc, key, keylen * 8);
}
To encrypt and MAC the data to send in a cookie, use the followingspc_cookie_encode( ) function,
which requires two arguments:
cookie
Data to be encrypted and MAC'd. spc_cookie_encode( ) expects the data to be a C-style
string, which means that it should not contain binary data and should beNULL terminated.
nonce
11-byte buffer that contains the nonce to use (see Recipe 4.9 for a discussion of nonces). If
you specify this argument as NULL, a default buffer that contains all NULL bytes will be used for
the nonce.
The problem with using a nonce with cookies is that the same nonce must be used for decrypting and
verifying the integrity of the data received from the client. To be able to do this, you need a second
plaintext cookie that allows you to recover the nonce before decrypting and verifying the encrypted
cookie data. Typically, this would be the user's name, and the server would maintain a list of nonces
that it has encoded for each logged-in user.
If you do not use a nonce, your system will be susceptible to capture replay
attacks. It is worth expending the effort to use a nonce.
The return from spc_cookie_encode( ) will be a dynamically allocated buffer that contains the
base64-encoded ciphertext and MAC of the data passed into it. You are responsible for freeing the
memory by calling free( ).
char *spc_cookie_encode(char *cookie, unsigned char *nonce) {
size_t
cookielen;
unsigned char *out;
cookielen = strlen(cookie);
if (!(out = (unsigned char *)malloc(cookielen + 16))) return 0;
if (!nonce) nonce = spc_cookie_nonce;
cwc_encrypt_message(&spc_cookie_cwc, 0, 0, cookie, cookielen, nonce, out);
cookie = spc_base64_encode(out, cookielen + 16, 0);
free(out);
return cookie;
}
When the cookies are received by the server from the client, you can pass the encrypted and MAC'd
data to spc_cookie_decode( ), which will decrypt the data and verify its integrity. If there is any
error, spc_cookie_decode( ) will return NULL; otherwise, it will return the decrypted data in a
dynamically allocated buffer that you are responsible for freeing withfree( ).
char *spc_cookie_decode(char *data, unsigned char *nonce) {
int
error;
char
*out;
size_t
cookielen;
unsigned char *cookie;
if (!(cookie = spc_base64_decode(data, &cookielen, 1, &error))) return 0;
if (!(out = (char *)malloc(cookielen - 16 + 1))) {
free(cookie);
return 0;
}
if (!nonce) nonce = spc_cookie_nonce;
error = !cwc_decrypt_message(&spc_cookie_cwc, 0, 0, cookie, cookielen,
nonce, out);
free(cookie);
if (error) {
free(out);
return 0;
}
out[cookielen - 16] = 0;
return out;
}
8.14.4 See Also
Recipe 4.5, Recipe 4.6, Recipe 4.9, Recipe 5.10, Recipe 11.2
[ Team LiB ]
[ Team LiB ]
8.15 Performing Password-Based Authentication and Key
Exchange
8.15.1 Problem
You want to establish a secure channel without using public key cryptography at all. You want to
avoid tunneling a traditional authentication protocol over a protocol like SSL, instead preferring to
build your own secure channel with a good protocol.
8.15.2 Solution
SAX (Symmetric Authenticated eXchange) is a protocol for creating a secure channel that does not
use public key cryptography.
PAX (Public key Authenticated eXchange) is similar to SAX, but it uses public key cryptography to
prevent against client spoofing if the attacker manages to get the server-side authentication
database. The public key cryptography also makes PAX a bit slower.
8.15.3 Discussion
The SAX and PAX protocols both perform authentication and key exchange. The protocols are
generic, so they work in any environment. However, in this recipe we'll show you how to use SAX and
PAX in the context of the Authenticated eXchange (AX) library, available from
http://www.zork.org/ax/. This library implements SAX and PAX over TCP/IP using a single API.
Let's take a look at how these protocols are supposed to work from the user's point of view. The
server needs to have authentication information associated with the user. The account setup must be
done over a preexisting secure channel. Perhaps the user sits down at a console, or the system
administrator might do the setup on behalf of the user while they are talking over the phone.
Account setup requires the user's password for that server. The password is used to compute some
secret information stored on the server; then the actual password is thrown away.
At account creation time, the server picks a salt value that is used to thwart a number of attacks.
The server can choose to do one of two things with this salt:
Tell it to the user, and have the user type it in the first time she logs in from any new machine
(the machine can then cache the salt value for subsequent connections). This solution prevents
attackers from learning anything significant by guessing a password, because the attacker has
to guess the salt as well. The salt effectively becomes part of the password.
Let the salt be public, in which case the attacker can try out passwords by attempting to
authenticate with the server.
8.15.3.1 The server
The first thing the server needs to be able to do is create accounts for users. User credential
information is stored in objects of type AX_CRED. To compute credentials, use the following function:
void AX_compute_credentials(char *user, size_t ulen, char *pass, size_t plen,
size_t ic, size_t pksz, size_t minkl, size_t maxkl,
size_t public_salt, size_t saltlen, AX_CRED *out);
This function has the following arguments:
user
Arbitrary binary string representing the unique login ID of the user.
ulen
Length of the username.
pass
The password, an arbitrary binary string.
plen
Length of the password in bytes.
ic
Iteration count to be used in the internal secret derivation function. SeeRecipe 4.10 for
recommendations on setting this value (AX uses the derivation function from that recipe).
pksz
Determines whether PAX credentials or SAX credentials should be computed. If you are using
PAX, the value specifies the length of the modulus of the public key in bits, which must be
1,024, 2,048, 4,096, or 8,192. If you are using SAX, set this value to 0.
minkl
Minimum key length we will allow the client to request when doing an exchange, in bytes. We
recommend 16 bytes (128 bits).
maxkl
Maximum key length we will allow the client to request when doing an exchange, in bytes.
Often, the protocol you use will only want a single fixed-size key (and not give the client the
option to choose), in which case, this should be the same value asminkl.
public_salt
If this is nonzero, the server will give out the user's salt value when requested. Otherwise, the
server should print out the salt at account creation time and have the user enter it on first login
from a new client machine.
salt_len
Length of the salt that will be used. The salt value is not actually entirely random. Three bytes
of the salt are used to encode the iteration count and the public key size. The rest of it is
random. We recommend that, if the salt is public, you use 16-byte salts. If the salt is kept
private, you will not want to make them too large, because you will have to convert them into
a printable format that the user has to carry around and enter. The minimum size AX allows is
11 bytes, which base64-encodes to 15 characters.
out
Pointer to a container into which credentials will be placed. You are expected to allocate this
object.
AX provides an API for serializing and deserializing credential objects:
char
*AX_CRED_serialize(AX_CRED *c, size_t *outlen);
AX_CRED *AX_CRED_deserialize(char *buf, size_t buflen);
These two functions each allocate their result with malloc( ) and return 0 on error.
In addition, if the salt value is to stay private, you will need to retrieve it so that you can encode it
and show it to the user. AX provides the following function for doing that:
char *AX_get_salt(AX_CRED *creds, size_t *saltlen);
The result is allocated by malloc( ). The size of the salt is placed into the memory pointed to by the
second argument.
Now that we can set up account information and store credentials in a database, we can look at how
to actually set up a server to handle connections. The high-level AX API does most of the work for
you. There's an actual server abstraction, which is of type AX_SRV.
You do need to define at least one callback, two if you want to log errors. In the first callback, you
must return a credential object for the associated user. The callback should be a pointer to a function
with the following signature:
AX_CRED *AX_get_credentials_callback(AX_SRV *s, char *user, size_t ulen,
char *extra, size_t elen);
This function has the following arguments:
s
Pointer to the server object. If you have multiple servers in a single program, you can use this
pointer to determine which server produced the request.
user
Username given to the server.
ulen
Length of the username.
extra
Additional application-specific information the client passed to the server. You can use this for
whatever purpose you want. For example, you could use this field to encode the server name
the client thinks it's connecting to, in order to implement virtual servers.
elen
Length of the application-specific data.
If the user does not exist, you must return 0 from this callback.
The other callback allows you to log errors when a key exchange fails. You do not have to define this
callback. If you do define it, the signature is the same as in the previous callback, except that it takes
an extra parameter of type size_t that encodes the error, and it does not return anything. As of this
writing, there are only two error conditions that might get reported:
AX_SOCK_ERR
Indicates that a generic socket error occurred. You can use your platform's standard API to
retrieve more specific information.
AX_CAUTH_ERR
Indicates that the server was unable to authenticate the client.
The first error can represent a large number of failures. In most cases, the connection will close
unexpectedly, which can indicate many things, including loss of connectivity or even the client's
failing to authenticate the server.
To initialize a server, we use the following function:
AX_SRV *AX_srv_listen(char *if, unsigned short port, size_t protocol,
AX_get_creds_cb cf, AX_exchange_status_cb sf);
This function has the following arguments:
if
String indicating the interface on which to bind. If you want to bind on all interfaces a machine
has, use "0.0.0.0".
port
Port on which to bind.
protocol
Indication of which protocol you're using. As of this writing, the only valid values are
SAX_PROTOCOL_v1 and PAX_PROTOCOL_v1.
cf
callback for retrieving credentials discussed above.
sf
Callback for error reporting discussed above. Set this to NULL if you don't need it.
This function returns a pointer to an object of type AX_SRV. If there's an error, an exception is thrown
using the XXL exception-handling API (discussed inRecipe 13.1). All possible exceptions are standard
POSIX error codes that would indicate some sort of failure when calling the underlying socket API.
To close down the server and deallocate associated memory, pass the object toAX_srv_close( ).
Once we have a server object, we need to wait for a connection to come in. Once a connection comes
in, we can tell the server to perform a key exchange with that connection. To wait for a connection to
come in, use the following function (which will always block):
AX_CLIENT *AX_srv_accept(AX_SRV *s);
This function returns a pointer to an AX_CLIENT object when there is a connection. Again, if there's
an error, an exception gets thrown, indicating an error caught by the underlying socket API.
At this point, you should launch a new thread or process to deal with the connection, to prevent an
attacker from launching a denial of service by stalling the key exchange.
Once we have received a client object, we can perform a key exchange with the following function:
int AX_srv_exchange(AX_CLIENT *c, char *key, size_t
char *x, size_t *xl);
*kl, char *uname, size_t *ul,
This function has the following arguments:
c
Pointer to the client object returned by AX_srv_accept( ). This object will be deallocated
automatically during the call.
key
Agreed-upon key.
kl
Pointer into which the length of the agreed-upon key in bytes is placed.
uname
Pointer to memory allocated by malloc( ) that stores the username of the entity on the other
side. You are responsible for freeing this memory withfree( ).
ul
Pointer into which the length of the username in bytes is placed.
x
Pointer to dynamically allocated memory representing application-specific data. The memory is
allocated with malloc( ), and you are responsible for deallocating this memory as well.
xl
Pointer into which the length of the application-specific data is placed.
On success, AX_srv_exchange( ) will return a connected socket descriptor in blocking mode that you
can then use to talk to the client. On failure, an XXL exception will be raised. The value of the
exception will be either AX_CAUTH_ERR if we believe the client refused our credentials or
AX_SAUTH_ERR if we refused the client's credentials. In both cases, it is possible that an attacker's
tampering with the data stream caused the error. On the other hand, it could be that the two parties
could not agree on the protocol version or key size.
With a valid socket descriptor in hand, you can now use the exchanged key to set up a secure
channel, as discussed in Recipe 9.12. When you are finished communicating, you may simply close
the socket descriptor.
Note that whether or not the exchange with the client succeeds, AX_srv_exchange( ) will free the
AC_CLIENT object passed into it. If the exchange fails, the socket descriptor will be closed, and the
client will have to reconnect in order to attempt another exchange.
8.15.3.2 The client
The client side is a bit less work. We first connect to the server with the following function:
AX *AX_connect(char *addr, unsigned short port, char *uname, size_t ulen,
char *extra, size_t elen, size_t protocol);
This function has the following arguments:
addr
IP address (or DNS name) of the server as a NULL-terminated string.
port
Port to which we should connect on the remote machine.
uname
Username.
ulen
Length of the username in bytes.
extra
Application-specific data discussed above.
elen
Length of the application-specific data in bytes.
protocol
Indication of the protocol you're using to connect. As of this writing, the only valid values are
SAX_PROTOCOL_v1 and PAX_PROTOCOL_v1.
This call will throw an XXL exception if there's a socket error. Otherwise, it will return an object
dynamically allocated with malloc( ) that contains the key exchange state.
If the user is expected to know the salt (i.e., if the server will not send it over the network), you
must enter it at this time, with the following function:
void AX_set_salt(AX *p, char *salt, size_t saltlen);
AX_set_salt( ) expects the binary encoding that the server-side API produced. It is your
responsibility to make sure the user can enter this value. Note that this function copies a reference to
the salt and does not copy the actual value, so do not modify the memory associated with your salt
until the AX context is deallocated (which happens as a side effect of the key exchange process; see
the following discussion).
Note that, the first time you make the user type in the salt on a particular client machine, you should
save the salt to disk. We strongly recommend encrypting the salt with the user's supplied password,
using an authenticated encryption mode and the key derivation function fromRecipe 4.10.
Once the client knows the salt, it can initiate key exchange using the following function:
int AX_exchange(AX *p, char *pw, size_t pwlen, size_t keylen, char *key);
This function has the following arguments:
p
Pointer to the context object that represents the connection to the server.
pw
Password, treated as a binary string (i.e., not NULL-terminated).
pwlen
Length of the associated password in bytes.
keylen
Key length the client desires in the exchange. The server must be prepared to serve up keys of
this length; otherwise, the exchange will fail.
key
Buffer into which the key will be placed if authentication and exchange are successful.
On success, AX_exchange( ) will return a connected socket descriptor in blocking mode that you can
then use to talk to the server. On failure, an XXL exception will be raised. The value of the exception
will be either AX_CAUTH_ERR if we believe the server refused our credentials or AX_SAUTH_ERR if we
refused the server's credentials. In both cases, it is possible that an attacker's tampering with the
data stream caused the error. On the other hand, it could be that the two parties could not agree on
the protocol version or key size.
With a valid socket descriptor in hand, you can now use the exchanged key to set up a secure
channel, as discussed in Recipe 9.12. When you are finished communicating, you may simply close
the socket descriptor.
Whether or not the connection succeeds, AX_exchange( ) automatically deallocates the AX object
passed into it. If the exchange does fail, the connection to the server will need to be reestablished by
calling AX_connect( ) a second time.
8.15.4 See Also
AX home page: http://www.zork.org/ax/
Recipe 4.10, Recipe 9.12, Recipe 13.1
[ Team LiB ]
[ Team LiB ]
8.16 Performing Authenticated Key Exchange Using RSA
8.16.1 Problem
Two parties in a network communication want to communicate using symmetric encryption. At least
one party has the RSA public key of the other, which was either transferred in a secure manner or
will be validated by a trusted third party.
You want to do authentication and key exchange without any of the information leakage generally
associated with password-based protocols.
8.16.2 Solution
Depending on your authentication requirements, you can do one-way authenticating key transport,
two-way authenticating key transport, or two-way authenticating key agreement.
8.16.3 Discussion
Instead of using this recipe to build your own key establishment protocols, it is
much better to use a preexisting network protocol such as SSL/TLS (seeRecipe
9.1 and Recipe 9.2) or to use PAX (Recipe 8.15) alongside the secure channel
code from Recipe 9.12.
With key transport, one entity in a system chooses a key and sends it to the entity with which it
wishes to communicate, generally by encrypting it with the RSA public key of that entity.
In such a scenario, the sender has to have some way to ensure that it really does have the public key
of the entity with which it wants to communicate. It can do this either by using a trusted third party
(see Chapter 10) or by arranging to transport the public key in a secure manner, such as on a CD-R.
If the recipient can send a message back to the sender using the session key, and that message
decrypts correctly, the sender can be sure that an entity possessing the correct private key has
received the session key. That is, the sender has authenticated the receiver, as long as the receiver's
public key is actually correct.
Such a protocol can be modified so that both parties can authenticate each other. In such a scheme,
the sender generates a secret key, then securely signs and encrypts the key.
It is generally insecure to sign the unencrypted value and encrypt that,
particularly in a public key-based system. In such a system, it is not even a
good idea to sign encrypted values. There are several possible solutions to this
issue, discussed in detail in Recipe 7.14. For now, we are assuming that you will
be using one of the techniques in that recipe.
Assuming that the recipient has some way to receive and validate the sender's public key, the
recipient can now validate the sender as well.
The major limitation of key transport is that the machine initiating a connection may have a weak
source of entropy, leading to an insecure connection. Instead, you could build akey agreement
protocol, where each party sends the other a significant chunk of entropy and derives a shared secret
from the information. For example, you might use the following protocol:
1. The client picks a random 128-bit secret.
2. The client uses a secure technique to sign the secret and encrypt it with the server's public key.
(See Recipe 7.14 for how to do this securely.)
3. The client sends the signed, encrypted key to the server.
4. The server decrypts the client's secret.
5. The server checks the client's signature, and fails if the client isn't authenticated. (The server
must already have a valid public key for the client.)
6. The server picks a random 128-bit secret.
7. The server uses a secure technique to sign the secret and encrypt it with the client's public key
(again, see Recipe 7.14).
8. The server sends its signed, encrypted secret to the client.
9. The client decrypts the server's secret.
10. The client checks the server's signature, and fails if the server isn't authenticated. (The client
must already have a valid public key for the server.)
11. The client and the server compute a master secret by concatenating the client secret and the
server secret, then hashing that with SHA1, truncating the result to 128 bits.
12. Both the client and the server generate derived keys for encryption and MAC'ing, as necessary.
13. The client and the server communicate using their new agreed-upon keys.
Incorporating either key transport or key exchange into a protocol that involves algorithm negotiation
is more complex. In particular, after keys are finally agreed upon, the client must MAC all the
messages received, then send that MAC to the server. The server must reconstruct the messages the
client received and validate the MAC. The server must then MAC the messages it received (including
the client's MAC), and the client must validate that MAC.
This MAC'ing is necessary to ensure that an attacker doesn't maliciously modify negotiation messages
before full encryption starts. For example, consider a protocol where the server tells the client which
encryption algorithms it supports, and the client chooses one from the list that it also supports. An
attacker might intercept the server's list and instead send only the subset of algorithms the attacker
knows how to break, forcing the client to select an insecure algorithm. Without the MAC'ing, neither
side would detect the modification of the server's message.
The client's public key is a weak point. If it gets stolen, other people can
impersonate the user. You should generally use PKCS #5 to derive a key from
a password (as shown in Recipe 4.10), then encrypt the public key (e.g., using
AES in CWC mode, as discussed in Recipe 5.10).
The SSL/TLS protocol handles all of the above concerns for you. It provides either one-way or twoway authenticating key exchange. (Note that in one-way, the server does not authenticate the client
using public key cryptography, if at all.) It is usually much better to use that protocol than to create
your own, particularly if you're not going to hardcode a single set of algorithms.
If you do not want to use a PKI, but would still like an easy off-the-shelf construction, combine PAX
(Recipe 8.15) with the secure channel from Recipe 9.12.
8.16.4 See Also
Recipe 4.10, Recipe 5.10, Recipe 7.14, Recipe 8.15, Recipe 9.1, Recipe 9.2, Recipe 9.12
[ Team LiB ]
[ Team LiB ]
8.17 Using Basic Diffie-Hellman Key Agreement
8.17.1 Problem
You want a client and a server to agree on a shared secret such as an encryption key, and you need
or want to use the Diffie-Hellman key exchange protocol.
8.17.2 Solution
Your cryptographic library should have an implementation of Diffie-Hellman. If it does not, be aware
that Diffie-Hellman is easy to implement on top of any arbitrary precision math library. You will need
to choose parameters in advance, as we describe in the followingSection 8.17.3.
Once you have a shared Diffie-Hellman secret, use a key derivation function to derive an actual
secret for use in other cryptographic operations. (See Recipe 4.11.)
8.17.3 Discussion
Diffie-Hellman is a very simple way for two entities to agree on a key without an eavesdropper's
being able to determine the key. However, room remains for a man-in-the-middle attack. Instead of
determining the shared key, the attacker puts himself in the middle, performing key agreement with
the client as if he were the server, and performing key agreement with the server as if he were the
client. That is, when you're doing basic Diffie-Hellman, you don't know who you're exchanging keys
with; you just know that no one else has calculated the agreed-upon key by snooping the network.
(See Recipe 7.1 for more information about such attacks.)
To solve the man-in-the-middle problem, you generally need to introduce some
sort of public key authentication mechanism. With Diffie-Hellman, it is common
to use DSA (see Recipe 7.15 and Recipe 8.18).
Basic Diffie-Hellman key agreement is detailed in PKCS (Public Key Cryptography Standard) #3.[3]
It's a much simpler standard than the RSA standard, in part because there is no authentication
mechanism to discuss.
[3]
See http://www.rsasecurity.com/rsalabs/pkcs/pkcs-3/.
The first thing to do with Diffie-Hellman is to come up with aDiffie-Hellman modulus n that is shared
by all entities in your system. This parameter should be a large prime number, at least 1,024 bits in
length (see the considerations in Recipe 8.17). The prime can be generated using Recipe 7.5, with
the additional stipulation that you should throw away any value where (n - 1)/2 is not also prime.
Some people like to use a fixed modulus shared across all users. We don't
recommend that approach, but if you insist on using it, be sure to read RFCs
2631 and 2785.
Diffie-Hellman requires another parameter g, the "generator," which is a value that we'll be
exponentiating. For ease of computation, use either 2 or 5.[4] Note that not every {prime,
generator} pair will work, and you will need to test the generator to make sure that it has the
mathematical properties that Diffie-Hellman requires.
[4]
It's possible (but not recommended) to use a nonprime value for n, in which case you need to compute a
suitable value for g. See the Applied Cryptography for an algorithm.
OpenSSL expects that 2 or 5 will be used as a generator. To select a prime for the modulus, you can
use the function DH_generate_parameters( ), which has the following signature:
DH *DH_generate_parameters(int prime_len, int g,
void (*callback)(int, int, void *), void *cb_arg);
This function has the following arguments:
prime_len
Size in bits of the prime number for the modulus (n) to be generated.
g
Generator you want to use. It should be either 2 or 5.
callback
Pointer to a callback function that is passed directly toBN_generate_prime( ), as discussed in
Recipe 7.4. It may be specified as NULL, in which case no progress will be reported.
cb_arg
Application-specific argument that is passed directly to the callback function, if one is specified.
The result will be a new DH object containing the generated modulus (n) and generator (g)
parameters. When you're done with the DH object, free it with the function DH_free( ).
Once parameters are generated, you need to check to make sure the prime and the generator will
work together properly. In OpenSSL, you can do this with DH_check( ):
int *DH_check(DH *ctx, int *err);
This function has the following arguments:
ctx
Pointer to the Diffie-Hellman context object to check.
err
Pointer to an integer to which is written an indication of any error that occurs.
This function returns 1 even if the parameters are bad. The 0 return value indicates that the
generator is not 2 or 5, as OpenSSL is not capable of checking parameter sets that include other
generators. Any error is always passed through the err parameter. The errors are as follows:
H_CHECK_P_NOT_SAFE_PRIME
DH_NOT_SUITABLE_GENERATOR
DH_UNABLE_TO_CHECK_GENERATOR
The first two errors can occur at the same time, in which case the value pointed to byerr will be the
logical OR of both constants.
Once both sides have the same parameters, they can send each other a message; each then
computes the shared secret. If the client initiates the connection, the client chooses a random value
x, where x is less than n. The client computes A = gx mod n, then sends A to the server. The server
chooses a random value y, where y is less than n. The server computes B = gy mod n, then sends B
to the client.
The server calculates the shared secret by computing k = Ay mod n. The client calculates the same
secret by computing Bx mod n.
Generating the message to send with OpenSSL is done with a call to the functionDH_generate_key(
):
int DH_generate_key(DH *ctx);
The function returns 1 on success. The value to send to the other party is stored inctx->pub_key.
Once one side receives the public value from the other, it can generate the shared secret with the
function DH_compute_key( ):
int DH_compute_key(unsigned char *secret, BIGNUM *pub_value, DH *dh);
This function has the following arguments:
secret
Buffer into which the resulting secret will be written, which must be large enough to hold the
secret. The size of the secret can be determined with a call toDH_size(dh).
pub_value
Public value received from the other party.
dh
DH object containing the parameters and public key.
Once both sides have agreed on a secret, it generally needs to be turned into some sort of fixed-size
key, or a set of fixed-size keys. A reasonable way is to represent the secret in binary and
cryptographically hash the binary value, truncating if necessary. Often, you'll want to generate a set
of keys, such as an encryption key and a MAC key. (See Recipe 4.11 for a complete discussion of key
derivation.)
Key exchange with Diffie-Hellman isn't secure unless you have some secure
way of authenticating the other end. Generally, you should digitally sign
messages in this protocol with DSA or RSA, and be sure that both sides
securely authenticate the signature-for example, through a public key
infrastructure.
Once a key or keys are established, the two parties try to communicate. If both sides are using
message integrity checks, they'll quickly know whether or not the exchange was successful (if it's
not, nothing will validate on decryption).
If you don't want to use an existing API, here's an example of generating arandom secret and
computing the value to send to the other party (we use the OpenSSL arbitrary precision math
library):
#include <openssl/bn.h>
typedef struct {
BIGNUM *n;
BIGNUM *g; /* use a BIGNUM even though g is usually small. */
BIGNUM *private_value;
BIGNUM *public_value;
} DH_CTX;
/* This function assumes that all BIGNUMs are already allocated, and that n and g
* have already been chosen and properly initialized. After this function
* completes successfully, use BN_bn2bin( ) on ctx->public_value to get a binary
* representation you can send over a network. See Recipe 7.4 for more info on
* BN<->binary conversions.
*/
int DH_generate_keys(DH_CTX *ctx) {
BN_CTX *tmp_ctx;
if (!(tmp_ctx = BN_CTX_new( ))) return 0;
if (!BN_rand_range(ctx->private_value, ctx->n)) {
BN_CTX_free(tmp_ctx);
return 0;
}
if (!BN_mod_exp(ctx->public_value, ctx->g, ctx->private_value, ctx->n, tmp_ctx)) {
BN_CTX_free(tmp_ctx);
return 0;
}
BN_CTX_free(tmp_ctx);
return 1;
}
When one side receives the Diffie-Hellman message from the other, it can compute the shared secret
from the DH_CTX object and the message as follows:
BIGNUM *DH_compute_secret(DH_CTX *ctx, BIGNUM *received) {
BIGNUM *secret;
BN_CTX *tmp_ctx;
if (!(secret = BN_new( ))) return 0;
if (!(tmp_ctx = BN_CTX_new( ))) {
BN_free(secret);
return 0;
}
if (!BN_mod_exp(secret, received, ctx->private_value, ctx->n, tmp_ctx)) {
BN_CTX_free(tmp_ctx);
BN_free(secret);
return 0;
}
BN_CTX_free(tmp_ctx);
return secret;
}
You can turn the shared secret into a key by converting the BIGNUM object returned by
DH_compute_secret( ) to binary (see Recipe 7.4) and then hashing it with SHA1, as discussed
above.
Traditional Diffie-Hellman is sometimes called ephemeral Diffie-Hellman, because the algorithm can
be seen as generating key pairs for one-time use. There are variants of Diffie-Hellman that always
use the same values for each client. There are some hidden "gotchas" when doing that, so we don't
particularly recommend it. However, if you wish to explore it, see RFC 2631 and RFC 2785 for more
information.
8.17.4 See Also
RFC 2631: Diffie-Hellman Key Agreement Method
RFC 2785: Methods for Avoiding the "Small-Subgroup" Attacks on the Diffie-Hellman Key
Agreement Method for S/MIME
Recipe 4.11, Recipe 7.1, Recipe 7.4, Recipe 7.5, Recipe 7.15, Recipe 8.17, Recipe 8.18
[ Team LiB ]
[ Team LiB ]
8.18 Using Diffie-Hellman and DSA Together
8.18.1 Problem
You want to use Diffie-Hellman for key exchange, and you need some secure way to authenticate the
key agreement to protect against a man-in-the-middle attack.
8.18.2 Solution
Use the station-to-station protocol for two-way authentication. A simple modification provides oneway authentication. For example, the server may not care to authenticate the client using public key
cryptography.
8.18.3 Discussion
Remember, authentication requires a trusted third party or a secure channel
for exchange of public DSA keys. If you'd prefer a password-based protocol
that can achieve all the same properties you would get from Diffie-Hellman and
DSA, see the discussion of PAX in Recipe 8.15.
Given a client initiating a connection with a server, the station-to-station protocol is as follows:
1. The client generates a random Diffie-Hellman secret x and the corresponding public value A.
2. The client sends A to the server.
3. The server generates a random Diffie-Hellman secret y and the corresponding public value B.
4. The server computes the Diffie-Hellman shared secret.
5. The server signs a string consisting of the public values A and B with the server's private DSA
key.
6. The server sends B and the signature to the client.
7. The client computes the shared secret.
8. The client validates the signature, failing if it isn't valid.
9. The client signs A concatenated with B using its private DSA key, and it encrypts the result
using the shared secret (the secret can be postprocessed first, as long as both sides do the
10.
8.
9.
same processing).
10. The client sends the encrypted signature to the server.
11. The server decrypts the signature and validates it.
The station-to-station protocol works only if your Diffie-Hellman keys are always one-time values. If
you need a protocol that doesn't expose the private values of each party, useRecipe 8.16. That basic
protocol can be adapted from RSA to Diffie-Hellman with DSA if you so desire.
Unless you allow for anonymous connection establishment, the client needs to identify itself as part of
this protocol. The client can send its public key (or a digital certificate containing the public key) at
Step 2. The server should already have a record of the client based on its public key, or else it should
fail. Alternatively, you can drop the client validation steps (9-11) and use a traditional login
mechanism after the encrypted link is established.
In many circumstances, the client won't have the server's public key in
advance. In such a case, the server will often send a copy of its public key (or a
digital certificate containing the public key) at Step 6. In this case, the client
can't assume that the public signing key is valid; there's nothing to distinguish
it from an attacker's public key! Therefore, the key needs to be validated using
a trusted third party before the client trusts that the party on the other end is
really the intended server. (We discuss this problem inRecipe 7.1 and Recipe
10.1.)
8.18.4 See Also
Recipe 7.1, Recipe 8.15, Recipe 8.16, Recipe 10.1
[ Team LiB ]
[ Team LiB ]
8.19 Minimizing the Window of Vulnerability When
Authenticating Without a PKI
8.19.1 Problem
You have an application (typically a client) that is likely to receive from a server identifying information
such as a certificate or key that may not necessarily be able to be automatically verified-for example,
because there is no PKI.
Without a way to absolutely defend against man-in-the-middle attacks in an automated fashion, you
want to do the best that you can, either by having the user manually do certificate validation or by
limiting the window of vulnerability to the first connection.
8.19.2 Solution
Either provide the user with trusted certificate information over a secure channel and allow him to enter
that information, or prompt the user the first time you see a certificate, and remember it for subsequent
connections.
These solutions push the burden of authentication off onto the user.
8.19.3 Discussion
It is common for small organizations to host some kind of a server that is SSL-enabled without a
certificate that has been issued by a third-party CA such as VeriSign. Most often, such an organization
issues its own certificate using its own CA. A prime example would be an SSL-enabled POP3 or SMTP
server. Unfortunately, when this is the case, your software needs to have some way of allowing the
client to indicate that the certificate presented by the server is acceptable.
There are two basic ways to do this:
Provide the user with some way to add the CA's certificate to a list of trusted certificates. This is
certainly a good idea, and any program that verifies certificates should support this capability.
Prompt the user, asking if the certificate is acceptable. If the user answers yes, the certificate
should be remembered, and the user is never prompted again. This approach could conceivably be
something of an automated way of performing the first solution. In this way, the user need not go
looking for the certificate and add it manually. It is not necessarily the most secure of solutions,
but for many applications, the risk is acceptable.
Prompting the user works for other things besides certificates. Public keys are a good example of
another type of identifying information that works well; in fact, public keys are employed by many SSH
clients. When connecting to an SSH server for the first time, many SSH clients present the user with the
fingerprint of the server's key and ask whether to terminate the connection, remember the key for
future connections, or allow it for use only this one time. Often, the key is associated with the server's
IP address, so if the key is remembered and the same server ever presents a different key, the user is
notified that the key has changed, and that there is some possibility that the server has been
compromised.
Be aware that the security provided by this recipe is not as strong as that provided by using a PKI
(described in Chapter 10 ). There still exists the possibility that an attacker might mount a man-in-themiddle attack, particularly if the client has never connected to the server before and has no record of
the server's credentials. Even if the client has the server's credentials, and they do not match, the client
may opt to continue anyway, thinking that perhaps the server has regenerated its certificate or public
key. The most common scenario, though, is that the user will not understand the warnings presented
and the implications of proceeding when a change in server credentials is detected.
All of the work required for this recipe is on the client side. First, some kind of store is required to
remember the information that is being presented by the server. Typically, this would be some kind of
file on disk. For this recipe, we are going to concentrate on certificates and keys.
For certificates, we will store the entire certificate in Privacy Enhanced Mail (PEM) format (see Recipe
7.17 ). We will put one certificate in one file, and name that file in such a manner that OpenSSL can use
it in a directory lookup. This entails computing the hash of the certificate's subject name and using it for
the filename. You will generally want to provide a verify callback function in anspc_x509store_t object
(see Recipe 10.5 ) that will ask the user whether to accept the certificate if OpenSSL has failed to verify
it. The user could be presented with an option to reject the certificate, accept it this once, or accept and
remember it. In the latter case, we'll save the certificate in an spc_x509store_t object in the directory
identified in the call to spc_x509store_setcapath( ) .
#include
#include
#include
#include
#include
<stdio.h>
<string.h>
<unistd.h>
<openssl/ssl.h>
<openssl/x509.h>
char *spc_cert_filename(char *path, X509 *cert) {
int length;
char *filename;
length = strlen(path) + 11;
if (!(filename = (char *)malloc(length + 1))) return 0;
snprintf(filename, length + 1, "%s/%08lx.0", path, X509_subject_name_hash(cert));
return filename;
}
int spc_remember_cert(char *path, X509 *cert) {
int result;
char *filename;
FILE *fp;
if (!(filename = spc_cert_filename(path, cert))) return 0;
if (!(fp = fopen(filename, "w"))) {
free(filename);
return 0;
}
result = PEM_write_X509(fp, cert);
fclose(fp);
if (!result) remove(filename);
free(filename);
return result;
}
int spc_verifyandmaybesave_callback(int ok, X509_STORE_CTX *store) {
int
err;
SSL
*ssl_ptr;
char
answer[80], name[256];
X509
*cert;
SSL_CTX
*ctx;
spc_x509store_t *spc_store;
if (ok) return ok;
cert = X509_STORE_CTX_get_current_cert(store);
printf("An error has occurred with the following certificate:\n");
X509_NAME_oneline(X509_get_issuer_name(cert), name, sizeof(name));
printf("
Issuer Name: %s\n", name);
X509_NAME_oneline(X509_get_subject_name(cert), name, sizeof(name));
printf("
Subject Name: %s\n", name);
err = X509_STORE_CTX_get_error(store);
printf("
Error Reason: %s\n", X509_verify_cert_error_string(err));
for (;;) {
printf("Do you want to [r]eject this certificate, [a]ccept and remember it, "
"or allow\nits use for only this [o]ne time? ");
if (!fgets(answer, sizeof(answer), stdin)) continue;
if (answer[0] =
if (answer[0] =
if (answer[0] =
= 'r' || answer[0] =
= 'o' || answer[0] =
= 'a' || answer[0] =
= 'R') return 0;
= 'O') return 1;
= 'A') break;
}
ssl_ptr = (SSL *)X509_STORE_CTX_get_app_data(store);
ctx = SSL_get_SSL_CTX(ssl_ptr);
spc_store = (spc_x509store_t *)SSL_CTX_get_app_data(ctx);
if (!spc_store->capath || !spc_remember_cert(spc_store->capath, cert))
printf("Error remembering certificate! It will be accepted this one time "
"only.\n");
return 1;
}
For keys, we will store the base64-encoded key in a flat file, much as OpenSSH does. We will also
associate the IP address of the server that presented the key so that we can determine when the
server's key has changed and warn the user. When we receive a key that we'd like to check to see
whether we already know about it, we can call spc_lookup_key( ) with the filename of the key store,
the IP number we received the key from, and the key we've just received. If we do not know anything
about the key or if some kind of error occurs, 0 is returned. If we know about the key, and everything
matches-that is, the IP numbers and the keys are the same-1 is returned. If we have a key stored for
the IP number and it does not match the key we have just received, -1 is returned.
If you have multiple servers running on the same system, you need to make sure
that they each keep separate caches so that the keys and IP numbers do not
collide.
#include
#include
#include
#include
#include
#include
#include
<ctype.h>
<stdio.h>
<string.h>
<openssl/evp.h>
<sys/types.h>
<netinet/in.h>
<arpa/inet.h>
static int get_keydata(EVP_PKEY *key, char **keydata) {
BIO
*b64 = 0, *bio = 0;
int
keytype, length;
char *dummy;
*keydata = 0;
keytype = EVP_PKEY_type(key->type);
if (!(length = i2d_PublicKey(key, 0))) goto error_exit;
if (!(dummy = *keydata = (char *)malloc(length))) goto error_exit;
i2d_PublicKey(key, (unsigned char **)&dummy);
if (!(bio = BIO_new(BIO_s_mem( )))) goto error_exit;
if (!(b64 = BIO_new(BIO_f_base64( )))) goto error_exit;
BIO_set_flags(b64, BIO_FLAGS_BASE64_NO_NL);
if (!(bio = BIO_push(b64, bio))) goto error_exit;
b64 = 0;
BIO_write(bio, *keydata, length);
free(*keydata); *keydata = 0;
if (!(length = BIO_get_mem_data(bio, &dummy))) goto error_exit;
if (!(*keydata = (char *)malloc(length + 1))) goto error_exit;
memcpy(*keydata, dummy, length);
(*keydata)[length - 1] = '\0';
return keytype;
error_exit:
if (b64) BIO_free_all(b64);
if (bio) BIO_free_all(bio);
if (*keydata) free(*keydata);
*keydata = 0;
return EVP_PKEY_NONE;
}
static int parse_line(char *line, char **ipnum, int *keytype, char **keydata) {
char *end, *p, *tmp;
/* we expect leading and trailing whitespace to be stripped already */
for (p = line; *p && !isspace(*p); p++);
if (!*p) return 0;
*ipnum = line;
for (*p++ = '\0'; *p && isspace(*p); p++);
for (tmp = p; *p && !isspace(*p); p++);
*keytype = (int)strtol(tmp, &end, 0);
if (*end && !isspace(*end)) return 0;
for (p = end; *p && isspace(*p); p++);
for (tmp = p; *p && !isspace(*p); p++);
if (*p) return 0;
*keydata = tmp;
return 1;
}
int spc_lookup_key(char *filename, char *ipnum, EVP_PKEY *key) {
int
bufsize = 0, length, keytype, lineno = 0, result = 0, store_keytype;
char *buffer = 0, *keydata, *line, *store_ipnum, *store_keydata, tmp[1024];
FILE *fp = 0;
keytype = get_keydata(key, &keydata);
if (keytype = = EVP_PKEY_NONE || !keydata) goto end;
if (!(fp = fopen(filename, "r"))) goto end;
while (fgets(tmp, sizeof(tmp), fp)) {
length = strlen(tmp);
buffer = (char *)realloc(buffer, bufsize + length + 1);
memcpy(buffer + bufsize, tmp, length + 1);
bufsize += length;
if (buffer[bufsize - 1] != '\n') continue;
while (bufsize && (buffer[bufsize - 1] = = '\r' || buffer[bufsize - 1] =
bufsize--;
buffer[bufsize] = '\0';
bufsize = 0;
lineno++;
= '\n'))
for (line = buffer; isspace(*line); line++);
for (length = strlen(line); length && isspace(line[length - 1]); length--);
line[length - 1] = '\0';
/* blank lines and lines beginning with # or ; are ignored */
if (!length || line[0] = = '#' || line[0] = = ';') continue;
if (!parse_line(line, &store_ipnum, &store_keytype, &store_keydata)) {
fprintf(stderr, "%s:%d: parse error\n", filename, lineno);
continue;
}
if (inet_addr(store_ipnum) != inet_addr(ipnum)) continue;
if (store_keytype != keytype || strcasecmp(store_keydata, keydata))
result = -1;
else result = 1;
break;
}
end:
if (buffer) free(buffer);
if (keydata) free(keydata);
if (fp) fclose(fp);
return result;
}
If spc_lookup_key( ) returns 0, indicating that we do not know anything about the key, the user
should be prompted in much the same way we did for certificates. If the user elects to remember the
key, the spc_remember_key( ) function will add the key information to the key store so that the next
time spc_lookup_key( ) is called, it will be found.
int spc_remember_key(char *filename, char *ipnum, EVP_PKEY *key) {
int
keytype, result = 0;
char *keydata;
FILE *fp = 0;
keytype = get_keydata(key, &keydata);
if (keytype = = EVP_PKEY_NONE || !keydata) goto end;
if (!(fp = fopen(filename, "a"))) goto end;
fprintf(fp, "%s %d %s\n", ipnum, keytype, keydata);
result = 1;
end:
if (keydata) free(keydata);
if (fp) fclose(fp);
return result;
}
int spc_accept_key(char *filename, char *ipnum, EVP_PKEY *key) {
int
result;
char answer[80];
result = spc_lookup_key(filename, ipnum, key);
if (result = = 1) return 1;
if (result = = -1) {
for (;;) {
printf("FATAL ERROR! A different key has been received from the server "
"%s\nthan we have on record. Do you wish to continue? ", ipnum);
if (!fgets(answer, sizeof(answer), stdin)) continue;
if (answer[0] = = 'Y' || answer[0] = = 'y') return 1;
if (answer[0] = = 'N' || answer[0] = = 'n') return 0;
}
}
for (;;) {
printf("WARNING! The server %s has presented has presented a key for which "
"we have no\nprior knowledge. Do you want to [r]eject the key, "
"[a]ccept and remember it,\nor allow its use for only this [o]ne "
"time? ", ipnum);
if (!fgets(answer, sizeof(answer), stdin)) continue;
if (answer[0] = = 'r' || answer[0] = = 'R') return 0;
if (answer[0] = = 'o' || answer[0] = = 'O') return 1;
if (answer[0] =
= 'a' || answer[0] =
= 'A') break;
}
if (!spc_remember_key(filename, ipnum, key))
printf("Error remembering the key! It will be accepted this one time only "
"instead.\n");
return 1;
}
8.19.4 See Also
Recipe 7.17 , Recipe 10.5
[ Team LiB ]
[ Team LiB ]
8.20 Providing Forward Secrecy in a Symmetric System
8.20.1 Problem
When using a series of (session) keys generated from a master secret, as described in the previous
recipe, we want to limit the scope of a key compromise. That is, if a derived key is stolen, or even if
the master key is stolen, we would like to ensure that no data encrypted by previous session keys
can be read by attackers as a result of the compromise. If our system has such a property, it is said
to have perfect forward secrecy.
8.20.2 Solution
Use a separate base secret for each entity in the system. For any given client, derive a new key
called K1 from the base secret key, as described in Recipe 4.11. Then, after you're sure that
communicating parties have correctly agreed upon a key, derive another key from K1 in the exact
same manner, calling it K2. Erase the base secret (on both the client and the server), replacing it
with K1. Use K2 as the session key.
8.20.3 Discussion
In Recipe 4.11, we commented on how knowledge of a properly created derived key would give no
information about any parent keys. We can take advantage of that fact to ensure that previous
sessions are not affected if throwing away the base secret somehow compromises the current key, so
that old session keys cannot be regenerated. The security depends on the cryptographically strong
one-way property of the hash function used to generate the derived keys.
Remember that when deriving keys, every key derivation needs to include
some kind of unique value that is never repeated (see Recipe 4.11 for a
detailed discussion).
8.20.4 See Also
Recipe 4.11
[ Team LiB ]
[ Team LiB ]
8.21 Ensuring Forward Secrecy in a Public Key System
8.21.1 Problem
In a system using public key cryptography, you want to ensure that a compromise of one of the
entities in your system won't compromise old communications that took place with different session
keys (symmetric keys).
8.21.2 Solution
When using RSA, generate new public keys for each key agreement, ensuring that the new key
belongs to the right entity by checking the digital signature using a long-term public key.
Alternatively, use Diffie-Hellman, being sure to generate new random numbers each time. Throw
away all of the temporary material once key exchange is complete.
8.21.3 Discussion
When discarding key material, be sure to zero it from memory, and use a
secure deletion technique if the key may have been swapped to disk (See
Recipe 13.2).
Suppose that you have a client and a server that communicate frequently, and they establish
connections using a set of fixed RSA keys. Suppose that an attacker has been recording all data
between the client and the server since the beginning of time. All of the key exchange messages and
data encrypted with symmetric keys have been captured.
Now, suppose that the attacker eventually manages to break into the client and the server, stealing
all the private keys in the system. Certainly, future communications are insecure, but what about
communications before the break-in? In this scenario, the attacker would be able to decrypt all of the
data ever sent by either party because all of the old messages used in key exchange can be
decrypted with all of the public keys in the system.
The easiest way to fix this problem is to use static (long-term) key pairs for establishing identity (i.e.,
digital signatures), but use randomly generated, one-time-use key pairs for performing key
exchange. This procedure is called ephemeral keying (and in the context of keying Diffie-Hellman it's
called ephemeral Diffie-Hellman, which we discussed in Recipe 8.17). It doesn't have a negative
impact on security because you can still establish identities by checking signatures that are generated
by the static signing key. The upside is that as long as you throw away the temporary key pairs after
use, the attacker won't be able to decrypt old key exchange messages, and thus all data for
connections that completed before the compromise will be secure from the attacker.
The only reason not to use ephemeral keying with RSA is that key generation
can be expensive.
The standard way of using Diffie-Hellman key exchange provides forward secrecy. With that protocol,
the client and server both pick secret random numbers for each connection, and they send a public
value derived from their secrets. The public values, intended for one-time use, are akin to public
keys. Indeed, it is possible to reuse secrets in Diffie-Hellman, thus creating a permanent key pair.
However, there is significant risk if this is done naïvely (seeRecipe 8.17).
When using RSA, if you're doing one-way key transport, the client need not have a public key. Here's
a protocol:
1. The client contacts the server, requesting a one-time public key.
2. The server generates a new RSA key pair and signs the public key with its long-term key. The
server then sends the public key and the signature. If necessary, the server also sends the
client its certificate for its long-term key.
3. The client validates the server's certificate, if appropriate.
4. The client checks the server's signature on the one-time public key to make sure it is valid.
5. The client chooses a random secret (the session key) and encrypts it using the one-time public
key.
6. The encrypted secret is sent to the server.
7. The parties attempt to communicate using the session key.
8. The server securely erases the one-time private key.
9. When communication is complete, both parties securely erase the session key.
In two-way authentication, both parties generate one-time keys and sign them with their long-term
private key.
8.21.4 See Also
Recipe 8.17, Recipe 13.2
[ Team LiB ]
[ Team LiB ]
8.22 Confirming Requests via Email
8.22.1 Problem
You want to allow users to confirm a request via email while preventing third parties from spoofing or
falsifying confirmations.
8.22.2 Solution
Generate a random identifier, associate it with the email address to be confirmed, and save it for
verification later. Send an email that contains the random identifier, along with instructions for
responding to confirm receipt and approval. If a response is received, compare the identifier in the
response with the saved identifier for the email address from which the response was received. If the
identifiers don't match, ignore the response and do nothing; otherwise, the confirmation was
successful.
8.22.3 Discussion
The most common use for confirmation requests is to ensure that an email address actually belongs
to the person requesting membership on some kind of mass mailing list (whether it's a mailing list,
newsletter, or some other type of mass mailing). Joining a mass mailing list typically involves either
sending mail to an automated recipient or filling out a form on a web page.
The problem with this approach is that it is trivial for someone to register someone else's email
address with a mailing list. For example, suppose that Alice wants to annoy Bob. If mailing lists
accepted email addresses without any kind of confirmation, Alice could register Bob's email address
with as many mailing lists as she could find. Suddenly, Bob would begin receiving large amounts of
email from mailing lists with which he did not register. In extreme cases, this could lead to denial of
service because Bob's mailbox could fill up with unwanted email, or if Bob has a slow network
connection, it could take an unreasonable amount of time for him to download his email.
The solution to this problem is to confirm with Bob that he really made the requests for membership
with the mailing lists. When a request for membership is sent for a mailing list, the mailing list
software can send an email to the address for which membership was requested. This email will ask
the recipient to respond with a confirmation that membership is truly desired.
The simplest form of such a confirmation request is to require the recipient to reply with an email
containing some nonunique content, such as the word "subscribe" or something similar. This method
is easiest for the mailing list software to deal with because it does not have to keep any information
about what requests have been made or confirmed. It simply needs to respond to confirmation
responses by adding the sender's email address to the mailing list roster.
Unfortunately, this is not an acceptable solution either, because Alice might know what response
needs to be sent back to the confirmation request in order for the mailing list software to add Bob to
its roster. If Alice knows what needs to be sent, she can easily forge a response email, making it
appear to the mailing list software as if it came from Bob's email address.
Sending a confirmation request that requires an affirmative acknowledgement is a step in the right
direction, but as we have just described it, it is not enough. Instead of requiring a nonunique
acknowledgment, the confirmation request should contain a unique identifier that is generated at the
time that the request for membership is made. To confirm the request, the recipient must send back
a response that also contains the same unique identifier.
Because a unique identifier is used, it is not possible for Alice to know what she would need to send
back to the mailing list software to get Bob's email address on the roster, unless she somehow had
access to Bob's email. That would allow her to see the confirmation request and the unique identifier
that it contains. Unfortunately, this is a much more difficult problem to solve, and it is one that
cannot be easily solved in software, so we will not give it any further consideration.
To implement such a scheme, the mailing list software must maintain some state information. In
particular, upon receipt of a request for membership, the software needs to generate the unique
identifier to include in the confirmation requests, and it must store that identifier along with the email
address for which membership has been requested. In addition, it is a good idea to maintain some
kind of a timestamp so that confirmation requests will eventually expire. Expiring confirmation
requests significantly reduces the likelihood that Alice can guess the unique identifier; more
importantly, it also helps to reduce the amount of information that must be remembered to be able
to confirm requests.
We define two functions in this recipe that provide the basic implementation for the confirmation
request scheme we have just described. The first, spc_confirmation_create( ), creates a new
confirmation request by generating a unique identifier and storing it with the email address for which
confirmation is to be requested. It stores the confirmation request information in an in-memory list of
pending confirmations, implemented simply as a dynamically allocated array. For use in a production
environment, a hash table or binary tree might be a better solution for an in-memory data structure.
Alternatively, the information could be stored in a database.
The function spc_confirmation_create( ) (SpcConfirmationCreate() on Windows) will return 0 if
some kind of error occurs. Possible errors include memory allocation failures and attempts to add an
address to the list of pending confirmations that already exists in the list. If the operation is
successful, the return value will be 1. Two arguments are required byspc_confirmation_create(
):
address
Email address that is to be confirmed.
id
Pointer to a buffer that will be allocated by spc_confirmation_create( ). If the function
returns successfully, the buffer will contain the unique identifier to send as part of the
confirmation request email. It is the responsibility of the caller to free the buffer usingfree( )
on Unix or LocalFree( ) on Windows.
You may adjust the SPC_CONFIRMATION_EXPIRE macro from the default presented here. It controls
how long pending confirmation requests will be honored and is specified in seconds.
Note that the code we are presenting here does not send or receive email at all. Programmatically
sending and receiving email is outside the scope of this book.
#include <stdlib.h>
#include <string.h>
#include <time.h>
/* Confirmation receipts must be received within one hour (3600 seconds) */
#define SPC_CONFIRMATION_EXPIRE 3600
typedef struct {
char
*address;
char
*id;
time_t expire;
} spc_confirmation_t;
static unsigned long
confirmation_count, confirmation_size;
static spc_confirmation_t *confirmations;
static int new_confirmation(const char *address, const char *id) {
unsigned long
i;
spc_confirmation_t *tmp;
/* first make sure that the address isn't already in the list */
for (i = 0; i < confirmation_count; i++)
if (!strcmp(confirmations[i].address, address)) return 0;
if (confirmation_count = = confirmation_size) {
tmp = (spc_confirmation_t *)realloc(confirmations,
sizeof(spc_confirmation_t) * (confirmation_size + 1));
if (!tmp) return 0;
confirmations = tmp;
confirmation_size++;
}
confirmations[confirmation_count].address = strdup(address);
confirmations[confirmation_count].id = strdup(id);
confirmations[confirmation_count].expire = time(0) + SPC_CONFIRMATION_EXPIRE;
if (!confirmations[confirmation_count].address ||
!confirmations[confirmation_count].id) {
if (confirmations[confirmation_count].address)
free(confirmations[confirmation_count].address);
if (confirmations[confirmation_count].id)
free(confirmations[confirmation_count].id);
return 0;
}
confirmation_count++;
return 1;
}
int spc_confirmation_create(const char *address, char **id) {
unsigned char buf[16];
if (!spc_rand(buf, sizeof(buf))) return 0;
if (!(*id = (char *)spc_base64_encode(buf, sizeof(buf), 0))) return 0;
if (!new_confirmation(address, *id)) {
free(*id);
return 0;
}
return 1;
}
Upon receipt of a response to a confirmation request, the address from which it was sent and the
unique identified contained within it should be passed as arguments tospc_confirmation_receive(
) (SpcConfirmationReceive() on Windows). If the address and unique identifier are in the list of
pending requests, the return from this function will be 1; otherwise, it will be 0. Before the list is
checked, expired entries will automatically be removed.
int spc_confirmation_receive(const char *address, const char *id) {
time_t
now;
unsigned long i;
/* Before we check the pending list of confirmations, prune the list to
* remove expired entries.
*/
now = time(0);
for (i = 0; i < confirmation_count; i++) {
if (confirmations[i].expire <= now) {
free(confirmations[i].address);
free(confirmations[i].id);
if (confirmation_count > 1 && i < confirmation_count - 1)
confirmations[i] = confirmations[confirmation_count - 1];
i--;
confirmation_count--;
}
}
for (i = 0; i < confirmation_count; i++) {
if (!strcmp(confirmations[i].address, address)) {
if (strcmp(confirmations[i].id, id) != 0) return 0;
free(confirmations[i].address);
free(confirmations[i].id);
if (confirmation_count > 1 && i < confirmation_count - 1)
confirmations[i] = confirmations[confirmation_count - 1];
confirmation_count--;
return 1;
}
}
return 0;
}
The Windows versions of spc_confirmation_create( ) and spc_confirmation_receive( ) are
named SpcConfirmationCreate( ) and SpcConfirmationReceive( ), respectively. The arguments
and return values for each are the same; however, there are enough subtle differences in the
underlying implementation that we present an entirely separate code listing for Windows instead of
using the preprocessor to have a single version.
#include <windows.h>
/* Confirmation receipts must be received within one hour (3600 seconds) */
#define SPC_CONFIRMATION_EXPIRE 3600
typedef struct {
LPTSTR
lpszAddress;
LPSTR
lpszID;
LARGE_INTEGER liExpire;
} SPC_CONFIRMATION;
static DWORD
dwConfirmationCount, dwConfirmationSize;
static SPC_CONFIRMATION *pConfirmations;
static BOOL NewConfirmation(LPCTSTR lpszAddress, LPCSTR lpszID) {
DWORD
dwIndex;
LARGE_INTEGER
liExpire;
SPC_CONFIRMATION *pTemp;
/* first make sure that the address isn't already in the list */
for (dwIndex = 0; dwIndex < dwConfirmationCount; dwIndex++) {
if (CompareString(LOCALE_USER_DEFAULT, NORM_IGNORECASE,
pConfirmations[dwIndex].lpszAddress, -1,
lpszAddress, -1) = = CSTR_EQUAL) return FALSE;
}
if (dwConfirmationCount = = dwConfirmationSize) {
if (!pConfirmations)
pTemp = (SPC_CONFIRMATION *)LocalAlloc(LMEM_FIXED, sizeof(SPC_CONFIRMATION));
else
pTemp = (SPC_CONFIRMATION *)LocalReAlloc(pConfirmations,
sizeof(SPC_CONFIRMATION) * (dwConfirmationSize + 1), 0);
if (!pTemp) return FALSE;
pConfirmations = pTemp;
dwConfirmationSize++;
}
pConfirmations[dwConfirmationCount].lpszAddress = (LPTSTR)LocalAlloc(
LMEM_FIXED, sizeof(TCHAR) * (lstrlen(lpszAddress) + 1));
if (!pConfirmations[dwConfirmationCount].lpszAddress) return FALSE;
lstrcpy(pConfirmations[dwConfirmationCount].lpszAddress, lpszAddress);
pConfirmations[dwConfirmationCount].lpszID = (LPSTR)LocalAlloc(LMEM_FIXED,
lstrlenA(lpszID) + 1);
if (!pConfirmations[dwConfirmationCount].lpszID) {
LocalFree(pConfirmations[dwConfirmationCount].lpszAddress);
return FALSE;
}
lstrcpyA(pConfirmations[dwConfirmationCount].lpszID, lpszID);
/* File Times are 100-nanosecond intervals since January 1, 1601 */
GetSystemTimeAsFileTime((LPFILETIME)&liExpire);
liExpire.QuadPart += (SPC_CONFIRMATION_EXPIRE * (__int64)10000000);
pConfirmations[dwConfirmationCount].liExpire = liExpire;
dwConfirmationCount++;
return TRUE;
}
BOOL SpcConfirmationCreate(LPCTSTR lpszAddress, LPSTR *lpszID) {
BYTE pbBuffer[16];
if (!spc_rand(pbBuffer, sizeof(pbBuffer))) return FALSE;
if (!(*lpszID = (LPSTR)spc_base64_encode(pbBuffer, sizeof(pbBuffer), 0)))
return FALSE;
if (!NewConfirmation(lpszAddress, *lpszID)) {
LocalFree(*lpszID);
return FALSE;
}
return TRUE;
}
BOOL SpcConfirmationReceive(LPCTSTR lpszAddress, LPCSTR lpszID) {
DWORD
dwIndex;
LARGE_INTEGER liNow;
/* Before we check the pending list of confirmations, prune the list to
* remove expired entries.
*/
GetSystemTimeAsFileTime((LPFILETIME)&liNow);
for (dwIndex = 0; dwIndex < dwConfirmationCount; dwIndex++) {
if (pConfirmations[dwIndex].liExpire.QuadPart <= liNow.QuadPart) {
LocalFree(pConfirmations[dwIndex].lpszAddress);
LocalFree(pConfirmations[dwIndex].lpszID);
if (dwConfirmationCount > 1 && dwIndex < dwConfirmationCount - 1)
pConfirmations[dwIndex] = pConfirmations[dwConfirmationCount - 1];
dwIndex--;
dwConfirmationCount--;
}
}
for (dwIndex = 0; dwIndex < dwConfirmationCount; dwIndex++) {
if (CompareString(LOCALE_USER_DEFAULT, NORM_IGNORECASE,
pConfirmations[dwIndex].lpszAddress, -1,
lpszAddress, -1) = = CSTR_EQUAL) {
if (lstrcmpA(pConfirmations[dwIndex].lpszID, lpszID) != 0) return FALSE;
LocalFree(pConfirmations[dwIndex].lpszAddress);
LocalFree(pConfirmations[dwIndex].lpszID);
if (dwConfirmationCount > 1 && dwIndex < dwConfirmationCount - 1)
pConfirmations[dwIndex] = pConfirmations[dwConfirmationCount - 1];
dwConfirmationCount--;
return TRUE;
}
}
return FALSE;
}
8.22.4 See Also
Recipe 11.2
[ Team LiB ]
[ Team LiB ]
Chapter 9. Networking
Today, most applications perform some type of network activity. Unfortunately, many programmers
don't know how to access a network securely. The recipes in this chapter aim to help you use a
network in your application. To many developers, network security from the application standpoint
means using the Secure Sockets Layer (SSL), but SSL isn't a magic solution. SSL can be difficult to
use properly; in many cases, it is overkill, and in a few cases, it is insufficient. This chapter presents
recipes for using OpenSSL to build SSL-enabled clients and servers and recipes for network and
interprocess communication without SSL.
On the Windows platform, with the exception of SSL over HTTP (which we cover inRecipe 9.4), we've
chosen to limit the SSL-specific recipes to OpenSSL, which is freely available and portable to a wide
range of platforms, Windows included.
On Windows systems, Microsoft provides access to its SSL implementation through theSecurity
Support Provider Interface (SSPI). SSPI is well documented, but unfortunately, the use of SSL is not.
What's more unfortunate is that implementing an SSL-enabled client or server with SSPI on Windows
is considerably more complex than using OpenSSL (which is saying quite a lot). The SSPI interface to
SSL is surprisingly low-level, requiring programs that use it to do much of the work of exchanging
protocol messages themselves. Because SSL is difficult to use properly, it is desirable to mask
protocol details with a high-level implementation (such as OpenSSL). We therefore avoid the SSPI
interface to SSL altogether.
If you are interested in finding out more about SSPI and the SSL interface, we recommend that you
consult the Microsoft Developer's Network (MSDN) and the samples that are included with the
Microsoft Windows Platform SDK, which is available from Microsoft on the Internet at
http://www.microsoft.com/msdownload/platformsdk/sdkupdate/. The relevant example code can be
found in the directory Microsoft SDK\Samples\Security\SSPI\SSL from wherever you install it on your
system (normally in \Program Files on your boot drive).
Additionally, over time, SSPI-specific recipes may end up on the book's companion web site,
particularly if submitted by readers such as you.
[ Team LiB ]
[ Team LiB ]
9.1 Creating an SSL Client
9.1.1 Problem
You want to establish a connection from a client to a remote server using SSL.
9.1.2 Solution
Establishing a connection to a remote server using SSL is not entirely different from establishing a
connection without using SSL-at least it doesn't have to be. Establishing an SSL connection requires
a little more setup work, consisting primarily of building anspc_x509store_t object (see Recipe
10.5) that contains the information necessary to verify the server. Once this is done, you need to
create an SSL_CTX object and attach it to the connection. OpenSSL will handle the rest.
Before reading this recipe, make sure you understand the basics of public key
infrastructure (see Recipe 10.1).
9.1.3 Discussion
Once you've created an spc_x509store_t object by loading it with the appropriate certificates and
CRLs (see Recipe 10.10 and Recipe 10.11 for information on obtaining CRLs), connecting to a remote
server over SSL can be as simple as making a call to the following function,spc_connect_ssl( ).
You can optionally create an SSL_CTX object yourself using spc_create_sslctx( ) or the OpenSSL
API. Alternatively, you can share one that has already been created for other connections, or you can
let spc_connect_ssl( ) do it for you. In the latter case, the connection will be established and the
SSL_CTX object that was created will be returned by way of a pointer to the SSL_CTX object pointer in
the function's argument list.
#include <openssl/bio.h>
#include <openssl/ssl.h>
BIO *spc_connect_ssl(char *host, int port, spc_x509store_t *spc_store,
SSL_CTX **ctx) {
BIO *conn = 0;
int our_ctx = 0;
if (*ctx) {
CRYPTO_add(&((*ctx)->references), 1, CRYPTO_LOCK_SSL_CTX);
if (spc_store && spc_store != SSL_CTX_get_app_data(*ctx)) {
SSL_CTX_set_cert_store(*ctx, spc_create_x509store(spc_store));
SSL_CTX_set_app_data(*ctx, spc_store);
}
} else {
*ctx = spc_create_sslctx(spc_store);
our_ctx = 1;
}
if (!(conn = BIO_new_ssl_connect(*ctx))) goto error_exit;
BIO_set_conn_hostname(conn, host);
BIO_set_conn_int_port(conn, &port);
if (BIO_do_connect(conn) <= 0) goto error_exit;
if (our_ctx) SSL_CTX_free(*ctx);
return conn;
error_exit:
if (conn) BIO_free_all(conn);
if (*ctx) SSL_CTX_free(*ctx);
if (our_ctx) *ctx = 0;
return 0;
}
We're providing an additional function here that will handle the differences between connecting to a
remote server using SSL and connecting to a remote server not using SSL. In both cases, aBIO
object is returned that can be used in the same way regardless of whether there is an SSL connection
in place. If the ssl flag to this function is zero, the spc_store and ctx arguments will be ignored
because they're only applicable to SSL connections.
OpenSSL makes heavy use of BIO objects, and many of the API functions require BIO arguments.
What are these objects? Briefly, BIO objects are an abstraction for I/O that provides a uniform,
medium-independent interface. BIO objects exist for file I/O, socket I/O, and memory. In addition,
special BIO objects, known as BIO filters, can be used to filter data prior to writing to or reading from
the underlying medium. BIO filters exist for operations such as base64 encoding and encryption using
a symmetric cipher.
The OpenSSL SSL API is built on BIO objects, and a special filter handles the details of SSL. The SSL
BIO filter is most useful when employed with a socket BIO object, but it can also be used for directly
linking two BIO objects together (one for reading, one for writing) or to wrap pipes or some other
type of connection-oriented communications primitive.
BIO *spc_connect(char *host, int port, int ssl, spc_x509store_t *spc_store,
SSL_CTX **ctx) {
BIO *conn;
SSL *ssl_ptr;
if (ssl) {
if (!(conn = spc_connect_ssl(host, port, spc_store, ctx))) goto error_exit;
BIO_get_ssl(conn, &ssl_ptr);
if (!spc_verify_cert_hostname(SSL_get_peer_certificate(ssl_ptr), host))
goto error_exit;
if (SSL_get_verify_result(ssl_ptr) != X509_V_OK) goto error_exit;
return conn;
}
*ctx = 0;
if (!(conn = BIO_new_connect(host))) goto error_exit;
BIO_set_conn_int_port(conn, &port);
if (BIO_do_connect(conn) <= 0) goto error_exit;
return conn;
error_exit:
if (conn) BIO_free_all(conn);
return 0;
}
As written, spc_connect( ) will attempt to perform post-connection verification of the remote peer's
certificate. If you instead want to perform whitelist verification or no verification at all, you'll need to
make the appropriate changes to the code using Recipe 10.9 for whitelist verification.
If a connection is successfully established, a BIO object will be returned regardless of whether you
used spc_connect_ssl( ) or spc_connect( ) to establish the connection. With this BIO object, you
can then use BIO_read( ) to read data, and BIO_write( ) to write data. You can also use other BIO
functions, such as BIO_printf( ), for example. When you're done and want to terminate the
connection, you should always use BIO_free_all( ) instead of BIO_free( ) to dispose of any
chained BIO filters. When you've obtained an SSL-enabled BIO object from either of these functions,
there will always be at least two BIO objects in the chain: one for the SSL filter and one for the
socket connection.
9.1.4 See Also
OpenSSL home page: http://www.openssl.org/
Recipe 10.1, Recipe 10.5, Recipe 10.9, Recipe 10.10, Recipe 10.11
[ Team LiB ]
[ Team LiB ]
9.2 Creating an SSL Server
9.2.1 Problem
You want to write a network server that can accept SSL connections from clients.
9.2.2 Solution
Creating a server that speaks SSL is not that different from creating a client that speaks SSL (see
Recipe 9.1). A small amount of additional setup work is required for servers. In particular, you need
to create an spc_x509store_t object (see Recipe 10.5) with a certificate and a private key. The
information contained in this object is sent to clients during the initial handshake. In addition, the
SPC_X509STORE_USE_CERTIFICATE flag needs to be set in the spc_x509store_t object. With the
spc_x509store_t created, calls need to be made to create the listening BIO object, put it into a
listening state, and accept new connections. (See Recipe 9.1 for a brief discussion regarding BIO
objects.)
9.2.3 Discussion
Once an spc_x509store_t object has been created and fully initialized, the first step in creating an
SSL server is to call spc_listen( ). The hostname may be specified as NULL, which indicates that
the created socket should be bound to all interfaces. Anything else should be specified in string form
as an IP address for the interface to bind to. For example, "127.0.0.1" would cause the serverBIO
object to bind only to the local loopback interface.
#include
#include
#include
#include
<stdlib.h>
<string.h>
<openssl/bio.h>
<openssl/ssl.h>
BIO *spc_listen(char *host, int port) {
BIO
*acpt = 0;
int
addr_length;
char *addr;
if (port < 1 || port > 65535) return 0;
if (!host) host = "*";
addr_length = strlen(host) + 6; /* 5 for int, 1 for colon */
if (!(addr = (char *)malloc(addr_length + 1))) return 0;
snprintf(addr, addr_length + 1, "%s:%d", host, port);
if ((acpt = BIO_new(BIO_s_accept(
))) != 0) {
BIO_set_accept_port(acpt, addr);
if (BIO_do_accept(acpt) <= 0) {
BIO_free_all(acpt);
acpt = 0;
}
}
free(addr);
return acpt;
}
The call to spc_listen( ) will create a BIO object that has an underlying socket that is in a listening
state. There isn't actually any SSL work occurring here because an SSL connection will only come into
being when a new socket connection is established. Thespc_listen( ) call is nonblocking and will
return immediately.
The next step is to call spc_accept( ) to establish a new socket and possibly an SSL connection
between the server and an incoming client. This function should be called repeatedly in order to
continually accept connections. However, be aware that it will block if there are no incoming
connections pending. The call to spc_accept( ) will either return a new BIO object that is the
connection to the new client, or return NULL indicating that there was some failure in establishing the
connection.
The spc_accept( ) function will automatically create an SSL_CTX object for
you in the same manner spc_connect( ) does (see Recipe 9.1); however,
because of the way that spc_accept( ) works (it is called repeatedly using the
same parent BIO object for accepting new connections), you should call
spc_create_sslctx( ) yourself to create a single SSL_CTX object that will be
shared among all accepted connections.
BIO *spc_accept(BIO *parent, int ssl, spc_x509store_t *spc_store, SSL_CTX **ctx) {
BIO *child = 0, *ssl_bio = 0;
int our_ctx = 0;
SSL *ssl_ptr = 0;
if (BIO_do_accept(parent) <= 0) return 0;
if (!(child = BIO_pop(parent))) return 0;
if (ssl) {
if (*ctx) {
CRYPTO_add(&((*ctx)->references), 1, CRYPTO_LOCK_SSL_CTX);
if (spc_store && spc_store != SSL_CTX_get_app_data(*ctx)) {
SSL_CTX_set_cert_store(*ctx, spc_create_x509store(spc_store));
SSL_CTX_set_app_data(*ctx, spc_store);
}
} else {
*ctx = spc_create_sslctx(spc_store);
our_ctx = 1;
}
if (!(ssl_ptr = SSL_new(*ctx))) goto error_exit;
SSL_set_bio(ssl_ptr, child, child);
if (SSL_accept(ssl_ptr) <= 0) goto error_exit;
if (!(ssl_bio = BIO_new(BIO_f_ssl(
BIO_set_ssl(ssl_bio, ssl_ptr, 1);
child = ssl_bio;
ssl_bio = 0;
)))) goto error_exit;
}
return child;
error_exit:
if (child) BIO_free_all(child);
if (ssl_bio) BIO_free_all(ssl_bio);
if (ssl_ptr) SSL_free(ssl_ptr);
if (*ctx) SSL_CTX_free(*ctx);
if (our_ctx) *ctx = 0;
return 0;
}
When a new socket connection is accepted, SSL_accept( ) is called to perform the SSL handshake.
The server's certificate (and possibly its chain, depending on how you configure the
spc_x509store_t object) is sent to the peer, and if a client certificate is requested and received, it
will be verified. If the handshake is successful, the returnedBIO object behaves exactly the same as
the BIO object that is returned by spc_connect( ) or spc_connect_ssl( ). Regardless of whether a
new connection was successfully established, the listeningBIO object passed into SSL_accept( ) will
be ready for another call to SSL_accept( ) to accept the next connection.
9.2.4 See Also
Recipe 9.1, Recipe 10.5
[ Team LiB ]
[ Team LiB ]
9.3 Using Session Caching to Make SSL Servers More
Efficient
9.3.1 Problem
You have a client and server pair that speak SSL to each other. The same client often makes several
connections to the same server in a short period of time. You need a way to speed up the process of
the client's reconnecting to the server without sacrificing security.
9.3.2 Solution
The terms SSL session and SSL connection are often confused or used interchangeably, but they are,
in fact, two different things. An SSL session refers to the set of parameters and encryption keys
created by performing an SSL handshake. An SSL connection is an active conversation between two
peers that uses an SSL session. Normally, when an SSL connection is established, the handshake
process negotiates the parameters that become a session. It is this negotiation that causes
establishment of SSL connections to be such an expensive operation.
Luckily, it is possible to cache sessions. Once a client has connected to the server and successfully
completed the normal handshake process, both the client and the server can save the session
parameters so that the next time the client connects to the server, it can simply reuse the session,
thus avoiding the overhead of negotiating new parameters and encryption keys.
9.3.3 Discussion
Session caching is normally not enabled by default, but enabling it is a relatively painless process.
OpenSSL does most of the work for you, although you can override much of the default behavior (for
example, you might build your own caching mechanism on the server side). By default, OpenSSL
uses an in-memory session cache, but if you will be caching a large number of sessions, or if you
want sessions to persist across boots, you may be better off using some kind of disk-based cache.
Most of the work required to enable session caching has to be done on the server side, but there's
not all that much that needs to be done:
1. Set a session ID context. The purpose of the session ID context is to make sure the session is
reused for the same purpose for which it was created. For instance, a session created for an
SSL web server should not be automatically allowed for an SSL FTP server. A session ID context
can be any arbitrary binary data up to 32 bytes in length. There are no requirements for what
the data should be, other than that it should be unique for the purpose your server serves-you
don't want to find your server getting sessions from other servers.
2.
2. Set a session timeout. The OpenSSL default is 300 seconds, which is probably a reasonable
default for most applications. When a session times out, it is not immediately purged from the
server's cache, but it will not be accepted when presented by the client. If a client attempts to
use an expired session, the server will remove it from its cache.
3. Set a caching mode. OpenSSL supports a number of possible mode options, specified as a bit
mask:
SSL_SESS_CACHE_OFF
Setting this mode disables session caching altogether. If you want to disable session
caching, you should specify this flag by itself; you do not need to set a session ID context
or a timeout.
SSL_SESS_CACHE_SERVER
Setting this mode causes sessions that are generated by the server to be cached. This is
the default mode and should be included whenever you're setting any of the other flags
described here, except for SSL_SESS_CACHE_OFF.
SSL_SESS_CACHE_NO_AUTO_CLEAR
By default, the session cache is checked for expired entries once for every 255
connections that are established. Sometimes this can cause an undesirable delay, so it
may be desirable to disable this automatic flushing of the cache. If you set this mode, you
should make sure that you periodically call SSL_CTX_flush_sessions( ) yourself.
SSL_SESS_CACHE_NO_INTERNAL_LOOKUP
If you want to replace OpenSSL's internal caching mechanism with one of your own
devising, you should set this mode. We do not include a recipe that demonstrates the use
of this flag in the book, but you can find one on the book's companion web site.
You can use the following convenience function to enable session caching on the server side. If you
want to use it with the SSL server functions presented in Recipe 9.2, you should create an SSL_CTX
object using spc_create_sslctx( ) yourself. Then call spc_enable_sessions( ) using that
SSL_CTX object, and pass the SSL_CTX object to spc_accept( ) so that a new one will not be
created automatically for you. Whether you enable session caching or not, it's a good idea to create
your own SSL_CTX object before calling spc_accept( ) anyway, so that a fresh SSL_CTX object isn't
created for each and every client connection.
#include <openssl/bio.h>
#include <openssl/ssl.h>
void spc_enable_sessions(SSL_CTX *ctx, unsigned char *id, unsigned int id_len,
long timeout, int mode) {
SSL_CTX_set_session_id_context(ctx, id, id_len);
SSL_CTX_set_timeout(ctx, timeout);
SSL_CTX_set_session_cache_mode(ctx, mode);
}
Enabling session caching on the client side is even easier than it is on the server side. All that's
required is setting the SSL_SESSION object in the SSL_CTX object before actually establishing the
connection. The following function, spc_reconnect( ), is a re-implementation of spc_connect_ssl(
) with the necessary changes to enable client-side session caching.
BIO *spc_reconnect(char *host, int port, SSL_SESSION *session,
spc_x509store_t *spc_store, SSL_CTX **ctx) {
BIO *conn = 0;
int our_ctx = 0;
SSL *ssl_ptr;
if (*ctx) {
CRYPTO_add(&((*ctx)->references), 1, CRYPTO_LOCK_SSL_CTX);
if (spc_store && spc_store != SSL_CTX_get_app_data(*ctx)) {
SSL_CTX_set_cert_store(*ctx, spc_create_x509store(spc_store));
SSL_CTX_set_app_data(*ctx, spc_store);
}
} else {
*ctx = spc_create_sslctx(spc_store);
our_ctx = 1;
}
if (!(conn = BIO_new_ssl_connect(*ctx))) goto error_exit;
BIO_set_conn_hostname(conn, host);
BIO_set_conn_int_port(conn, &port);
if (session) {
BIO_get_ssl(conn, &ssl_ptr);
SSL_set_session(ssl_ptr, session);
}
if (BIO_do_connect(conn) <= 0) goto error_exit;
if (!our_ctx) SSL_CTX_free(*ctx);
if (session) SSL_SESSION_free(session);
return conn;
error_exit:
if (conn) BIO_free_all(conn);
if (*ctx) SSL_CTX_free(*ctx);
if (our_ctx) *ctx = 0;
return 0;
}
Establishing an SSL connection as a client may be as simple as setting theSSL_SESSION object in the
SSL_CTX object, but where does this mysterious SSL_SESSION come from? When a connection is
established, OpenSSL creates an SSL session object and tucks it away in theSSL object that is
normally hidden away in the BIO object that is returned by spc_connect_ssl( ). You can retrieve it
by calling spc_getsession( ).
SSL_SESSION *spc_getsession(BIO *conn) {
SSL *ssl_ptr;
BIO_get_ssl(conn, &ssl_ptr);
if (!ssl_ptr) return 0;
return SSL_get1_session(ssl_ptr);
}
The SSL_SESSION object that is returned by spc_getsession( ) has its reference count
incremented, so you must be sure to call SSL_SESSION_free( ) at some point to release the
reference. You can obtain the SSL_SESSION object as soon as you've successfully established a
connection, but because the value can change between the time the connection is first established
and the time it's terminated due to renegotiation, you should always get theSSL_SESSION object just
before the connection is terminated. That way, you can be sure you have the most recent session
object.
9.3.4 See Also
Recipe 9.2
[ Team LiB ]
[ Team LiB ]
9.4 Securing Web Communication on Windows Using the
WinInet API
9.4.1 Problem
You are developing a Windows program that needs to connect to an HTTP server with SSL enabled.
You want to use the Microsoft WinInet API to communicate with the HTTP server.
9.4.2 Solution
The Microsoft WinInet API was introduced with Internet Explorer 3.0. It provides a set of functions
that allow programs easy access to FTP, Gopher, HTTP, and HTTPS servers. For HTTPS servers, the
details of using SSL are hidden from the programmer, allowing the programmer to concentrate on
the data that needs to be exchanged, rather than protocol details.
9.4.3 Discussion
The Microsoft WinInet API is a rich API that makes client-side interaction with FTP, Gopher, HTTP,
and HTTPS servers easy; as with most Windows APIs, however, a sizable amount of code is still
required. Because of the wealth of options available, we won't provide fully working code for a
WinInet API wrapper here. Instead, we'll discuss the API and provide code samples for the parts of
the API that are interesting from a security standpoint. We encourage you to consult Microsoft's
documentation on the API to learn about all that the API can do.
If you're going to establish a connection to a web server using SSL with WinInet, the first thing you
need to do is create an Internet session by calling InternetOpen( ). This function initializes and
returns an object handle that is needed to actually establish a connection. It takes care of such
details as presenting the user with the dial-in UI if the user is not connected to the Internet and the
system is so configured. Although any number of calls may be made toInternetOpen( ) by a single
application, it generally needs to be called only once. The handle it returns can be reused any number
of times.
#include <windows.h>
#include <wininet.h>
HINTERNET
LPSTR
DWORD
LPSTR
LPSTR
DWORD
hInternetSession;
lpszAgent
=
dwAccessType
=
lpszProxyName
=
lpszProxyBypass =
dwFlags
=
"Secure Programming Cookbook Recipe 9.4";
INTERNET_OPEN_TYPE_PROXY;
0;
0;
0;
hInternetSession = InternetOpen(lpszAgent, dwAccessType, lpszProxyName,
lpszProxyBypass, dwFlags);
If you set dwAccessType to INTERNET_OPEN_TYPE_PROXY, lpszProxyName to 0, and
lpszProxyBypass to 0, the system defaults for HTTP access are used. If the system is configured to
use a proxy, it will be used as required. The lpszAgent argument is passed to servers as the client's
HTTP agent string. It may be set as any custom string, or it may be set to the same string a specific
browser might send to a web server when making a request.
The next step is to connect to the server. You do this by callingInternetConnect( ), which will
return a new handle to an object that stores all of the relevant connection information. The two
obvious requirements for this function are the name of the server to connect to and the port on which
to connect. The name of the server may be specified as either a hostname or a dotted-decimal IP
address. You can specify the port as a number or use the constant INTERNET_DEFAULT_HTTPS_PORT
to connect to the default SSL-enabled HTTP port 443.
HINTERNET
LPSTR
INTERNET_PORT
LPSTR
LPSTR
DWORD
DWORD
DWORD
hConnection;
lpszServerName
nServerPort
lpszUsername
lpszPassword
dwService
dwFlags
dwContext
=
=
=
=
=
=
=
"www.amazon.com";
INTERNET_DEFAULT_HTTPS_PORT;
0;
0;
INTERNET_SERVICE_HTTP;
0;
0;
hConnection = InternetConnect(hInternetSession, lpszServerName, nServerPort,
lpszUsername, lpszPassword, dwService, dwFlags,
dwContext);
The call to InternetConnect( ) actually establishes a connection to the remote server. If the
connection attempt fails for some reason, the return value is NULL, and the error code can be
retrieved via GetLastError( ). Otherwise, the new object handle is returned. If multiple requests to
the same server are necessary, you should use the same handle, to avoid the overhead of
establishing multiple connections.
Once a connection to the server has been established, a request object must be constructed. This
object is a container for various information: the resource that will be requested, the headers that
will be sent, a set of flags that dictate how the request is to behave, header information returned by
the server after the request has been submitted, and other information. A new request object is
constructed by calling HttpOpenRequest( ).
HINTERNET
LPSTR
LPSTR
LPSTR
LPSTR
LPSTR
DWORD
hRequest;
lpszVerb
lpszObjectName
lpszVersion
lpszReferer
lpszAcceptTypes
dwFlags
DWORD
dwContext
=
=
=
=
=
=
"GET";
"/";
"HTTP/1.1";
0;
0;
INTERNET_FLAG_SECURE |
INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTP |
INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTPS;
= 0;
hRequest = HttpOpenRequest(hConnection, lpszVerb, lpszObjectName, lpszVersion,
lpszReferer, lpszAcceptTypes, dwFlags, dwContext);
The lpszVerb argument controls the type of request that will be made, which can be any valid HTTP
request, such as GET or POST. The lpszObjectName argument is the resource that is to be
requested, which is normally the part of a URL that follows the server name, starting with the
forward slash and ending before the query string (which starts with a question mark). Specifying
lpszAcceptTypes as 0 tells the server that we can accept any kind of text document; it is equivalent
to a MIME type of "text/*".
The most interesting argument passed to HttpOpenRequest( ) is dwFlags. A large number of flags
are defined, but only five deal specifically with HTTP over SSL:
INTERNET_FLAG_IGNORE_CERT_CN_INVALID
Normally, as part of verification of the server's certificate, WinInet will verify that the hostname
is contained in the certificate's commonName field or subjectAltName extension. If this flag is
specified, the hostname check will not be performed. (See Recipe 10.4 and Recipe 10.8 for
discussions of the importance of performing hostname checks on certificates.)
INTERNET_FLAG_IGNORE_CERT_DATE_INVALID
An important part of verifying the validity of an X.509 certificate involves checking the dates for
which a certificate is valid. If the current date is outside the certificate's valid date range, the
certificate should be considered invalid. If this flag is specified, the certificate's validity dates
are not checked. This option should never be used in a released version of a product.
INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTP
If this flag is specified and the server attempts to redirect the client to a non-SSL URL, the
redirection will be ignored. You should always include this flag so you can be sure you are not
transferring in the clear data that you expect to be protected.
INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTPS
If this flag is specified and the server attempts to redirect the client to an SSL- protected URL,
the redirection will be ignored. If you're expecting to be communicating only with servers under
your own control, it's safe to omit this flag; if not, you might want to consider including it so
you're not transferred somewhere other than expected.
INTERNET_FLAG_SECURE
This is the all-important flag. When this flag is included, the use of SSL on the connection is
enabled. Without it, SSL is not used, and all data is transferred in the clear. Obviously, you
want to include this flag.
Once the request object has been constructed, the request needs to be sent to the server. This is
done by calling HttpSendRequest( ) with the request object. Additional headers can be included
with the request submission, as well as any optional data to be sent after the headers. You will want
to send optional data when performing a POST operation. Additional headers and optional data are
both specified as strings and the lengths of the strings.
BOOL
LPSTR
DWORD
LPSTR
DWORD
bResult;
lpszHeaders
dwHeadersLength
lpszOptional
dwOptionalLength
=
=
=
=
0;
0;
0;
0;
bResult = HttpSendRequest(hRequest, lpszHeaders, dwHeadersLength, lpOptional,
dwOptionalLength);
After sending the request, the server's response can be retrieved. As part of sending the request,
WinInet will retrieve the response headers from the server. Information about the response can be
obtained using the HttpQueryInfo( ) function. A complete list of the information that may be
available can be found in the WinInet documentation, but for our purposes, the only information
we're concerned with is the content length. The server is not required to send a content length
header back as part of its response, so we must also be able to handle the case where it is not sent.
Response data sent by the server after its response headers can be obtained by calling
InternetReadFile( ) as many times as necessary to retrieve all of the data.
DWORD dwContentLength, dwIndex, dwInfoLevel;
DWORD dwBufferLength, dwNumberOfBytesRead, dwNumberOfBytesToRead;
LPVOID lpBuffer, lpFullBuffer, lpvBuffer;
dwInfoLevel
= HTTP_QUERY_CONTENT_LENGTH;
lpvBuffer
= (LPVOID)&dwContentLength;
dwBufferLength = sizeof(dwContentLength);
dwIndex
= 0;
HttpQueryInfo(hRequest, dwInfoLevel, lpvBuffer, &dwBufferLength, &dwIndex);
if (dwIndex != ERROR_HTTP_HEADER_NOT_FOUND) {
/* Content length is known. Read only that much data. */
lpBuffer = GlobalAlloc(GMEM_FIXED, dwContentLength);
InternetReadFile(hRequest, lpBuffer, dwContentLength, &dwNumberOfBytesRead);
} else {
/* Content length is not known. Read until EOF is reached. */
dwContentLength = 0;
dwNumberOfBytesToRead = 4096;
lpFullBuffer = lpBuffer = GlobalAlloc(GMEM_FIXED, dwNumberOfBytesToRead);
while (InternetReadFile(hRequest, lpBuffer, dwNumberOfBytesToRead,
&dwNumberOfBytesRead)) {
dwContentLength += dwNumberOfBytesRead;
if (dwNumberOfBytesRead != dwNumberOfBytesToRead) break;
lpFullBuffer = GlobalReAlloc(lpFullBuffer, dwContentLength +
dwNumberOfBytesToRead, 0);
lpBuffer = (LPVOID)((LPBYTE)lpFullBuffer + dwContentLength);
}
lpFullBuffer = lpBuffer = GlobalReAlloc(lpFullBuffer, dwContentLength, 0);
}
After the data has been read with InternetReadFile( ), the variable lpBuffer will hold the
contents of the server's response, and the variable dwContentLength will hold the number of bytes
contained in the response data buffer. At this point, the request has been completed, and the request
object should be destroyed by calling InternetCloseHandle( ). If additional requests to the same
connection are required, a new request object can be created and used with the same connection
handle from the call to InternetConnect( ). When no more requests are to be made on the same
connection, InternetCloseHandle( ) should be used to close the connection. Finally, when no more
WinInet activity is to take place using the Internet session object created byInternetConnect( ),
InternetCloseHandle( ) should be called to clean up that object as well.
InternetCloseHandle(hRequest);
InternetCloseHandle(hConnection);
InternetCloseHandle(hInternetSession);
9.4.4 See Also
Recipe 10.4, Recipe 10.8
[ Team LiB ]
[ Team LiB ]
9.5 Enabling SSL without Modifying Source Code
9.5.1 Problem
You have an existing client or server that is not SSL-enabled, and you want to make it so without
modifying its source code to add SSL support.
9.5.2 Solution
Stunnel is a program that uses OpenSSL to create SSL tunnels between clients and servers that do
not natively support SSL. At the time of this writing, the latest release is 4.04, and it is available for
Unix and Windows from http://www.stunnel.org. For servers, it listens on another socket for SSL
connections and forwards data bidirectionally to the real server over a non-SSL connection. SSLenabled clients can then connect to Stunnel's listening port and communicate with the server that is
not SSL-enabled. For clients, it listens on a socket for non-SSL connections and forwards data
bidirectionally to the server over an SSL-enabled connection.
Stunnel has existed for a number of years and has traditionally used command-line switches to
control its behavior. Version 4.00 changed that. Stunnel now uses a configuration file to control its
behavior, and all formerly supported command-line switches have been removed. We'll cover the
latest version, 4.04, in this recipe.
9.5.3 Discussion
While this recipe does not actually contain any code, we've included this section because we consider
Stunnel a tool worth discussing, particularly if you are developing SSL-enabled clients and servers. It
can be quite a frustrating experience to attempt to develop and debug SSL-enabled clients and
servers together from the ground up, especially if you do not have any prior experience programming
with SSL. Stunnel will help you debug your SSL code.
A Stunnel configuration file is organized in sections. Each section contains a set of keys, and each key
has an associated value. Sections and keys are both named and case-insensitive. A configuration file
is parsed from top to bottom with sections delimited by a line containing the name of the section
surrounded by square brackets. The other lines contain key and value pairs that belong to the most
recently parsed section delimiter. In addition, an optional global section that is unnamed occurs
before the first named section in the file. Keys are separated from their associated value by an equal
sign (=).
Comments may only begin at the start of a line that begins with a hash mark (#) (optionally
preceded by whitespace), and the whole line is treated as a comment. Any leading or trailing
whitespace surrounding a key or a value is stripped. Any other whitespace is significant, including
leading or trailing whitespace surrounding a section name (as it would occur between the square
brackets). For example, "[ my_section ]" is not the same as "[my_section]". The documentation
included with Stunnel describes the supported keys sufficiently well, so we won't duplicate it here.
One nice advantage of the configuration files over the old command-line interface is that each section
in the configuration file defines either a client or a server, so a single instance of Stunnel can be used
to run multiple clients or servers. If you want to run both clients and servers, you still need two
instances of Stunnel running because the flag that determines which mode to run in is a global
option. With the command-line interface, multiple instances of Stunnel used to be required, one for
each client or server that you wanted to run. Therefore, if you wanted to use Stunnel for POP3, IMAP,
and SMTPS servers, you needed to run three instances of Stunnel.
Each section name defines the name of the service that will be used with TCP Wrappers and for
logging purposes. For both clients and servers, specify the accept and connect keys. The accept
key specifies the port on which Stunnel will listen for incoming connections, and theconnect key
specifies the port that Stunnel will attempt to connect to for outgoing connections. At a minimum,
these two keys must specify a port number, but they may also optionally include a hostname or IP
address. To include a hostname or IP address, precede the port number with the hostname or IP
address, and separate the two with a colon (:).
You enable the mode for Stunnel as follows:
Server mode
To enable server mode, set the global option key client to no. When running in server mode,
Stunnel expects incoming connections to speak SSL and makes outgoing connections without
SSL. You will also need to set the two global optionscert and key to the names of files
containing the certificate and key to use.
Client mode
To enable client mode, set the global option key client to yes. In client mode, Stunnel
expects incoming connection to be operating without SSL and makes outgoing connections
using SSL. A certificate and key may be specified, but they are not required.
The following example starts up two servers. The first is for IMAP over SSL, which will listen for SSL
connections on port 993 and redirect traffic without SSL to a connection on port 110. The second is
for POP3 over SSL, which will listen for SSL connections on port 995 for the localhost (127.0.0.1)
interface only. Outgoing connections will be made to port 110 on the localhost interface.
client = no
cert
= /home/mmessier/ssl/servercert.pem
key
= /home/mmessier/ssl/serverkey.pem
[imaps]
accept = 993
connect = 143
[pop3]
accept = localhost:995
connect = localhost:110
In the following example, Stunnel operates in client mode. It listens for connections on the localhost
interface on port 25, and it redirects traffic to port 465 onsmtp.secureprogramming.com. This
example would be useful for a mail client that does not support SMTP over SSL.
client = yes
[smtp]
accept = localhost:25
connect = smtp.secureprogramming.com:465
9.5.4 See Also
Stunnel web page: http://www.stunnel.org
[ Team LiB ]
[ Team LiB ]
9.6 Using Kerberos Encryption
9.6.1 Problem
You need to use encryption in code that already uses Kerberos for authentication.
9.6.2 Solution
Kerberos is primarily an authentication service employed for network services. As a side effect of the
requirements to perform authentication, Kerberos also provides an API for encryption and decryption,
although the number of supported ciphers is considerably fewer than those provided by other
cryptographic protocols. Authentication yields a cryptographically strong session key that can be used as
a key for encryption.
This recipe works on Unix and Windows with the Heimdal and MIT Kerberos implementations. The code
presented here will not work on Windows systems that are Kerberos-enabled with the built-in Windows
support, because Windows does not expose the Kerberos API in such a way that the code could be made
to work. In particular, the encryption and decryption functions used in this recipe are not present on
Windows unless you are using either Heimdal or MIT Kerberos. Instead, you should use CryptoAPI on
Windows (see Recipe 5.25 ).
9.6.3 Discussion
Kerberos provides authentication between clients and servers, communicating over an established data
connection. The Kerberos API provides no support for establishing, terminating, or passing arbitrary
data over a data connection, whether pipes, sockets, or otherwise. Once its job has been successfully
performed, a cryptographically strong session key that can be used as a key for encryption is "left
behind."
We present a discussion of how to authenticate using Kerberos in Recipe 8.13. In this recipe, we pick up
at the point where Kerberos authentication has completed successfully. At this point, you'll be left with
at least a krb5_context object and a krb5_auth_context object. Using these two objects, you can
obtain a krb5_keyblock object that contains the session key by calling
krb5_auth_con_getremotesubkey( ) . The prototype for this function is as follows:
krb5_error_code krb5_auth_con_getremotesubkey(krb5_context context,
krb5_auth_context auth_context,
krb5_keyblock **key_block);
Once you have the session key, you can use it for encryption and decryption.
Kerberos supports only a limited number of symmetric ciphers, which may vary depending on the
version of Kerberos that you are using. For maximum portability, you are limited primarily to DES and
3-key Triple-DES in CBC mode. The key returned from krb_auth_con_getremotesubkey( ) will have
an algorithm already associated with it, so you don't even have to choose. As part of the authentication
process, the client and server will negotiate the strongest cipher that both are capable of supporting,
which will (we hope) be Triple-DES (or something stronger) instead of DES, which is actually rather
weak. In fact, if DES is negotiated, you may want to consider refusing to proceed.
Many different implementations of Kerberos exist today. The most prominent among the free
implementations is the MIT implementation, which is distributed with Darwin and many Linux
distributions. Another popular implementation is the Heimdal implementation, which is distributed with
FreeBSD and OpenBSD. Unfortunately, while the two implementations share much of the same API,
there are differences. In particular, the API for encryption services that we will be using in this recipe
differs between the two. To determine which implementation is being used, we test for the existence of
the KRB5_GENERAL_ _ preprocessor macro, which will be defined by the MIT implementation but not the
Heimdal implementation.
Given a krb5_keyblock object, you can determine whether DES was negotiated using the following
function:
#include <krb5.h>
int spc_krb5_isdes(krb5_keyblock *key) {
#ifdef KRB5_GENERAL_ _
if (key->enctype = = ENCTYPE_DES_CBC_CRC || key->enctype =
key->enctype = = ENCTYPE_DES_CBC_MD5 || key->enctype =
return 1;
#else
if (key->keytype = = ETYPE_DES_CBC_CRC || key->keytype = =
key->keytype = = ETYPE_DES_CBC_MD5 || key->keytype = =
key->keytype = = ETYPE_DES_CFB64_NONE || key->keytype =
return 1;
#endif
return 0;
}
= ENCTYPE_DES_CBC_MD4 ||
= ENCTYPE_DES_CBC_RAW)
ETYPE_DES_CBC_MD4 ||
ETYPE_DES_CBC_NONE ||
= ETYPE_DES_PCBC_NONE)
The krb5_context object and the krb5_keyblock object can then be used together as arguments to
spc_krb5_encrypt( ) , which we implement below. The function also requires a buffer that holds the
data to be encrypted along with the size of the buffer, as well as a pointer to receive a dynamically
allocated buffer that will hold the encrypted data on return, and a pointer to receive the size of the
encrypted data buffer.
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<krb5.h>
int spc_krb5_encrypt(krb5_context ctx, krb5_keyblock *key, void *inbuf,
size_t inlen, void **outbuf, size_t *outlen) {
#ifdef KRB5_GENERAL_ _
size_t
blksz, newlen;
krb5_data
in_data;
krb5_enc_data out_data;
if (krb5_c_block_size(ctx, key->enctype, &blksz)) return 0;
if (!(inlen % blksz)) newlen = inlen + blksz;
else newlen = ((inlen + blksz - 1) / blksz) * blksz;
in_data.magic = KV5M_DATA;
in_data.length = newlen;
in_data.data
= malloc(newlen);
if (!in_data.data) return 0;
memcpy(in_data.data, inbuf, inlen);
spc_add_padding((unsigned char *)in_data.data + inlen, inlen, blksz);
if (krb5_c_encrypt_length(ctx, key->enctype, in_data.length, outlen)) {
free(in_data.data);
return 0;
}
out_data.magic
= KV5M_ENC_DATA;
out_data.enctype = key->enctype;
out_data.kvno
= 0;
out_data.ciphertext.magic = KV5M_ENCRYPT_BLOCK;
out_data.ciphertext.length = *outlen;
out_data.ciphertext.data
= malloc(*outlen);
if (!out_data.ciphertext.data) {
free(in_data.data);
return 0;
}
if (krb5_c_encrypt(ctx, key, 0, 0, &in_data, &out_data)) {
free(in_data.data);
return 0;
}
*outbuf = out_data.ciphertext.data;
free(in_data.data);
return 1;
#else
int
result;
void
*tmp;
size_t
blksz, newlen;
krb5_data
edata;
krb5_crypto
crypto;
if (krb5_crypto_init(ctx, key, 0, &crypto) != 0) return 0;
if (krb5_crypto_getblocksize(ctx, crypto, &blksz)) {
krb5_crypto_destroy(ctx, crypto);
return 0;
}
if (!(inlen % blksz)) newlen = inlen + blksz;
else newlen = ((inlen + blksz - 1) / blksz) * blksz;
if (!(tmp = malloc(newlen))) {
krb5_crypto_destroy(ctx, crypto);
return 0;
}
memcpy(tmp, inbuf, inlen);
spc_add_padding((unsigned char *)tmp + inlen, inlen, blksz);
if (!krb5_encrypt(ctx, crypto, 0, tmp, inlen, &edata)) {
if ((*outbuf = malloc(edata.length)) != 0) {
result = 1;
memcpy(*outbuf, edata.data, edata.length);
*outlen = edata.length;
}
krb5_data_free(&edata);
}
free(tmp);
krb5_crypto_destroy(ctx, crypto);
return result;
#endif
}
The decryption function works identically to the encryption function. Remember that DES and Triple-DES
are block mode ciphers, so padding may be necessary if the data you're encrypting is not an exact
multiple of the block size. While the Kerberos library will do any necessary padding for you, it does so by
padding with zero bytes, which is a poor way to pad out the block. Therefore, we do our own padding
using the code from Recipe 5.11 to perform PKCS block padding.
#include <stdlib.h>
#include <string.h>
#include <krb5.h>
int spc_krb5_decrypt(krb5_context ctx, krb5_keyblock *key, void *inbuf,
size_t inlen, void **outbuf, size_t *outlen) {
#ifdef KRB5_GENERAL_ _
int
padding;
krb5_data
out_data;
krb5_enc_data in_data;
in_data.magic
= KV5M_ENC_DATA;
in_data.enctype = key->enctype;
in_data.kvno
= 0;
in_data.ciphertext.magic = KV5M_ENCRYPT_BLOCK;
in_data.ciphertext.length = inlen;
in_data.ciphertext.data
= inbuf;
out_data.magic = KV5M_DATA;
out_data.length = inlen;
out_data.data
= malloc(inlen);
if (!out_data.data) return 0;
if (krb5_c_block_size(ctx, key->enctype, &blksz)) {
free(out_data.data);
return 0;
}
if (krb5_c_decrypt(ctx, key, 0, 0, &in_data, &out_data)) {
free(out_data.data);
return 0;
}
if ((padding = spc_remove_padding((unsigned char *)out_data.data +
out_data.length - blksz, blksz)) = = -1) {
free(out_data.data);
return 0;
}
*outlen = out_data.length - (blksz - padding);
if (!(*outbuf = realloc(out_data.data, *outlen))) *outbuf = out_data.data;
return 1;
#else
int
padding, result;
void
*tmp;
size_t
blksz;
krb5_data
edata;
krb5_crypto crypto;
if (krb5_crypto_init(ctx, key, 0, &crypto) != 0) return 0;
if (krb5_crypto_getblocksize(ctx, crypto, &blksz) != 0) {
krb5_crypto_destroy(ctx, crypto);
return 0;
}
if (!(tmp = malloc(inlen))) {
krb5_crypto_destroy(ctx, crypto);
return 0;
}
memcpy(tmp, inbuf, inlen);
if (!krb5_decrypt(ctx, crypto, 0, tmp, inlen, &edata)) {
if ((padding = spc_remove_padding((unsigned char *)edata.data + edata.length blksz, blksz)) != -1) {
*outlen = edata.length - (blksz - padding);
if ((*outbuf = malloc(*outlen)) != 0) {
result = 1;
memcpy(*outbuf, edata.data, *outlen);
}
}
krb5_data_free(&edata);
}
free(tmp);
krb5_crypto_destroy(ctx, crypto);
return result;
#endif
}
9.6.4 See Also
Recipe 5.11 , Recipe 5.25 , Recipe 8.13
[ Team LiB ]
[ Team LiB ]
9.7 Performing Interprocess Communication Using
Sockets
9.7.1 Problem
You have two or more processes running on the same machine that need to communicate with each
other.
9.7.2 Solution
Modern operating systems support a variety of interprocess communications primitives that vary
from system to system. If you intend to make your program portable among different platforms and
even different implementations of Unix, your best bet is to use sockets. All modern operating systems
support the Berkeley socket interface for TCP/IP at a minimum, while most-if not all-Unix
implementations also support Unix domain sockets.
9.7.3 Discussion
Many operating systems support various methods of allowing two or more processes to communicate
with each o