CCTV Surveillance This page intentionally left blank CCTV Surveillance Analog and Digital Video Practices and Technology Second Edition by Herman Kruegle AMSTERDAM • BOSTON • HEIDELBERG • LONDON • NEW YORK • OXFORD PARIS • SAN DIEGO • SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Butterworth–Heinemann is an imprint of Elsevier Senior Acquisitions Editor: Assistant Editor: Marketing Manager: Project Manager: Cover Designer: Compositor: Cover Printer: Text Printer/Binder: Mark Listewnik Kelly Weaver Christian Nolin Jeff Freeland Eric DeCicco Integra Software Services Pvt. Ltd. Phoenix Color Corp. The Maple-Vail Book Manufacturing Group Elsevier Butterworth–Heinemann 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA Linacre House, Jordan Hill, Oxford OX2 8DP, UK Copyright © 2007, Elsevier, Inc. All rights reserved. Exceptions: Copyright for many of the photos is not held by the publisher. Please see the Photo Credits section, on the next page, for copyright information on these photos. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail: [email protected] You may also complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting “Support & Contact” then “Copyright and Permission” and then “Obtaining Permissions.” Recognizing the importance of preserving what has been written, Elsevier prints its books on acid-free paper whenever possible. Library of Congress Cataloging-in-Publication Data Kruegle, Herman. CCTV surveillance : analog and digital video practices and technology / Herman Kruegle—2nd ed. p. cm. ISBN-13: 978-0-7506-7768-4 (casebound : alk. paper) ISBN-10: 0-7506-7768-6 (casebound : alk. paper) 1. Closed-circuit television—Design and construction. 2. Television in security systems. I. Title. TK6680.K78 2005 621.389’28—dc22 2005022280 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. ISBN-13: 978-0-7506-7768-4 ISBN-10: 0-7506-7768-6 For information on all Butterworth–Heinemann publications visit our Web site at www.books.elsevier.com Printed in the United States of America 06 07 08 09 10 11 10 9 8 7 6 5 4 3 2 1 Working together to grow libraries in developing countries www.elsevier.com | www.bookaid.org | www.sabre.org Photo Credits The publisher and author would like to thank the listed manufacturers for the photographs used in the figures. Accele Electronics Allan Broadband American Dynamics Avida Axis Communications CBC America Canon USA Casio USA Cohu, Inc. Controp USA COP-USA Dell Star D-Link Digispec FM Systems Global Solar Gossen Greenlee Gyrozoom Hitachi Honeywell Security ICU IFS/GE Security Ikegami Electronics (U.S.A. Inc.) Integral Tech Intellicom International Fiber Systems Ipix Instrumentation Tech Systems Keithley Leader Instruments Lowering Systems Mace Mannix Marshall Mitsubishi 8-9A, 8-9B 25-14A 12-1, 17-1E 2-7C, 2-7E, 2-7G, 2-7H, 2-16A, 2-16B, 2-17A, 2-17B, 2-17C, 2-17D, 2-17E, 2-17F, 4-18A, 4-27C, 4-27D, 4-27E, 4-30, 4-33A, 4-33B, 4-36, 4-37, 4-38, 4-40, 15-2A, 15-2C, 15-8A, 15-8C, 15-10B, 15-12, 15-15A, 15-15B, 16-7, 18-5A, 18-6A, 18-6B, 18-7, 18-10, 18-11A, 18-11B, 18-14A, 18-14B, 18-20A, 18-23D, 18-24, 19-22A, 19-22B, 21-2A, 21-2B, 21-4A, 21-4B, 21-4C, 22-4A, 22-4C, 22-5, 22-10B, 22-10C, 22-23A, 22-23B, 22-25, 22-26, 22-27 5-14B, 7-28A, 7-34A, 7-34B, 7-35A, 7-35B 15-9A 4-14A 7-36A 2-10A, 2-10F 17-24 18-19B 6-35A 7-36B 13-8A 25-13B 23-11A, 23-11C 25-15A 25-21A, 25-21B 4-14B 2-26D, 17-22A 9-12C, 15-2D, 15-7D, 15-10D, 15-13, 22-10B 13-8C, 13-8D 6-28, 6-30 2-10C, 4-38, 8-5A 7-36C 25-22A, 25-22B 6-28, 6-30 2-15B 16-6A, 16-6B 25-14B 25-1A, 25-2A, 25-2B, 25-2C, 25-6A, 25-10, 25-11 14-8C, 14-8D 15-10C 25-15B 8-16A 2-28A, 10-1 NVT Omniscope Panasonic Security Systems Parkut Pelco Pentax Radiant Rainbow RF-Links Remote Video Surveillance Sanyo Security Products Sagebrush Technology Selectronic Semco SOHOware Sony Electronics Smarter Systems Tektronix Thorlabs Trango Uni-Solar Ovonic Vicon Videolarm Watec Winsted 6-9A, 6-9B 2-15A Cover image (bottom), 2-10B, 2-26A, 2-26C, 2-27B, 2-27C, 5-14A, 8-9D, 14-4B, 14-5B, 14-6A, 15-2B, 17-2, 18-20B, 20-4A, 20-4B, 20-5B 22-26, 22-27 14-5C, 15-7C, 15-14B, 15-17, 17-1A, 17-1C, 17-11B 2-7A, 2-14, 4-12A 13-8B 4-12B, 4-19, 4-22A, 4-22B 18-25B 9-12B Cover image (middle right), 2-27A, 5-14C, 8-5B, 8-9C, 9-12A, 14-1A, 15-6C, 15-6D, 15-10A, 17-1D 17-14 8-10A 6-38C 7-10 4-26, 4-31, 5-22, 7-28B, 14-1B, 14-4A, 17-22B 23-13 25-1B, 25-1C, 25-6B, 25-13A 25-17 6-35B, 6-35C 23-11B 2-26B, 2-30A, 2-30C, 14-3, 14-5A, 14-5D, 14-6B, 15-1A, 15-1B, 15-5, 15-6A, 15-6B, 15-9B, 15-11, 15-14A, 15-14C, 15-19B, 17-1B, 17-10A, 17-11A 2-29E, 14-7A, 14-7B, 14-7C, 14-8A, 14-8B, 15-7A, 15-8B, 15-8D, 15-9C, 15-14D, 15-19A, 17-13, 17-15A, 17-15B, 22-4B 18-19A, 18-19C 20-2A, 20-2B, 20-3A, 20-3B For Carol This page intentionally left blank Contents xi xiii xv Foreword Preface Acknowledgments Part 1 Chapter 1 Chapter 2 Video’s Critical Role in the Security Plan Video Technology Overview Part 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26 Chapter 27 Chapter 28 Natural and Artificial Lighting Lenses and Optics Cameras—Analog, Digital, and Internet Analog Video, Voice, and Control Signal Transmission Digital Transmission—Video, Communications, Control Analog Monitors and Digital Displays Analog, Digital Video Recorders Hard Copy Video Printers Video Switchers Quads and Multiplexers Video Motion Detectors Dome Cameras Integrated Cameras, Camera Housings, and Accessories Electronic Video Image Splitting, Reversal, and Annotation Camera Pan/Tilt Mechanisms Covert Video Surveillance Low-Light-Level Cameras, Thermal Infrared Imagers Control Room/Console Design Rapid Deployment Video Systems Applications and Solutions—Sample Scenarios System Power Sources Video-Security Systems Integration Video System Test Equipment Video Check List Education, Standards, Certification New Video Technology Glossary Bibliography Index 1 13 47 71 109 145 199 251 275 305 321 341 353 373 387 405 415 445 469 497 507 513 553 577 583 601 605 609 615 639 643 ix This page intentionally left blank Foreword A few years ago I had the privilege of addressing a Congressional Subcommittee on Technology and Procurement Policy, chaired by Congressman Tom Davis. In addition to examining GSA’s efforts to secure federal buildings, the Subcommittee was interested in hearing and learning about new physical security technology. When I leaf through the pages of this book, I again realize the enormity of the task undertaken by the Subcommittee, the necessity for doing so, and the importance of this type of information to not only security professionals, but now to IT professionals as well. Closed circuit television (CCTV) and other related video security and surveillance technology has advanced further and faster in the period from 2001 to 2005 than in any prior comparable time period. IP cameras, mapping, servers, platforms, LANs, WANs, and VPNs, wireless, digital migration, algorithms, etc. are all converging along with other related security system technologies such as access control, life safety, intrusion alarms, etc. with the intent to configure fully integrated systems. This is the new direction for the security industry as digital technology has become pervasive across all product lines, opening the door to more software-oriented control platforms on the enterprise level. So who is the better person to chronicle, explain, and put these terms and technology into perspective than Herman Kruegle, one of the industry’s foremost experts on video surveillance and related technologies. I have had the privilege of knowing and working with Herman for many years. He is a consummate professional who has the innate ability to explain the technical aspects of this emerging technology in a manner we can all understand and put into practice. Herman’s first book, CCTV Surveillance – Video Practices and Technology, is considered, by most of us in the industry, to be the bible of CCTV, and I fully expect this revised edition will rise to even greater popularity. In the pages following, readers will find concise and intelligent descriptions of the analog and digital video practices and technology we have all grown up with. But more important, Herman has included, in this revised edition, his explanation of the newest audio/video information technology (AV/IT) developments, products utilizing the technology and applications for same. Security professionals, system integrators, architects and engineers, IT managers, or end users who are looking for a resource to help them navigate this complex field of IP Video Security will not be disappointed. The material is well researched and thoughtfully laid out to help insure the reader’s understanding and to hopefully allow them to go on to designing, installing, and using digital video surveillance to its fullest capacity. Frank Abram xi This page intentionally left blank Preface Following the same philosophy contained in the first edition, the second edition is written for and contains information valuable to the end-user as well as the technical practitioner. Each chapter begins with an overview and then presents equipment available with their characteristics, features, and application. The first edition of CCTV Surveillance in 1995 asked the question “why write a CCTV surveillance book?”. At that time, analog CCTV had progressed from a vacuum tube to a solid state technology that provided reliable, longlife small cameras produced at prices affordable for most security applications. A decade later, significant advances have been made in camera sensors, computers, and digital transmission technology to warrant a complete review of CCTV’s role in the security industry. The migration from legacy analog components to digital technology and the emergence of the Internet have accelerated the utilization of Internet protocol (IP) video and remote monitoring in security. The internet has permitted the widespread interconnection of other technologies including intrusion and fire and intrusion alarm systems, access control, and other communications and control. The ease of interconnection afforded by digital transmission of video and other pertinent security data anywhere in a facility, local environment or globally, engenders a new meaning to video transmission and remote viewing. The explosion of high-capacity magnetic disk, solid state, and optical data storage memories has permitted the generation of new products including digital video recorders (DVR) and data compression algorithms to compress and store video images and replace the time-honored magnetic video cassette recorder (VCR). In this second edition of CCTV Surveillance, I have attempted to add these new technologies to the “nonchanging” basic technologies covered in the first edition. Physics does not change—only the technology and products do. This new revised edition of CCTV Surveillance includes the new digital video technology and contains eight new chapters: Chapter 7 Chapter Chapter Chapter Chapter Chapter Chapter Chapter 10 12 14 20 21 24 25 Digital Transmission, Video, Communications and Control Hard Copy Video Printers Quads and Multiplexers Dome Cameras Control Room/Console Design Rapid Deployment Video Systems Video-Security Systems Integration Video System Test Equipment Chapter 7—Wired and wireless digital transmission represents possibly the most significant technology advancement in the video security industry. It makes use of the Internet and intranets for remote video, data, and audio communication over existing hard wire communication links. Chapter 7 includes an analysis of digital wireless video transmission using the family of 802.11x protocol spread spectrum technology (SST). Prior to 1995–98 the Internet was not available for commercial use and remote video monitoring and control was accomplished primarily over existing telephone lines or expensive satellite links with limited functionality. Ease of installation, camera addressing, and identification using IP cameras has opened a new vista in video transmission and remote monitoring. Chapter 10—This chapter describes the new technological advances made in hard-copy printers that improve the quality and reduce the cost of monochrome and color video printouts. The advances in ink-jet and laser printer technologies using inexpensive, large solid state memories and high resolution linear CCD imagers have been driven by the consumer and business markets, and have given the security industry access to low-cost, color, hard copy prints rivaling photographic resolution and quality. Chapter 12—While available in 1995, multiplexers have taken on new importance because of the significant xiii xiv Preface increase in the number of cameras used in a typical security installation and their ability to be integrated into DVRs that were not available five years ago. Chapter 14—Dome cameras are now everywhere in security systems. In 1995 they were used primarily in selected locations: casinos, department stores, supermarkets, malls, and in outdoor parking lot applications. The public at large has accepted their presence almost everywhere. Domes are easy to install and can be small and aesthetic. Dome cameras are adjustable in pointing direction (manual or motorized, pan and tilt), and many have motorized zoom lenses to change the camera field of view (FOV). The use of small dome cameras has exploded because of significant cost reduction and sophistication of pointing and zooming capabilities. Fast pan/tilt camera modules with remote control via analog or digital communications over two-wire or wireless communication links are reasons for their popularity. Chapter 20—Consoles and Control Rooms have become more complex and require more design attention for their successful implementation. This chapter analyzes the console and security control room with regard to lighting, monitor locations, operator control placement, and the other human factors required for guard efficiency and comfort. Chapter 21—There has always been a requirement for a transportable Rapid Deployment Security (RDS) systems having video and alarm intrusion equipment for protecting personnel and assets. The Post-911 era with real terror threats has initiated the need for RDS equipment to protect military, government, business, and other personnel on travel. The majority of these systems consist of alarm intrusion and analog or digital video viewing system. These RDS systems are carried from one location to another and deployed quickly to set up an alarm perimeter and realtime video monitoring and recording. Analog or digital transmission allows local or remote monitoring. After use, the RDS equipment is disassembled and stored in its carrying case, ready for another deployment. The much smaller size of the video and alarm equipment has accelerated its use and acceptance. Chapter 22—The Video Applications chapter has been updated and expanded to include digital video applications including the combination of legacy analog and IP cameras. One video monitoring application uses on-site local networks and a second application uses the Internet and IP cameras, signal routers, and servers for remote site video monitoring. Security applications require complete integration of communication, video, alarm, access control, and fire to provide monitoring by the local security force, and corporate executives at a local or remote site(s). The integration of these security functions provides the safety and security necessary to protect personnel and assets at any facility. Chapter 25—Installation and maintenance of video equipment requires the use of video and computer test equipment. Prior to the widespread use of digital technology in security systems, a limited range of test equipment was used. Now with the many computer interfaces and Internet protocols and connection to the Internet, more sophisticated test equipment and some knowledge of software and computer programming is necessary. Parameters to be tested and monitored include: (a) video signal level and quality; (b) control data signals for pan, tilt, zoom, focus; and (c) digital signal protocols for multiplexers, IP cameras, signal routers and servers, DVRs, etc. Acknowledgments Over the years I have had opportunities to speak with many individuals who provided technical insight in video technology and electro-optics. I particularly appreciate the discussions with Stanley Dolin and Lee Gallagher, on the subjects of optics, the physics of lighting, lenses, and optical sensors. I found very helpful the technical discussions on cameras with Frank Abram, Sanyo Security Products, and Victor Houk. I thank Dr. Gerald Herskowitz, Stevens Institute of Technology for contributing to the fiber-optic section in Chapter 6 and reviewing other sections on video transmission. I thank Robert Wimmer and Fredrick Nilsson for their excellent technical articles in security journals, company publications, as well as technical seminars on many aspects of video security. Thanks to Charlie Pierce for his interest in my book over the years and enthusiasm and excellence in presenting stimulating educational video seminars. Eric Kruegle, Avida Inc., contributed his expertise on various aspects of digital video. In particular I appreciate his help in wired and wireless video transmission, compression, and encryption in Chapter 7. Eric was also instrumental in keeping my computer alive, and I thank him for rescuing me late at night from missing files and software surprises. I acknowledge the initial encouragement of Kevin Kopp and editorial advice of Greg Franklin at Butterworth (now Elsevier) during the formative stages of the first edition of CCTV Surveillance in 1995. I thank all staff at Elsevier for bringing out this second edition successfully: Pam Chester for her assistance in the formulation of this edition, Mark Listewnik for his constant encouragement, professional suggestions, and diligence in bringing this large project to a successful conclusion, Jeff Freeland for providing the meticulous final editing and effort in completing this large endeavor. I gratefully acknowledge the dedication, patience, and skill of my wife, Carol, in assisting in the preparation of this book. I would like to thank the manufacturers for the use of the many photographs that illustrate the components used in video security applications. Each of them contribute to the education of the security professional and assist the consultant, systems integrator, and end user in designing and implementing the best security system possible. xv This page intentionally left blank PART I Chapter 1 Video’s Critical Role in the Security Plan CONTENTS 1.1 Protection of Assets 1.1.1 Overview 1.1.2 Background 1.2 The Role of Video in Asset Protection 1.2.1 Video as Part of the Emergency and Disaster Plan 1.2.1.1 Protecting Life and Minimizing Injury 1.2.1.2 Reducing Exposure of Physical Assets and Optimizing Loss Control 1.2.1.3 Restoring Normal Operations Quickly 1.2.1.4 Documenting an Emergency 1.2.1.5 Emergency Shutdown and Restoration 1.2.1.6 Testing the Plan 1.2.1.7 Standby Power and Communications 1.2.2 Security Investigations 1.2.3 Safety 1.2.4 The Role of the Guard 1.2.5 Employee Training and Education 1.3 Synergy through Integration 1.3.1 Integrated Functions 1.3.2 System Hardware 1.4 Video’s Role and Its Applications 1.4.1 Video System Solutions 1.4.2 Overt vs. Covert Video 1.4.3 Security Surveillance Applications 1.4.4 Safety Applications 1.4.5 Video Access Control 1.5 The Bottom Line 1.1 PROTECTION OF ASSETS The protection of personnel and assets is a management function. Three key factors governing the planning of an assets protection program are: (1) an adequate plan designed to prevent losses from occurring, (2) adequate countermeasures to limit unpreventable losses, and (3) support of the protection plan by top management. 1.1.1 Overview Most situations today require a complete safety/security plan. The plan should contain requirements for intrusion detection, video assessment, fire detection, access control, and full two-way communication. Critical functions and locations must be monitored using wired and wireless backup communications. The most significant driving force behind the explosion in the use of closed-circuit television (CCTV) has been the worldwide increase in theft and terrorism and the commensurate concern and need to protect personnel and assets. The terrorist attack on September 11, 2001, brought about a quantum jump and a complete reevaluation of the personnel and asset security requirements to safe-guard a facility. To meet this new threat, video security has taken on the lead role in protecting personnel and assets. Today every state-of-the-art security system must include video as a key component to provide the “remote eyes” for security, fire, and safety. The fateful day of September 11, 2001, has dramatized the importance of reliable communications and remote visualization of images via remote video cameras. Many lives were saved (and lost) as a consequence of the voice, video, alarm, and fire equipment in place and in use at the time of the fateful attack on the World Trade Center in New York. The availability of operational wired and wireless two-way communication between command and control headquarters and responders (police, fire, emergency) played a crucial role in life and death. The availability (or absence) at command posts of real-time video images 1 2 CCTV Surveillance at crucial locations in the Twin Towers during the attack and evacuation contributed to the action taken by command personnel during the tragedy. The use (or absence) of wireless transmission from the remote video cameras in the Twin Towers clearly had an impact on the number of survivors and casualties. During the 1990s, video components (cameras, recorders, monitors, etc.) technology matured from the legacy analog to a digital imaging technology and became compatible with computers and now forms an essential part of the security solution. In the late 1990s, digital cameras were introduced into the consumer market, thereby significantly reducing price and as a result found widespread use in the security industry. Simultaneously, powerful microprocessors, large hard disk computer memory storage, and random access memory (RAM) became available from the personal computer/laptop industry, thereby providing the computing power necessary to control, view, record, and play back digital CCTV cameras in the security system. The home run came with the availability and explosive acceptance and use of the Internet (and intranet) as a new means of long distance two-way communication of voice, data, and most importantly video. For over a decade the long distance transmission of video was limited to slow telephone transmission of video images—snap-shots (slowscan video). The use of dedicated high speed (expensive) land lines or expensive satellite communications was limited to government and large-clientele users. Now the Internet provides near-live (near real-time) video transmission communications over an inexpensive, easily accessible worldwide transmission network. The application and integration of video into safety and security systems has come of age as a reliable, cost-effective means for assessing and responding to terrorist attacks and other life-threatening situations. Video is an effective means for deterring crimes and protecting assets and for apprehending and prosecuting offenders. Security personnel today have the responsibility for multifaceted security and safety systems in which video often plays the key role. With today’s increasing labor costs and the need for each security officer to provide more functionality, video more than ever before is earning its place as a cost-effective means for improving security and safety while reducing security budgets. Loss of assets and time due to theft is a growing cancer on our society that eats away at the profits of every organization or business, be it government, retail, service, or manufacturing. The size of the organization makes no difference to the thief. The larger the organization, the more the theft occurs and the greater the opportunity for losses. The more valuable the product, the greater the temptation for a thief to steal it. A properly designed and applied video system can be an extremely profitable investment for an institution to cut losses. The prime objective of the video system should not be the apprehension of thieves but rather the deterrence of crime through security. A successful thief needs privacy—a video system can deny that privacy. As a security by-product, video has emerged as an effective training tool for managers and security personnel. Every installation/establishment should have a security plan in place prior to an incident. Video-based training is easy to implement using the abundance of inexpensive camcorders and playback equipment available and the commercial video production training video services available. The use of training videos results in standardized procedures and improved employee efficiency and productivity. The public at large has accepted the use of video systems in most public facilities. Video is being applied to reduce asset losses and increase corporate profits and bottom line. Many case histories show that after the installation of video, shoplifting and employee thefts drop sharply. The number of thefts cannot be counted exactly but shrinkage can be measured. It has been shown that video is an effective psychological deterrent to crime and an effective tool for criminal prosecution. Theft is not only the unauthorized removal of valuable property but also the removal of information, such as computer software, CDs, magnetic tape and disks, optical disks, microfilm, and hard copy. Video surveillance systems provide a means for successfully deterring such thievery and/or detecting or apprehending offenders. The use of video prevents the destruction of property, vandalizing buildings, defacing elevator interiors, painting graffiti on art objects and facilities, stealing computers, and demolishing furniture or other valuable equipment. Video offers the greatest potential benefit when integrated with other sensing systems and used to view remote areas. Video provides the “eyes” for many security devices and functions such as: (1) fire sensors: smoke detector alarms, (2) watching for presence (or absence) of personnel in an area, (3) evacuation of personnel—determining route for evacuation, access (emergency or intruder) to determine response, respond, and monitor response. When combined with fire and smoke detectors, CCTV cameras in inaccessible areas can be used to give advance warning of a fire. Video is the critical link in the overall security of a facility but organizations must develop a complete security plan rather than adopt piecemeal protection measures. To optimize use of video technology, the practitioner and end user must understand all of its aspects—from light sources to video monitors and recorders. The capabilities and limitations of video during daytime and nighttime operation must also be understood. 1.1.2 Background Throughout history, humans have valued their own life and the lives of their loved ones above all else. Next Video’s Critical Role in the Security Plan in value has been their property. Over the centuries many techniques have been developed to protect property against invaders or aggressors threatening to take or destroy it. In the past as in the present, manufacturing, industrial, and government organizations have hired “watchmen” to protect their facilities. These private security personnel wearing uniforms and using equipment much like the police do are hired to prevent crime and bodily harm, and deter or prevent theft on the premises. The very early guard companies were Pinkerton’s and Burns. Contract protection organizations were hired to safeguard their employees and assets in emergency and personal threat situations. A significant increase in guard use came with the start of World War II. Many guards were employed to secure industrial work sites manufacturing military equipment and doing classified work, and to guard government facilities. Private corporations obtained such protection through contract agencies to guard classified facilities and work. In the early 1960s, as electronic technology advanced, alarm systems and video were introduced. Radio Corporation of America (RCA), Motorola, and General Electric were the pioneering companies that began manufacturing vacuum-tube television cameras for the security industry. The use of video cameras during the 1960s and 1970s grew rapidly because of increased reliability, lower cost, and technological improvements in the tube-type camera technology. In the 1980s growth continued at a more modest level with further improvements in functions and availability of other accessories for video security systems. The most significant advance in video technology during the 1980s was the invention and introduction of the solid-state video camera. By the early 1990s the solid-state camera using the charged coupled device (CCD) image sensor was the choice for new security installations and was rapidly replacing the tube cameras. In the past, the camera—in particular, the vidicon tube sensor—was the critical component in the video system. The camera determined the overall performance and quality of visual intelligence obtainable from the security system. The vidicon tube was the weakest link in the system and was subject to degradation with age and usage. The complexity and variability of the image tube and its analog electrical nature made it less reliable than the other solid-state components. Performance varied considerably between different camera models and camera manufacturers, and as a function of temperature and age. By contrast, the solid-state CCD sensor and newer metal oxide semiconductor (MOS) and complimentary MOS (CMOS) sensor cameras have long life and are stable over all operating conditions. Another factor in the explosive use of video in security systems has been the rapid improvement in equipment capability at affordable prices. This has been the result of the widespread use of solid-state camcorders 3 by consumers (lower manufacturing costs), and the availability of low-cost video cassette recorders (VCRs), digital video recorders (DVRs), and personal computer (PC)based equipment. The 1990s saw the integration of computer technology with video security technology. All components were solid state. Digital video technology needed large-scale digital memories to manipulate and store video images and the computer industry had them. To achieve satisfactory video image transmission and storage, the video signal had to be “compressed” to transmit it over the existing narrowband phone line networks. The video-computer industry already had compression for broadcast, industrial, and government requirements. The video industry needed a fast and low-cost means to transmit the video images to remote locations and the US government’s Defense Advanced Research Projects Agency (DARPA) had already developed the Internet, the predecessor of the World Wide Web (WWW). The Internet (and intranet) communications channels and the WWW now provide this extraordinary worldwide ability to transmit and receive video and audio, and communicate and control data anywhere. 1.2 THE ROLE OF VIDEO IN ASSET PROTECTION Video provides multiple functions in the overall security plan. It provides the function of asset protection by monitoring location of assets and activity in their location. It is used to detect unwanted entry into a facility beginning at a perimeter location and following an unauthorized person throughout a facility. Figure 1-1 shows a typical single site video system using either legacy analog or digital, or a combination of both technologies. In a perimeter protection role, video is used with intrusion-detection alarm devices as well as video motion detection to alert the guard at the security console that an intrusion has occurred. If an intrusion occurs, multiple CCTV cameras located throughout the facility follow the intruder so that there is a proper response by guard personnel or designated employees. Management must determine whether specific guard reaction is required and what the response will be. Video monitoring allows the guard to be more effective, but it also improves security by permitting the camera scene to be transmitted to other control centers or personnel. The video image can be documented with a VCR, DVR, and/or printed out on a hard copy video printer. The video system for the multiple site application is best implemented using a combination of analog/digital or an all-digital solution (Figure 1-2). Local site installations already using analog video cameras, monitors, etc. can be retained and integrated with new digital Internet Protocal (IP) cameras, local area networks (LANs), intranets, and the Internet to facilitate remote site video monitoring. The digital transmission 4 CCTV Surveillance PERIMETER PARKING LOT SURVEILLANCE SECURITY ROOM CCTV MONITORS/RECORDERS AUDIO COMMUNICATIONS COMMAND AND CONTROL LOADING DOCK SURVEILLANCE INTRUDER PATH LOBBY SURVEILLANCE G FENCE LINE FACILITY ENTRANCE FIGURE 1-1 PERIMETER PARKING LOT SURVEILLANCE Single site video security system network provides two-way communications of audio and controls and excellent video image transmission to remote sites. The digital signals can be encrypted to prevent eavesdropping by unauthorized outside personnel. Using a digital signal backbone allows adding additional cameras to the network or changing their configuration in the system. In the relatively short history of CCTV and video there have been great innovations in the permanent recording of video images. These new technologies have been brought about by the consumer demand for video camcorders, the television broadcast industry, and government requirements for military and aerospace hardware and software. One result of these requirements was the development of the VCR and DVR. The ability to record video images provided the video security industry with a new dimension, i.e. going beyond real-time camera surveillance. The availability of VCR and DVR technology resulting from the consumer market has made possible the excellent time-lapse VCRs and large storage PC-based DVR systems. These technologies provide permanent documentation of the video images in analog (magnetic tape) and digital (solid state and hard disk drive) storage media. The use of time-lapse recorders, computer hard disks and video printers give management the tools to present hard evidence for criminal prosecution. This ability to provide a permanent record of evidence is of prime importance to personnel responsible for providing security. Prior to the mid-1990s the CCTV security industry primarily used monochrome solid-state cameras. In the 1990s the widespread use of color camcorders in the video consumer market accelerated the availability of these reliable, stable, long-life cameras for the security industry. While monochrome cameras are still specified in low light level (LLL) and nighttime security applications, color is now the norm in most security applications. The increased sensitivity and resolution of color cameras and the significant decrease in cost of color cameras have resulted in their widespread use. Many monochrome cameras being used for LLL applications are being augmented with active infrared (IR) illuminators. Also coming into use is a new generation of passive monochrome thermal IR imaging cameras that detect the differences in temperature of objects in the scene, compared to the scene background. These cameras operate in total darkness. There has also been an explosion in the use of covert video surveillance through the use of small, inexpensive color cameras. The development of smaller solid-state cameras has resulted in a decrease in the size of ancillary video equipment. Camera lenses, dome cameras, housings, pan/tilt Video’s Critical Role in the Security Plan 5 SITE 2 SITE 1 ANALOG CAMERA(S) ANALOG CAMERA(S) SERVER CAMERAS SERVER BNC RJ45 BNC KEYBOARD RJ45 ROUTER KEYBOARD DIGITAL IP CAMERA(S) * DIGITAL IP CAMERA(S) ROUTER INTERNET * DOMES INTRANET LOCAL AREA NETWORK (LAN) WIDE AREA NETWORK (WAN) WIRELESS (WiFi) * * COMPRESSED DIGITAL VIDEO (MJPEG, MPEG-2, MPEG-4). NETWORK** VIDEO RECORDER ** SUFFICIENT STORAGE TO SUPPORT ALL SITES WITH SECURITY AUTHENTICATION. * MONITORING STATION ANALOG CAMERA(S) RAID LEVEL 5 CONTROLLER FOR EXPANDED STORAGE CAPACITY. SERVER ROUTER RJ45 BNC DIGITAL IP CAMERA(S) KEYBOARD ALARM INPUT/ OUTPUT DEVICES FIGURE 1-2 Multiple site system using analog/digital video mechanisms, and brackets are smaller in size and weight resulting in lower costs and providing more aesthetic installations. The small cameras and lenses satisfy covert video applications and are easy to conceal. The potential importance of color in surveillance applications can be illustrated very clearly: turn off the color on a television monitor to make it a monochrome scene. It is obvious how much information is lost when the colors in the scene change to shades of gray. Objects that were easily identified in the color scene become difficult to identify in the monochrome scene. It is much easier to pick out a person with a red shirt in the color image than in a monochrome image. The security industry has long recognized the value of color to enhance personnel and article identification in video surveillance and access control. One reason why we can identify subjects more easily in color is that we are used to seeing color, both in the real world and on our TV at home. When we see a monochrome scene we have to make an additional effort to recognize certain information (besides the actual missing colors) thereby decreasing the intelligence available. Color provides more accurate identification of personnel and objects and leads to a higher degree of apprehension and conviction of criminals. 1.2.1 Video as Part of the Emergency and Disaster Plan Every organization regardless of size should have an emergency and disaster control plan that includes video as a critical component. Depending on the organization an anti-terrorist plan should take highest priority. Part of the plan should be a procedure for succession of personnel in the event one or more members of top management are unavailable when disaster strikes. In large organizations the plan should include the designation of alternate headquarters if possible, a safe document-storage facility, and remote (off-site if possible) video operations capability. The plan must provide for medical aid and assure the welfare of all employees in the organization. Using video as a source of information, there should be a method to alert employees in the event of a dangerous condition and a plan to provide for quick police and emergency response. There should be an emergency shutdown plan 6 CCTV Surveillance and restoration procedures with designated employees acting as leaders. There should be CCTV cameras stationed along evacuation routes and instructions for practice tests. The evacuation plan should be prepared in advance and tested. A logical and effective disaster control plan should do the following: • Define emergencies and disasters that could occur as they relate to the particular organization. • Establish an organization and specific tasks with personnel designated to carry out the plan immediately before, during, and immediately following a disaster. • Establish a method for utilizing the organization’s resources, in particular video, to analyze the disaster situation and bring to bear all available resources. • Recognize a plan to change from normal operations into and out of the disaster emergency mode as soon as possible. Video plays a very important role in any emergency, disaster and anti-terrorist plan: • Video helps protect human life by enabling security or safety officials to see remote locations and view first hand what is happening, where it is happening, what is most critical, and what areas must be attended to first. • Aids in minimizing personal injury by permitting “remote eyes” to get to those people who require immediate attention, or to send personnel to the area being hit hardest to remove them from the area, or to bring in equipment to protect them. • Video reduces the exposure of physical assets to oncoming disaster, such as fire or flood, and prevents or at least assesses document removal (of assets) by intruders or any unauthorized personnel. • Video documents the equipment and assets that were in place prior to the disaster, recording them on VCR, DVR or storage on an enterprise network to be compared to the remaining assets after the disaster has occurred. It also documents personnel and their activities before, during, and after an incident. • Probably more so than any other part of a security system, video will aid management and the security force in minimizing any disaster or emergency. It is useful in restoring an organization to normal operation by determining that no additional emergencies are in progress and that procedures and traffic flow are normal in those restored areas. 1.2.1.1 Protecting Life and Minimizing Injury Through the intelligence gathered from the video system, security and disaster control personnel should move all personnel to places of safety and shelter. Personnel assigned to disaster control and remaining in a threatened area should be protected by using video to monitor their safety, and the access and egress at these locations. By such monitoring, advance notice is available to provide a means of support and assistance for those persons if injured, and personnel that must be rescued or relieved. 1.2.1.2 Reducing Exposure of Physical Assets and Optimizing Loss Control Assets should be stored or secured properly before an emergency so that they will be less vulnerable to theft or loss. Video is an important tool for continually monitoring safe areas during and after a disaster to ensure that the material is not removed. In an emergency or disaster, the well-documented plan will call for specific personnel to locate highly valued assets, secure them, and evacuate personnel. 1.2.1.3 Restoring Normal Operations Quickly After an emergency situation has been brought under control, security personnel can monitor and maintain the security of assets and help determine that employees are safe and have returned to their normal work routine. 1.2.1.4 Documenting an Emergency For purposes of: (1) future planning, (2) liability and insurance, and (3) evaluation by management and security personnel, video coverage of critical areas and operations during an emergency is an excellent tool and can reduce financial losses significantly. Video recordings of assets lost or stolen or personnel injured or killed can support a company’s claim that it was not negligent and that it initiated a prudent emergency and disaster plan prior to the event. Although video can provide crucial documentation of an event, it should be supplemented with highresolution photographs of specific instances or events. If perimeter fences or walls were destroyed or damaged in a disaster, video can help prevent and document intrusion or looting by employees, spectators, or other outsiders. 1.2.1.5 Emergency Shutdown and Restoration In the overall disaster plan, shutting down equipment such as machinery, utilities, processes, and so on, must be considered. If furnaces, gas generators, electrical power equipment, boilers, high-pressure air or oil systems, chemical equipment, or rapidly rotating machinery could cause damage if left unattended they should be shut down as soon as possible. Again, video surveillance can be crucial to determine if the equipment has been shut down properly, if personnel must enter the area to do so, or if it must be shut down by other means. Video’s Critical Role in the Security Plan 1.2.1.6 Testing the Plan While a good emergency plan is essential, it should not be tested for the first time in an actual disaster situation. Deficiencies are always discovered during testing. Also, a test serves to train the personnel who will carry out the plan if necessary. Video can help evaluate the plan to identify shortcomings and show personnel what they did right and wrong. Through such peer review a practical and efficient plan can be put in place to minimize losses to the organization. 1.2.1.7 Standby Power and Communications During any emergency or disaster, primary power and communications between locations will probably be disrupted. Therefore, a standby power-generation system should be provided for emergency monitoring and response. This standby power comprised of a backup gaspowered generator or an uninterruptible power supply with DC batteries to extend backup operation time will keep emergency lighting, communications, and strategic video equipment online as needed. Most installations use a power sensing device that monitors the normal supply of power at various locations. When the device senses that power has been lost, the various backup equipments automatically switch to the emergency power source. A prudent security plan anticipating an emergency will include a means to power vital, audio, video, and other sensor equipment to ensure its operation during the event. Since emergency video and audio communications must be maintained over remote distances, alternative communication pathways should be supplied in the form of either auxiliary hard-wired cable (copper wire or fiber optics) or a wireless (RF, microwave, infrared) transmission system. It is usually practical to provide a backup path to only the critical cameras, not all of them. The standby generator supplying power to the video, safety, and emergency equipment must be sized properly. For equipment that normally operates on 120 volt AC, inverters are used to convert the low voltage from the backup DC batteries (typically 12 or 24 volts DC) to the required 120 volts AC (or 230 volts AC). 1.2.2 Security Investigations Security investigators have used video very successfully with respect to safeguarding company assets and preventing theft, negligence, outside intrusion, and so on. By using small, low-cost, covert CCTV (hidden camera and lens), it is easy to positively identify a person or to document an event without being noticed. Better video image quality, smaller lenses and cameras, wireless video transmission, and easier installation and removal of such equipment have led to this high success. Many lenses and cameras that can be hidden in rooms, hallways, or stationary objects are 7 available today. Equipment to provide such surveillance is available for indoor or outdoor locations in bright sunlight or in no light (IR-illuminated or thermal cameras). 1.2.3 Safety Closed circuit television equipment is installed not always for security reasons alone but also for safety purposes as well. Security personnel can be alerted to unsafe practices or accidents that require immediate attention. An attentive guard can use CCTV cameras distributed throughout a facility in stairwells, loading docks, around machinery, etc. to observe and immediately document any safety violations or incidents. 1.2.4 The Role of the Guard Security guards are employed to protect plant assets and personnel. Security and corporate management are aware that guards are only one element of an organization’s complete security plan. As such, the cost to implement the guard force and its ability to protect assets and personnel are analyzed in relation to the costs and roles of other technological security solutions. In this respect video has much to contribute: increased security for relatively low capital investment and low operating cost, as compared with a guard. Guards using video can increase the security coverage or protection of a facility. Alternatively, installing new CCTV equipment enables guards to monitor remote sites, allowing guard count and security costs to be reduced significantly. 1.2.5 Employee Training and Education Video can be used as a powerful training tool. It is used widely in education and the training of security personnel because it can demonstrate lessons and examples vividly to the trainee. In this post-9/11 era, security personnel should receive professional training by all means including real video footage. Video is an important tool for the security trainer. Example procedures of all types can be shown conveniently in a short time period, and with instructions given during the presentation. Videotaped real-life situations (not simulations or performances) can demonstrate the consequences of mis-applied procedures and the benefits of proper planning and execution by trained and knowledgeable personnel. Every organization can supplement live training with either professional training videos or actual scenes from their own video system, demonstrating good and poor practices as well as proper guard reaction in real cases of intrusion, unacceptable employee behavior, and so on. Such internal video systems can also be used in training 8 CCTV Surveillance exercises: trainees may take part in videotaped simulations, which are later critiqued by their supervisor. Trainees can then observe their own actions to find ways to improve and become more effective. Finally, such internal video systems are very important tools during rehearsals or tests of an emergency or disaster plan. After the run-through, all team members can monitor their own reactions, and managers or other professionals can critique them. must specify as a minimum: (a) where and when unusual behavior should be detected, (b) what the response should be, and (c) how it should be reported and recorded. If the intruder has violated a barrier or fence the intrusiondetection system should be able to determine that a person—not an animal, bird, insect, leaf, or other object— passed through the barrier. Video provides the most positive means for establishing this information. This breech in security must then be communicated by some means to security personnel so that a reaction force has sufficient information to permit an appropriate response. In another scenario, if material is being removed by an unauthorized person in an interior location, a video surveillance system activated by a video motion detector (VMD) alarm should alert a guard and transmit the video information to security personnel for appropriate action. In both cases a guard force would be dispatched and the event recorded on a VCR, DVR or network storage and/or printed as hard copy for guard response, documentation, and prosecution. In summary, it is the combination of sensors, communication channels, monitoring displays, documentation equipment and a guard force that provides the synergy to maximize the security function. The integration of video, intrusion-detection alarms, access control, and security guards increases the overall security asset protection and employee safety at a facility. 1.3 SYNERGY THROUGH INTEGRATION Video equipment is most effective when integrated with other security hardware and procedures to form a coherent security system. When video is combined with the other security sensors the total security system is more than the individual subsystems. Synergy obtains when video assessment is combined with intrusion and motion alarm sensors, electronic access control, fire alarms, communications, and security guard personnel (Figure 1-3). 1.3.1 Integrated Functions Functionally the integrated security system is designed as a coordinated combination of equipment, personnel, and procedures that: (a) uses each component in a way that enhances the use of every other component and (b) optimally achieves the system’s stated objective. In designing a security system, each element’s potential contribution to loss prevention, asset protection, or personnel safety must be considered. The security plan 1.3.2 System Hardware Since a complete video security system may be assembled from components manufactured by different companies, INTEGRATED SECURITY SYSTEM ACCESS CONTROL • ELECTRONIC • VIDEO • BIOMETRIC ALARMS: • FIRE • SAFETY INTRUSION DETECTION VIDEO SURVEILLANCE COMMUNICATIONS INTEGRATED SECURITY SYSTEM SYNERGY: • MAXIMIZE ASSET AND PERSONNEL PROTECTION • PROVIDE DISASTER CONTROL • OPTIMIZE RECOVERY PLAN FIGURE 1-3 Integrated security system SECURITY PERSONNEL Video’s Critical Role in the Security Plan all equipment must be compatible. The video equipment should be specified by one consulting or architecture/engineering firm, and the system and service should be purchased, installed, and maintained through a single system integrator, dealer/installer, or general contractor. If a major supplier provides a turnkey system, including all equipment, training, and maintenance, the responsibility of system operation resides with one vendor, which is easier to control. Buying from one source also permits management to go back to one installer or general contractor if there are any problems instead of having to point fingers or negotiate for service among several vendors. Choosing a single supplier obviously requires thorough analysis to determine that the supplier: (1) will provide a system that meets the requirements of the facility, (2) will be available for maintenance when required, and (3) will still be in business in 5 or 10 years. There are many companies that can supply complete video systems including cameras and housings, lenses, pan/tilt mechanisms, multiplexers, time-lapse VCRs or DVRs, analog and digital networks, and other security equipment required for an integrated video system. If the end user chooses components from various manufacturers, care must be taken by the system designer and installer to be aware of the differences and interface the equipment properly. If the security plan calls for a simple system with potential for later expansion the equipment should be modular and ready to accept new technology as it becomes available. Many larger manufacturers of security equipment anticipate this integration and expansion requirement and design their products accordingly. Service is a key ingredient for successful system operation. If one component fails, repair or replacement must be done quickly, so that the system is not shut down. Near-continuous operation is accomplished by the direct replacement method, immediate maintenance by an in-house service organization, or quick-response service calls from the installer/contractor. Service consideration should be addressed during the planning and initial design stages, as they affect choice of manufacturer and service provider. Most vendors use the replacement technique to maintain and service equipment. If part of the system fails, the vendor replaces the defective equipment and sends it to the factory for repair. This service policy decreases security system downtime. The key to a successful security plan is to choose the right equipment and service company, one that is customer oriented and knowledgeable about reliable, technologically superior products that satisfy the customer needs. 1.4 VIDEO’S ROLE AND ITS APPLICATIONS In its broadest sense, the purpose of CCTV in any security plan is to provide remote eyes for a security operator: 9 to create live-action displays from a distance. The video system should have recording means—either a VCR or a DVR, or other storage media—to maintain permanent records for training or evidence. Following are some applications for which video provides an effective solution: • When overt visual observation of a scene or activity is required from a remote location. • An area to be observed contains hazardous material or some action that may kill or injure personnel. Such areas may have toxic chemicals, biological or radioactive material, substances with high potential for fire or explosion, or items that may emit X-ray radiation or other nuclear radiation. • Visual observation of a scene must be covert. It is much easier to hide a small camera and lens in a target location than to station a person in the area. • There is little activity to watch in an area, as in an intrusion-detection location or a storage room, but significant events must be recorded in the area when they occur. Integration of video with alarm sensors and a time-lapse/real-time VCR or DVR provides an extremely powerful solution. • Many locations must be observed simultaneously by one person from a central security location. • Tracing a person or vehicle from an entrance into a facility to a final destination. The security force can predict where the person or vehicle can be interdicted. • Often a guard or security officer must only review a scene for activity periodically. The use of video eliminates the need for a guard to make rounds to remote locations, which is wasteful of the guard’s time. • When a crime has been committed, capturing the scene using the video camera and recorder to have a permanent record and hard copy printout of the activity and event. The proliferation of high-quality printed images from VCR/DVR equipment has clearly made the case for using video for creating permanent records. 1.4.1 Video System Solutions The most effective way to determine that a theft has occurred, when, where, and by whom, is to use video for detection and recording. The particular event can be identified, stored, and later reproduced for display or hard copy. Personnel can be identified on monochrome or color CCTV monitors. Most security installations use color CCTV cameras that provide sufficient information to document the activity and event or identify personnel or articles. The color camera permits easier identification of personnel and objects. If there is an emergency or disaster and security personnel must see if personnel are in a particular area, video can provide an instantaneous assessment of personnel location and availability. 10 CCTV Surveillance In many cases during normal operations, security personnel can help ensure the safety of personnel in a facility, determine that employees or visitors have not entered the facility, or confirm that personnel have exited the facility. Such functions are used for example where dangerous jobs are performed or hazardous material is handled. The synergistic combination of audio and video information from a remote site provides for effective security. Several camera manufacturers and installers combine video and audio (one-way or duplex) using an external microphone or one installed directly in the camera. The video and audio signals are transmitted over the same coaxial, unshielded-twisted-pair (UTP), or fiber-optic cable, to the security monitoring location where the scene is viewed live and/or recorded. When there is activity in the camera area the video and audio signals are switched to the monitor and the guard sees and hears the activity in the scene and initiates a response. 1.4.2 Overt vs. Covert Video Most video installations use both overt and covert (hidden) CCTV cameras, with more cameras overt than covert. Overt installations are designed to deter crime and provide general surveillance of remote areas such as parking lots, perimeter fence lines, warehouses, entrance lobbies, hallways, or production areas. When CCTV cameras and lenses are exposed, all managers, employees, and visitors realize that the premises are under constant video surveillance. When the need arises, covert installations are used to detect and observe clandestine activity. While overt video equipment is often large and not meant to be concealed, covert equipment is usually small and designed to be hidden in objects in the environment or behind a ceiling or wall. Overt cameras are usually installed permanently whereas covert cameras are usually designed to be installed quickly, left in place for a few hours, days, or weeks, and then removed. Since minimizing installation time is desirable when installing covert cameras, video signal transmission often is wireless rather than wired. snow), wind, dirt, dust, sand, salt, and smoke. The outdoor systems use natural daylight and artificial lighting at night supplied either by parking lights or by a colocated infrared (IR) source. Some cameras can automatically switch from color operation during daylight, to monochrome when the lighting decreases below some specified level for nighttime operation. Most video security applications use fixed, permanently installed video equipment. These systems are installed for months and years and left in place until they are superseded by new equipment or they are no longer required. There are many cases, however, where there is a requirement for a rapid deployment of video equipment to be used for a short period of time: days, weeks, or sometimes months, and then removed to be used again in another application. Chapter 21 describes some of these transportable rapid deployment video systems. 1.4.4 Safety Applications In public, government, industrial, and other facilities, a safety, security, and personnel protection plan must guard personnel from harm caused by accident, human error, sabotage, or terrorism. Security forces are expected to monitor the conditions and activities at all locations in the facility through the use of CCTV cameras. In a hospital room or hallway the video cameras may serve a dual function: monitoring patients while also determining the status and location of employees, visitors, and others. A guard can watch entrance and exit doors, hallways, operating rooms, drug dispensaries, and other vital areas. Safety personnel can use video for evacuation and to determine if all personnel have left the area and are safe. Security personnel can use video for remote traffic monitoring and control and to ascertain high-traffic locations and how best to control them. Video plays a critical role in public safety, as a tool for monitoring vehicular traffic on highways and city streets, in truck and bus depots, at public rail and subway facilities, airports, power plants, just to name a few. 1.4.3 Security Surveillance Applications Many video applications fall broadly into two types, indoor and outdoor. This division sets a natural boundary between equipment types: those suitable for controlled indoor environments and those suitable for harsher outdoor environments. The two primary parameters are environmental factors and lighting factors. The indoor system requires artificial lighting that may or may not be augmented by daylight. The indoor system is subject to only mild indoor temperature and humidity variations, dirt, dust, and smoke. The outdoor system must withstand extreme temperatures, precipitation (fog, rain, and 1.4.5 Video Access Control As security requirements become more complex and demanding, video access control and electronic access control equipments should work synergistically with each other. For medium- to low-level access control security requirements, electronic card-reading systems are adequate after a person has first been identified at some exterior perimeter location. For higher security, personal biometric descriptors (iris scanning, fingerprint, etc.) and/or video identification are necessary. Video’s Critical Role in the Security Plan Video surveillance is often used with electronic or video access control equipment. Video access control uses video to identify a person requesting access at a remote location, on foot or in a vehicle. A guard can compare the live image and the photo ID carried by the person on a video monitor and then either allow or deny entry. For the highest level of access control security the guard uses a system to compare the live image of the person to an image of the person retrieved from a video image database or one stored in a smart card. The two images are displayed side by side on a split-screen monitor along with other pertinent information. The video access control system can be combined with an electronic access control system to increase security and provide a means to track all attempted entries. There are several biometric video access control systems which can positively identify a person enrolled in the system using iris, facial, or retina identification. 1.5 THE BOTTOM LINE The synergy of a CCTV security system implies the following functional scenario: • An intrusion alarm sensor or VMD will detect an unauthorized intrusion or entry or attempt to remove equipment from an area. • A video camera located somewhere in the alarm area is viewing the area at the location or may be pointed manually or automatically (from the guard site) to view the alarm area. • The information from the alarm sensor and/or camera is transmitted immediately to the security console, monitored by personnel, and/or recorded for permanent documentation. • The security operator receiving the alarm information has a plan to dispatch personnel to the location or to take some other appropriate action. • After dispatching a security person to the alarm area the guard resumes normal security duties to view the response, give additional instruction, and monitor any future event. • After a reasonable amount of time the person dispatched should neutralize the intrusion or other event. The security guard resumes monitoring that situation to bring it to a successful conclusion and continues monitoring the facility. The use of video plays a crucial role in the overall security system plan. During an intrusion, disaster or theft, the video system provides information to the guard, who must make some identification of the perpetrator, assess the problem, and respond appropriately. An installation containing suitable and sufficient alarm sensors and video 11 cameras permits the guard to follow the progress of the event and assist the response team in countering the attack. The use of video and the VMD capability to track an intruder is most effective. With an intrusion alarm and visual video information, all the elements are in place for a timely, reliable transfer of information to the security officer. For maximum effectiveness, all parts of the security system must work together synergistically. If an intrusion alarm fails, the command post may not see the intruder with sufficient advance notice. If the video fails, the guard cannot identify the perpetrator or evaluate the extent of the security breech even though he may know that an intrusion has occurred. It is important that the security officer be alert and that proper audio and visual cues are provided to alert the guard when an alarm has occurred. If inadequate alarm annunciation is provided and the guard misses or misinterprets the alarm and video input, the data from either or both are not acted upon and the system fails. In an emergency such as a terrorist attack, fire, flood, malfunctioning machinery, burst utility pipeline, etc. the operation of video, safety sensors, and human response at the console are all required. Video is an inexpensive investment for preventing accidents and minimizing damage when an accident occurs. Since the reaction time to a terrorist attack, fire or other disaster is critical, having various cameras at the critical locations before personnel arrive is very important. Closed circuit television cameras act as real-time eyes at the emergency location, permitting security and safety personnel to send the appropriate reaction force with adequate equipment to provide optimum response. In the case of a fire, while a sprinkler may activate or a fire sensor may produce an alarm, a CCTV camera can quickly ascertain whether the event is a false alarm, a minor alarm, or a major event. The automatic sprinkler and fire alarm system might alert the guard to the event but the video “eyes” viewing the actual scene prior to the emergency team’s dispatch often save lives and reduce asset losses. In the case of a security violation, if a sensor detects an intrusion the guard monitoring the video cameras can determine if the intrusion requires the dispatch of personnel or some other response. In the event of a major, well-planned attack on a facility by a terrorist organization or other intrusion, a diversionary tactic such as a false alarm can quickly be discovered through the use of video thereby preventing an inappropriate response. To justify expenditures on security and safety equipment an organization must expect a positive return on investment. The value of assets protected must be greater than the amount spent on security, and the security system must adequately protect personnel and visitors. An effective security system reduces theft, saves money, and saves lives. This page intentionally left blank Chapter 2 Video Technology Overview CONTENTS 2.1 Overview 2.2 The Video System 2.2.1 The Role of Light and Reflection 2.2.2 The Lens Function 2.2.3 The Camera Function 2.2.4 The Transmission Function 2.2.5 The Monitor Function 2.2.6 The Recording Function 2.3 Scene Illumination 2.3.1 Natural Light 2.3.2 Artificial Light 2.4 Scene Characteristics 2.4.1 Target Size 2.4.2 Reflectivity 2.4.3 Effects of Motion 2.4.4 Scene Temperature 2.5 Lenses 2.5.1 Fixed-Focal-Length Lens 2.5.2 Zoom Lens 2.5.3 Vari-Focal Lens 2.5.4 Panoramic—360 Lens 2.5.5 Covert Pinhole Lens 2.5.6 Special Lenses 2.6 Cameras 2.6.1 The Scanning Process 2.6.1.1 Raster Scanning 2.6.1.2 Digital and Progressive Scan 2.6.2 Solid-State Cameras 2.6.2.1 Analog 2.6.2.2 Digital 2.6.2.3 Internet 2.6.3 Low-Light-Level Intensified Camera 2.6.4 Thermal Imaging Camera 2.6.5 Panoramic 360 Camera 2.7 Transmission 2.7.1 Hard-Wired 2.7.1.1 Coaxial Cable 2.7.1.2 Unshielded Twisted Pair 2.7.1.3 LAN, WAN, Intranet and Internet 2.7.2 Wireless 2.7.3 Fiber Optics 2.8 Switchers 2.8.1 Standard 2.8.2 Microprocessor-Controlled 2.9 Quads and Multiplexers 2.10 Monitors 2.10.1 Monochrome 2.10.2 Color 2.10.3 CRT, LCD, Plasma Displays 2.10.4 Audio/Video 2.11 Recorders 2.11.1 Video Cassette Recorder (VCR) 2.11.2 Digital Video Recorder (DVR) 2.11.3 Optical Disk 2.12 Hard-copy Video Printers 2.13 Ancillary Equipment 2.13.1 Camera Housings 2.13.1.1 Standard-rectangular 2.13.1.2 Dome 2.13.1.3 Specialty 2.13.1.4 Plug and Play 2.13.2 Pan/Tilt Mounts 2.13.3 Video Motion Detector (VMD) 2.13.4 Screen Splitter 2.13.5 Camera Video Annotation 2.13.5.1 Camera ID 2.13.5.2 Time and Date 2.13.6 Image Reversal 2.14 Summary 13 14 CCTV Surveillance 2.1 OVERVIEW The second half of the 1990s has witnessed a quantum jump in video security technology. This technology has manifest with a new generation of video components, i.e. digital cameras, multiplexers, DVRs, etc. A second significant activity has been the integration of security systems with computer-based LANs, wide area networks (WANs), wireless networks (WiFi), intranets and, Internet and the World Wide Web (WWW) communications systems. Although today’s video security system hardware is based on new technology which takes advantage of the great advances in microprocessor computing power, solid-state and magnetic memory, digital processing, and wired and wireless video signal transmission (analog, digital over the Internet, etc.), the basic video system still requires the lens, camera, transmission medium (wired cable, wireless), monitor, recorder, etc. This chapter describes current video security system components and is an introduction to their operation. The primary function of any video security or safety system is to provide remote eyes for the security force located at a central control console or remote site. The video system includes the illumination source, the scene to be viewed, the camera lens, the camera, and the means of transmission to the remote monitoring and recording CAMERA SITE equipment. Other equipment often necessary to complete the system include video switchers, multiplexers, VMDs, housings, scene combiners and splitters, and character generators. This chapter describes the technology used to: (1) capture the visual image, (2) convert it to a video signal, (3) transmit the signal to a receiver at a remote location, (4) display the image on a video monitor, and (5) record and print it for permanent record. Figure 2-1 shows the simplest video application requiring only one video camera and monitor. The printer and video recorder are optional. The camera may be used to monitor employees, visitors, or people entering or leaving a building. The camera could be located in the lobby ceiling and pointed at the reception area, the front door, or an internal access door. The monitor might be located hundreds or thousands of feet away, in another building or another city or country with the security personnel viewing that same lobby, front door, or reception area. The video camera/monitor system effectively extends the eyes, reaching from observer location to the observed location. The basic one-camera system shown in Figure 2-1 includes the following hardware components. • Lens. Light from the illumination source reflects off the scene. The lens collects the light from the scene CONSOLE MONITORING SITE MONITOR (CRT/LCD) VIDEO CAMERA TRANSMISSION MEANS COAX CABLE COAXIAL UTP (UNSHILDED TWISTED PAIR) OPTICAL LENS VIDEO PRINTER SCENE ILLUMINATION SOURCE (NATURAL, ARTIFICIAL) FIGURE 2-1 Single camera video system VIDEO RECORDER (DVR/VCR) Video Technology Overview • • • • • image, using thermal, inkjet, laser, or other printing technology. and forms an image of the scene on the light-sensitive camera sensor. Camera. The camera sensor converts the visible scene formed by the lens into an electrical signal suitable for transmission to the remote monitor, recorder, and printer. Transmission Link. The transmission media carries the electrical video signal from the camera to the remote monitor. Hard-wired media choices include: (a) coaxial, (b) two-wire unshielded twisted-pair (UTP), (c) fiber-optic cable, (d) LAN, (e) WAN, (f) intranet, and (g) Internet network. Wireless choices include: (a) radio frequency (RF), (b) microwave, or (c) optical infrared (IR). Signals can be analog or digital. Monitor. The video monitor or computer screens display (CRT, LCD or plasma) the camera image by converting the electrical video signal back into a visible image on the monitor screen. Recorder. The camera scene is permanently recorded by a real-time or TL VCR onto a magnetic tape cassette or by a DVR using a magnetic disk hard drive. Hard-copy Printer. The video printer produces a hardcopy paper printout of any live or recorded video The first four components are required to make a simple video system work. The recorder and/or printer is required if a permanent record is required. Figure 2-2 shows a block diagram of a multi-camera analog video security system using these components plus additional hardware and options to expand the capability of the single-camera system to multiple cameras, monitors, recorders, etc. providing a more complex video security system. Additional ancillary supporting equipment for more complex systems includes: camera switchers, quads, multiplexers, environmental camera housings, camera pan/tilt mechanisms, image combiners and splitters, and scene annotators. • Camera Switcher, Quad, Multiplexer. When a CCTV security system has multiple cameras, an electronic switcher, quad, or multiplexer is used to select different cameras automatically or manually to display the images on a single or multiple monitors, as individual or multiple scenes. The quad can digitally combine four CAMERA SITE LENS CONSOLE MONITORING SITE • QUAD • MULTIPLEXER • SWITCHER TRANSMISSION MEANS • COAXIAL • UTP CAMERA 1 • OPTICAL QUAD 1 2 3 MONITOR (CRT/LCD) 4 1 SEQUENCE 2 MONITOR (CRT/LCD) 3 o o VIDEO PRINTER o N FIGURE 2-2 Comprehensive video security system 15 VIDEO RECORDER (DVR/VCR) 16 CCTV Surveillance cameras. The multiplexer can digitally combine 4, 9, 16, and even 32 separate cameras. • Housings. The many varieties of camera/lens housings fall into three categories: indoor, outdoor and integral camera/housing assemblies. Indoor housings protect the camera and lens from tampering and are usually constructed from lightweight materials. Outdoor housings protect the camera and lens from the environment: from precipitation, extremes of heat and cold, dust, dirt, and vandalism. • Dome Housing. The dome camera housing uses a hemispherical clear or tinted plastic dome enclosing a fixed camera or a camera with pan/tilt and zoom lens capability. • Plug and Play Camera/Housing Combination. To simplify surveillance camera installations many manufacturers are now packaging the camera-lens-housing as a complete assembly. These plug-and-play cameras are ready to mount in a wall or ceiling and to connect the power in and the video out. • Pan/Tilt Mechanism. When a camera must view a large area, a pan and tilt mount is used to rotate it horizontally (panning) and to tilt it, providing a large angular coverage. • Splitter/Combiner/Inserter. An optical or electronic image combiner or splitter is used to display more than one camera scene on a single monitor. • Annotator. A time and date generator annotates the video scene with chronological information. A camera identifier puts a camera number (or name—FRONT DOOR, etc.) on the monitor screen to identify the scene displayed by the camera. The digital video surveillance system includes most of the devices in the analog video system. The primary differences manifest in using digital electronics and digital processing within the video devices. Digital video components use digital signal processing (DSP), digital video signal compression, digital transmission, recording and viewing. Figure 2-3 illustrates these devices and signal paths and the overall system block diagram for the digital video system. SITE 2 SITE 1 ANALOG CAMERA(S) ANALOG CAMERA(S) CAMERAS SERVER SERVER ROUTER RJ45 BNC KEYBOARD BNC RJ45 * KEYBOARD DIGITAL IP CAMERA(S) DIGITAL IP CAMERA(S) ROUTER INTERNET * INTRANET LOCAL AREA NETWORK (LAN) DOMES WIDE AREA NETWORK (WAN) WIRELESS (WiFi) * * COMPRESSED DIGITAL VIDEO (MJPEG, MPEG-2, MPEG-4). NETWORK ** VIDEO RECORDER ** SUFFICIENT STORAGE TO SUPPORT ALL SITES WITH SECURITY AUTHENTICATION. * MONITORING STATION RAID LEVEL 5 CONTROLLER FOR EXPANDED STORAGE CAPACITY. ANALOG CAMERA(S) ROUTER RJ45 MODEM BNC DIGITAL IP CAMERA(S) MODEM KEYBOARD POTS DSL OTHER ALTERNATE LAND LINE SITE TO SITE CONNECTION FIGURE 2-3 Networked digital video system block diagram SERVER ALARM INPUT/ OUTPUT DEVICES Video Technology Overview 2.2 THE VIDEO SYSTEM Figure 2-4 shows the essentials of the CCTV camera environment: illumination source, camera, lens, and the camera–lens combined field of view (FOV), that is the scene the camera–lens combination sees. 2.2.1 The Role of Light and Reflection A scene or target area to be viewed is illuminated by natural or artificial light sources. Natural sources include the sun, the moon (reflected sunlight), and starlight. Artificial sources include incandescent, sodium, metal arc, mercury, fluorescent, infrared, and other man-made lights. Chapter 3 describes all of these light sources in detail. The camera lens receives the light reflected from the scene. Depending on the scene to be viewed the amount of light reflected from objects in the scene can vary from 5 or 10% to 80 or 90% of the light incident on the scene. Typical values of reflected light for normal scenes such as foliage, automobiles, personnel, and streets fall in the range from about 25–65%. Snow-covered scenes may reach 90%. The amount of light received by the lens is a function of the brightness of the light source, the reflectivity of the scene, and the transmission characteristics of the intervening atmosphere. In outdoor applications there is usually a considerable optical path from the source to the scene and back to the camera; therefore the transmission through the atmosphere must be considered. When atmospheric conditions are clear, there is generally little or no attenuation of the reflected light from the scene. However, when there is precipitation (rain, snow, or sleet, or when fog intervenes) or in dusty, smoky, or sand-blown environments, this attenuation might be substantial and must be considered. Likewise in hot climates thermal effects (heat waves) and humidity can cause severe attenuation and/or distortion of the scene. Complete attenuation of the reflected light from the scene (zero visibility) can occur, in which case no scene image is formed. Since most solid-state cameras operate in the visible and near-infrared wavelength region the general rule of thumb with respect to visibility is that if the human eye cannot see the scene neither can the camera. Under this situation, no amount of increased lighting will help; however, if the visible light can be filtered out of the scene and only the IR portion used, scene visibility might be increased NATURAL OR ARTIFICIAL ILLUMINATION SOURCE SCENE VIEWED BY CAMERA/LENS REFLECTED LIGHT FROM SENSOR LENS CAMERA LENS FIELD OF VIEW (FOV) SENSOR: CCD CMOS INTENSIFIER THERMAL IR FIGURE 2-4 17 Video camera, scene, and source illumination VIDEO OUT POWER IN 18 CCTV Surveillance SOLID STATE CCD, CMOS SENSOR SENSOR GEOMETRY d = SENSOR v = 3 UNITS HIGH DIAGONAL v = VERTICAL v d h = 4 UNITS WIDE h h = HORIZONTAL CCTV CAMERA CAMERA SENSOR FOV LENS D SCENE D = DISTANCE FROM SCENE TO LENS H V HORIZONTAL WIDTH (H) FIGURE 2-5 VERTICAL HEIGHT (V) Video scene and sensor geometry somewhat. This problem can often be overcome by using a thermal infrared (IR) imaging camera that works outside of the visible wavelength range. These thermal IR cameras produce a monochrome display with reduced image quality and are much more expensive than the charge coupled device (CCD) or complimentary metal oxide semiconductor (CMOS) cameras (see Section 2.6.4). Figure 2-5 illustrates the relationship between the viewed scene and the scene image on the camera sensor. The lens located on the camera forms an image of the scene and focuses it onto the sensor. Almost all video systems used in security systems have a 4-by-3 aspect ratio (4 units wide by 3 units high) for both the image sensor and the field of view. The width parameter is designated as h, and H, and the vertical as v, and V. Some cameras have a 16 units wide by 9 units high definition television (HDTV) format. 2.2.2 The Lens Function The camera lens is analogous to the lens of the human eye (Figure 2-6) and collects the reflected radiation from the scene much like the lens of your eye or a film camera. The function of the lens is to collect reflected light from the scene and focus it into an image onto the CCTV camera sensor. A fraction of the light reaching the scene from the natural or artificial illumination source is reflected toward the camera and intercepted and collected by the camera lens. As a general rule, the larger the lens diameter, the more light will be gathered, the brighter the image on the sensor, and the better the final image on the monitor. This is why larger-aperture (diameter) lenses, having a higher optical throughput, are better (and more expensive) than smaller-diameter lenses that collect less light. Under good lighting conditions—bright indoor lighting, outdoors under sunlight—the large-aperture lenses are not required and there is sufficient light to form a bright image on the sensor by using small-diameter lenses. Most video applications use a fixed-focal-length (FFL) lens. The FFL lens like the human eye lens covers a constant angular field of view (FOV). The FFL lens images a scene with constant fixed magnification. A large variety of CCTV camera lenses are available with different focal lengths (FLs) that provide different FOVs. Wide-angle, medium-angle, and narrow-angle (telephoto) lenses produce different magnifications and FOVs. Zoom and varifocal lenses can be adjusted to have variable FLs and FOVs. Most CCTV lenses have an iris diaphragm (as does the human eye) to adjust the open area of the lens and change the amount of light passing through it and reaching the sensor. Depending on the application, manual or automatic-iris lenses are used. In an automatic-iris CCTV lens, as in a human eye lens, the iris closes automatically when the illumination is too high and opens automatically Video Technology Overview 19 EYE OR CAMERA SENSOR SCENE SCENE CAMERA SENSOR FIELD OF VIEW LENS IRIS EYE RETINA CAMERA SENSOR EYE FIELD OF VIEW AT SCENE 17 mm EYE MAGNIFICATION = 1 EYE LENS FOCAL LENGTH = 17 mm (0.67") FIGURE 2-6 Comparing the human eye to the video camera lens when it is too low, thereby maintaining the optimum illumination on the sensor at all times. Figure 2-7 shows representative samples of CCTV lenses, including FFL, varifocal, zoom, pinhole, and a large catadioptric lens for long range outdoor use (which combines both mirror and glass optical elements). Chapter 4 describes CCTV lens characteristics in detail. 2.2.3 The Camera Function The lens focuses the scene onto the camera image sensor which acts like the retina of the eye or the film in a photographic camera. The video camera sensor and electronics convert the visible image into an equivalent electrical signal suitable for transmission to a remote monitor. Figure 2-8 is a block diagram of a typical analog CCTV camera. The camera converts the optical image produced by the lens into a time-varying electric signal that changes (modulates) in accordance with the light-intensity distribution throughout the scene. Other camera electronic circuits produce synchronizing pulses so that the timevarying video signal can later be displayed on a monitor or recorder, or printed out as hard copy on a video printer. While cameras may differ in size and shape depending on specific type and capability, the scanning process used by most cameras is essentially the same. Almost all cameras must scan the scene, point by point, as a function of time. (An exception is the image intensifier.) Solid-state CCD or CMOS color and monochrome cameras are used in most applications. In scenes with low illumination, sensitive CCD cameras with infrared (IR) illuminators are used. In scenes with very low illumination and where no active illumination is permitted (i.e. covert) low-light-level (LLL) intensified CCD (ICCD) cameras are used. These cameras are complex and expensive (Chapter 19). Figure 2-9 shows a block diagram of a the analog camera with (a) digital signal processing (DSP) and (b) the all digital internet protocol (IP) video camera. In the early 1990s the non-broadcast, tube-type color cameras available for security applications lacked longterm stability, sensitivity, and high resolution. Color cameras did not find much use in security applications until solid-state color CCTV cameras became available through the development of solid-state color sensor technology and widespread use of consumer color CCD cameras used in camcorders. Color cameras have now become standard in security systems and most CCTV security cameras in use today are color. Figure 2-10 shows representative CCTV cameras including monochrome and color solidstate CCD and CMOS cameras, a small single board camera, and a miniature remote head camera. Chapters 5, 14, 15 and 19 describe standard and LLL security CCTV cameras in detail. 2.2.4 The Transmission Function Once the camera has generated an electrical video signal representing the scene image, the signal is transmitted to a remote security monitoring site via some transmission 20 CCTV Surveillance (A) MOTORIZED ZOOM (B) CATADIOPTRIC LONG FFL (D) WIDE FOV FFL (F) NARROW FOV (TELEPHOTO) FFL FIGURE 2-7 (C) FLEXIBLE FIBER OPTIC (E) RIGID FIBER OPTIC (G) MINI-LENS (H) STRAIGHT AND RIGHT-ANGLE PINHOLE LENSES Representative video lenses means: coaxial cable, two-wire twisted-pair, LAN, WAN, intranet, Internet, fiber optic, or wireless techniques. The choice of transmission medium depends on factors such as distance, environment, and facility layout. If the distance between the camera and the monitor is short (10–500 feet), coaxial cable, UTP, and fiber optic or wireless is used. For longer distances (500 to several thousand feet) or where there are electrical disturbances, fiber-optic cable and UTP are preferred. For very long distances and in harsh environments (frequent lightning storms) or between separated buildings where no electrical grounding between buildings is in place, fiber optics is the choice. In applications where the camera and monitor are separated by roadways or where there is no right-of- way, wireless systems using RF, microwave or optical transmission is used. For transmission over many miles or from city to city the only choice is the digital or Internet IP camera using compression techniques and transmitting over the Internet and WWW. Images from these Internet systems are not real-time but sometimes come close to real-time. Chapters 6 and 7 describe all of these video transmission media. 2.2.5 The Monitor Function At the monitoring site a cathode ray tube (CRT), LCD or plasma monitor converts the video signal back into a Video Technology Overview 21 IN HORIZONTAL AND VERTICAL SCANNING TIMING AND SYNCHRONIZING SENSOR: CCD, CMOS, IR, ETC. VIDEO AMPLIFIER HORIZONTAL AND VERTICAL SYNC OUT (OPTIONAL) OUT LENS DIGITAL SIGNAL PROCESSING (DSP) VIDEO OUTPUT ANALOG/ DIGITAL VIDEO OUT 75 ohm OPTICAL IMAGE FOCUSED ONTO IMAGE SENSOR FIGURE 2-8 Analog CCTV camera block diagram COLUMN/ROW PIXEL SCANNING ANALOG IN TIMING AND SYNCHRONIZING OUT HORIZONTAL AND VERTICAL SYNC-OPTIONAL LENS SENSOR: CCD, CMOS INTENSIFIER, INFRARED DIGITAL SIGNAL PROCESSING (DSP) VIDEO AMPLIFIER ANALOG VIDEO OUTPUT VIDEO OUT 75 ohm SDRAM/FLASH MEMORY DIGITAL LENS, P/T DRIVERS AF, IRIS, ZOOM, P/T, SHUTTER LENS INTERFACE LOGIC TIMING AND SYNCHRONIZING COLUMN/ROW PIXEL SCANNING VIDEO PROCESSOR VIDEO SIGNAL COMPRESSION MJPEG, MPEG-4 (DSP) WIRED ETHERNET PORT INTERNET INTRANET LAN/WAN WIRELESS PORT 802.11 a/b/g SENSOR: CCD, CMOS INTENSIFIER, INFRARED NTSC/PAL PORT ALARM TRIGGERS ALARM OUTPUTS FIGURE 2-9 DIGITAL VIDEO RECORDER Analog camera with DSP and all digital camera block diagram LCD DISPLAY 22 CCTV Surveillance (A) INTENSIFIED CCD CAMERA (ICCD) (D) MINIATURE CAMERA FIGURE 2-10 (B) 1/3" FORMAT CS MOUNT COLOR CAMERA (C) 1/2" FORMAT CS MOUNT MONOCHROME CAMERA (E) REMOTE HEAD CAMERA (F) THERMAL Representative video cameras visual image on the monitor face via electronic circuitry similar but inverse to that in the camera. The final scene is produced by a scanning electron beam in the CRT in the video monitor. This beam activates the phosphor on the cathode-ray tube, thereby producing a representation of the original image onto the faceplate of the monitor. Alternatively the video image is displayed point by point on an LCD or plasma screen. Chapter 8 describes monitor and display technology and hardware. A permanent record of the monitor video image is made using a VCR tape or DVR hard disk magnetic recorder and a permanent hard copy is printed with a video printer. 2.2.6 The Recording Function For decades the VCR has been used to record monochrome and color video images. The real-time and TL VCR magnetic tape systems have been a reliable and efficient means for recording security scenes. Beginning in the mid-1990s the DVR was developed using a computer hard disk drive and digital electronics to provide video image recording. The availability of large memory disks (hundreds of megabytes) made these machines available for long duration security recording. Significant advantages of the DVR over the VCR are the high reliability of the disk as compared with the cassette tape, its ability to perform high speed searches (retrieval of images) anywhere on the disk, absence of image deterioration after many copies are made. 2.3 SCENE ILLUMINATION A scene is illuminated by either natural or artificial illumination. Monochrome cameras can operate with any type of light source. Color cameras need light that contains all the colors in the visible spectrum and light with a reasonable balance of all the colors to produce a satisfactory color image. 2.3.1 Natural Light During daytime the amount of illumination and spectral distribution of light (color) reaching a scene depends on the time of day and atmospheric conditions. The color spectrum of the light reaching the scene is important if color CCTV is being used. Direct sunlight produces the highest-contrast scene, allowing maximum identification of objects. On a cloudy or overcast day, less light is received by the objects in the scene resulting in less contrast. To produce an optimum camera picture under the wide variation in light levels (daytime to nighttime), an automaticiris camera system is required. Table 2-1 shows the light levels for outdoor illumination under bright sun, partial clouds, and overcast day down to overcast night. Scene illumination is measured in foot candles (Fc) and can vary over a range of 10,000 to 1 (or more). This exceeds the dynamic operating range of most camera sensors for producing a good-quality video image. After the sun has gone below the horizon and if the moon is overhead, reflected sunlight from the moon illuminates the Video Technology Overview 23 ILLUMINATION COMMENTS CONDITION (lux) (FtCd) 10,000 107,500 1,000 10,750 OVERCAST DAY 100 1,075 VERY DARK DAY 10 DIRECT SUNLIGHT FULL DAYLIGHT 107.5 1 TWILIGHT DAYLIGHT RANGE 10.75 1.075 DEEP TWILIGHT .1 FULL MOON .01 .1075 QUARTER MOON .001 .01075 STARLIGHT .0001 .001075 OVERCAST NIGHT .00001 .0001075 LOW LIGHT LEVEL RANGE NOTE: 1 lux = .093 FtCd Table 2-1 Light Levels under Daytime and Nighttime Conditions scene and may be detected by a sensitive monochrome camera. Detection of information in a scene under this condition requires a very sensitive camera since there is very little light reflected into the camera lens from the scene. As an extreme, when the moon is not overhead or is obscured by cloud cover, the only light received is ambient light from: (1) local man-made lighting sources, (2) night-glow caused by distant ground lighting reflecting off particulate (pollution), clouds, and aerosols in the lower atmosphere, and (3) direct light caused by starlight. This is the most severe lighting condition and requires either: (1) ICCD, (2) monochrome camera with IR LED illumination, or (3) thermal IR camera. Table 2-2 summarizes the light levels occurring under daylight and these LLL conditions and the operating ranges of typical cameras. The equivalent metric measure of light level (lux) compared with the foot candle (Fc) is given. One Fc is equivalent to approximately 9.3 lux. 2.3.2 Artificial Light Artificial illumination is often used to augment outdoor lighting to obtain adequate video surveillance at night. The light sources used are: tungsten, tungsten-halogen, metal-arc, mercury, sodium, xenon, IR lamps, and light emitting diode (LED) IR arrays. Figure 2-11 illustrates several examples of these lamps. The type of lighting chosen depends on architectural requirements and the specific application. Often a particular lighting design is used for safety reasons so that personnel at the scene can see better, as well as for improving the video picture. Tungsten and tungsten halogen lamps have by far the most balanced color and are best for color cameras. The most efficient visual outdoor light types are the low- and high-pressure sodium-vapor lamps to which the human eye is most sensitive. These lamps, however, do not produce all colors (missing blue and green) and therefore are not good light sources for color cameras. Metal-arc lamps have excellent color rendition. Mercury arc lamps provide good security illumination but are missing the color red and therefore are not as good as the metal-arc lamps at producing excellent-quality color video images. Long-arc xenon lamps having excellent color rendition are often used in outdoor sports arenas and large parking areas. Light emitting diode IR illumination arrays either mounted in monochrome video cameras or located near the camera are used to illuminate scenes when sufficient lighting is not available. Since they only emit energy in the IR spectrum they can only be used with monochrome cameras. They are used at short ranges (10–25 feet) with wide angle lenses (50–75 FOV) or at medium long ranges (25–200 feet) with medium to narrow FOV lenses (5–20 ). Artificial indoor illumination is similar to outdoor illumination, with fluorescent lighting used extensively in addition to the high-pressure sodium, metal-arc and mercury lamps. Since indoor lighting has a relatively constant light level, automatic-iris lenses are often unnecessary. However, if the CCTV camera views a scene near an outside window or a door where additional light comes in during the day, or if the indoor lighting changes between daytime and nighttime operation, then an automatic-iris lens or electronically shuttered camera is required. The illumination level from most indoor lighting is significantly lower by 100–1000 times than that of sunlight. Chapter 3 describes outdoor natural and artificial lighting and indoor man-made lighting systems available for video surveillance use. 24 CCTV Surveillance CAMERA REQUIREMENT PER LIGHTING CONDITIONS ILLUMINATION CONDITION ILLUMINATION (FtCd) (lux) OVERCAST NIGHT .00001 .0001075 STARLIGHT .0001 .001075 QUARTER MOON .001 .01075 FULL MOON .01 .1075 DEEP TWILIGHT .1 1.075 TWILIGHT 1 10.075 VERY DARK DAY 10 107.5 OVERCAST DAY 100 1,075 1,000 10,750 10,000 107,500 FULL DAYLIGHT DIRECT SUNLIGHT VIDICON * CCD CMOS ICCD ISIT * OPERATING RANGE OF TYPICAL CAMERAS * FOR REFERENCE ONLY Table 2-2 Camera Capability under Natural Lighting Conditions (A) TUNGSTEN HALOGEN (B) FLUORESCENT • STRAIGHT • U (C) HIGH PRESSURE SODIUM (D) TUNGSTEN PAR • SPOT • FLOOD (E) XENON LONG ARC (F) HIGH INTENSITY DISCHARGE METALARC NOTE: PAR = PARABOLIC ALUMINIZED REFLECTOR FIGURE 2-11 Representative artificial light sources Video Technology Overview 2.4 SCENE CHARACTERISTICS The quality of the video image depends on various scene characteristics that include: (1) the scene lighting level, (2) the sharpness and contrast of objects relative to the scene background, (3) whether objects are in a simple, uncluttered background or in a complicated scene, and (4) whether objects are stationary or in motion. These scene factors will determine whether the system will be able to detect, determine orientation, recognize, or identify objects and personnel. As will be seen later the scene illumination—via sunlight, moonlight, or artificial sources—and the actual scene contrast play important roles in the type of lens and camera necessary to produce a quality image on the monitor. 2.4.1 Target Size In addition to the scene’s illumination level and the object’s contrast with respect to the scene background, the object’s apparent size—that is, its angular FOV as seen by the camera—influences a person’s ability to detect it. (Try to find a football referee with a striped shirt in a field of zebras.) The requirements of a video system are a function of the application. These include: (1) detection of the object or movement in the scene; (2) determination of the object’s orientation; (3) recognition of the type of object in the scene, that is, adult or child, car or truck; or (4) identification of the object (Who is the person? Exactly what kind of truck is it?). Making these distinctions depends on the system’s resolution, contrast, and signal-to-noise ratio (S/N). In a typical scene the average observer can detect a target about one-tenth of a degree in angle. This can be related to a standard video picture that has 525 horizontal lines (NTSC) and about 350 TV line vertical and 500 TV line horizontal resolution. Figure 2-12 and Table 2-3 summarize the number of lines required to detect, orient, recognize, or identify an object in a television picture. The number of TV lines required will increase for conditions of poor lighting, highly complex backgrounds, reduced contrast, or fast movement of the camera or target. 2.4.2 Reflectivity The reflectivity of different materials varies greatly depending on its composition and surface texture. Table 2-4 gives DETECTION 1 TV LINE ORIENTATION 2 TV LINES RECOGNITION 5 TV LINES IDENTIFICATION 7 TV LINES NOTE: 1 TV LINE (BRIGHT AND DARK LINE) = 1 LINE PAIR FIGURE 2-12 Object size vs. intelligence obtained 25 26 CCTV Surveillance INTELLIGENCE MINIMUM * TV LINES 1 ± 0.25 DETECTION ORIENTATION 1.4 ± 0.35 RECOGNITION 4 ± 0.8 6.4 ± 1.5 IDENTIFICATION * ONE TV LINE CORRESPONDS TO A LIGHT AND DARK LINE (ONE TV LINE PAIR) Table 2-3 TV Lines vs. Intelligence Obtained some examples of materials and objects viewed by video cameras and their respective reflectivities. Since the camera responds to the amount of light reflected from the scene it is important to recognize that objects have a large range of reflectivities. The objects with the highest reflectivities produce the brightest images. To detect one object located within the area of another the objects must differ in reflectivity, color, or texture. Therefore, if a red box is in front of a green wall and both have the same reflectivity and texture, the box will not be seen on a monochrome video system. In this case, the total reflectivity in the visible spectrum is the same for the green wall and the red box. This is where the color camera shows its advantage over the monochrome camera. The case of a color scene is more complex. While the reflectivity of the red box and the green wall may be the same as averaged over the entire visible spectrum from blue to red, the color camera can distinguish between green and red. It is easier to identify a scene characteristic by a difference in color in a color scene than it is to identify it by a difference in gray scale (intensity) in a monochrome scene. For this reason the target size required to make an identification in a color scene is generally less than it is to make the same identification in a monochrome scene. 2.4.3 Effects of Motion A moving object in a video image is easier to detect, but more difficult to recognize than a stationary one provided that the camera can respond to it. Low light level cameras produce sharp images for stationary scenes but smeared images for moving targets. This is caused by a phenomenon called “lag” or “smear.” Solid-state sensors (CCD, CMOS, and ICCD) do not exhibit smear or lag at normal light levels and can therefore produce sharp images of both stationary and moving scenes. Some image intensifiers exhibit smear when the scene moves fast or when there is a bright light in the FOV of the lens. When the target in the scene moves very fast the inherent camera scan rate (30 frames per second) causes a blurred image of this moving target in the camera. This is analogous to the blurred image in a still photograph when the shutter speed is too slow for the action. There is no cure for this as long as the standard NTSC (National Television System Committee) television scan rate (30 frames per second) is used. However, CCTV snapshots can be MATERIAL REFLECTIVITY (%) * SNOW ASPHALT 85–95 5 PLASTER (WHITE) 90 SAND 40–60 TREES 20 GRASS 40 CLOTHES 15–30 CONCRETE-NEW 40 CONCRETE-OLD CLEAR WINDOWS 25 70 HUMAN FACE 15–25 WOOD 10–20 PAINTED WALL (WHITE) 75–90 RED BRICK 25–35 PARKING LOT AND AUTOMOBILES 40 ALUMINUM BUILDING (DIFFUSE) 65–70 * VISIBLE SPECTRUM: 400–700 NANOMETERS Table 2-4 Reflectivity of Common Materials Video Technology Overview taken without any blurring using fast-shuttered CCD cameras. For special applications in which fast-moving targets must be imaged and tracked, higher scan rate cameras are available. 27 provide low resolution and low identification capabilities. Narrow-FOV or telephoto lenses have high magnification, with high resolution and high identification capabilities. 2.5.2 Zoom Lens 2.4.4 Scene Temperature Scene temperature has no effect on the video image in a CCD, CMOS, or ICCD sensor. These sensors do not respond to temperature changes or temperature differences in the scene. On the other hand, IR thermal imaging cameras do respond to temperature differences and changes in temperature in the scene. Thermal imagers do not respond to visible light or the very near-IR radiation like that produced by IR LEDs. The sensitivity of IR thermal imagers is defined as the smallest change in temperature in the scene that can be detected by the thermal camera. The zoom lens is more versatile and complex than the FFL lens. Its FL is variable from wide-angle to narrow-angle (telephoto) FOV (Figure 2-14). The overall camera/lens FOV depends on the lens FL and the camera sensor size as shown in Figure 2-14. Zoom lenses consist of multiple lens groups that are moved within the lens barrel by means of an external zooming ring (manual or motorized), thereby changing the lens FL and angular FOV without having to switch lenses or refocusing. Zoom focal length ratios can range from 6 to 1 up to 50 to 1. Zoom lenses are usually large and used on pan/tilt mounts viewing over large areas and distances (25–500 feet). 2.5 LENSES A lens collects reflected light from the scene and focuses it onto the camera image sensor. This is analogous to the lens of the human eye focusing a scene onto the retina at the back of the eye (Figure 2-6). As in the human eye, the camera lens inverts the scene image on the image sensor, but the eye and the camera electronics compensate (invert the image) to perceive an upright scene. The retina of the human eye differs from any CCTV lens in that it focuses a sharp image only in the central 10% of its total 160 FOV. All vision outside the central focused scene is out of focus. This central imaging part of the human eye can be characterized as a medium FL lens: 16–25 mm. In principle, Figure 2-6 represents the function of any lens in a video system. Many different lens types are used for video surveillance and safety applications. They range from the simplest FFL manual-iris lenses to the more complex variable-focallength (vari-focal) and zoom lenses, with an automatic iris being an option for all types. In addition, pinhole lenses are available for covert applications, split-image lenses for viewing multiple scenes on one camera, right-angle lenses for viewing a scene perpendicular to the camera axis, and rigid or flexible fiber-optic lenses for viewing through thick walls, under doors, etc. 2.5.1 Fixed-Focal-Length Lens Figure 2-13 illustrates three fixed focal length (FFL) or fixed FOV lenses with narrow (telephoto), medium, and wide FOVs and the corresponding FOV obtained when used with a 1/3-inch camera sensor format. Wide-FOV (short FL) lenses permit viewing a very large scene (wide angle) with low magnification and therefore 2.5.3 Vari-Focal Lens The vari-focal lens is a variable focal length lens used in applications where a FFL lens would be used. In general they are smaller and cost much less than zoom lenses. Like the zoom lens, the vari-focal lens is used because its focal length (angular FOV) can be changed manually or automatically, using a motor, by rotating the barrel on the lens. This feature makes it convenient to adjust the FOV to a precise angle when installed on the camera. Typical vari-focal lenses have focal lengths of 3–8 mm, 5–12 mm, 8–50 mm. With just these three lenses focal lengths of from 3 to 50 mm (91–5 horizontal FOV) can be covered on a 1/3-inch format sensor. Unlike zoom lenses, vari-focal lenses must be refocused each time the FL and the FOV are changed. They are not suitable for zoom or pan/tilt applications. 2.5.4 Panoramic—360 Lens There has always been a need to see “all around,” i.e. an entire room or other location, seeing 360 with one panoramic camera and lens. In the past, 360 FOV camera viewing systems have only been achieved by using multiple cameras and lenses and combining the scenes on a splitscreen monitor. Panoramic lenses have been available for many years but have only recently been combined with digital electronics and sophisticated mathematical transformations to take advantage of their capabilities. Figure 2-15 shows two lenses having a 360 horizontal FOV and a 90 vertical FOV. The panoramic lens collects light from the 360 panoramic scene and focuses it onto the camera sensor as 28 CCTV Surveillance TELEPHOTO (NARROW ANGLE) 2.5 –15° 2.5 TO 15° FL = 150 TO 25 mm * NORMAL 15 TO 45° FL = 25 TO 8 mm WIDE ANGLE 45 TO 85° FL = 8 TO 2.1 mm * * 15–45° 45–85° * SENSOR FORMAT:1/2" FIGURE 2-13 Representative FFL lenses and their fields of view (FOV) CAMERA SENSOR 10.5–105 mm FL ZOOM LENS WIDE ANGLE 10.5 mm FL SENSOR FORMAT HORIZONTAL FOV (DEGREES) WIDE 10.5 mm NARROW 105 mm 1/4" 18.6 2.0 1/3" 24.8 2.6 1/2" 33.0 3.5 2/3" 45.5 4.8 FIGURE 2-14 Zoom video lens horizontal field of view (FOV) NARROW ANGLE 105 mm Video Technology Overview (A) FIGURE 2-15 29 (B) Panoramic 360 lens a donut-shaped image. The electronics and mathematical algorithm converts this donut-shaped panoramic image into the rectangular (horizontal and vertical) format for normal monitor viewing (Section 2.6.5). has been labeled a pinhole lens. Figure 2-16 shows examples of straight and right-angle pinhole lenses used with C or CS mount cameras. The very small mini-pinhole lenses are used on the low-cost, small board cameras. 2.5.5 Covert Pinhole Lens 2.5.6 Special Lenses This special security lens is used when the lens and CCTV camera must be hidden. The front lens element or aperture is small (from 1/16 to 5/16 of an inch in diameter). While this is not the size of a pinhead it nevertheless (A) PINHOLE LENSES FIGURE 2-16 Pinhole and mini-pinhole lenses Some special lenses useful in security applications include split-image, right-angle, relay, and fiber optic (Figure 2-17). (B) MINI-LENSES 30 CCTV Surveillance (A) DUAL SPLIT IMAGE LENS (B) TRI SPLIT IMAGE LENS (C) RIGHT ANGLE LENS (D) RIGID FIBER OPTICS (E) RELAY LENS (F) FLEXIBLE FIBER OPTICS FIGURE 2-17 Special video lenses The dual-split and tri-split lenses use only one camera to produce multiple scenes. These are useful for viewing the same scene with different magnifications or different scenes with the same or different magnifications. Using only one camera can reduce cost and increases reliability. These lenses are useful when two or three views are required and only one camera was installed. The right-angle lens permits a camera using a wideangle lens installed to view a scene that is perpendicular to the camera’s optical axis. There are no restrictions on the focal lengths so they can be used in wide- or narrow-angle applications. The flexible and rigid coherent fiber-optic lenses are used to mount a camera several inches to several feet away from the front lens as might be required to view from the opposite side of a wall or in a hazardous environment. The function of the fiber-optic bundle is to transfer the focused visual image from one location to another. This may be useful for: (1) protecting the camera, and (2) locating the lens in one environment (outdoors) and the camera in another (indoors). 2.6 CAMERAS The camera lens focuses the visual scene image onto the camera sensor area point-by-point and the camera electronics transforms the visible image into an electrical signal. The camera video signal (containing all picture information) is made up of frequencies from 30 cycles per second, or 30 hertz (Hz), to 4.2 million cycles per second, or 4.2 megahertz (MHz). The video signal is transmitted via a cable (or wireless) to the monitor display. Almost all security cameras in use today are color or monochrome CCD with the rapid emergence of CMOS types. These cameras are available as low-cost single printed circuit board (PCB) cameras with small lenses already built in, with or without a housing used for covert and overt surveillance applications. More expensive cameras in a housing are larger and more rugged and have a C or CS mechanical mount for accepting any type of lens. These cameras have higher resolution and light sensitivity and other electrical input/output features suitable for multiple camera CCTV systems. The CCD and CMOS Video Technology Overview cameras with LED IR illumination arrays can extend the use of these cameras to nighttime use. For LLL applications, the ICCD and IR cameras provide the highest sensitivity and detection capability. Significant advancements in camera technology have been made in the last few years particularly in the use of digital signal processing (DSP) in the camera, and development of the IP camera. All security cameras manufactured between the 1950s and 1980s were the vacuum tube type, either vidicon, silicon, or LLL types using silicon intensified target (SIT) and intensified SIT (ISIT). In the 1980s the CCD and CMOS solid-state video image sensors were developed and remain the mainstay in the security industry. Increased consumer demand for video recorders using CCD sensors in camcorders and the CMOS sensor in digital still frame cameras caused a technology explosion and made these small, high resolution, high sensitivity, monochrome and color solid-state cameras available for security systems. The security industry now has at its disposal both analog and digital surveillance cameras. Up until the mid-1990s analog cameras dominated, with only rare use of DSP electronics, and the digital Internet camera was only being 31 introduced to the security market. Advances in solid-state circuitry, the demand from the consumer market and the availability of the Internet were responsible for the rapid use of digital cameras for security applications. 2.6.1 The Scanning Process Two methods used in the camera and monitor video scanning process are raster scanning and progressive scanning. In the past, analog video systems have all used the raster scanning technique, however, newer digital systems are now using progressive scanning. All cameras use some form of scanning to generate the video picture. A block diagram of the CCTV camera and a brief description of the analog raster scanning process and video signal are shown in Figures 2-8, 2-9, 2-18, and 2-19. The camera sensor converts the optical image from the lens into an electrical signal. The camera electronics process the video signal and generate a composite video signal containing the picture information (luminance and color) and horizontal and vertical synchronizing pulses. Signals are transmitted in what is called a frame of picture SENSOR TYPEWRITER ANALOGY SCENE PAGE 1 FIELD 1 SCANNING RASTER PATTERN LENS LINE 1 LINE 3 LINE 5 2 1 1 = 1ST FIELD LINE 262 V2 (1/60 sec) 2 = 2ND FIELD PAGE 2 FIELD 2 (1/60 sec) LINE 2 LINE 4 LINE 6 1 + 2 = FRAME (1/30 sec) 2 1 LINE 525 1 SCAN LINE VIDEO SIGNAL (VOLTS) 1 1/30 sec 62.5 MICROSECONDS (1 FRAME = 525 SCAN LINES) WHITE LEVEL PICTURE INFORMATION BLACK LEVEL –0.4 1/60 sec (1 FIELD = 262 1/2 SCAN LINES) FIGURE 2-18 Analog video scanning process and video display signal SYNC PULSES TIME 32 CCTV Surveillance video, made up of two fields of information. Each field is transmitted in 1/60 of a second and the entire frame in 1/30 of a second, for a repetition rate of 30 frames per second (fps). In the United States, this format is the Electronic Industries Association (EIA) standard called the NTSC (National Television System Committee) system. The European standard uses 625 horizontal lines with a field taking 1/50 of a second and a frame 1/25 of a second and a repetition rate of 25 fps. 2.6.1.1 Raster Scanning In the NTSC system the first picture field is created by scanning 2621/2 horizontal lines. The second field of the frame contains the second 2621/2 lines, which are synchronized so that they fall between the gaps of the first field lines thus producing one completely interlaced picture frame containing 525 lines. The scan lines of the second field fall exactly halfway between the lines of the first field resulting in a 2-to-1 interlace system. As shown in Figure 2-18 the first field starts at the upper-left corner (of the camera sensor or the CRT monitor) and progresses down the sensor (or screen), line by line, until it ends at the bottom center of the scan. Likewise the second field starts at the top center of the screen and ends at the lower-right corner. Each time one line in the field traverses from the left side of the scan to the right it corresponds to one horizontal line as shown in the video waveform at the bottom of Figure 2-18. The video waveform consists of negative synchronization pulses and positive picture information. The horizontal and vertical synchronization pulses are used by the video monitor (and VCR, DVR, or video printer) to synchronize the video picture and paint an exact replica in time and intensity of the camera scanning function onto the monitor face. Black picture information is indicated on the waveform at the bottom (approximately 0 volts) and the white picture information at the top (1 volt). The amplitude of a standard NTSC signal is 1.4 volts peak to peak. In the 525-line system the picture information consists of approximately 512 lines. The lines with no picture information are necessary for vertical blanking, which is the time when the camera electronics or the beam in the monitor CRT moves from the bottom to the top to start a new field. Random-interlace cameras do not provide complete synchronization between the first and the second fields. The horizontal and the vertical scan frequencies are not locked together and therefore fields do not interlace exactly. This condition, however, results in an acceptable picture, and the asynchronous condition is difficult to detect. The 2-to-1 interlace system has an advantage when multiple cameras are used with multiple monitors and/or recorders in that they prevent jump or jitter when switching from one camera to the next. The scanning process for solid-state cameras is different. The solid-state sensor consists of an array of very small picture elements (pixels) that are read out serially (sequentially) by the camera electronics to produce the same NTSC format—525 TV lines in 1/30 of a second (30 fps)—as shown in Figure 2-19. The use of digital cameras and digital monitors has changed the way the camera and monitor signals are processed, transmitted, and displayed. The final presentation on the monitor looks similar to the analog method but instead of seeing 525 horizontal lines (NTSC system), individual pixels are seen in a row and column format. In the digital system the camera scene is divided into rows and columns of individual pixels (small points in the scene) each representing the light intensity and color for each point in the scene. The digitized scene signal is transmitted to the digital display be it LCD, plasma, or other, and reproduced on the monitor screen pixel-bypixel providing a faithful representation of the original scene. 2.6.1.2 Digital and Progressive Scan The digital scanning is accomplished in either the 2-to-1 interlace mode as in the analog system, or in a progressive mode. In the progressive mode each line is scanned in linear sequence: line 1, then line 2, line 3, etc. Solidstate camera sensors and monitor displays can be manufactured with a variety of horizontal and vertical pixels formats. The standard aspect ratio is 4:3 as in the analog system, the wide-screen 16:9, and others are used. Likewise there are many different combinations of the number of pixels in the sensor and display available. Some standard formats for color CCD cameras are 512 h × 492 v for 330 TV line resolution and 768 h × 494 v for 480 TV line resolution, and for color LCD monitors is 1280 h × 1024 v. 2.6.2 Solid-State Cameras Video security cameras have gone through rapid technological change during the last half of the 1980s to the present. For decades the vidicon tube camera was the only security camera available. In the 1980s the more sensitive and rugged silicon-diode tube camera was the best available. In the late 1980s the invention and development of the digital CCD and later the CMOS cameras replaced the tube camera. This technology coincided with rapid advancement in DSP in cameras, the IP camera, and use of digital transmission of the video signal over local and wide area networks and the Internet. The two generic solid-state cameras accounting for most security applications are the CCD and the CMOS. Video Technology Overview 33 PROGRESSIVE SCAN DISPLAY LINE 1 2 PROGRESSIVE SCAN CAMERA 3 LINE 1 2 3 525 525 512 h × 492 v 768 h × 494 v 330 TVLines 480 TVLines TYPICAL STANDARD FORMATS VGA XGA SXGA 640 h × 480 v 1024 h × 768 v 1280 h × 1024 v VIDEO SIGNAL LINE 1 FIGURE 2-19 LINE 2 LINE 525 LINE 3 Digital and progressive scanning process and video display signal The first generation of solid-state cameras available from most manufacturers had 2/3-inch (sensor diagonal) and 1/2-inch sensor formats. As the technology improved, smaller formats evolved. Most solid-state cameras in use today are available in three image sensor formats: 1/2-, 1/3-, and 1/4-inch. The 1/2-inch format produces higher resolution and sensitivity at a higher cost. The 1/2-inch and smaller formats permitted the use of smaller, less expensive lenses as compared with the larger formats. Many manufacturers now produce 1/3-inch and 1/4-inch format cameras with excellent resolution and light sensitivity. Solid-state sensor cameras are superior to their predecessors because of their: (1) precise, repeatable pixel geometry, (2) low power requirements, (3) small size, (4) excellent color rendition and stability, and (5) ruggedness and long life expectancy. At present, solid-state cameras have settled into three main categories: (1) analog, (2) digital, and (3) Internet. 2.6.2.1 Analog Analog cameras have been with the industry since CCTV has been used in security. Their electronics are straightforward and the technology is still used in many applications. 2.6.2.2 Digital Since the second half of 1990s there has been an increased use of DSP in cameras. It significantly improves the performance of the camera by: (1) automatically adjusting to large light level changes (eliminating the automatic-iris), (2) integrating the VMD into the camera, and (3) automatically switching the camera from color operation to higher sensitivity monochrome operation, as well as other features and enhancements. 2.6.2.3 Internet The most recent camera technology advancement is manifest in the IP camera. This camera is configured with electronics that connects to the Internet, WWW network through an Internet service provider (ISP). Each camera is provided with a registered Internet address and can transmit the video image anywhere on the network. This is really remote video monitoring at its best! The camera site is viewed from anywhere by entering the camera Internet address (ID number) and proper password. Password security is used so that only authorized users can enter the website and view the camera image. Two-way communication is used so that the user can control camera parameters and direct the camera operation (pan, tilt, zoom, etc.) from the monitoring site. 34 CCTV Surveillance 2.6.3 Low-Light-Level Intensified Camera When a security application requires viewing during nighttime conditions where the available light is moonlight, starlight, or other residual reflected light, and the surveillance must be covert (no active illumination like IR LEDs), LLL intensified CCD cameras are used. The ICCD cameras have sensitivities between 100 and 1000 times higher than the best solid-state cameras. The increased sensitivity is obtained through the use of a light amplifier mounted in between the lens and the CCD sensor. LLL cameras cost between 10 and 20 times more than CCD cameras. Chapter 19 describes the characteristics of these cameras. 2.6.4 Thermal Imaging Camera An alternative to the ICCD camera is the thermal IR camera. Visual cameras see only visible light energy from the blue end of the visible spectrum to the red end (approximately 400–700 nanometers). Some monochrome cameras see beyond the visible region into the near-IR region of the spectrum up to 1000 nanometers (nm). This IR energy, however, is not thermal IR energy. Thermal IR cameras using thermal sensors respond to thermal energy in the 3–5 micrometer (m) and 8–14 m range. The IR sensors respond to the changes in heat (thermal) energy emitted by the targets in the scene. Thermal imaging cameras can operate in complete darkness. They require no visible or IR illumination whatever. They are truly passive nighttime monochrome imaging sensors. They can detect humans and any other warm objects (animals, vehicle engines, ships, aircraft, warm/hot spots in buildings) or other objects against a scene background. 2.6.5 Panoramic 360 Camera Powerful mathematical techniques combined with the unique 360 panoramic lens (see Section 2.5.4) have made possible a 360 panoramic camera. In operation the lens collects and focuses the 360 horizontal by up to 90 vertical scene (one-half of a sphere, a hemisphere) onto the camera sensor. The image takes the form of a “donut” on the sensor (Figure 2-20). The camera/lens is located at the origin (0). The scene is represented by the surface of the hemisphere. As shown, a small part (slice) of the scene area (A,B,C,D) is “mapped” onto the sensor as a,b,c,d. In this way the full scene is mapped onto the sensor. Direct presentation of the donutring video image onto the monitor does not result in a useful picture to work with. That is where the use of a RAW DONUT IMAGE FROM CAMERA SENSOR 360 HORIZONTAL, 90 VERTICAL 360° PANORAMIC CAMERA 360° HORIZ. FOV A IMAGE TRANSFORMATION: DONUT TO RECTANGULAR CD 0 B TYP. SLICE 90° VERT. FOV B A D C DISPLAY CONFIGURATION DRIVER RECTANGULAR: HALF SPLIT A B D C 0° 180° LENS SEES FULL HEMISPHERE: 360° × 180° RECTANGULAR: 4-WAY SPLIT MIXED: RECTANGULAR AND DONUT DONUT: RAW SENSOR IMAGE 180° CD 360° B FOUR TYPICAL DISPLAY FORMATS FIGURE 2-20 Panoramic 360 camera A Video Technology Overview powerful mathematical algorithm comes in. Digital processing in the computer using the algorithm transforms the donut-shaped image into the normal format seen on a monitor, i.e. horizontal and vertical. All of the 0 to 360 horizontal by 90 vertical images cannot be presented on a monitor in a useful way – there is just too much picture “squeezed” into the small screen area. This condition is solved by computer software by looking at only a section of the entire scene at any particular time. The main attributes of the panoramic system are: (1) captures a full 360 FOV, (2) can digitally pan/tilt to anywhere in the scene and digitally zoom any scene area, (3) has no moving parts (no motors, etc. that can wear out), and (4) multiple operators can view any part of the scene in real-time or at a later time. The panoramic camera requires a high resolution camera since so much scene information is contained in the image. Camera technology has progressed so that these digital cameras are available and can present a good image of a zoomed-in portion of the panoramic scene. 35 the monitor may be from tens of feet to many miles or perhaps completely around the globe. The transmission path may be inside buildings, outside buildings, above ground, under ground, through the atmosphere, or in almost any environment imaginable. For this reason the transmission means must be carefully assessed and an optimum choice of hardware made to satisfactorily transmit the video signal from the camera to the monitoring site. There are many ways to transmit the video signal from the camera to the monitoring site. Figure 2-21 shows some examples of transmission cables. The signal can be analog or digital. The signal can be transmitted via electrical conductors using coaxial cable or UTP, by fiber optic, by LAN or WAN, intranet or Internet. Particular attention should be paid to transmission means when transmitting color video signals since the color signal is significantly more complex and susceptible to distortion than monochrome. Chapters 6 and 7 describe and analyze the characteristics, advantages, and disadvantages of all of the transmission means and the hardware available to transmit the video signal. 2.7 TRANSMISSION 2.7.1 Hard-Wired By definition, the camera must be remotely located from the monitor and therefore the video signal must be transmitted by some means from one location to another. In security applications, the distance between the camera and There are several hard-wired means for transmitting a video signal, including coaxial cable, UTP, LAN, WAN, intranet, Internet, and fiber-optic cable. Fiber-optic cable COAXIAL CABLE COPPER CONDUCTOR PROTECTIVE OUTER JACKET COPPER SHEATH INSULATED TWO WIRE TX REC UNSHIELDED TWISTED PAIR (UTP) TX REC TX = TRANSMITTER REC = RECEIVER FIBER-OPTIC CABLE GLASS FIBER DIELECTRIC INSULATOR FIGURE 2-21 STRENGTHENING MEMBER Hard wired copper and fiber-optic transmission means PROTECTIVE OUTER JACKET 36 CCTV Surveillance is used for long distances and when there is interfering electrical noise. Local area networks and Internet connections are digital transmission techniques used in larger security systems and where the signal must be transmitted over existing computer networks or over long distances. 2.7.1.1 Coaxial Cable The most common video signal transmission method is the coaxial cable. This cable has been used since the inception of CCTV and continues to be used to this day. The cable is inexpensive, easy to terminate at the camera and monitor ends, and transmits a faithful video signal with little or no distortion or loss. This cable has a 75 ohm electrical impedance which matches the impedance of the camera and monitor insuring a distortion-free video image. This coaxial cable has a copper electrical shield and center conductor works well over distances up to 1000 feet. 2.7.1.2 Unshielded Twisted Pair In the 1990s unshielded twisted pair (UTP) video transmission came into vogue. The technique uses a transmitter at the camera and a receiver at the monitor with two twisted copper wires connecting them. Several reasons for its increased popularity are: (1) can be used over longer distances than coaxial cable, (2) uses inexpensive wire, (3) many locations already have two-wire twistedpair installed, (4) low-cost transmitter and receiver, and (5) higher electrical noise immunity as compared to coaxial cable. The UTP using a sophisticated electronic transmitter and receiver can transmit the video signal up to 2000–3000 feet. 2.7.1.3 LAN, WAN, Intranet and Internet The evolution of the LAN, WAN, intranet and Internet revolutionized the transmission of video signals in a new form—digital—which significantly expanded the scope and effectiveness of video for security systems. The widespread use of business computers and consequent use of these networks provided an existing digital network protocol and communications suitable for video transmission. The Internet and WWW attained widespread use in the late 1990s and truly revolutionized digital video transmission. This global computer network provided the digital backbone path to transmit digital video, audio, and command signals from anywhere on the globe. The video signal transmission techniques described so far provide a means for real-time transmission of a video signal, requiring a full 4.2 MHz bandwidth to reproduce real-time motion. When these techniques cannot be used for real-time video, alternative digital techniques are used. In these systems, a non-real-time video transmission takes place, so that some scene action is lost. Depending on the action in the scene, the resolution, from near real-time (15 fps.) to slow-scan (a few frames/sec) of the video image are transmitted. The digitized and compressed video signal is transmitted over a LAN or Internet network and decompressed and reconstructed at the receiver/monitoring site. 2.7.2 Wireless In legacy analog video surveillance systems, it is often more economical or beneficial to transmit the real-time video signal without cable—wireless—from the camera to the monitor using a radio frequency (RF) or IR atmospheric link. In digital video systems using digital transmission, the use of wireless networks (WiFi) permits routing the video and control signals to any remote location. In both the analog and the digital systems some form of video scrambling or encryption is often used to remove the possibility of eavesdropping by unauthorized personnel outside the system. Three important applications for wireless transmission are: (1) covert and portable rapid deployment video installations, (2) building-to-building transmission over a roadway, and (3) parking lot light poles to building. The Federal Communications Commission (FCC) restricts some wireless transmitting devices using microwave frequencies or RF to government and law enforcement use but has given approval for many RF and microwave transmitters for general security use. These FCC approved devices operate above the normal television frequency bands at approximately 920 MHz, 2.4 GHz, and 5.8 GHz. The atmospheric IR link is used when a high security link is required. This link does not require an FCC approval and transmits a video image over a narrow beam of visible light or near-IR energy. The beam is very difficult to intercept (tap). Figure 2-22 illustrates some of the wireless transmission techniques available today. 2.7.3 Fiber Optics Fiber-optic transmission technology has advanced significantly in the last 5–10 years and represents a highly reliable, secure means of transmission. Fiber-optic transmission holds several significant advantages over other hard-wired systems: (1) very long transmission paths up to many miles without any significant degradation in the video signal with monochrome or color, (2) immunity to external electrical disturbances from weather or electrical equipment, (3) very wide bandwidth, permitting one or more video, control, and audio signals to be multiplexed on a single fiber, and (4) resistance to tapping (eavesdropping) and therefore a very secure transmission means. While the installation and termination of fiber-optic cable requires a more skilled technician, it is well within the capability of qualified security installers. Many hardwired installations requiring the optimum color and resolution rendition use fiber-optic cable. Video Technology Overview INFRARED TRANSMISSION VIDEO MONITOR VIDEO FROM CAMERA LENS POWER INPUT 37 LENS POWER INPUT MICROWAVE TRANSMISSION VIDEO TO MONITOR VIDEO FROM CAMERA POWER INPUT POWER INPUT RF TRANSMISSION *DIPOLE *YAGGI ANTENNA FOR DIRECTIONAL LONGER RANGE TRANSMISSION VIDEO TO MONITOR VIDEO FROM CAMERA POWER INPUT POWER INPUT FIGURE 2-22 RF, microwave and IR video transmission links 2.8 SWITCHERS 2.8.2 Microprocessor-Controlled The video switcher accepts video signals from many different video cameras and connects them to one or more monitors or recorders. Using manual or automatic activation or an alarming signal input, the switcher selects one or more of the cameras and directs its video signal to a specified monitor, recorder, or some other device or location. When the security system requires many cameras in various locations with multiple monitors and other alarm input functions, a microprocessor-controlled switcher and keyboard is used to manage these additional requirements (Figure 2-24). In large security systems the switcher is microprocessor controlled and can switch hundreds of cameras to dozens of monitors, recorders, or video printers via an RS-232 or other communication control link. Numerous manufacturers make comprehensive keyboard-operated, computer-controlled consoles that integrate the functions of the switcher, pan/tilt pointing, automatic scanning, automatic preset pointing for pan/tilt systems, and many other functions. The power of the software-programmable console resides in its flexibility, expandability, and ability to accommodate a large variety of applications and changes in facility design. In place of a dedicated hardware system built for each specific application this computercontrolled system can be configured via software for the application. Chapter 11 describes types of switchers and their functions and applications. 2.8.1 Standard There are four basic switcher types: manual, sequential, homing, and alarming. Figure 2-23 shows how these are connected into the video security system. The manual switcher connects one camera at a time to the monitor, recorder, or printer. The sequential switcher automatically switches the cameras in sequence to the output device. The operator can override the automatic sequence with the homing sequential switcher. The alarming switcher connects the alarmed camera to the output device automatically, when an alarm is received. 38 CCTV Surveillance INPUT OUTPUT C INPUT C C MANUAL SWITCHER C M C DVR/VCR C C C . . . . . . (0–32) (0–32) C C C HOMING SWITCHER C M C DVR/VCR C C SEQUENTIAL SWITCHER M ALARMING SWITCHER M DVR/VCR DVR/VCR C . . . (0–32) SYMBOLS: C = CAMERA M = MONITOR DVR/VCR FIGURE 2-23 OUTPUT ALARM 1 ALARM 2 ALARM 3 ALARM 4 = DIGITAL VIDEO RECORDER VIDEO CASSETTE RECORDER Basic video switcher types 2.9 QUADS AND MULTIPLEXERS A quad or a multiplexer is used when multiple camera scenes need to be displayed on one video monitor. It is interposed between the cameras and the monitor, accepts multiple camera inputs, memorizes the scenes from each camera, compresses them, and then displays multiple scenes on a single video monitor. Equipment is available to provide 2, 4, 9, 16, and up to 32 separate video scenes on one single monitor. Figure 2-25 shows a block diagram of quad and multiplexer systems. The most popular presentation is the quad screen showing four pictures. This presentation significantly improves camera viewing ability in multi-camera systems, decreases security guard fatigue, and requires three fewer monitors in a four-camera system. There is a loss of resolution when more than one scene is presented on the monitor with resolution decreasing as the number of scenes increases. One-quarter of the resolution of a full screen is obtained on a quad display (half in horizontal and half in vertical). Quads and multiplexers have front panel controls so that: (1) a full screen image of a camera can be selected, (2) multiple cameras can be displayed (quad, 9, etc.), or (3) the full screen images of all cameras can be sequentially switched with dwell times for each camera, set by the operator. Chapter 12 describes video quads and multiplexers in detail. 2.10 MONITORS Video monitors can be divided into several categories: (1) monochrome, (2) color, (3) CRT, (4) LCD, (5) plasma, and (6) computer display. Contrary to a popular misconception, larger video monitors do not necessarily have better picture resolution or the ability to increase the amount of intelligence available in the picture. All US NTSC security monitors have 525 horizontal lines— regardless of their size or whether they are monochrome or color; therefore the vertical resolution is about the same regardless of the CRT monitor size. The horizontal resolution is determined by the system bandwidth. With the NTSC limitation the best picture quality is obtained by choosing a monitor having resolution equal to or better than the camera or transmission link bandwidth. With the use of a higher resolution computer monitor and corresponding higher resolution camera and commensurate bandwidth to match, higher resolution video images are Video Technology Overview 39 ALARM 1 ALARM 2 ALARM 3 ALARM 4 INPUT OUTPUT (0–2048) M INPUT C1 M C2 M MATRIX M SWITCHER (0–128) C3 C4 (0–2048) DVR/VCR RS232, RS485 COMMUNICATIONS PORT VP (0–32) SYMBOLS: KEYBOARD C = CAMERA M = MONITOR = DVR/VCR = VIDEO PRINTER VP FIGURE 2-24 Microprocessor controlled switcher and keyboard QUAD SYSTEM 3 2 MULTIPLEXER SYSTEM SCENE 3 1 SCENE 2 4 1 DIGITAL VIDEO RECORDER VIDEO CASSETTE RECORDER SCENE 4 SCENE 1 1 2 3 4 1 2 3 4 4 1 1 2 3 4 2 3 SCENE 3 16 SCENE 4 SCENE 16 2 3 4 16 SELECT SEQUENCE 1 3 SCENE 2 SELECT QUAD MENU 2 SCENE 1 4 MONITOR DISPLAYS QUAD PICTURES OR ANY INDIVIDUAL SCENES IN THREE MODES: QUAD – FOUR COMPRESSED PICTURES SELECT – ONE FULL PICTURE SEQUENCE THROUGH 4 SCENES MENU 1 2 3 4 QUAD ANY SCENE FULL SCREEN 1 1 9 NINE 1 2 3 4 SEQUENCE 16 SIXTEEN 1 1 9 MONITOR DISPLAYS 1, 4, 9, 16 PICTURES OR ANY INDIVIDUAL SCENE – FULL SCREEN FIGURE 2-25 Quad and multiplexer block diagrams 16 40 CCTV Surveillance (A) TRIPLE 5" (B) DUAL 9" (C) LCD FIGURE 2-26 (D) PLASMA Standard 5- and 9-inch single/multiple CRT, LCD and plasma monitors obtained. Chapter 8 gives more detailed characteristics of monochrome and color monitors used in the security industry. Figure 2-26 shows representative examples of video monitors. 2.10.1 Monochrome Until the late 1990s the most popular monitor used in CCTV systems was the monochrome CRT monitor. It is still used and is available in sizes ranging from a 1-inch-diagonal viewfinder to a large 27-inch-diagonal CRT. By far the most popular monochrome monitor size is the 9-inch-diagonal that optimizes video viewing for a person seated about 3 feet away. A second reason for its popularity is that two of these monitors fit into the standard EIA 19-inch-wide rack-mount panel. Figure 2-26b shows two 9-inch monitors in a dual rack-mounted version. A triple rack-mount version of a 5-inch-diagonal monitor is used when space is at a premium. The triple rack-mounted monitor is popular, since three fit conveniently into the 19-inch EIA rack. The optimum viewing distance for the triple 5-inch-diagonal monitor is about 1.5 feet. 2.10.2 Color Color monitors are now in widespread use and range in size from 3 to 27 inch diagonal and have required viewing distances and capabilities similar to those of monochrome monitors. Since color monitors require three differentcolored dots to produce one pixel of information on the monitor, they have lower horizontal resolution than monochrome monitors. Popular color monitor sizes are 13, 15, and 17 inch diagonal. 2.10.3 CRT, LCD, Plasma Displays The video security picture is displayed on three basic types of monitor screens: (1) cathode ray tube (CRT), (2) liquid crystal display (LCD), and most recently (3) the plasma display (Figure 2-26d). The analog CRT has seen excellent service from the inception of video and continues as a strong contender providing a low-cost, reliable security monitor. The digital LCD monitor is growing in popularity because of its smaller size (smaller depth), 2–3 inches vs. 12–20 inches for the CRT. The LCD is an all solid-state display accepts the VGA computer signal. Most small (3–10 inch diagonal) and many large (10–17 inch diagonal) Video Technology Overview LCD monitors also accept an analog video input. The most recent monitor entry into the security market is the digital plasma display. This premium display excels in resolution and brightness and viewing angle and produces the highest quality image in the industry. It is also the most expensive. Screen sizes range from 20 to 42 inches diagonal. Overall depths are small and range in size from 3 to 4 inches. They are available in 4:3 and HDTV 16 9 format. 41 a second. The VCR cassette tape is transportable and the DVR and optical disk systems are available with or without removable disks. This means that the video images (digital data) can be transported to remote locations or stored in a vault for safekeeping. The removable DVR and optical disks are about the same size as Victor Home System (VHS) cassettes. Chapter 9 describes analog and digital video recording equipment in detail. The digital DVR technology has all but replaced the analog VCR. 2.10.4 Audio/Video Many monitors have built-in audio channel with speakers, to produce audio and video simultaneously. 2.11 RECORDERS The video camera, transmission means, and monitor provide the remote eyes for the security guard but as soon as the action or event is over the image disappears from the monitor screen forever. When a permanent record of the live video scene is required a VCR, DVR, network recorder, or optical disk recorder is used (Figure 2-27). The video image can be recorded in real-time, near real-time, or TL. The VCRs record the video signal on a magnetic tape cassette with a maximum real-time recording time of 6 hours and near real-time of 24 hours. When extended periods of recording are required (longer than the 6 hour real-time cassette), a TL recorder is used. In the TL process the video picture is not recorded continuously (real-time), but rather “snap-shots” are recorded. These snap shots are spread apart in time by a fraction of a second or even seconds so that the total elapsed time for the recording can extend for hundreds of hours. Some present TL systems record over an elapsed time of 1280 hours. The DVR records the video image on a computer magnetic HD(hard drive) and the optical disk storage on an optical disk media. The DVR and optical disk systems have a significant advantage over the VCR with respect to retrieval time of a particular video frame. VCRs take many minutes to fast-forward or fast-rewind the magnetic tape to locate a particular frame on the tape. Retrieval times on DVRs and optical disks are typically a fraction of (A) SINGLE CHANNEL DVR FIGURE 2-27 2.11.1 Video Cassette Recorder (VCR) Magnetic storage media have been used universally to record the video image. The VCR uses the standard VHS cassette format. The 8 mm Sony format is used in portable surveillance equipment because of its smaller size. Super VHS and Hi-8 formats are used to obtain higher resolution. VCRs can be subdivided into two classes: realtime and TL. The TL recorder has significantly different mechanical and electrical features permitting it to take snapshots of a scene at predetermined (user-selectable) intervals. It can also record in real-time when activated by an alarm or other input command. Real-time recorders can record up to 6 hours in monochrome or color. Time-lapse VCRs are available for recording time-lapse sequences up to 720 hours. 2.11.2 Digital Video Recorder (DVR) The DVR has emerged as the new generation of magnetic recorder of choice. A magnetic HD like those used in a microcomputer can store many thousands of images and many hours of video in digital form. The rapid implementation and success of the DVR has resulted from the availability of inexpensive digital magnetic memory storage devices and the advancements made in digital signal compression techniques. Present DVRs are available in single channel, 4 and 16 channels and may be cascaded to provide many more channels. A significant feature of the DVR is the ability to access (retrieve) a particular frame or recorded time period anywhere on the disk in a fraction of a second. The digital (B) 16 CHANNEL DVR DVR and NVR video disk storage equipment (C) 32 CHANNEL NVR 42 CCTV Surveillance technology also allows making many generations (copies) of the stored video images without any errors or degradation of the image. printout can then be given to another guard to take action. For courtroom uses, time, date, and any other information can be annotated on the printed image. Chapter 10 describes hard-copy video printer systems in detail. 2.11.3 Optical Disk 2.13 ANCILLARY EQUIPMENT When very large volumes of video images need to be recorded, an optical disk system is used. Optical disks have a much larger video image database capacity than magnetic disks given the same physical space they occupy. These disks can record hundreds of times longer than their magnetic counterparts. 2.12 HARD-COPY VIDEO PRINTERS A hard-copy printout of a video image is often required as evidence in court, as a tool for apprehending a vandal or thief, or as a duplicate record of some document or person. The printout is produced by a hard-copy video printer, a thermal printer that “burns” the video image onto coated paper or an ink-jet or laser printer. The thermal technique used by many hard-copy printer manufacturers produces excellent-quality images in monochrome or color. Figure 2-28 shows a monochrome thermal printer and a sample of the hard-copy image quality it produces. In operation, the image displayed on the monitor or played back from the recorder is immediately memorized by the printer and printed out in less than 10 seconds. This is particularly useful if an intrusion or unauthorized act has occurred and been observed by a security guard. An automatic alarm or a security guard can initiate printing the image of the alarm area or of the suspect and the Most video security systems require additional accessories and equipment, including: (1) camera housings, (2) camera pan/tilt mechanisms and mounts, (3) camera identifiers, (4) VMDs, (5) image splitters/inserters, and (6) image combiners. These are described in more detail in Chapters 13, 15, 16, and 17. The two accessories most often used with the basic camera, monitor and transmission link, described previously are camera housings and pan/tilt mounts. Outdoor housings are used to protect the camera and lens from vandalism and the environment. Indoor housings are used primarily to prevent vandalism and for aesthetic reasons. The motorized pan/tilt mechanisms rotate and point the system camera and lens via commands from a remote control console. 2.13.1 Camera Housings Indoor and outdoor camera housings protect cameras and lenses from dirt, dust, harmful chemicals, the environment, and vandalism. The most common housings are rectangular metal or plastic products, formed from high impact indoor or outdoor plastic, painted steel, or stainless steel (Figure 2-29). Other shapes and types include cylindrical (tube), corner-mount, ceiling- mount, and dome housings. (A) PRINTER FIGURE 2-28 Thermal monochrome video printer and hard copy (B) HARDCOPY Video Technology Overview FIGURE 2-29 Standard indoor/outdoor video housings: (a) corner, (b) elevator corner, (c) ceiling, (d) outdoor environmental rectangular, (e) dome, (f) plug and play 43 44 CCTV Surveillance 2.13.1.1 Standard-rectangular The rectangular type housing is the most popular. It protects the camera from the environment and provides a window for the lens to view the scene. The housings are available for indoor or outdoor use with a weatherproof and tamper resistant design. Options include: heaters, fans, and window washers. 2.13.1.2 Dome A significant part of video surveillance is accomplished using cameras housed in the dome housing configuration. The dome camera housing can range from a simple fixed monochrome or color camera in a hemispherical dome to a “speed-dome” housing having a high resolution color camera with remote controlled pan/tilt/zoom/focus. Other options include presets and image stabilization. The dome-type housing consists of a plastic hemispherical dome on the bottom half. The housing can be clear, tinted, or treated with a partially transmitting optical coating that allows the camera to see in any direction. In a freestanding application (e.g. on a pole, pedestal, or overhang), the top half of the housing consists of a protective cover and a means for attaching the dome to the structure. When the dome housing is mounted in a ceiling, a simpler housing cover is provided and mounted above the ceiling level to support the dome. 2.13.1.3 Specialty There are many other specialty housings for mounting in or on elevators, ceilings, walls, tunnels, pedestals, hallways, etc. These special types include: explosion proof, bullet (A) TOP-MOUNTED FIGURE 2-30 proof and extreme environmental construction for artic and desert use. 2.13.1.4 Plug and Play In an effort to reduce installation time for video surveillance cameras, manufacturers have combined the camera, lens, and housing in one assembly ready to be mounted on a ceiling, wall or pole and plugged into the power source and video transmission cable. These assemblies are available in the form of domes, corner mounts, ceiling mounts, etc. making for easy installation in indoor or outdoor applications. Chapter 15 describes these camera housing assemblies and their specific applications in detail. 2.13.2 Pan/Tilt Mounts To extend the angle of coverage of a CCTV lens/camera system a motorized pan/tilt mechanism is often used. Figure 2-30 shows three generic outdoor pan/tilt types: top-mounted, side-mounted, and dome camera. The pan/tilt motorized mounting platform permits the camera and lens to rotate horizontally (pan) or vertically (tilt) when it receives an electrical command from the central monitoring site. Thus the camera lens is not limited by its inherent FOV and can view a much larger area of a scene. A camera mounted on a pan/tilt platform is usually provided with a zoom lens. The zoom lens varies the FOV in the pointing direction of the camera/lens from a command from the central security console. The combination of the pan/tilt and zoom lens provides the widest angular coverage for video surveillance. There is one disadvantage with the pan/tilt/zoom configuration compared with the fixed camera installation. When the camera and lens are (B) SIDE-MOUNTED Video pan/tilt mechanisms: top-mounted, side-mounted, indoor dome (C) INDOOR DOME Video Technology Overview pointing in a particular direction via the pan/tilt platform, most of the other scene area the camera is designed to cover is not being viewed. This dead area or dead time is unacceptable in many security applications and therefore a careful consideration should be given to the adequacy of their wide-FOV pan/tilt design. Pan/tilt platforms range from small, indoor, lightweight units that only pan, up to large, outdoor, environmental designs carrying large cameras, zoom lenses, and large housings. Choosing the correct pan/tilt mechanism is important since it generally requires more service and maintenance than any other part of the video system. Chapter 17 describes several generic pan/tilt designs and their features. 2.13.3 Video Motion Detector (VMD) Another important component in a video surveillance system is a VMD that produces an alarm signal based on a change in the video scene. The VMD can be built into the camera or be a separate component inserted between the camera and the monitor software in a computer. The VMD electronics, either analog or digital, store the video frames, compare subsequent frames to the stored frames, and then determine whether the scene has changed. In operation the VMD digital electronics decides whether the change is significant and whether to call it an alarm to alert the guard or some equipment, or declare it a false alarm. Chapter 13 describes various VMD electronics, their capabilities and their limitations. 2.13.4 Screen Splitter The electronic or optical screen splitter takes a part of several camera scenes (two, three, or more), combines the scenes and displays them on one monitor. The splitters do not compress the image. In an optical splitter the image combining is implemented optically at the camera lens and requires no electronics. The electronic splitter/combiner is located between the camera output and the monitor input. Chapter 16 describes these devices in detail. 2.13.5 Camera Video Annotation 2.13.5.1 Camera ID When multiple cameras are used in a video system some means must be provided to identify the camera. The system uses a camera identifier component that electronically assigns an alphanumeric code and/or name to each camera displayed on a monitor, recorded on a recorder, or printed on a printer. Alphanumeric and symbol character generators are available to annotate the video signal with the names of cameras, locations in a building, etc. 45 2.13.5.2 Time and Date When time and date is required on the video image a time/date generator is used to annotate the video picture. This information is mandatory for any prosecution or courtroom procedure. 2.13.6 Image Reversal Occasionally video surveillance systems use a single mirror to view the scene. This mirror reverses the video image from the normal left-to-right to a right-to-left (reversed image). The image reversal unit corrects the reversal. Chapter 16 describes this device. 2.14 SUMMARY Video surveillance serves as the remote eyes for management and the security force. It provides security personnel with advance notice of breeches in security, hostile, and terrorist acts, and is a part of the plan to protect personnel and assets. It is a critical subsystem for any comprehensive security plan. In this chapter an introduction to most of the current video technology and equipment has been described. Lighting plays an important role in determining whether a satisfactory video picture will be obtained with monochrome and color cameras and LLL ICCD cameras. Thermal IR cameras are insensitive to light and only require temperature differences between the target and the background. There are many types of lenses available for video systems: FFL, vari-focal, zoom, pinhole, panoramic, etc. The vari-focal and zoom lenses extend the FOV of the FFL lens. The panoramic 360 lens provides entire viewing of the scene. The proper choice of lens is necessary to maximize the intelligence obtained from the scene. Many types of video cameras are available: color, monochrome (with or without IR illumination), LLL intensified, and thermal IR, analog and digital, simple and full featured, daytime and nighttime. There are cameras with built-in VMD to alert security guards and improve their ability to detect and locate personnel and be alerted to activity in the scene. An important component of the video system is the analog or digital video signal transmission means from the camera to the remote site, to the monitoring and recording site. Hard wire or fiber optics is best if the situation permits. Analog works for short distances and digital for long distances. The Internet works globally. In multiple camera systems the quad and multiplexers permit multi-camera displays on one monitor. Fewer monitors in the security room can improve guard performance. 46 CCTV Surveillance The CRT monitor is still a good choice for many video applications. The LCD is the solid-state digital replacement for the CRT. The plasma displays provides an all solid state design that has the highest resolution, brightness, and largest viewing angle, but at the highest cost. Until about the year 2000 the only practical means for recording a permanent image of the scene was the VCR real-time or TL recorder. Now, new and upgraded systems replace the VCR with the DVR recorder with its increased reliability and fast search and retrieve capabilities, to distribute the recorded video over a LAN, WAN, intranet or Internet or wirelessly-WiFi using one of the 802.11 protocols. Thermal, ink-jet and laser hard copy printers produce monochrome and color prints for immediate picture dissemination and permanent records for archiving. All types of camera/lens housings are available for indoor and outdoor applications. Specialty cameras/ housings are available for elevators, stairwells, dome housings for public facilities: casinos, shopping malls, extreme outdoor environments, etc. Pan/tilt assemblies for indoor and outdoor scenarios significantly increase the overall FOV of the camera system. Small, compact speed domes have found widespread use in many indoor and outdoor video surveillance environments. Plug and play surveillance cameras permit quick installation and turn-on and are available in almost every housing configuration and camera type. The video components summarized above are used in most video security applications including: (1) retail stores, (2) manufacturing plants, (3) shopping malls, (4) offices (5) airports, (6) seaports, (7) bus and rail terminals, (8) government facilities etc. There is widespread use of small video cameras and accessories for temporary covert applications. The small size and ease of deployment of many video components and the flexibility in transmission means over short and long distances has made rapid deployment equipment for portable personnel protection systems practical and important. Chapters 21 and 22 describe video surveillance systems designed for some of these applications. It is clear that the direction the video security industry is taking is the integration of the video security function with digital computing technology and the other parts of the security system: access control, intrusion alarms, fire and two-way communications. Video security is rapidly moving from the legacy analog technology to the digital automatic video surveillance (AVS) technology. PART II Chapter 3 Natural and Artificial Lighting CONTENTS 3.1 3.2 3.3 3.4 3.5 3.6 Overview Video Lighting Characteristics 3.2.1 Scene Illumination 3.2.1.1 Daytime/Nighttime 3.2.1.2 Indoor/Outdoor 3.2.2 Light Output 3.2.3 Spectral Output 3.2.4 Beam Angle Natural Light 3.3.1 Sunlight 3.3.2 Moonlight and Starlight Artificial Light 3.4.1 Tungsten Lamps 3.4.2 Tungsten-Halogen Lamps 3.4.3 High-Intensity-Discharge Lamps 3.4.4 Low-Pressure Arc Lamps 3.4.5 Compact Short-Arc Lamps 3.4.6 Infrared Lighting 3.4.6.1 Filtered Lamp Infrared Source 3.4.6.2 Infrared-Emitting Diodes 3.4.6.3 Thermal (Heat) IR Source Lighting Design Considerations 3.5.1 Lighting Costs 3.5.1.1 Operating Costs 3.5.1.2 Lamp Life 3.5.2 Security Lighting Levels 3.5.3 High-Security Lighting Summary 3.1 OVERVIEW Scene lighting affects the performance of any monochrome or color video security system. Whether the application is indoor or outdoor, daytime or nighttime, the amount of available light and its color (wavelength) energy spectrum must be considered, evaluated, and compared with the sensitivity of the cameras to be used. In bright sunlight daytime applications some cameras require the use of an automatic-iris lens or electronic shutter. In nighttime applications the light level and characteristics of available and artificial light sources must be analyzed and matched to the camera’s spectral and illumination sensitivities to ensure a good video picture. In applications where additional lighting can be installed the available types of lamps—tungsten, tungsten-halogen, metal-arc, sodium, mercury, and others—must be compared to optimize video performance. In applications where no additional lighting is permissible, the existing illumination level, color spectrum, and beam angle must be evaluated and matched to the video camera/lens combination. An axiom in video security applications is: the more light the better the picture. The quality of the monitor picture is affected by how much light is available and how well the sensor responds to the colors in the light source. This is particularly true when color cameras are used since they need more light and the correct colors of light, than monochrome cameras. The energy from light radiation is composed of a spectrum of colors, including “invisible light” produced by long-wavelength IR and shortwavelength ultraviolet (UV) energy. Most monochrome CCTV cameras respond to visible and near-IR energy but color cameras are made to respond to visible light only. Although many consider lighting to be only a decorator’s or an architect’s responsibility, the type and intensity is of paramount importance in any video security system and therefore the security professional must be knowledgeable. This chapter analyzes the available natural and artificial light sources and provides information to help in choosing an optimum light source or in determining whether existing light levels are adequate. 47 48 CCTV Surveillance 3.2 VIDEO LIGHTING CHARACTERISTICS The illumination present in the scene determines the amount of light ultimately reaching the CCTV camera lens. It is therefore an important factor in the quality of the video image. The illumination can be from natural sources such as the sun, moon, starlight or thermal (heat), or from artificial sources such as tungsten, mercury, fluorescent, sodium, metal-arc, LEDs or other lamps. Considerations about the source illuminating a scene include: (1) source spectral characteristics, (2) beam angle over which the source radiates, (3) intensity of the source, (4) variations in that intensity, and (5) location of the CCTV camera relative to the source. Factors to be considered in the scene include: (1) reflectance of objects in the scene, (2) complexity of the scene, (3) motion in the scene, and (4) degree of contrast in the scene. 3.2.1 Scene Illumination In planning a video system it is necessary to know the kind of illumination, the intensity of light falling on a surface, and how the illumination varies as a function of distance from the light source. The video camera image sensor responds to reflected light from the scene. To obtain a better understanding of scene and camera illumination, consider Figure 3-1, which shows the illumination source, the scene to be viewed, and the CCTV camera and lens. The radiation from the illuminating source reaches the video camera by first reflecting off the objects in the scene. 3.2.1.1 Daytime/Nighttime Before any camera system is chosen the site should be surveyed to determine whether the area under surveillance will receive direct sunlight and whether the camera will be pointed toward the sun (to the south or the west). Whenever possible, cameras should be pointed away from the sun to reduce glare and potential damage to the camera. Also, when the camera views a bright background or bright source, persons or objects near the camera may be hard to identify since not much light illuminates them from the direction of the camera. The light level from different sources varies from a maximum of 10,000 FtCd for natural bright sunlight to a minimum of 1 FtCd (from artificial lamplight at night), giving a ratio of 10,000 to 1. During nighttime, dawn or dusk operation, the camera system may see moonlight and/or starlight, and reflected NATURAL OR ARTIFICIAL ILLUMINATION SOURCE SOURCE PARAMETERS: INTENSITY SPECTRAL INTENSITY (COLOR) BEAM ANGLE SCENE VIEWED BY CAMERA / LENS SCENE REFLECTED LIGHT FROM SCENE CAMERA: SOLID STATE OR TUBE SENSOR LENS FIELD OF VIEW (FOV) SCENE PARAMETERS: ABSOLUTE REFLECTANCE SPECTRAL REFLECTANCE COMPLEXITY OF SCENE (FINE DETAIL) MOTION IN SCENE FIGURE 3-1 CCTV camera, scene, and source illumination VIDEO SIGNAL OUT Natural and Artificial Lighting light from artificial illumination. For nighttime operation the most widely used lamps are tungsten, tungstenhalogen, sodium, mercury, and high-intensity-discharge (HID) metal-arc and xenon types. than having to increase the lighting in a parking lot or exterior perimeter in order to obtain a satisfactory picture with a less expensive camera. 3.2.2 Light Output 3.2.1.2 Indoor/Outdoor For indoor applications, the solid-state CCD and CMOS cameras usually have sufficient sensitivity and dynamic range to produce a good image and can operate with manual-iris lenses. When video surveillance cameras view an outdoor scene, the light source is natural or artificial, depending on the time of day. During the daytime, operating conditions will vary, depending on whether there is bright sun, clouds, overcast sky, or precipitation; the light’s color or spectral energy, as well as its intensity, will vary. The CCTV camera for outdoor applications, where the light level and scene contrast range widely, requires automatic light-level adjustment, usually an automatic-iris lens or an electronic shutter in the camera. Most outdoor cameras must have automatic-iris-control lenses or shuttered CCDs to adjust over the large light-level range encountered. Very often an expensive CCTV camera may cost less The amount of light produced by any light source is defined by a parameter called the “candela”—related to the light from one candle (Figure 3-2). One FtCd of illumination is defined as the amount of light received from a 1-candela source at a distance of 1 foot. A light meter calibrated in FtCd will measure 1 FtCd at a distance of 1 foot from that source. As shown in Figure 3-2, the light falling on a 1-square-foot area at a distance of 2 feet is one-quarter FtCd. This indicates that the light level varies inversely as the square of the distance between the source and observer. Doubling the distance from the source reduces the light level to one-quarter of its original level. Note that exactly four times the area is illuminated by the same amount of light—which explains why each quarter of the area receives only a quarter of the light. LIGHTING TERMINOLOGY REFERENCE SOURCE POWER TO ENERGY (LIGHT) SOURCE: POWER IN (WATTS) 1 fc LUMINOUS INTENSITY (I ) (1 CANDLE) 1/4 fc 1 AREA = A = 1 D = DISTANCE FROM SOURCE = 2 ILLUMINATION ON SURFACE AREA = A = 4 ILLUMINANCE (E ) AT A DISTANCE (D ) IS PROPORTIONAL TO 1/DISTANCE SQUARED E = I/D 2 ILLUMINANCE AT THE SCENE = LUMENS/SQUARE FOOT (FtCd) OR: LUMENS/SQUARE METER (lux) FIGURE 3-2 49 Illumination defined—the inverse square law 50 CCTV Surveillance 3.2.3 Spectral Output Since different CCTV camera types respond to different colors it is important to know what type of light source is illuminating the surveillance area as well as what type might have to be added to get the required video picture. Figure 3-3 shows the spectral light–output characteristics from standard tungsten, tungsten-halogen, and sodium artificial sources, as well as that from natural sunlight. Superimposed on the figure is the spectral sensitivity of the human eye. Each source produces light at different wavelengths or colors. To obtain the maximum utility from any video camera it must be sensitive to the light produced by the natural or artificial source. Sunlight, moonlight, and tungsten lamps produce energy in a range in which all video cameras are sensitive. Solid-state CCD sensors are sensitive to visible and near-IR sources but many CCD cameras have IR cut filters which reduce this IR sensitivity. 3.2.4 Beam Angle Another characteristic important in determining the amount of light reaching a scene is the beam angle over which the source radiates. One parameter used to classify light sources is their light-beam pattern: Do they emit a wide, medium, or narrow beam of light? The requirement for this parameter is determined by the FOV of the camera lens used and the total scene to be viewed. It is best to match the camera lens FOV (including any pan and tilt motion) to the light-beam radiation pattern to obtain the best uniformity of illumination over the scene, and the best picture quality and light efficiency. Most lighting manufacturers have the coefficient of utilization (CU) for specific fixture luminaires. The CU expresses how much light the fixture luminaire (lens) directs to the desired location (example: CU = 75%). Figure 3-4 shows the beam patterns of natural and artificial light sources. The natural sources are inherently wide while artificial sources are available in narrow-beam (a few degrees) to wide-beam (30–90 ) patterns. The sun and moon, as well as some artificial light sources operating without a reflector, radiate over an entire scene. Artificial light sources and lamps almost always use lenses and reflectors and are designed or can sometimes be adjusted to produce narrow- or wide-angle beams. If a large area is to be viewed, either a single wide-beam source or multiple sources must be located within the scene to illuminate it fully and uniformly. If a small scene at a long RELATIVE SPECTRAL INTENSITY 380 nm 780 nm VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM INFRARED (IR) SPECTRUM 100 HUMAN EYE RESPONSE 80 60 SUN HIGH PRESSURE SODIUM (YELLOW) W,WI HID 40 MERCURY (Hg) 20 FIGURE 3-3 700 800 RED BLUE 600 ORANGE 500 GREEN YELLOW 400 VIOLET 0 Light and IR output from common illumination sources 900 1000 1100 WAVELENGTH (NANOMETERS W = TUNGSTEN WI = TUNGSTEN HALOGEN Natural and Artificial Lighting PAR SPOT SUN PAR FLOOD FLUORESCENT 51 HIGH INTENSITY DISCHARGE (HID) 40 20 NOTE: PAR LAMP (PARABOLIC ALUMINIZED REFLECTOR) HIGH INTENSITY DISCHARGE (HID) MERCURY METALARC SODIUM FIGURE 3-4 Beam patterns from common sources range is to be viewed, it is necessary to illuminate only that part of the scene to be viewed, resulting in a reduction in the total power needed from the source. 3.3 NATURAL LIGHT There are two broad categories of light and heat sources: natural and artificial. Natural light sources include the sun, moon (reflected sunlight), stars, and thermal (heat). The visible natural sources contain the colors of the visible spectrum (blue to red) as shown in Figure 3-3. Sunlight and moonlight contain IR radiation in addition to visible light spectra and are classified as broadband light sources, that is, they contain all colors and wavelengths. Far-IR radiation in the 3–5 micrometer (m) and 8–11 m spectrum produces heat energy. Only thermal IR imaging cameras are sensitive to this far-IR energy. Artificial light sources can be broadband or narrowband, i.e. containing only a limited number of colors or all of them. Monochrome video systems cannot perceive the color distribution or spectrum of colors from different light sources. The picture quality of monochrome cameras depends solely on the total amount of energy emitted from the lamp that the camera is sensitive to. When the lamp output spectrum falls within the range of the camera sensor spectral sensitivity then the camera produces the best picture. For color video systems the situation is more complex and critical. Broadband light sources containing most of the visible colors are necessary for a color camera. To get a good color balance the illumination source should match the sensor sensitivity. For the camera to be able to respond to all the colors in the visible spectrum the light source must contain all the colors of the spectrum. Color cameras have an automatic white-balance control that automatically adjusts the camera electronics to produce the correct color balance. The light source must contain the colors in order for them to be seen on the monitor. Broadband light sources such as the sun, tungsten or tungsten-halogen, and xenon produce the best color pictures because they contain all the colors in the spectrum. If the scene in Figure 3-1 is illuminated by sunlight, moonlight, or starlight, it will receive uniform illumination. If it is illuminated by several artificial sources, the lighting may vary considerably over the FOV of the camera and lens. For outdoor applications the camera system must operate over the full range from direct sunlight 52 CCTV Surveillance RELATIVE SPECTRAL INTENSITY 380 nm 780 nm NEAR INFRARED (IR) SPECTRUM VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM 100 SUN ENERGY SPECTRUM HUMAN EYE RESPONSE 3400 K 80 3000 K CMOS 60 CCD CMOS 40 FIGURE 3-5 700 800 900 1000 1100 RED 600 ORANGE 500 GREEN YELLOW 400 BLUE 0 VIOLET 20 WAVELENGTH (NANOMETERS) Spectral characteristics of natural sources and camera sensors to nighttime conditions, and must have an automatic light control means to compensate for this light-level change. Figure 3-5 summarizes the characteristics of natural sources, i.e. the sun, moon, and starlight, and how different camera types respond to them. Table 3-1 summarizes the overall light-level ranges, from direct sunlight to overcast starlight. During the first few hours in the morning and the last few hours in the evening, the sunlight’s spectrum is shifted toward the orange–red region, so things look predominantly orange and red. During the midday hours, when the sun is brightest and most intense, blues and greens are LIGHT LEVEL 3.3.1 Sunlight The sun is the energy source illuminating an outdoor scene during the daylight hours. The sun emits a continuum of all wavelengths and colors to which monochrome and color television cameras are sensitive. This continuum includes visible radiation in the blue, green, yellow, orange, red, and also in the IR range of the spectrum. The sun also produces long wavelength thermal IR heat energy that is used by thermal (heat) imaging IR cameras. All monochrome and color solid-state cameras are sensitive to the visible spectrum, and some monochrome cameras to the visible and near-IR spectrum. Color cameras are sensitive to all the color wavelengths in the visible spectrum (as is the human eye), but color cameras are purposely designed to be insensitive to near-IR wavelengths. LIGHTING CONDITION fc* UNOBSTRUCTED SUN 10,000 SUN WITH LIGHT CLOUD 7,000 SUN WITH HEAVY CLOUD 2,000 SUNRISE, SUNSET 50 TWILIGHT .4 FULL MOON .02 QUARTER MOON .002 OVERCAST MOON .0007 CLEAR NIGHT SKY .0001 AVERAGE STARLIGHT .00007 OVERCAST NIGHT SKY .000005 lux ** 100,000 70,000 20,000 500 4 .2 .02 .007 .001 .0007 .00005 * LUMENS PER SQUARE FOOT (fc) ** LUMENS PER SQUARE METER (lux) NOTE: 1 fc EQUALS APPROXIMATELY 10 lux Table 3-1 Light-Level Range from Natural Sources Natural and Artificial Lighting brightest and reflect the most balanced white light. For this reason a color camera must have an automatic whitebalance control that adjusts for color shift during the day, so that the resulting video picture is color corrected. 3.3.2 Moonlight and Starlight After the sun sets in an environment with no artificial lighting, the scene may be illuminated by the moon, the stars, or both. Since moonlight is the reflected light from the sun it contains most of the colors emitted from the sun. However, the low level of illumination reaching the earth from the moon (or stars) prevents color cameras (and the human eye) from providing good color rendition. 3.4 ARTIFICIAL LIGHT The following sections describe some of the artificial light sources in use today and how their characteristics affect their use in video security applications. Artificial light sources consist of the several types of lamps used in outdoor parking lots, storage facilities, fence lines, or in indoor environments for lighting rooms, hallways, work areas, elevators, etc. Two types of lamps are common: tungsten or tungsten-halogen lamps having solid filaments, and gaseous or arc lamps containing low- or high-pressure gas in an enclosed envelope. Arc lamps can be further classified into HID, low-pressure, and high-pressure shortarc types. High-intensity-discharge-lamps are used most extensively because of their high efficacy (efficiency in converting electrical energy into light energy) and long life. Low-pressure arc lamps include fluorescent and lowpressure sodium types used in many indoor and outdoor installations. Long-arc xenon lamps are used in large outdoor sports arenas. High-pressure short-arc lamps find use in applications that require a high-efficiency, well-directed narrow beam to illuminate a target at long distances (hundreds or thousands of feet). Such lamps include xenon, metal-halide, high-pressure sodium, and mercury. For covert security applications some lamps are fitted with a visible-light-blocking filter so that only invisible IR radiation illuminates the scene. Narrow-band light sources such as mercury-arc or sodium-vapor lamps do not produce a continuous spectrum of colors, so color is rendered poorly. A mercury lamp has little red light output and therefore red objects appear nearly black when illuminated by a mercury arc. Likewise a high-pressure sodium lamp contains large quantities of yellow, orange, and red light and therefore a blue or blue–green object will look dark or gray or brown in its light. A low-pressure sodium lamp produces only yellow light and consequently is unsuitable for color video applications. 53 A significant advance in tungsten lamp development came with the use of a halogen element (iodine or bromine) in the lamp’s quartz envelope, with the lamp operating in what is called the “tungsten-halogen cycle.” This operation increases a lamp’s rated life significantly even though it operates at a high temperature and light output. Incandescent filament lamps are available with power ratings from a fraction of a watt to 10 kilowatts. High intensity discharge arc lamps comprise a broad class of lamps in which the arc discharge takes place between electrodes contained in a transparent or translucent bulb. The spectral radiation output and intensity are determined principally by the chemical compounds and gaseous elements that fill the bulb. The lamp is started using a high-voltage ignition circuit with some form of electrical ballasting used to stabilize the arc. In contrast, tungsten lamps operate directly from the power source. Compact short-arc lamps are only a few inches in size but emit high-intensity, high-lumen output radiation with a variety of spectral characteristics. Long-arc lamps, such as fluorescent, low-pressure sodium vapor, and xenon have output spectral characteristics determined by the gas in the arc or the tube-wall emitting material. The fluorescent lamp has a particular phosphor coating on the inside of the glass and bulb that determines its spectral output. Power outputs available from arc-discharge lamps range from a few watts up to many tens of kilowatts. An important aspect of artificial lighting is the consideration of the light-beam pattern from the lamp and the camera lens FOV. A wide-beam flood lamp will illuminate a large area with a fairly uniform intensity of light and therefore produce a well-balanced picture. A narrow-beam light or spotlight will illuminate a small area and consequently areas at the edge of the scene and beyond will be darker. A scene that is illuminated non-uniformly (i.e. with high contrast) and having “hotspots” will result in a nonuniform picture. For maximum efficiency the camera–lens combination FOV should match the lamp beam angle. If a lamp illuminates only a particular area of the scene the camera–lens combination FOV should only be viewing that area illuminated by the lamp. This source beam angle problem does not exist for areas lighted by natural illumination such as the sun, which usually uniformly illuminates the entire scene except for shadows. 3.4.1 Tungsten Lamps The first practical artificial lighting introduced in 1907 took the form of an incandescent filament tungsten lamp. These lamps used a tungsten mixture formed into a filament and produced an efficacy (ratio of light out to power in) of approximately 7 lumens per watt of visible light. This represented a great increase over anything existing at the time, but represents a low efficiency compared to 54 CCTV Surveillance most other present lamp types. In 1913 ductile tungsten wire fabricated into coiled filaments increased efficacy to 20 lumens per watt. Today the incandescent lamp is commonplace and is still used in most homes, businesses, factories, and public facilities. While its efficacy does not measure up to that of the arc lamp, the tungsten and tungsten-halogen incandescent lamps nevertheless offer a low-cost installation for many applications. Since it is an incandescent source, it radiates all the colors in the visible spectrum as well as the near-IR spectrum providing an excellent light source for monochrome and color cameras. Its two disadvantages when compared with arc lamps are: (1) relatively low efficacy, which makes it more expensive to operate, and (2) relatively short operating life of several thousand hours. Incandescent filament lamp efficacy increases with filament operating temperature; however, lamp life expectancy decreases rapidly as lamp filament temperature increases. Maximum practical efficacy is about 35 lumens per watt in high-wattage lamps operated at approximately 3500 K color temperature. A tungsten lamp cannot operate at this high temperature since it will last only a few hours. At lower temperatures, life expectancy increases to several thousand hours, which is typical of incandescent lamps used in general lighting. An incandescent lamp consists of a tungsten filament surrounded by an inert gas sealed inside a transparent or frosted-glass envelope. The purpose of the frosted glass is to increase the apparent size of the lamp, thereby (A) TUNGSTEN HALOGEN IN QUARTZ ENVELOPE FIGURE 3-6 decreasing its peak intensity and reducing glare and hotspots in the illuminated scene. Incandescent lamp filaments are usually coiled to increase their efficiency. The coils are sometimes coiled again (coiled-coiled) to further increase the filament area and increase the luminance. Filament configurations are designed to optimize the radiation patterns for specific applications. Sometimes long and narrow filaments are used and mounted into cylindrical reflectors to produce a rectangular beam pattern. Others have small filaments so as to be incorporated into parabolic reflectors to produce a narrow collimated beam (spotlight). Others have larger filament areas and are used to produce a wide-angle beam (such as a floodlight). Figure 3-6 shows several lamp configurations. Figure 3-7 shows some standard lamp luminaires used in industrial, residential, and security applications. The luminaire fixtures house the tungsten, HID, and lowpressure lamps. The tungsten-halogen lamp design is a significant improvement over the incandescent lamp. In conventional gas-filled, tungsten-filament incandescent lamps, tungsten molecules evaporate from the incandescent filament, flow to the relatively cool inner surface of the bulb wall (glass). The tungsten adheres to the glass and forms a thin film that gradually thickens during the life of the lamp and causes the bulb to darken. This molecular action reduces the lumen light output and efficacy in two ways. First, evaporation of tungsten from the filament reduces the filament wire’s diameter and increases its resistance, so (B) TUNGSTEN FILAMENT Generic tungsten, tungsten–Halogen lamp configurations (C) TUNGSTEN HALOGEN IN PARABOLIC ALUMINIZED REFLECTOR (PAR) Natural and Artificial Lighting FIGURE 3-7 (A) TUNGSTEN HALOGEN (B) SODIUM (C) FLUORESCENT (D) METAL ARC MERCURY 55 Standard lamp luminaires that light output and color temperature increase. Second, the tungsten deposited on the bulb wall increases the opacity (reduces transmission of light through the glass) as it thickens. Figure 3-8 illustrates the relative amount of energy produced by tungsten-filament and halogenquartz-tungsten lamps as compared with other arc-lamp types, including fluorescent, metal-arc, and sodium, in the visible and near-IR spectral range. On an absolute basis, the energy produced by the tungsten lamp in the visible spectral region is significantly lower than that provided by HID lamps. However, the total amount of energy produced by the tungsten lamp over the entire spectrum is comparable to that of the other lamps. Figure 3-8 shows the human eye response and spectral sensitivity of standard CCTV camera sensors. 3.4.2 Tungsten-Halogen Lamps The discovery of the tungsten-halogen cycle significantly increased the operating life of the tungsten lamp. Tungsten-halogen lamps, like conventional incandescent lamps, use a tungsten filament in a gas-filled lighttransmitting envelope and emit light with a spectral distribution similar to that of a tungsten lamp. Unlike the standard incandescent lamp, the tungsten-halogen lamp contains a trace vapor of one of the halogen elements (iodine or bromine) along with the usual inert fill gas. Also, tungsten-halogen lamps operate at much higher gas pressure and bulb temperature than non-halogen incandescent lamps. The higher gas pressure retards the tungsten evaporation, allowing the filament to operate at a higher temperature, resulting in higher efficiencies than conventional incandescent lamps. To withstand these higher temperatures and pressures, the lamps use quartz bulbs or high-temperature “hard” glass. The earliest version of these lamps used fused quartz bulbs and iodine vapor and were called “quartz iodine lamps.” After it was found that other halogens could be used, the more generic tungsten-halogen lamp is now used. The important result achieved with the addition of halogen was caused by the “halogen regenerative cycle,” which maintains a nearly constant light output and color temperature throughout the life of the lamp and significantly 56 CCTV Surveillance RELATIVE SPECTRAL INTENSITY 380 nm 780 nm VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM INFRARED (IR) SPECTRUM 100 CMOS 80 CCD WITHOUT IR FILTER CCD WITH IR FILTER HID 60 WI 40 HIGH PRESSURE SODIUM (YELLOW) HG 20 FLUORESCENT FIGURE 3-8 700 800 RED BLUE 600 ORANGE 500 GREEN YELLOW 400 VIOLET 0 900 1000 1100 WAVELENGTH (NANOMETERS) WI = TUNGSTEN HALOGEN HID = HIGH INTENSITY DISCHARGE Spectral characteristics of lamps and camera sensors extends the life of the lamp. The halogen chemical cycle permits the use of more compact bulbs compared to those of tungsten filament lamps of comparable ratings and permits increasing either lamp life or lumen output and color temperature to values significantly above those of conventional tungsten filament lamps. Incandescent and xenon lamps are good illumination sources for IR video applications when the light output is filtered with a covert filter (one that blocks or absorbs the transmission of visible radiation) and they transmit only the near-IR radiation. Figure 3-9 shows a significant portion of the emitted spectrum of the lamp radiation falling in the near-IR region that is invisible to the human eye but to which solid-state silicon sensor cameras are sensitive. The reason for this is shown in Figure 3-9, which details the spectral characteristics of these lamps. When an IR-transmitting/visible-blocking filter is placed in front of a tungsten-halogen lamp, only the IR energy illuminates the scene and reflects back to the CCTV camera lens. This combination produces an image on the video monitor from an illumination source that is invisible to the eye. This technology is commonly referred to as “seeing in the dark” i.e. there is no visible radiation and yet a video image is discernible. Some monochrome solid-state CCD and CMOS sensors are responsive to this near-IR radiation. Since the IR region has no “color,” color cameras are designed to be insensitive to the filtered IR energy. Approximately 90% of the energy emitted by the tungsten-halogen lamp occurs in the IR region. However, only a fraction of this IR light can be used by silicon sensors, since they are responsive only up to approximately 1100 nanometers (nm). The remaining IR energy above 1100 nm manifests as heat, which does not contribute to the image. While the IR source is not visible to the human eye it is detectable by silicon camera devices and other night vision devices (Chapter 19). 3.4.3 High-Intensity-Discharge Lamps An enclosed arc high-intensity-discharge (HID) lamp is in widespread use for general lighting and security applications. There are three major types of HID lamps, each one having a relatively small arc tube mounted inside a heat-conserving outer jacket and filled with an inert gas to prevent oxidation of the hot arc tube seals (Figure 3-10). Natural and Artificial Lighting RELATIVE SPECTRAL INTENSITY 380 nm 780 nm 100 NEAR INFRARED (IR) SPECTRUM VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM CCD WITH IR FILTER 57 W,WI WITH IR FILTER SILICON TUBE 80 CCD WITHOUT FILTER CMOS 60 XENON W,WI 40 20 FIGURE 3-9 700 800 900 1000 1100 RED 600 ORANGE 500 GREEN YELLOW 400 BLUE 0 VIOLET CMOS WAVELENGTH (NANOMETERS) W = TUNGSTEN WI = TUNGSTEN HALOGEN Filtered tungsten and xenon lamps vs. camera spectral sensitivity The principle in all vapor-arc lighting systems is the same: (1) an inert gas is contained within the tube to spark ignition, (2) the inert gas carries current from one electrode to the other, (3) the current develops heat and vaporizes the solid metal or metallic-oxide inside the tube, and (4) light is discharged from the vaporized substance through the surface of the discharge tube and into the area to be lighted. The three most popular HID lamps are: (1) mercury in a quartz tube, (2) metal halide in a quartz tube, and (3) high-pressure sodium in a translucent aluminum-oxide tube. Each type differs in electrical input, light output, shape, and size. While incandescent lamps require no auxiliary equipment and operate directly from a suitable voltage, discharge sources in HID lamps require a high-voltage starting device and electrical ballast while in operation. The high-voltage ignition provides the voltage necessary to start the lamp; once the lamp is started, the ballast operates the lamp at the rated power (wattage) or current level. The ballast consumes power, which must be factored into calculations of system efficiency. HID lamps, unlike incandescent or fluorescent lamps, require several minutes to warm up before reaching full brightness. If turned off momentarily they take several minutes before they can be turned on again (reignited). The primary overriding advantages of HID lamps are high efficacy and their long life, provided they are operated at a minimum of several hours per start. Lamp lifetime is typically 16,000 to 24,000 hours and light efficacy ranges from 60 to 140 lumens per watt. These lamps cannot be electrically dimmed without drastically affecting the starting warm-up luminous efficiency, color, and life. These lamps are the most widely used lamps for lighting industrial and commercial buildings, streets, sports fields, etc. One disadvantage of short-arc lamps just mentioned is their significant warm-up time—usually several minutes to ten minutes. If accidentally or intentionally turned off, these lamps cannot be restarted until they have cooled down sufficiently to reignite the arc. This may be 2–5 minutes and then take an additional 5 minutes to return to full brightness. Dual-HID bulbs are now available, which include two identical HID lamp units, only one of which operates at a time. If the first lamp is extinguished momentarily, the cold lamp may be ignited immediately, eliminating the waiting time to allow the first lamp to cool down. 58 CCTV Surveillance FIGURE 3-10 (A) MERCURY (B) XENON (C) METAL ARC (D) SODIUM High-intensity-discharge lamps Mercury HID lamps are available in sizes from 40 to 1500 watts. Spectral output is high in the blue region but extremely deficient in the red region. Therefore they should be used in monochrome but not color video applications (Figure 3-11). A second class of HID lamp is the metal-halide that is filled with mercury-metallic iodides. These lamps are available with power ratings from 175 to 1500 watts. The addition of metallic salts to the mercury arc improves the efficacy and color by adding emission lines in the red end of the spectrum. With different metallic additives or different phosphor coatings on the outside of the lamp, the lamp color varies from an incandescent spectrum to a daylight spectrum. The color spectrum from the metalhalide lamp is significantly improved over the mercury lamp and can be used for monochrome or color video applications. The third class of HID lamp is the high-pressure sodium lamp. This lamp contains a special ceramic-arc tube material that withstands the chemical attack of sodium at high temperatures, thereby permitting high luminous efficiency and yielding a broader spectrum, compared with lowpressure sodium arcs. However, because the gas is only sodium, the spectral output distribution from the highpressure sodium HID lamp is yellow–orange and has only a small amount of blue and green. For this reason the lamp is not suitable for good color video security applications. The primary and significant advantage of the high-pressure sodium lamp over virtually all other lamps is its high efficacy, approximately 60–140 lumens per watt. It also enjoys a long life, approximately 24,000 hours. The sodium lamp is an extremely good choice for monochrome surveillance applications. High-intensitydischarge lamps are filled to atmospheric pressure (when not operating) and rise to several atmospheres when operating. This makes them significantly safer than short-arc lamps that are under much higher pressure at all times. The choice of lamp is often determined by architectural criteria, but the video designer should be aware of the color characteristics of each lamp to ensure their suitability for monochrome or color video. Natural and Artificial Lighting 59 RELATIVE SPECTRAL INTENSITY 380 nm 780 nm VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM INFRARED (IR) SPECTRUM 100 80 METAL HALIDE 60 METAL HALIDE DAYLIGHT HIGH PRESSURE SODIUM (YELLOW) 40 MERCURY (HG) 20 FIGURE 3-11 700 RED 600 ORANGE BLUE 500 GREEN YELLOW 400 VIOLET 0 800 900 1000 1100 WAVELENGTH (NANOMETERS) Spectral output from HID lamps 3.4.4 Low-Pressure Arc Lamps Fluorescent and low-pressure sodium lamps are examples of low-pressure arc lamp illumination sources. These lamps have tubular bulb shapes and long arc lengths (several inches to several feet). A ballast is necessary for proper operation, and a high-voltage pulse is required to ignite the arc and start the lamp. The most common type is the fluorescent lamp with a relatively high efficacy of approximately 60 lumens per watt. The large size of the arc tube (diameter as well as length) requires that it be placed in a large luminaire (reflector) to achieve a defined beam shape. For this reason, fluorescent lamps are used for large-area illumination and produce a fairly uniform pattern. The fluorescent lamp system is sensitive to the surrounding air temperature and therefore is used indoors or in moderate temperatures. When installed outdoors in cold weather a special low-temperature ballast must be used to ensure that the starting pulse is high enough to start the lamp. The fluorescent lamp combines a low-pressure mercury arc with a phosphor coating on the interior of the bulb. The lamp arc produces UV radiation from the low-pressure mercury arc, which is converted into visible radiation by the phosphor coating on the inside wall of the outside tube. A variety of phosphor coatings is available to produce almost any color quality (Figure 3-12). Colors range from “cool white,” which is the most popular variety, to daylight, blue white, and so on. Lamps are available with input powers from 4 watts to approximately 200 watts. Tube lengths vary from 6 to 56 inches (15–144 cm). Fluorescent lamps can be straight, circular, or U-shaped. Fluorescent lamps can emit a continuous spectrum like an incandescent lamp simulating a daylight spectrum and suitable for color cameras. A second class of low-pressure lamp is the sodium lamp which emits a single yellow color (nearly monochromatic). These lamps have ratings from 18 to 180 watts. The lowpressure sodium lamp has the highest efficacy output of any lamp type built to date, approximately 180 lumens per watt. While the efficacy is high, the lamp’s pure yellow light limits it to some monochrome video surveillance applications and to roadway lighting applications. If used with color cameras, only yellow objects will appear yellow; all other objects will appear brown or black. The low-pressure sodium light utilizes pure metal sodium with an inert-gas combination of neon–argon enclosed in a discharge tube about 28 inches long. The pressure in the tube is actually below atmospheric 60 CCTV Surveillance RELATIVE SPECTRAL INTENSITY 380 nm 780 nm VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM INFRARED (IR) SPECTRUM 100 80 LOW PRESSURE SODIUM (YELLOW) 60 FLUORESCENT—DAYLIGHT 40 FLUORESCENT—WARM FIGURE 3-12 700 RED 600 ORANGE 500 GREEN YELLOW 400 BLUE 0 VIOLET 20 800 900 1000 1100 WAVELENGTH (NANOMETERS) Light output from low pressure arc lamps pressure, which causes the glass to collapse inward if it is ruptured—a good safety feature. A unique advantage of the low-pressure sodium amber light is its better “modeling” (showing of texture and shape) of any illuminated surface, for both the human eye and the CCTV camera. It provides more contrast, and since the monochrome CCTV camera responds to contrast, images under this light are clearer, according to some reports. The yellow output from the sodium lamp is close to the wavelength region at which the human eye has its peak visual response (560 nanometers). Some security personnel and the police have identified low-pressure sodium as a uniquely advantageous off-hour lighting system for security because the amber yellow color clearly tells people to keep out. This yellow security lighting also sends the psychological message that the premises are well guarded. 3.4.5 Compact Short-Arc Lamps Enclosed short-arc lamps comprise a broad class of lamps in which the arc discharge takes place between two closely spaced electrodes, usually tungsten, and is contained in a rugged transparent or frosted bulb. The spectrum radiated by these lamps is usually determined by the elements and chemical compounds inside. They are called short-arc because the arc is short compared with its electrode size, spacing, and bulb size and operates at relatively high currents and low voltages. Such lamps are available with power ratings ranging from less than 50 watts to more than 25 kilowatts. These lamps usually operate at less than 100 volts, although they need a high-voltage pulse (several thousand volts) to start. Most short-arc lamps operate on AC or DC power and require some form of current-regulating device (ballast) to maintain a uniform output radiation. Several factors limit the useful life of compact lamps compared with HID lamps, especially the high current density required which reduces electrode lifetime. Compact short-arc lamps generally have a life in the low thousands of hours and operate at internal pressures up to hundreds of atmospheres. Therefore they must be operated in protected enclosures and handled with care. The most common short-arc lamps are mercury, mercuryxenon, and xenon. Figure 3-13 shows the spectral output of mercury-xenon lamps. Natural and Artificial Lighting 61 RELATIVE SPECTRAL INTENSITY 380 nm 780 nm VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM INFRARED (IR) SPECTRUM 100 80 XENON MERCURY (HG) 60 40 MERCURY–XENON 20 FIGURE 3-13 700 RED BLUE 600 ORANGE 500 GREEN YELLOW 400 VIOLET 0 800 900 1000 1100 WAVELENGTH (NANOMETERS) Spectral outputs of mercury–xenon lamps Short-arc xenon lamps are not common in security applications because of their high cost and short lifetime. However, they play an important role for IR sources used in covert surveillance. The light output from the mercury arc lamp is primarily in the blue region of the visible spectrum and therefore only fair results are obtained with monochrome CCD or CMOS solid-state cameras. Despite mercury lighting’s good appearance to the human eye, typical solid-state cameras respond poorly to it. The mercury-xenon lamp, containing a small amount of mercury in addition to xenon gas, offers fair color rendition. Immediately after lamp ignition the output is essentially the same as the spectrum of a xenon lamp. The xenon gas produces a background continuum that improves the color rendition. As the mercury vaporizes over several minutes, the spectral output becomes that of mercury vapor, with light output in the blue, green, yellow, and orange portions of the spectrum. The xenon shortarc’s luminous efficiency ranges from 20 to 53 lumens per watt over lamp wattage ranges of 200–7000 watts. The color temperature of the arc is approximately 6000 K, which is almost identical to that of sunlight. The xenon lamp output consists of specific colors as well as a continuum and some IR radiation, and produces similar color lighting to that of the sun (Figure 3-13). The greater percentage of the continuum radiation at all wavelengths closely matches the spectral radiation characteristics of sunlight. Compared with all other short-arc lamps, the xenon lamp is the ideal artificial light choice for accurate color rendition. The lamp spectral output does not change with lamp life, so color rendition is good over the useful life of the lamp. Color output is virtually independent of operating temperature and pressure, thereby ensuring good color rendition under adverse operating conditions. Xenon lamps are turned on with starting voltage pulses of 10–50 kilovolts (kV). Typical lamps reach full output intensity within a few seconds after ignition. The luminous efficiency of the xenon lamp ranges from 15 to 50 lumens per watt over a wattage range of approximately 75 to 10,000 watts. A characteristic unique to the compact short-arc lamp is the small size of the radiating source, usually a fraction of a millimeter to a few millimeters in diameter. Due to optical characteristics, one lamp in a suitable reflector can produce a very concentrated beam of light. Parabolic and spherical reflectors, among others, are used to provide optical control of the lamp output: the parabolic for search- or spotlights and the spherical for floodlights. Compact short-arc lamps are often mounted in a parabolic reflector to produce a highly collimated beam used to 62 CCTV Surveillance housing to illuminate the scene. The second technique uses a non-thermal IR LED or LED array to generate IR radiation through electronic recombination in a semiconductor device. Both techniques produce narrow or wide beams, resulting in excellent images when the scene is viewed with an IR-sensitive camera, such as a solid-state CCD, CMOS, or ICCD camera. illuminate distant objects. This configuration also produces an excellent IR spotlight when an IR transmitting filter is mounted in front of the lamp. Even when not used for spotlighting, the small arc size of compact shortarc lamps allows the luminaire reflector to be significantly smaller than other lamp reflectors. Mounting orientation can affect the performance of short-arc lamps. Most xenon lamps are designed for vertical or horizontal operation but many mercury-xenon and mercury lamps must be operated vertically to prevent premature burnout. 3.4.6.1 Filtered Lamp Infrared Source Xenon and incandescent lamps can illuminate a scene many hundreds of feet from the camera and produce sufficient IR radiation to be practical for a covert video system (Figure 3-9). Since thermal IR sources (tungsten, xenon lamps) consume significant amounts of power and become hot, they may require a special heat sink or air cooling to operate continuously. Figure 3-14 shows the configuration of several tungsten and xenon lamp IR sources that produce IR beams and that have built-in reflectors and IR transmitting (visual blocking) filters. These lamp systems use thin-film dichroic optical coatings (a light-beam splitter) and absorbing filters that direct the very-near-IR rays toward the front of the lamp and out into the beam, while reflecting visible and long-IR radiation to the back of the lamp, where it is absorbed 3.4.6 Infrared Lighting A covert IR lighting system is a solution when conventional security lighting is not appropriate, for example when the presence of a security system (1) would attract unwanted attention, (2) would alert intruders to a video surveillance system, or (3) would disturb neighbors. There are two generic techniques for producing IR lighting. One method uses the IR energy from a thermal incandescent or xenon lamp. These IR sources are fitted with optical filters that block the visible radiation so that only IR radiation is transmitted from the lamp IR TRANSMITTING FILTER PARABOLIC REFLECTOR XENON SHORT ARC LAMP SPOT OR FLOOD LAMP (PAR) METAL HOUSING WITH COOLING FINS (HEAT SINK) AND CONVECTION NON-VISIBLE IR ENERGY OUT IR TRANSMITTING FILTER ELECTRICAL POWER IN • 117 VAC • 24 VAC • 12 VDC RETAINING RING SWIVEL MOUNT REFLECTED OR ABSORBED VISIBLE LIGHT ENERGY NOTE: PAR—PARABOLIC ALUMINIZED REFLECTOR FIGURE 3-14 Thermal IR source configurations Natural and Artificial Lighting HEAT SINK HOUSING 63 TUNGSTEN HALOGEN LAMP (INSIDE) IR TRANSMITTING VISIBLE BLOCKING WINDOW/FILTER FIGURE 3-15 High-efficiency thermal (IR) lamp by the housing material. The housing acts as an efficient heat sink that effectively dissipates the heat. The system can operate continuously in hot environments without a cooling fan. An especially efficient configuration using a tungstenhalogen lamp as the radiating source and a unique filtering and cooling technique is shown in Figure 3-15. The figure shows the functioning parts of a 500-watt IR illuminating source using a type PAR 56 encapsulated tungsten-halogen lamp. The PAR 56 lamp filament operates at a temperature of approximately 3000 K and has an average rated life of 2000–4000 hours. The lamp’s optical dichroic mirror coatings on the internal surfaces of the reflector and front cover lens are made of multiple layers of silicon dioxide and titanium dioxide. In addition to this interference filter, there is a “cold” mirror—a quartzsubstrate shield—between the tungsten-halogen lamp and the coated cover lens to control direct visible-light output from the filament. The lamp optics have a visible absorbing filter between the lamp and the front lens that transmits less than 0.1% of all wavelengths shorter than 730 nanometers. This includes the entire visible spectrum. The compound effect of this filtering ensures that only IR radiation leaves the front of the lamp and that visible and long-IR radiation (longer than is useful to the silicon camera sensor) cannot leave the front of the lamp. The lamp output is consequently totally invisible to the human eye. The IR lamp system is available with different front lenses to produce beam patterns for a wide range of applications, covering wide scene illumination to long-range spotlighting. Table 3-2 summarizes the types of lamp lenses available and the horizontal and vertical beam angles they produce. These beam angles vary from 12 for a narrow beam (spotlight) to 68 for a very wide beam (flood lamp). 3.4.6.2 Infrared-Emitting Diodes Video security systems for covert and nighttime illumination are using IR LEDs consisting of an array of gallium arsenide (GaAs) semiconductor diodes. These LEDs emit a narrow band of deep red 880 nm or IR 950 nm radiation and no other discernible visible light. These efficient devices typically convert 50% of electrical energy to optical IR radiation. They operate just slightly above room temperature, dissipate little heat and therefore usually require minimum cooling. The light is generated in the diode at the PN-junction and emits IR radiation when electrically biased in a forward direction. The 800–900 nm IR energy is directed toward the magnifying dome lens built into each LED emitter and directed toward the scene. To adequately illuminate an entire scene requires an array of ten, hundred to several hundred diodes that are connected in series with the power source. The array is powered from a conventional 12 VDC or 117 VAC source. The IR light output from each diode adds up to produce enough radiation to illuminate the scene and target with sufficient IR energy to produce a good video picture with a solid-state CCD or CMOS camera. Figure 3-16 shows an IR LED GaAs array that produces a high-efficiency IR beam for covert and nighttime applications. 3.4.6.3 Thermal (Heat) IR Source All objects emit light when sufficiently hot. Changing the temperature of an object changes the intensity and color of the light emitted from it. For instance, iron glows dull red when first heated, then red-orange when it becomes hotter and eventually white hot. In a steel mill, molten iron appears yellow–white because it is hotter than the red-orange of the lower-temperature iron. The tungsten filament of an incandescent lamp is hotter yet and emits 64 CCTV Surveillance SOURCE TYPE INPUT POWER (WATTS) (VOLTAGE) BEAM ANGLE (DEGREES) MAXIMUM RANGE (ft) 100 60 HORIZ 60 VERT 30 WIDE FLOOD FILTERED WI INCANDESCENT SPOT FILTERED WI INCANDESCENT 100 10 HORIZ 10 VERT 200 WIDE FLOOD FILTERED WI INCANDESCENT 500 40 HORIZ 16 VERT 90 SPOT FILTERED WI INCANDESCENT 500 12 HORIZ 8 VERT 450 FLOOD FILTERED XENON ARC 400 (AC) 40 500 SPOT FILTERED XENON ARC 400 (AC) 12 1500 FLOOD LED 50 (12 VDC) 30 200 FLOOD LED 8 (12 VDC) 40 70 WI = TUNGSTEN HALOGEN LED = LIGHT EMITTING DIODE (880 nm-DEEP RED GLOW, 950 nm-INVISIBLE IR) WI AND XENON THERMAL LAMPS USE VISUAL BLOCKING FILTERS Table 3-2 Beam angles for IR Lamps nearly white light. Any object that is hot enough to glow is said to be incandescent: hence the term for heatedfilament bulbs. A meaningful parameter for describing color is the color temperature or apparent color temperature of an object when heated to various temperatures. In the laboratory a special radiating source that emits radiation with 100% efficiency at all wavelengths when heated is called a blackbody radiator. The blackbody radiator emits energy in the ultraviolet, visible, and infrared spectrums following specific physical laws. Tungsten lamps and the sun radiate energy like a blackbody because they radiate with a continuous spectrum, that is, they emit at all wavelengths and colors. Other sources such as mercury, fluorescent, sodium, and metal-arc lamps do not emit a continuous spectrum but only produce narrow bands of colors: mercury produces a green–blue band; sodium produces a yellow–orange band. Thermal IR cameras are used to view temperature differences in objects in a scene (Chapter 19). 3.5 LIGHTING DESIGN CONSIDERATIONS The design of the lighting system for video security systems requires consideration of: (1) initial installation cost, (2) efficiency of lamp type chosen, (3) cost of operation, (4) maintenance costs, (5) spectral intensity, and (6) beam angle of the lamp and luminaire. 3.5.1 Lighting Costs The cost of lighting an indoor or an outdoor area depends on factors including: (1) initial installation, (2) maintenance, and (3) operating costs (energy usage). The initial installation costs are lowest for incandescent lighting, followed by fluorescent lighting, and then by HID lamps. All incandescent lamps can be connected directly to a voltage supply. They are available for alternating current electrical supply voltages of: 240, 120, and 24 VAC and direct current 12 VDC with no need for electrical ballasting or highvoltage starting circuits. All that is required is a suitably designed luminaire that directs the lamp illumination into the desired beam pattern. Some incandescent lamps are pre-focused with built-in luminaires to produce spot or flood beam coverage. Fluorescent lamps are installed in diffuse light reflectors and require only an igniter and simple ballast for starting and running. HID lamps require more complex ballast networks, which are more expensive, larger and bulkier, consume electrical power, and add to installation and operating costs. All lamps and lamp fixtures are designed for easy lamp replacement. Fluorescent and HID lamps that have Natural and Artificial Lighting 65 90–130 ft SINGLE LED IR SOURCE 60–80 ft BEAM PROFILE (DISPERSION) 50° RELATIVE SPECTRAL INTENSITY 20° 380 nm 780 nm VISIBLE SPECTRUM (380–780 nm) UV SPECTRUM INFRARED (IR) SPECTRUM 880 950 100 80 60 40 FIGURE 3-16 700 800 900 1000 1100 RED 600 ORANGE 500 GREEN YELLOW 400 BLUE 0 VIOLET 20 WAVELENGTH (NANOMETERS) Single LED and LED array beam output characteristics ballast modules and high-voltage starting circuits require additional maintenance since they will fail sometime during the lifetime of the installation. Table 3-3 compares the common lamp types including the deep red and IR LEDs. 3.5.1.1 Operating Costs Energy efficiency of the illumination system must be considered in a video security system. Translated into dollars and cents, this relates to the number of lumens or light output per kilowatt of energy input that additional lighting might cost or that could be saved if an LLL ICCD video camera or thermal IR camera was installed. The amount of light available directly affects the quality and quantity of intelligence on the video monitor. If the lighting already exists on the premises, the security professional must determine quantitatively whether the lamp type is suitable and the amount of lighting is sufficient. The result of a site survey will determine whether more lighting must be added. Computer design programs are available to calculate the location and size of the lamps necessary to illuminate an area with a specified number of FtCds. If adding lighting is an option, the analysis will compare that cost with the cost of installing more sensitive and expensive video cameras. If the video security system includes color cameras, the choice of lighting becomes even more critical. All color cameras require a higher level of lighting than their monochrome counterparts. To produce a color image having a signal-to-noise ratio or noise-free picture as good as a monochrome system, as much as ten times more lighting is required. To obtain faithful color reproduction of facial tones, objects, and other articles in the scene, the light sources chosen or already installed must produce enough of these colors for the camera to detect and balance them. Since a large number of different generic lighting types are currently installed in industrial and public sites, the security professional must be knowledgeable in the spectral output of such lights. 66 CCTV Surveillance EFFICIENC Y * LUMENS/WATT SPECTRAL OUTPUT TYPE LIFETIME (HOURS) POWER RANGE (WATTS) WARM–UP/ RESTRIKE (MINUTES) INITIAL MEAN BLUE–GREEN 32–63 25–43 16,000–24,000 50–1,000 5–7/3–6 HIGH PRESSURE SODIUM YELLOW–WHITE 64–140 58–126 20,000–24,000 35–1,000 3–4/1 METAL ARC: METAL HALIDE MULTI-VAPOR GREEN–YELLOW 80–115 57–92 10,000–20,000 175–1,000 2–4/10–15 WHITE 74–100 49–92 12,000–20,000 28–215 IMMEDIATE YELLOW–WHITE YELLOW–WHITE 17–24 15–23 750–1,000 2,000 100–1,500 IMMEDIATE IMMEDIATE MERCURY FLUORESCENT INCANDESCENT: TUNGSTEN TUNGSTEN HALOGEN * REFERRED TO AS EFFICACY IN LIGHTING (LUMENS/WATT) Table 3-3 Comparison of Lamp Characteristics TYPE LIFETIME (HOURS) INITIAL COST OPERATING COST TOTAL OWNING AND OPERATING COST MERCURY 16,000–24,000 HIGH MEDIUM MEDIUM HIGH PRESSURE SODIUM 20,000–24,000 HIGH LOW LOW METAL ARC: METAL HALIDE MULTI-VAPOR 10,000–20,000 HIGH LOW LOW FLUORESCENT 12,000–20,000 MEDIUM MEDIUM MEDIUM LOW HIGH HIGH INCANDESCENT: TUNGSTEN HALOGEN TUNGSTEN Table 3-4 750–1,000 2,000 Light Output vs. Lamp Type over Rated Life Since the lamp operating costs often exceeds the initial installation and maintenance costs put together, it is important to know the efficacy of each lamp type. To appreciate the significant differences in operating costs for the different lamp types, Table 3-4 compares the average light output over the life of each lamp. For the various models of incandescent, mercury vapor (HID), fluorescent, and high-pressure sodium lamps, lamp life in hours is compared with input power and operating cost, kilowatt-hours (kWh) used, based on 4000 hours of annual operation. The comparisons are made for lamps used in different applications, including duskto-dawn lighting, wall-mounted aerial lighting, and floodlighting. In each application, there is a significant saving in operational costs (energy costs) between the highpressure sodium and fluorescent lamps as compared with the mercury vapor and standard incandescent lamps. Choosing the more efficient lamp over the less efficient one can result in savings of double or triple the operational costs, depending on the cost of electricity in a particular location. 3.5.1.2 Lamp Life Lamp life plays a significant role in determining the cost efficiency of different light sources. Actual lamp replacement costs and labor costs must be considered, as well as the additional risk of interrupted security due to unavailable lighting. Table 3-5 summarizes the average lamp life in hours for most lamp types in use today. At the top of the list are the high- and low-pressure sodium lamps and the HID mercury vapor lamp, each providing approximately 24,000 hours of average lamp life. Next, some fluorescent lamp types have a life of 10,000 hours. At the bottom of the list are the incandescent and quartz-halogen lamps, having rated lives of Natural and Artificial Lighting TYPE LIFETIME (HOURS) POWER IN (WATTS) MERCURY 24,000 100 250 1,000 4,100 12,100 57,500 HIGH PRESSURE SODIUM 24,000 50 150 1,000 4,000 16,000 1,40,000 METAL ARC: METAL HALIDE MULTI-VAPOR 7,500 20,000 3,000 175 400 1,500 14,000 34,000 1,55,000 FLUORESCENT 18,000 12,000 10,000 30 60 215 1,950 5,850 15,000 2,000 250 4,000 INCANDESCENT: TUNGSTEN TUNGSTEN HALOGEN Table 3-5 67 LUMENS OUT (fc) Lamp Life vs. Lamp Type approximately 1000–2000 hours. If changing lamps is inconvenient or costly, high-pressure sodium lamps should be used in place of incandescent types. Using highpressure sodium rather than tungsten will save 12 trips to the site to replace a defective lamp, and having 12 fewer burned-out lamps will reduce the amount of time the video surveillance system will be down. High pressure sodium lamps, however, will not produce good color rendering. Lamp designs require specifications of wattage, voltage, bulb type, base type, efficacy, lumen output, color temperature, life, operating cost, and other special features. Color temperature, power input, and life ratings of a lamp are closely related and cannot be varied independently. For a given wattage, the lumen output and the color temperature decrease as the life expectancy increases. In incandescent lamps, filament power (watts) is roughly proportional to the fourth power of filament temperature. So a lamp operated below its rated voltage has a longer life. A rule of thumb: Filament life is doubled for each 5% reduction in voltage; conversely, filament life is halved for each 5% increase in voltage. 3.5.2 Security Lighting Levels In addition to the lamp parameters and energy requirements, the size and shape of the luminaire, spacing between lamps, and height of the lamp above the surface illuminated must be considered. Although each video application has special illumination requirements, primary responsibility for lighting is usually left to architects or illumination engineers. To provide adequate lighting in an industrial security or safety environment in building hall-ways, stairwells, outdoor perimeters, or parking lot facilities, different lighting designs are needed. Table 3-6 tabulates recommended light-level requirements for locations including parking lots, passenger platforms, building exteriors, and pedestrian walkways. The video system designer or security director often has no option to increase or change installed lighting and must first determine whether the lighting is sufficient for the CCTV application and then make a judicious choice of CCTV camera to obtain a satisfactory picture. If lighting is not sufficient, the existing lighting can sometimes be augmented by “fill-in” lighting at selected locations to provide the extra illumination needed by the camera. Chapters 4, 5, and 19 cover video lenses, cameras, and LLL cameras respectively, and offer some options for video equipment when sufficient lighting is not available. 3.5.3 High-Security Lighting Lighting plays a key role in maintaining high security in correctional facilities. Lighting hardware requires special fixtures to ensure survival under adverse conditions. Highsecurity lamps and luminaires are designed specifically to prevent vandalism and are often manufactured using high-impact molded polycarbonate enclosures to withstand vandalism and punishing weather conditions without breakage or loss of lighting efficiency (Figure 3-17). These luminaires are designed to house incandescent, HID, and other lamp types to provide the necessary light intensity and the full spectrum of color rendition required for monochrome and color video security systems. Most fixtures feature tamper-proof screws that prevent the luminaire from being opened by unauthorized personnel. For indoor applications, high-impact polycarbonate fluorescent lamp luminaires offer a good solution. The molded polycarbonate lenses have molded tabs that 68 CCTV Surveillance LIGHT LEVEL TYPE LOCATION PARKING AREA LOADING DOCKS GARAGES—REPAIR GARAGES—ACTIVE TRAFFIC PRODUCTION/ASSEMBLY AREA ROUGH MACHINE SHOP/SIMPLE ASSY. MEDIUM MACHINE SHOP/MODERATE DIFFICULT ASSY. DIFFICULT MACHINE WORK/ASSY. FINE BENCH/MACHINE WORK, ASSY. STORAGE ROOMS/WAREHOUSES ACTIVE—LARGE/SMALL INACTIVE STORAGE YARDS PARKING-OPEN (HIGH–MEDIUM ACTIVITY) PARKING-COVERED (PARKING, PEDESTRIAL AREA) PARKING ENTRANCES DAY NIGHT FtCd lux INDOOR INDOOR INDOOR INDOOR 5–50 20 50–100 10–20 50–500 200 500–1000 100–200 INDOOR INDOOR 20–50 50–100 200–500 500–1000 INDOOR INDOOR INDOOR 200–500 200–500 15–30 2000–5000 2000–5000 150–300 INDOOR OUTDOOR 5 1–20 50 10–200 OUTDOOR 1–2 10–20 OUTDOOR OUTDOOR 5 5–50 50 50–500 NOTE: 1 FtCd EQUALS APPROXIMATELY 10 lux Table 3-6 Recommended Light Levels for Typical Security Applications UNBREAKABLE HIGH IMPACT RESISTANT POLYCARBONATE PRISMATIC DIFFUSER (A) HIGH PRESSURE SODIUM (HPS) CAST OR WELDED ALUMINUM HOUSING PROTECTS ELECTRICAL COMPONENTS 20 16 12 8 4 0 FIGURE 3-17 High security luminaires ACRYLIC POLYSTYRENE BUTYRATE IMPACT RESISTANT ABS NON-METALLIC COMPARATIVE IMPACT RESISTANCE (NOTCHED IZOD TEST) POLYCARBONATE TAMPERPROOF SCREWS (B) FLOURESCENT Natural and Artificial Lighting engage special slots in the steel-backed plate and prevent the luminaire from being opened, thereby minimizing exposure of the fluorescent lamps to vandalism. Applications include prison cells, juvenile-detention facilities, high-security hospital wards, parking garages, public housing hallways, stairwells, and underground tunnels. 3.6 SUMMARY The quality of the final video picture and the intelligence it conveys depend heavily on the natural and/or artificial light sources illuminating the scene. For optimum results, an analysis of the lamp parameters (spectrum, 69 illumination level, beam pattern) must be made and matched to the spectral and sensitivity characteristics of the camera. Color systems require careful analysis when they are used with natural illumination during daylight hours and with broad-spectrum color-balanced artificial illumination sources. Using multiple light sources having different color balances in the same scene can produce poor color rendition in the video image. If the illumination level is marginal, measure it with a light meter (Chapter 25) to quantify the actual light reaching the camera from the scene. If there is insufficient light for the standard solid-state video camera, augment the lighting with additional fill-in sources or choose a more sensitive ICCD camera (Chapter 19). As with the human eye, lighting holds the key to clear sight. This page intentionally left blank Chapter 4 Lenses and Optics CONTENTS 4.1 4.2 Overview Lens Functions and Properties 4.2.1 Focal Length and Field of View 4.2.1.1 Field-of-View Calculations 4.2.1.1.1 Tables for Scene Sizes vs. FL for 1/4-, 1/3-, and 1/2-Inch Sensors 4.2.1.1.2 Tables for Angular FOV vs. FL for 1/4-, 1/3-, and 1/2-Inch Sensor Sizes 4.2.1.2 Lens and Sensor Formats 4.2.2 Magnification 4.2.2.1 Lens–Camera Sensor Magnification 4.2.2.2 Monitor Magnification 4.2.2.3 Combined Camera and Monitor Magnification 4.2.3 Calculating the Scene Size 4.2.3.1 Converting One Format to Another 4.2.4 Calculating Angular FOV 4.2.5 Lens Finder Kit 4.2.6 Optical Speed: f-number 4.2.7 Depth of Field 4.2.8 Manual and Automatic Iris 4.2.8.1 Manual Iris 4.2.8.2 Automatic-Iris Operation 4.2.9 Auto-Focus Lens 4.2.10 Stabilized Lens 4.3 Fixed Focal Length Lens 4.3.1 Wide-Angle Viewing 4.3.2 Narrow-Angle Telephoto Viewing 4.4 Vari-Focal Lens 4.5 Zoom Lens 4.5.1 Zooming 4.5.2 Lens Operation 4.5.3 Optical Speed 4.5.4 Configurations 4.5.5 Manual or Motorized 4.6 4.7 4.8 4.9 4.5.6 Adding a Pan/Tilt Mechanism 4.5.7 Preset Zoom and Focus 4.5.8 Electrical Connections 4.5.9 Initial Lens Focusing 4.5.10 Zoom Pinhole Lens 4.5.11 Zoom Lens–Camera Module 4.5.12 Zoom Lens Checklist Pinhole Lens 4.6.1 Generic Pinhole Types 4.6.2 Sprinkler Head Pinhole 4.6.3 Mini-Pinhole Special Lenses 4.7.1 Panoramic Lens—360 4.7.2 Fiber-Optic and Bore Scope Optics 4.7.3 Bi-Focal, Tri-Focal Image Splitting Optics 4.7.4 Right-Angle Lens 4.7.5 Relay Lens Comments, Checklist and Questions Summary 4.1 OVERVIEW The function of the camera lens is to collect the reflected light from a scene and focus it onto a camera sensor. Choosing the proper lens is very important, since its choice determines the amount of light received by the camera sensor, the FOV on the monitor, and the quality of the image displayed. Understanding the characteristics of the lenses available and following a step-by-step design procedure simplifies the task and ensures an optimum design. A CCTV lens functions like the human eye. Both collect light reflected from a scene or emitted by a luminous light source and focus the object scene onto some receptor—the retina or the camera sensor. The human eye has a fixed-focal-length (FFL) lens and variable iris 71 72 CCTV Surveillance diaphragm, which compares to an FFL, automatic-iris video lens. The eye has an iris that opens and closes just like an automatic-iris camera lens and automatically adapts to changes in light level. The iris—whether in the eye or in the camera—optimizes the light level reaching the receptor, thereby providing the best possible image. The iris in the eye is a muscle-controlled membrane; the automatic iris in a video lens is a motorized device. Of the many different kinds of lenses used in video security applications the most common is the FFL lens, which is available in wide-angle (90 ), medium-angle (40 ), and narrow-angle (5 ) FOVs. To cover a wide scene and also obtain a close-up (telephoto) view with the same camera, a variable-FOV vari-focal or zoom lens is used. The varifocal lens is used to “fine tune” the focal length (FL) to a specific FL for the application. To further increase the camera’s FOV a zoom lens mounted on a pan/tilt platform is used. The pinhole lens is used for covert video surveillance applications since it has a small front diameter and can easily be hidden. There are many other specialty lenses, including split-image, fiber optic, right-angle, and automatic focus. A relatively new lens—the panoramic 360 lens—is used to obtain a 360 horizontal by up to 90 vertical FOV. This lens must be used with a digital computer and software algorithm to make use of the donut-shaped image it produces on the camera sensor. The software converts the image to a 360 panoramic display. 4.2 LENS FUNCTIONS AND PROPERTIES A lens focuses an image of the scene onto the CCTV camera sensor (Figure 4-1). The sensor can be a CCD, CMOS, ICCD, or thermal IR imager. The lens in a human and a camera have some similarities: they both collect light and focus it onto a receptor (Figure 4-2). They have one important difference: the human lens has one FFL and the retina is one size, but the camera lens may have many different FLs and the sensor may have different sizes. The unaided human eye is limited to seeing a fixed and constant FOV, whereas the video system can be modified to obtain a range of FOVs. The eye has an automatic-iris diaphragm to optimize the light level reaching the retina. The camera lens has an iris (either manual or automatic) to control the light level reaching the sensor (Figure 4-3). NATURAL OR ARTIFICIAL ILLUMINATION SOURCE SCENE VIEWED BY CAMERA /LENS D V REFLECTED LIGHT FROM SENSOR HEIGHT LENS H WIDTH C OR CS MOUNT CAMERA CCD, CMOS, IR SENSOR LENS FIELD OF VIEW (FOV) V = VERTICAL HEIGHT VIDEO OUT H = HORIZONTAL WIDTH D = DIAGONAL H × V = CAMERA SENSOR FOV FIGURE 4-1 CCTV camera/lens, scene, and source illumination POWER IN Lenses and Optics SENSOR FORMAT SCENE 2/3" 1/2" 1/3" 1/4" SCENE CAMERA SENSOR FIELD OF VIEW (FOV) LENS IRIS EYE RETINA CAMERA SENSOR EYE FIELD OF VIEW EYE MAGNIFICATION = 1 17 mm EYE LENS FOCAL LENGTH ≈ 17 mm (0.67") FIGURE 4-2 Comparing the human eye to a CCTV lens and camera sensor EYE AUTOMATIC IRIS IRIS ALMOST CLOSED WHEN VIEWING BRIGHT SCENE (SUN) METAL LEAVES OPEN AND CLOSE BY MOVING LENS IRIS RING IRIS HALF CLOSED WHEN VIEWING NORMAL SCENE (INDOORS) IRIS WIDE OPEN WHEN VEIWING DARK SCENE (NIGHT TIME) IRIS OPEN HALF CLOSED IRIS NEARLY CLOSED CCTV LENS AUTOMATIC IRIS MOTOR DRIVEN IRIS DRIVE MOTOR/GEAR MOTOR DRIVE FIGURE 4-3 Comparing the human eye and CCTV camera lens iris VIDEO SIGNAL 73 74 CCTV Surveillance 4.2.1 Focal Length and Field of View In the human eye, magnification and FOV are set by the lens FL and retina size. When the human eye and the video camera lens and sensor see the same basic picture, they are said to have the same FOV and magnification. In practice, a lens that has an FL and FOV similar to that of the human eye is referred to as a normal lens with a magnification M = 1. The human eye’s focal length—the distance from the center of the lens at the front of the eye to the retina in the back of the eye—is about 17 mm (0.67 inch) (Figure 4-2). Most people see approximately the same FOV and magnification (M = 1). Specifically, the video lens and camera format corresponding to the M = 1 condition is a 25 mm FL lens on a 1-inch (diagonal) format camera, a 16 mm lens on a 2/3-inch format camera, a 12.5 mm lens on a 1/2-inch camera, an 8 mm lens on a 1/3-inch camera, and a 6 mm lens on a 1/4-inch sensor. The 1-inch format designation was derived from the development of the original vidicon television tube, which had a nominal tube diameter of 1 inch (25.4 mm) and an actual scanned area (active sensor size) of approximately 16 mm in diameter. Figure 4-4 shows the FOV as seen with a lens having magnifications of 1, 3, and 1/3 respectively. MONITOR MONITOR MONITOR FIGURE 4-4 NARROW ANGLE M=3 NORMAL M=1 WIDE ANGLE M = 1/3 Lens FOV for magnifications of 3, 1, and 1/3 Lenses with much shorter FL used with these sensors are referred to as wide-angle lenses and lenses with much longer FL are referred to as narrow-angle (telephoto) lens. Between these two are medium FL lenses. Telephoto lenses used with video cameras act like a telescope: they magnify the image viewed, narrow the FOV, and effectively bring the object of interest closer to the eye. While there is no device similar to the telescope for the wide-angle example, if there were, the device would broaden the FOV, allowing the eye to see a wider scene than is normal and at the same time causing objects to appear farther away from the eye. One can see this condition when looking through a telescope backwards. This also occurs with the automobile passenger side-view mirror, a concave mirror that causes the scene image to appear farther away, and therefore smaller than it actually is (de-magnified). Just as your own eyes have a specific FOV—the scene you can see—so does the video camera. The camera FOV is determined by the simple geometry shown in Figure 4-5. The scene has a width (W ) and a height (H ) and is at a distance (D) away from the camera lens. Once the scene has been chosen, three factors determine the correct FL lens to use: (1) the size of the scene (H W ), (2) the distance between the scene and camera lens (D), and (3) the camera image sensor size (1/4-, 1/3-, or 1/2-inch format). Lenses and Optics 75 SENSOR GEOMETRY SENSOR TUBE VERTICAL = 3 UNITS HIGH SOLID STATE (CCD) d = DIAGONAL v = VERTICAL v HORIZONTAL = 4 UNITS WIDE CCTV CAMERA d h h = HORIZONTAL CAMERA SENSOR FOV LENS D D = DISTANCE FROM SCENE TO LENS SCENE SENSOR SIZE W H FORMAT HORIZONTAL WIDTH VERTICAL HEIGHT VERTICAL(v ) mm inch mm inch mm inch 1" 16 0.63 12.8 0.50 9.6 0.38 2/3" 11 0.43 8.8 0.35 6.6 0.26 1/2" 8 0.31 6.4 0.25 4.8 0.19 1/3" 6 0.24 4.8 0.19 3.6 0.14 1/4" 4 3 0.16 0.12 3.2 2.4 0.13 2.4 1.8 0.1 0.07 1/6" FIGURE 4-5 DIAGONAL(d ) HORIZONTAL(h) 0.09 Camera/lens sensor geometry and formats 4.2.1.1 Field-of-View Calculations There are many tables, graphs, monographs, and linear and circular slide rules for determining the angles and sizes of a scene viewed at varying distances by a video camera with a given sensor format and FL lens. One convenient aid in the form of transparent circular scales, called a “Lens Finder Kit,” eliminates the calculations required to choose a video camera lens (Section 4.2.5). Such kits are based on the simple geometry shown in Figure 4-6. Since light travels in straight lines, the action of a lens can be drawn on paper and easily understood. Bear in mind that while commercial video lenses are constructed from multiple lens elements, the single lens shown in Figure 4-6 for the purpose of calculation has the same effective FL as the video lens. By simple geometry, the scene size viewed by the sensor is inversely proportional to the lens FL. Shown in Figure 4-6 is a camera sensor of horizontal width (h) and vertical height (v). For a 1/2-inch CCD sensor, this would correspond to h = 64 mm and v = 48 mm. The lens FL is the distance behind the lens at which the image of a distant object (scene) would focus. The figure shows the projected area of the sensor on the scene at some distance D from the lens. Using the eye analogy, the sensor and lens project a scene W wide × H high (the eye sees a circle as did the original vidicon). As with the human eye, the video lens inverts the image, but the human brain and the electronics re-inverts the image in the camera to provide an upright image. Figure 4-6 shows how to measure or calculate the scene size (W × H ) as detected by a rectangular video sensor format and lens with horizontal and vertical angular FOVs H and V , respectively. 4.2.1.1.1 Tables for Scene Sizes vs. FL for 1/4-, 1/3-, and 1/2-Inch Sensors Tables 4-1, 4-2, and 4-3 give scene-size values for the 1/4-, 1/3-, and 1/2-inch sensors, respectively, as a function of the distance from the camera to the object and the lens FL. The tables include scene sizes for most available lenses ranging from 2.1 to 150 mm FL. To find the horizontal FOV H , we use the geometry of similar triangles: h FL = W D W = h ×D FL (4-1) The horizontal angular FOV H is then derived as follows: tan h/2 H = 2 FL h H = tan−1 2 2 FL h H = 2 tan−1 2 FL (4-2) 76 CCTV Surveillance CAMERA LOCATION SCENE LOCATION SCENE v = SENSOR VERTICAL HEIGHT h = SENSOR HORIZONTAL WIDTH θH = HORIZONTAL ANGLE OF VIEW FIXED FOCAL LENGTH LENS H W CAMERA SENSOR θV/2 θV = VERTICAL ANGLE OF VIEW D v h θH/2 FL H = SCENE HEIGHT W = SCENE WIDTH FIGURE 4-6 Sensor, lens and scene geometry LENS FOCAL LENGTH (mm) 2.1 2.2 2.3 2.6 3.0 3.6 3.8 4.0 4.3 6.0 8.0 12.0 16.0 25.0 ANGULAR FIELD OF VIEW: H × V (DEG.) 81.2 × 60.9 78.6 × 59.0 76.1 × 57.1 69.4 × 52 61.9 × 46.4 53.1 × 39.8 50.7 × 38.0 48.5 × 36.4 45.4 × 34.1 33.4 × 25.0 25.4 × 19.0 17.1 × 12.8 12.8 × 9.6 8.2 × 6.2 5 W ×H 8.6 × 6.4 8.2 × 6.1 7.8 × 5.9 6.9 × 5.2 6.0 × 4.5 5.0 × 3.8 4.7 × 3.6 4.5 × 3.4 4.2 × 3.1 3 × 2.3 2.3 × 1.7 1.5 × 1.1 1.1 × .8 .72 × .54 1/4 - INCH SENSOR FORMAT LENS GUIDE CAMERA TO SCENE DISTANCE (D) IN FEET WIDTH AND HEIGHT OF AREA (W × H ) IN FEET 10 20 30 40 50 75 W ×H W ×H W ×H W ×H W ×H W ×H 69 × 51 86 × 64 129 × 96 17 × 12.9 51 × 39 34 × 26 65 × 49 16.4 × 12.2 82 × 63 123 × 92 49 × 37 33 × 25 15.6 × 11.8 78 × 59 117 × 86 62 × 47 31 × 23 47 × 35 13.9 × 10.4 69 × 52 104 × 78 55 × 42 28 × 21 42 × 31 36 × 27 48 × 36 60 × 45 90 × 68 12 × 9 24 × 18 10 × 7.5 40 × 30 50 × 38 75 × 57 30 × 23 20 × 15 28 × 21 9.5 × 7.1 38 × 28 47 × 36 71 × 54 19 × 14 9 × 6.8 36 × 27 45 × 34 68 × 51 27 × 20 18 × 14 8.4 × 6.3 16.7 × 12.5 25 × 19 33 × 25 42 × 31 63 × 47 6 × 4.5 24 × 18 30 × 23 45 × 35 18 × 13.5 12 × 9 4.5 × 3.4 23 × 17 35 × 26 13.5 × 10.1 18 × 13.5 9 × 6.8 3 × 2.2 12 × 9 15 × 11 23 × 17 9.0 × 6.8 6 × 4.4 2.3 × 1.7 9 × 6.8 11.2 × 8.4 17 × 13 6.8 × 5.1 4.5 × 3.4 1.4 × 1.1 5.8 × 4.3 7.2 × 5.4 10.8 × 8.1 4.3 × 3.2 2.9 × 2.1 NOTE: 1/4 - INCH LENSES ARE DESIGNED FOR 1/4 - INCH SENSOR FORMATS ONLY AND WILLNOT WORK ON 1/3 - INCH OR 1/2 - INCH SENSORS. LENS FOCAL LENGTHS ARE NOMINAL PER MANUFACTURERS’ LITERATURE. ANGULAR FOV AND W × H ARE DERIVED FROM EQUATIONS 4 - 1 TO 4 - 4 AND VERTICAL FOV FROM STANDARD 4:3 MONITOR RATIO: V = 0.75H. Table 4-1 1/4-Inch Sensor FOV and Scene Sizes vs. FL and Camera-to-Scene Distance 100 W ×H 171 × 129 164 × 123 157 × 117 138 × 104 120 × 90 100 × 76 94 × 72 90 × 68 84 × 62 60 × 46 46 × 34 30 × 23 22 × 17 14.4 × 10.8 Lenses and Optics 1/3-INCH SENSOR FORMAT LENS GUIDE LENS FOCAL LENGTH (mm) ANGULAR FIELD OF VIEW: H × V (DEG.) 2.3 2.6 2.8 3.6 3.8 4.0 4.5 6.0 8.0 12.0 16.0 25.0 50.0 75.0 CAMERA TO SCENE DISTANCE (D) IN FEET WIDTH AND HEIGHT OF AREA (W × H ) IN FEET 92.4 × 69.3 5 W ×H 10.4 × 7.8 10 W ×H 20.8 × 15.6 20 W ×H 41.6 ×3 1.2 30 W ×H 63 × 47 40 W ×H 83 × 62 85.4 × 64.1 81.2 × 60.9 67.4 × 50.5 64.6 × 48.4 61.9 × 46.4 56.1 × 42.1 43.6 × 32.7 33.4 × 25.0 26.6 × 20.0 17.1 × 12.8 11.0 × 8.2 5.5 × 4.1 3.7 × 2.8 9.2 × 6.9 8.6 × 6.5 6.7 × 5.0 6.3 × 4.7 6.0 × 4.5 5.3 × 4.0 4.0 × 3.0 3.0 × 2.3 2.0 × 1.5 1.5 × 1.2 .96 × .72 .48 × .36 .32 × .24 18.5 × 13.8 17.2 × 13 13.3 × 10 12.6 × 9.5 12 × 9 10.6 × 8 8.0 × 6 6 × 4.5 4.0 × 3.0 3.0 × 2.3 1.9 × 1.4 .96 × .72 .64 × .50 36.8 × 27.6 34.4 × 26 26.7 × 20 25 × 18.9 24 × 18 21.2 × 15.9 16 × 12 12 × 9 8.0 × 6.0 6.0 × 4.5 3.8 × 2.9 1.9 × 1.4 1.3 × .96 55 × 41 51 × 39 40 × 30 37.9 × 28.4 36 × 27 31.8 × 23.9 24 × 18 18 × 13.5 12.0 × 9.0 9.0 × 6.8 5.8 × 4.4 2.9 × 2.2 1.9 × 1.4 77 × 58 69 × 52 53 × 40 50.5 × 37.9 48 × 36 42.4 × 31.8 32 × 24 24 × 18 16 × 12 12.0 × 9.0 7.7 × 5.8 3.8 × 2.8 2.6 × 1.9 50 W ×H 104 × 78 75 W ×H 156 × 117 100 W ×H 208 × 156 92 × 69 86 × 65 67 × 50 63 × 47 60 × 45 53 × 40 40 × 30 30 × 22.5 20 × 15 15.0 × 11.3 9.6 × 7.2 4.8 × 3.6 3.2 × 2.4 138 × 104 129 × 98 101 × 75 95 × 71 90 × 68 80 × 60 60 × 45 45 × 34 30 × 23 23 × 17 14.4 × 10.8 7.2 × 5.4 4.8 × 3.6 184 × 138 172 × 130 134 × 100 123 × 92 120 × 90 106 × 80 80 × 60 60 × 45 40 × 30 30 × 22.5 19.2 × 14.4 9.6 × 7.2 6.4 × 4.8 NOTE: MOST 1/3 - INCH LENSES WILL NOT WORK ON 1/2 - INCH SENSORS BUT ALL WILL WORK ON ALL 1/4 - INCH SENSORS. LENS FOCAL LENGTHS ARE NOMINAL PER MANUFACTURERS’ LITERATURE. ANGULAR FOV AND W × H ARE DERIVED FROM EQUATIONS 4 - 1 TO 4 - 4 AND VERTICAL FOV FROM STANDARD 4:3 MONITOR RATIO: V = 0.75H. Table 4-2 1/3-Inch Sensor FOV and Scene Sizes vs. FL and Camera-to-Scene Distance 1/2-INCH SENSOR FORMAT LENS GUIDE LENS FOCAL LENGTH (mm) ANGULAR FIELD OF VIEW: H × V (DEG.) 1.4 2.6 3.5 3.6 3.7 4.0 4.2 4.5 4.8 6.0 7.5 8.0 12.0 16.0 25.0 50.0 75.0 150.0 133 × 100 101.8 × 76.4 84.9 × 63.7 83.3 × 62.5 81.7 × 61.3 77.3 × 58.0 74.6 × 56.0 70.8 × 53.1 67.4 × 50.5 56.1 × 42.1 46.2 × 34.7 43.6 × 32.7 29.9 × 22.4 22.6 × 17.0 14.6 × 10.9 7.3 × 5.5 4.9 × 3.7 2.4 × 1.8 CAMERA TO SCENE DISTANCE (D) IN FEET WIDTH AND HEIGHT OF AREA (W × H ) IN FEET 5 W ×H 23 × 17 12.3 × 9.2 9.1 × 6.9 8.9 × 6.7 8.6 × 6.5 8.0 × 6.0 7.6 × 5.7 7.1 × 5.3 6.7 × 5.0 5.3 × 4.0 4.3 × 3.2 4.0 × 3.0 2.7 × 2.0 2.0 × 1.5 1.3 × 1.0 .64 × .48 .43 × .32 .21 × .16 10 W ×H 46 × 34 24.6 × 18 18.2 × 13.8 17.8 × 13.4 17.2 × 13.0 16.0 × 12.0 15.2 × 11.4 14.2 × 10.6 13.4 × 10.0 10.6 × 8.0 8.6 × 6.4 8.0 × 6.0 5.3 × 4.0 4.0 × 1.5 2.6 × 2.0 1.3 × 1.0 .85 × .64 .43 × .32 20 W ×H 91 × 69 49 × 37 37 × 28 36 × 27 35 × 26 32 × 24 30 × 23 28 × 21 27 × 20 21 × 16 17.1 × 12.8 16 × 12 10.7 × 8 8×6 5.1 × 3.8 2.6 × 1.9 1.7 × 1.3 .85 × .64 30 W ×H 137 × 103 74 × 55 55 × 41 53 × 40 52 × 39 48 × 36 48 × 34 43 × 32 40 × 30 32 × 24 26 × 19 24 × 18 16 × 12 12 × 9 7.7 × 5.8 3.8 × 2.9 2.6 × 1.9 1.3 × .96 40 W ×H 183 × 137 98 × 74 73 × 55 71 × 53 69 × 52 64 × 24 61 × 46 57 × 43 53 × 40 43 × 32 34 × 26 32 × 24 21.3 × 16 16 × 12 10.2 × 7.7 5.1 × 3.8 3.4 × 2.6 1.7 × 1.3 50 W ×H 228 × 171 123 × 92 91 × 69 89 × 67 86 × 65 80 × 60 76 × 57 71 × 53 67 × 50 53 × 40 43 × 32 40 × 30 27 × 20 20 × 15 12.8 × 9.6 6.4 × 4.8 4.3 × 3.2 2.1 × 1.6 NOTE: ALL 1/2-INCH FORMAT LENSES WILL WORK ON 1/3- AND 1/4 -INCH SENSORS. LENS FOCAL LENGTHS ARE NOMINAL PER MANUFACTURERS’ LITERATURE. ANGULAR FOV AND W × H ARE DERIVED FROM EQUATIONS 4-1 TO 4-4. Table 4-3 1/2-Inch Sensor FOV and Scene Sizes vs. FL and Camera-to-Scene Distance 75 W ×H 342 × 257 185 × 138 137 × 104 134 × 101 129 × 98 120 × 90 114 × 86 107 × 80 101 × 75 80 × 60 65 × 48 60 × 45 41 × 30 30 × 23 19 × 14 6.5 × 4.8 3.2 × 2.4 9.6 × 7.2 100 W ×H 457 × 348 246 × 184 182 × 138 178 × 134 172 × 130 160 × 120 156 × 114 142 × 107 134 × 100 106 × 80 86 × 64 80 × 60 53 × 40 40 × 30 25.6 × 19.2 12.8 × 9.6 8.6 × 6.4 4.3 × 3.2 77 78 CCTV Surveillance For the vertical FOV, similar triangles give: v FL = H D H= v ×D FL (4-3) The vertical angular FOV V is then derived from the geometry: tan v/2 v = 2 FL v v = tan−1 2 2 FL v v = 2 tan−1 2 FL will fit onto the sensor. Lens magnification is measured relative to the eye which is defined as a normal lens. The eye has approximately a 17-mm FL and is equivalent to a 25-mm FL lens on a 1-inch format camera sensor. Therefore, the magnification of a 1-inch (16-mm format) sensor is Lens focal length (mm) Sensor diagonal (mm) Ms = Ms (4-4) 4.2.1.1.2 Tables for Angular FOV vs. FL for 1/4-, 1/3-, and 1/2-Inch Sensor Sizes Table 4-4 shows the angular FOV obtainable with 1/4 -, 1/3-, 1/2-, and 2/3-inch sensors with some standard lenses from 1.4 to 150 mm FL. The values of angular FOV in Table 4-4 can be calculated from Equations 4-2 and 4-4. 4.2.1.2 Lens and Sensor Formats Fixed focal length lenses must be used with either the image sensor size (format) for which they were designed or with a smaller sensor size. They cannot be used with larger sensor sizes because unacceptable image distortion and image darkening (vignetting) at the edges of the image occurs. When a lens manufacturer lists a lens for a 1/3-inch sensor format, it can be used on a 1/4-inch sensor but not on a 1/2-inch sensor without producing image vignetting. This problem of incorrect lens choice for a given format size occurs most often when a C or CS mount 1/3-inch format lens is incorrectly used on a 1/2-inch format camera. Since the lens manufacturer does not “over design” the lens, that is, make glass lens element diameters larger than necessary, check the manufacturer’s specifications for proper choice. 4.2.2 Magnification The overall magnification from a specific camera, lens, and monitor depends on three factors: (1) lens FL, (2) camera sensor format, and (3) the monitor size (diagonal). Video magnification is analogous to film magnification: the sensor is equivalent to the film negative, and the monitor is equivalent to the photo print. 4.2.2.1 Lens–Camera Sensor Magnification The combination of the lens FL and the camera sensor size defines the magnification Ms at the camera location. For a specific camera, the sensor size is fixed. Therefore, no matter how large the image from the lens is at the sensor, the camera will see only as much of the image as 1 FL inch = 16 mm (4-5) For 2/3 inch (11-mm format) the magnification is Ms FL 11 mm = 2/3 inch (4-6) For 1/2 inch (8-mm format) the magnification is Ms 1/2 inch = FL 8 mm (4-7) For 1/3 inch (5.5-mm format) the magnification is Ms 1/3 inch = FL 5.5 mm (4-8) For 1/4 inch (4-mm format) the magnification is Ms 1/4 inch = FL 4 mm (4-9) Example: From Equation 4-7, a 16-mm FL lens on a 21 -inch format camera would have a magnification of Ms 1/2 inch = FL 16 mm = =2 8 mm 8 mm 4.2.2.2 Monitor Magnification When the camera image is displayed on the CCTV monitor, a further magnification of the object scene takes place. The monitor magnification Mm is equivalent to the ratio of the monitor diagonal (dm ) to the sensor diagonal (ds ) or Mmonitor = Mm = dm ds (4-10) Example: From Equation 4-10, for a 9-inch diagonal monitor (dm = 9 inches) and a 1/2 sensor format (ds = 8 mm = 0315 inch) Mm = 9 = 2857 0315 LENS FOCAL LENGTH (mm) 1.4 2.1 2.2 2.3 2.6 2.8 3.0 3.5 3.6 3.7 3.8 4.0 4.2 4.3 4.5 4.8 6.0 7.5 8.0 12.0 16.0 25.0 50.0 75.0 150.0 MAXIMUM IMAGE FORMAT OPTICAL SPEED: f/# 1/2 1/4 1/3 1/3 1/2 1/3 1/4 1/2 1/2 1/2 1/3 1/2 1/2 1/4 1/2 1/2 1/2 2/3 2/3 2/3 2/3 2/3 2/3 2/3 1/2 1.4 1.0 1.2 1.4 1.6 1.2 1.0 1.4 1/6 1.6 1.4 1.2 1.6 1.4 1.4 1.4 1.0 1.4 1.2 1.2 1.4 1.4 1.4 1.4 1.6 LENS MOUNT TYPE CS CS CS CS CS CS CS CS, C CS, C CS CS CS CS CS CS, C CS, C CS CS, C CS CS, C CS, C CS, C CS, C CS, C CS, C 1/4 INCH SENSOR HORIZONTAL 101 91 93 89 72 71 65 59 54 53 51 50 49 42 44 39 33 26 25 28 13 8 4.1 2.8 1.4 VERTICAL 76 70 69 67 54 53 49 44 41 40 39 37 36 35 34 29 25 20 19 13 10 6 3.1 2.1 1.1 CAMERA ANGULAR FIELD OF VIEW (FOV) (DEGREES) 1/3 INCH SENSOR 1/2 INCH SENSOR HORIZONTAL VERTICAL HORIZONTAL VERTICAL 135 101 180 135 113 100 96 85 75 72 128 96 78 72 71 68 65 64 59 54 53 51 50 49 104 93 94 78 71 70 89 87 67 65 59 52 57 35 33 24 17 12 5.5 3.7 1.8 45 39 43 26 25 28 13 9 4.1 2.8 1.4 79 69 57 46 45 30 22 15 7.3 4.8 2.4 59 52 43 35 34 23 17 11 5.5 3.6 1.8 2/3 INCH SENSOR HORIZONTAL VERTICAL 96 74 58 39 31 20 10 6.8 3.3 45 29 23 15 7.5 5 2.5 NOTE: ALL FOCAL LENGTHS AND ANGULAR FOVs BASED ON MANUFACTURER’S LITERATURE. ALL THE LARGER FORMAT LENSES CAN BE USED ON SMALLER FORMAT SENSORS. LENSES ARE ALSO AVAILABLE HAVING SMALLER FORMATS AND LOWER f/#S THAN THOSE LISTED. Lenses and Optics Table 4-4 Representative Fixed Lenses Angular FOV vs. Sensor Format and Lens Focal Length 79 80 CCTV Surveillance For vertical scene height, using Equation 4-1: 4.2.2.3 Combined Camera and Monitor The combined lens, sensor, and monitor magnification is v ×D FL 4.8 mm × 25 ft = 9.6 ft H= 12.5 mm Scene height = H = M = Ms × Mm For the example above and Equation 4-11, the overall magnification of the 8-mm FL lens, 1/2-inch format camera, and a 9-inch monitor is M = Ms × Mm = 2 × 2857 = 5714 Table 4-5 summarizes the magnification for the entire video system, for a 9- and 17-inch monitor and various lenses and camera formats. It should be noted that increasing the magnification by using a larger monitor does not increase the information in the scene; it only increases the size of the displayed picture and permits viewing the monitor from a greater distance. 4.2.3 Calculating the Scene Size Equations 4-1 and 4-3 are used to calculate scene size. For example, calculate the horizontal and vertical scene size as seen by a 1/2-inch CCD sensor using a 12.5 mm FL lens at a distance D = 25 ft. A 1/2-inch sensor is 6.4 mm wide and 4.8 mm high. From Equation 4-1 for horizontal scene width: h Scene width = W = ×D FL 6.4 mm × 25 ft = 12.8 ft W = 12.5 mm CAMERA FORMAT (inch/mm) MONITOR SIZE (inch) 4.2.3.1 Converting One Format to Another To obtain scene sizes (width and height) for a 1/6-inch sensor, divide all the scene sizes in the 1/3-inch table (Table 4-2) by 2. For a 2/3-inch sensor, multiple all the scene sizes in the 1/3-inch table (Table 4-2) by 2. Understanding Tables 4-1, 4-2, and 4-3 makes it easy to choose the right lens for the required FOV coverage. As an example, choose a lens for viewing all of a building 15 feet high by 20 feet long from a distance of 40 feet with a 1/2-inch format video camera (Figure 4-7). From Table 4-3, a 12-mm FL lens will just do the job. If a 1/4-inch format video camera were used, a lens with an FL of 16 mm would be needed (from Table 4-4, a scene 16.7 feet high by 22.5 feet wide would be viewed). If a 1/3-inch format video camera were used, a lens with an FL of 9 mm would be used (from Table 4-2, a scene 15.2 feet high by 20 feet wide would be viewed). 4.2.4 Calculating Angular FOV Equations 4-2 and 4-4 are used to calculate the horizontal and vertical angular FOV of the lens–camera combination. Table 4-4 shows the angular FOV obtainable with some LENS FOCAL LENGTH mm 9 1/6 (0.11/2.75) 17 9 1/4 (0.15/4.0) 17 9 1/3 (0.22/5.5) 17 9 1/2 (0.31/8.0) 17 TOTAL MAGNIFICATION 2.4 72.7 30 909.1 2.4 137.4 30 1717.2 2.6 37.3 25.0 358.3 2.6 70.4 25.0 676.8 3.8 28.7 50.0 377.0 3.8 54.0 50.0 712.2 4.8 17.1 75.0 267.8 4.8 32.4 75.0 506.0 ALL VALUES BASED ON SENSOR AND MONITOR DIAGONAL MAGNIFICATION = Ms × Mm, WHERE Ms = MONITOR DIAGONAL LENS FL and, Mm = SENSOR DIAGONAL SENSOR DIAGONAL EXAMPLE: 1/3-inch FORMAT SENSOR, 3.8 mm FL LENS (0.15 inch), AND 17-inch MONITOR M= Table 4-5 3.8 mm 17 inch = 54 × 0.22 inch 5.5 mm Monitor Magnification vs. Camera/Monitor Size and Lens Focal Length Lenses and Optics 81 SCENE LOCATION CAMERA LOCATION H = SCENE HEIGHT LENS SENSOR SIDE VIEW ANGULAR WIDTH v H = 22.5 ft θV FL D = 40 ft VERTICAL FOV: v/FL = H/D W = SCENE WIDTH LENS SENSOR TOP(PLAN) VIEW ANGULAR WIDTH h θH W = 30 ft FL D = 40 ft HORIZONTAL FOV: h/FL = W/D FIGURE 4-7 Calculating the focal length for viewing a building standard lenses from 2.6 to 75 mm focal length. For the previous example, calculate the horizontal and vertical angular FOVs H and V for a 1/2-inch CCD sensor using a 12.5 mm FL lens. The distance need not be supplied, since an angular measure is independent of distance. From Equation 4-2, for horizontal angular FOV: ratio of the sensor size. Rule of thumb: for a given lens, angular FOV increases for larger sensor size, decreases for smaller sensor size. 4.2.5 Lens Finder Kit h/2 66 mm/2 = = 0264 tan H = 2 FL 125 mm H = 148 2 H = 296 From Equation 4-4 for vertical angular FOV: tan v/2 48 mm/2 v = = = 0192 2 FL 125 mm v = 109 2 v = 218 Table 4-4 summarizes angular FOV values for some standard lenses from 1.4 to 150 mm FL lenses used on the 1/4-, 1/3-, 1/2- and 2/3-inch sensors. To obtain the angular FOV for sensor sizes, multiply or divide the angles by the Tables and slide rules for finding lens angular FOVs abound. Over the years many charts and devices have been available to simplify the task of choosing the best lens for a particular security application. Figure 4-8 shows how to quickly determine the correct lens for an application using the Lens Finder Kit (copyright H. Kruegle). There is a separate scale for each of the three camerasensor sizes: 1/4-, 1/3-, 1/2-inch (the 1/4- and 1/3-inch are shown). The scale for each camera format shows the FL of standard lenses and the corresponding angular horizontal and vertical FOVs that the camera will see. To use the kit, the plastic disk is placed on the facility plan drawing and the lens FL giving the desired camera FOV coverage is chosen. For example, a 1/4-inch format camera is to view a horizontal FOV (H ) in a front lobby 30 feet wide at a distance of 30 feet from the camera (Figure 4-9). What FL lens should be used? 82 CCTV Surveillance THE LENS FINDER KIT USES THREE TRANSPARENT PROTRACTOR DISKS TO HELP CHOOSE THE BEST LENS WHEN USING THE 1/4-, 1/3- AND 1/2-INCH CCTV CAMERA FORMATS WITH C OR CS MOUNTS. THE DISKS ARE UNIVERSAL AND CAN BE USED ON ANY SCALE DRAWING. HOW TO USE: 1. SELECT THE DISK TO MATCH THE CAMERA FORMAT: 1/4-, 1/3- OR 1/2-INCH. 2. USING A SCALE DRAWING OF THE FLOOR PLAN (ANY SCALE), PLACE THE CENTER HOLE OF THE DISK AT THE PROPOSED CAMERA LOCATION ON THE FLOOR PLAN. 3. ROTATE THE DISK UNTIL ONE SEGMENT (PIE SECTION) TOTALLY INCLUDES THE HORIZONTAL FIELD OF VIEW REQUIRED. 4. USE THE FOCAL LENGTH LENS DESIGNATED IN THE SEGMENT ON THE DISK. 5. IF THE SCALE DRAWING INCLUDES AN ELEVATION VIEW, FOLLOW STEPS 1 THROUGH 4 AND USE THE VERTICAL ANGLE DESIGNATED IN EACH PIE SEGMENT FOR THE VERTICAL FIELD OF VIEW OF THE LENS. 88 138 8 17 22 DA WORKS Y 328 8 mm 13 6 mm 6.0 A 7 7 28 3. 458 © H. KRUEGLE 200 AVID SECURITY A W IZ HO RI Z 6m m 5 T. R VE 648 HOR IZ 48 4. 37 3.6 mm 54 HORIZ 4 VER 1 T. VE 488 RT . 8 4 T. R VE 4.0 m m 18 8m m 26 KS OR TIC VER R 6 mm m 718L DE IN m 0m .8 V 64 HO R 3.0 m 49 HO RI ZO N 2 LE N C C SF m " 1 /3 R DE IN AL TIC VER © H. KRUEGLE 2007 12m 1m6m T T 70 8 948 2mm5 m LEN C C SF 11 17 L TA 2.2 m " 1 /4 35 HO RI ZO N m m V 24 93 L TA 12m 8 1m6m 2m5m SECU AVI RI T 13 348 17 248 10 1/3 INCH FORMAT m 1/4 INCH FORMAT NOTE: FOR 2/3- AND 1/2-INCH FORMATS MULTIPLY THE 1/3- AND 1/4-INCH SCALE FOV'S BY 2 FIGURE 4-8 Choosing a lens with the Lens Finder Kit To find the horizontal angular FOV H , draw the following lines to scale on the plan: one line to a distance 30 feet from the camera to the center of the scene to be viewed, a line 30 feet long and perpendicular to the first line, and two lines from the camera location to the endpoints of the second 30-foot line. Place the 1/4-inch Lens Finder Kit on the top view (plan) drawing with its center at the camera location and choose the FL closest to the horizontal angle required. A 3.6 mm FL lens is closest. This lens will see a horizontal scene width of 30 feet. Likewise for scene height: using the side-view (elevation) drawing, the horizontal scene height is 22.5 feet. 4.2.6 Optical Speed: f-number The optical speed or f-number (f/#) of a lens defines its light-gathering ability. The optical speed of a lens—how much light it collects and transmits to the camera sensor— is defined by a parameter called the f-number (f/#). As the FL of a lens becomes longer, its optical aperture or diameter (d) must increase proportionally to keep the f-number the same. The f-number is related to the FL and the lens diameter (clear aperture) d by the following equation: f/# = FL d (4-11) For example, an f/2.0 lens transmits four times as much light as an f/4.0 lens. The f-number relationship is analogous to water flowing through a pipe. If the pipe diameter is doubled, four times as much water flows through it. Likewise, if the f-number is halved (i.e. if the lens diameter is doubled), four times as much light will be transmitted through the lens. In practice the f-number obtained is worse than this because of various losses caused by imperfect lens transmission, reflection, absorption, and other lens imaging properties. The amount of light (I ) collected and transmitted Lenses and Optics 83 WALL 30.6 ft LENS HORIZONTAL FOV PLAN TOP VIEW W = 30.6 ft 30.6 ft θH 30 ft 30 ft H = 23.4 ft CAMERA D = 30 ft LENS CAMERA SIDE VIEW θV 23.4 ft FORMAT: 1/4 INCH 30 ft FIGURE 4-9 Determining lobby lens horizontal and vertical FOVs through the lens system varies inversely as the square of the lens f-number (K = constant): I= K f /#2 Long-FL lenses are larger (and costlier) than short-FL lenses, due to the cost of the larger optical elements. It can be seen from Equation 4-11 that the larger the d is made, the smaller the f/# is, i.e. more light gets to the camera sensor. The more light the lens can collect and transfer to the camera image sensor the better the picture quality: a larger lens permits the camera to operate at lower light levels. This light-gathering ability depends on the size (diameter) of the optics: the larger the optics the more the light that can be collected. Most human eyes have the same size lens (approximately 7 mm lens diameter). In video systems, however, the lens size (the diameter of the front lens) varies over a wide range. The optical speed of video lenses varies significantly: it varies as the square of the diameter of the lens. This means a lens having a diameter twice that of another will pass four times as much light through it. Like a garden hose, when the diameter is doubled, the flow is quadrupled (Figure 4-10). The more the light passing through a lens and reaching the video sensor the better the contrast and picture image quality. Lenses with low f-numbers, such as f/1.4 or f/1.6, pass more light than lenses with high f-numbers. The lens optical speed is related to the FL and diameter by the equation f/# = focal length/diameter. So the larger the FL given the same lens diameter, the “slower” the lens (less light reaches the sensor). A slow lens might have an f-number of f/4 or f/8. Most lenses have an iris ring usually marked with numbers such as 1.4, 2.0, 2.8, 4.0, 5.6, 8.0, 11, 16, 22, C, representing optical speed, f-numbers, or f-stops. The difference between each of the iris settings represents a difference of a factor of 2 in the light transmitted by the lens. Opening the lens from, say, f/2.0 to f/1.4 doubles the light transmitted. Only half the light is transmitted when the iris opening is reduced from, say, f/5.6 to f/8. Changing the iris setting two f-numbers changes the light by a factor of 4 (or 1/4), and so on. Covering the f/# range from f/1.4 to f/22 spans a light-attenuation range of 256 to 1. The C designation on the lens indicates when the lens iris is closed and no light is transmitted. In general, faster lenses collect more light energy from the scene, are larger, and are more expensive. In calculating the overall cost of a video camera lens system, a more expensive, fast lens often overrides the higher cost incurred if a more sensitive camera is needed or additional lighting must be installed. 84 CCTV Surveillance WATER FLOW ANALOGY LENS LIGHT TRANSMISSION D = DIAMETER f / # = OPTICAL SPEED HOSE LENS f/4 D=1 LIGHT TRANSMISSION: 4X f/2 WATER FLOW: 4X D=2 FIGURE 4-10 Light transmission through a lens 4.2.7 Depth of Field The depth of field in an optical system is the distance that an object in the scene can be moved toward or away from the lens and still be in good focus. In other words, it is the range of distance toward and away from the camera lens in which objects in the scene remain in focus. Ideally this range would be very large: say, from a few feet from the lens to hundreds of feet, so that essentially all objects of interest in the scene would be in sharp focus. In practice this is not achieved because the depth of field is: (1) inversely proportional to the focal length, and (2) directly proportional to the f-number. Medium to long FFL lenses operating at low f-numbers—say, f/1.2 to f/4.0—do not focus sharp images over their useful range of from 2 or 3 feet to hundreds of feet. Long focal length lenses—say, 50–300 mm—have a short depth of field and can produce sharp images only over short distances and must be refocused manually or automatically (auto-focus) when viewing objects at different scene distances. When these lenses are used with their iris closed down to, say, f/8 to f/16, the depth of field increases significantly and objects are in sharp focus at almost all distances in the scene. Short focal length lenses (27–5 mm) have a long depth of field. They can produce sharp images from a few feet to 50–100 feet even when operating at low f-numbers 4.2.8 Manual and Automatic Iris The lens iris is either manually or automatically adjusted to optimize the light level reaching the sensor (Figure 4-3). The manual iris is adjusted with a ring on the lens. The auto-iris uses an internal mechanism and motor (or galvanometer) to adjust the iris. 4.2.8.1 Manual Iris The manual-iris video lens has movable metal “leaves” forming the iris. The amount of light entering the camera is determined by rotating an external iris ring, which opens and closes these internal leaves. Figure 4-11 shows a manual iris FFL lens and the change in light transmitted through it at different settings of the iris. Solid-state CCD and CMOS camera sensors can operate over wide light-level changes with manual-iris lenses but require automatic-iris lenses when used over their full light-level range, that is, from bright sunlight to lowlevel nighttime lighting. Some solid-state cameras use electronic shuttering (Section 5.5.3) and do not require an automatic-iris lens. 4.2.8.2 Automatic-Iris Operation Automatic-iris lenses have an electro-optical mechanism whereby the amount of light passing through the lens is adjusted depending on the amount of light available from the scene and the sensitivity of the camera. The camera video signal provides the information used for adjusting the light passing through the lens. The system works something like this: if a scene is too bright for the camera, the video signal will be strong (large in amplitude). This large signal will activate a motor or galvanometer that causes the lens iris circular opening to become smaller in diameter, thereby reducing the amount of light reaching the camera. When the amount of light reaching the camera produces a predetermined signal level, the motor or galvanometer in the lens stops and maintains that light level through the lens. Likewise if too little light reaches the camera, the video camera signal level is small and the automatic-iris motor or galvanometer opens up the iris diaphragm, allowing more light to reach the camera. In both the high and the low light level conditions the automatic-iris mechanism produces the best Lenses and Optics 85 FOCUSING RING MANUAL IRIS f/# MARKINGS LENS f/# LIGHT TRANSMISSION 1 1.4 2.0 2.8 4.0 5.6 8.0 11.0 16.0 22.0 C 1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 0 NOTE: REFERENCE LIGHT LEVEL = 1 AT f/1.4 C = LENS IRIS FULLY CLOSED EACH INCREASE OR DECREASE IN f/# REPRESENTS ONE f STOP FIGURE 4-11 Lens f/# vs. light transmission contrast picture. Automatic-iris lenses are available to compensate the full range of light, from bright sunlight to darkness. There are two types of automatic-iris lenses: direct drive and video drive. The two methods used to control these two lens types are: DC motor (or galvanometer) drive or video signal drive. With the DC drive method the camera has all the electronics and directly drives the DC motor in the lens with a positive or a negative signal to open and close the iris depending on the light level. With the video method the camera video signal drives the electronics in the lens which then drives the DC motor (or galvanometer) in the lens. Figure 4-12 shows some common automatic-iris lenses. A feature available on some automatic-iris lenses is called average-peak response weighting which permits optimizing the picture still further based on the variation in lighting conditions within the scene. Scenes with high-contrast objects (bright headlight, etc.) are better compensated for by setting the automatic-iris control to peak, so that the lens ignores (compensates for) the bright spots and highlights in the scene (see Section 5.5.5). Low-contrast scenes are better compensated by setting the control to average. Figure 4-13 illustrates some actual scenes obtained when these adjustments are made. Automatic-iris lenses should only be used with cameras having a fixed video gain in their system. Automatic-iris lenses are more expensive than their manual counterparts, with the price ratio varying by about two or three to one. 4.2.9 Auto-Focus Lens Auto-focus lenses were originally developed for the consumer camcorder market and are now available to the security market (similar to the solid-state sensor evolution). There are two types of auto-focus techniques in use. One auto-focus system uses a ranging (distance measuring) means to automatically focus the scene image onto the sensor. A second type analyzes the video signal in by means of DSP electronics, and forces the lens to focus on the target in the scene. The function of both these types of systems is to keep objects of interest in focus on the camera sensor even though they move toward or away from the lens. These lenses are particularly useful when a person (or vehicle) enters a camera FOV and moves toward or away from the camera. The auto-focus lens changes focus from the surrounding scene and focuses on the moving object (automatically) to keep the moving object in focus. Various types of automatic-focusing techniques are used, including: (1) active IR ranging, (2) ultrasonic wave, (3) solid-state triangulation, and (4) video signal DSP. 86 CCTV Surveillance (A) DC MOTOR IN LENS, ELECTRONICS IN CAMERA FIGURE 4-12 (B) VIDEO ELECTRONICS IN CAMERA Automatic-iris fixed focal length (FFL) lenses (A) HIGH CONTRAST SCENES OPTIMIZED USING MEDIUM RESPONSE WEIGHTING (B) NORMAL CONTRAST SCENES OPTIMIZED USING PEAK RESPONSE WEIGHTING (C) LOW CONTRAST SCENES OPTIMIZED USING AVERAGE RESPONSE WEIGHTING FIGURE 4-13 Automatic-iris enhanced video scenes 4.2.10 Stabilized Lens A stabilized lens is used when it is necessary to remove unwanted motion of the lens and camera with respect to the scene being viewed. Applications for stabilized lenses include handheld cameras, cameras on moving ground vehicles, airborne platforms, ships, and cameras on towers and buildings. Stabilized lenses can remove significant image vibration (blurring) in pan/tilt mounted cameras that are buffeted by wind or the motion caused by the moving vehicle. The stabilized lens system has movable optical components and/or active electronics that compensate for (move in the opposite direction to) the relative motion between the camera and the scene. An extreme example Lenses and Optics of a stabilized video camera system is the video image from a helicopter. The motion compensation results in a steady, vibration free scene image on the monitor. Figure 4-14 shows a stabilized security lens and samples of pictures taken with and without the stabilization on. 4.3 FIXED FOCAL LENGTH LENS Video lenses come in many varieties, from the simple, small, and inexpensive mini pinhole and “bottle cap” lenses to complex, large, expensive, motorized, automaticiris zoom lenses. Each camera application requires a specific scene to be viewed and a specific intelligence to be extracted from the scene if it is to be useful in a security application. Monitoring a small front lobby or room to see if a person is present may require only a simple lens. It is difficult, however, to determine the activity of a person 100–200 feet away in a large showroom. Apprehending a thief or thwarting a terrorist may require a high-quality, long FL zoom lens and a high resolution camera mounted on a pan/tilt platform. Covert cameras using pinhole lenses are often used to uncover internal theft, shoplifting, or other inappropriate or criminal activity. They can be concealed in inconspicuous locations, installed quickly, and moved on short notice. The following sections describe common and special lens types used in video surveillance applications. The majority of lenses used in video applications are FFL lenses. Most of these lenses are available with a manual focusing ring to adjust the amount of light passing through the lens and reaching the image sensor. The very short FL lenses (less than 6 mm) often have no manual iris. FFL lenses are the workhorses of the industry. Their attributes include low cost, ease of operation, and long life. Most FFL lenses are optically fast and range in speed from f/1.2 to f/1.8, providing sufficient light for most cameras to produce an excellent quality picture. The manual-iris lenses are suitable for medium light-level when used with most solid-state cameras. Most FFL lenses have a mount which in the industry for attaching the lens to the camera is called a C or CS mount (Figure 4-15). The C mount has been a standard in the CCTV industry for many years while the CS mount was introduced in the mid-1990s to match the trend toward smaller camera sensor formats and their correspondingly smaller lens requirements. The C and CS mount has a 1 inch 32 threads per inch thread. Most security surveillance cameras are now manufactured with a CS mount and supplied with a 5 mm thick spacer adapter ring which allows a C mount lens to be attached to the CS mount camera. The C mount focuses the scene image 0.69 inches (17.5 mm) behind the lens onto the camera sensor. The CS mount focuses the scene image 0.394 inches (10 mm) behind the lens onto the sensor. Commonly used CS mount FLs vary from 2.5 mm (A) LENS (C) UNSTABILIZED IMAGE (B) LENS FIGURE 4-14 Stabilized lenses and results 87 (D) STABILIZED IMAGE 88 CCTV Surveillance C MOUNT CS MOUNT 1"-32 TPI 1"-32 TPI CCD/CMOS SENSOR CCD/CMOS SENSOR MECHANICAL LENS GEOMETRY BACK FOCAL DISTANCE IMAGE PLANE SCENE MECHANICAL BACK FOCAL DISTANCE 17.526 mm (0.69") 12.5 mm (0.492") FLANGE BACK FOCAL LENGTH (F,f ) TPI = THREADS PER INCH FIGURE 4-15 C and CS mount lens mounting dimensions (wide-angle) to 200 mm (telephoto). The C mount is also used for long FL lenses having large physical dimensions. Large optics are designed to be used with almost any sensor format size from 1/4 to 1 inch. Lenses with FLs longer than approximately 300 mm are large and expensive. As the FL becomes longer the diameter of the lens increases and costs escalate accordingly. Most FFL lenses are available in a motorized or automatic-iris version. These are necessary when they are used with LLL ICCD cameras in daytime and nighttime applications where light level must be controlled via an automatic iris or neutral density filters depending on the scene illumination. With the widespread use of smaller cameras and lenses a new set of lens–camera mounts developed. They were not given any special name but are referred to as 11 mm, 12 mm (the most common), and 13 mm mounts. The dimensions refer to the diameter of the thread on the lens and the camera mount. Figure 4-16 shows these lens mounts. Note that the threads are not all the same (the 13 mm mount is different from the 11 mm and 12 mm). 4.3.1 Wide-Angle Viewing While the human eye has peripheral vision and can detect the presence and movement of objects over a wide angle THREAD: 13 mm DIA. × 1.0 mm PITCH (160 ), the eye sees a focused image in only about the central 10 of its FOV. No video camera has this unique eye characteristic, but a video system’s FOV can be increased (or decreased) by replacing the lens with one having a shorter (or longer) FL. The eye cannot change its FOV without the use of external optics. Choosing different FL lenses brings trade-offs: reducing the FL increases the FOV but reduces the magnification, thereby making objects in the scene smaller and less discernible (i.e. decreasing resolution). Increasing the FL has the opposite effect. To increase the FOV of a CCTV camera, a short-FL lens is used. The FOV obtained with wide-angle lenses can be calculated from Equations 4-1, 4-2, 4-3, and 4-4, or by using Table 4-4, or the Lens Finder Kit. For example, substituting an 8 mm FL, wide-angle lens for a 16 mm lens on any camera doubles the FOV. The magnification is reduced to one-half, and the camera sees “twice as much but half as well.” By substituting a 4 mm FL lens for the 16 mm lens, the FOV quadruples. We see sixteen times as much scene area but one-fourth as well. A 2.8 mm FL lens is an example of a wide-angle lens; it has an 82 horizontal by 67 vertical FOV on a 1/3-inch sensor. A super wide FOV lens for a 1/2-inch sensor is the 3.5 mm FL lens, with an FOV approximately 90 horizontal by 75 vertical. Using a wide-angle lens reduces THREAD: 12 mm DIA. × 0.5 mm PITCH THREAD: 10 mm DIA. x 0.5 mm PITCH SENSOR BACK FOCAL LENGTH RANGES FROM 3 mm TO 9 mm MOST COMMON MOUNT: 12 mm DIA. × 0.5 mm PITCH FIGURE 4-16 Mini-lens mounting dimensions Lenses and Optics 89 1/4" TELEPHOTO (SHADED) 1/3" 1/2" FL = 4.0 mm WIDE ANGLE FL = 25 mm NORMAL FOR 1" SENSOR FL = 75 mm NARROW ANGLE (TELEPHOTO) FL = 8 MM NORMAL FOR 1/3" SENSOR NOTE: ANGULAR FOV SHOWN FOR 1/3-, 1/2-, AND 2/3-INCH FORMAT SENSORS FIGURE 4-17 Wide-angle, normal, and narrow-angle (Telephoto) FFL lenses vs. format the resolution or ability to discern objects in a scene. Figure 4-17 shows a comparison of the FOV seen on 1/4-, 1/3-, and 1/2-inch format cameras with wide-angle, normal, and telephoto lenses. 4.3.2 Narrow-Angle Telephoto Viewing When the lens FL increases above the standard M = 1 magnification condition the FOV decreases and the magnification increases. Such a lens is called a medium- or narrow-angle (telephoto) lens. The lens magnification is determined by Equations 4-5, 4-10, 4-8, and 4-9 for the reference (1 inch) and three commonly used sensor sizes (see also Table 4-4 and the Lens Finder Kit). Outdoor security applications often require viewing scenes hundreds and sometimes thousands of feet away from the camera. To detect and/or identify objects, persons, or activity at these ranges requires very longFL lenses. Long-FL lenses between 150 and 1200 mm are usually used outdoors to view these parking lots or other remote areas. These large lenses require very stable mounts and rugged pan-tilt drives to obtain good picture quality. The lenses must be large (3–8 inches in diameter) to collect enough light from the distant scene and have usable f-numbers (f/2.5 to f/8) for the video camera to produce a good picture on the monitor. Fixed focal length lenses having FLs from 2.6 mm up to several hundred millimeters are refractive- or glass-type. Above approximately 300 mm FL, refractive glass lenses become too large and expensive, and reflective mirror optics or mirror and glass optics are used to achieve optically fast (low f-number) lenses with lower weight and size. These long FL telephoto lenses, called “Cassegrain” or “catadioptric lenses,” cost hundreds to thousands of dollars. Figure 4-18 shows a schematic of these lenses, a 700 mm f/8.0 and a 300 mm f/5.6 lens used for longrange outdoor surveillance applications. 4.4 VARI-FOCAL LENS The vari-focal lens is a variable focal length lens developed to be used in place of an FFL lens (Figure 4-19). In general it is smaller and costs much less than a zoom lens. The advantage of the vari-focal lens over an FFL lens is that its focal length and FOV can be changed manually 90 CCTV Surveillance WINDOW M1 SENSOR LIGHT FROM DISTANT SCENE L M2 FOCAL PLANE L = CORRECTING LENSES FIGURE 4-18 M1 = PRIMARY MIRROR M2 = SECONDARY MIRROR Long-range, long-focal length catadioptric lenses 8 mm FOV 3 mm FOV FOCAL LENGTH 3–8 mm SENSOR FIGURE 4-19 VARI-FOCAL LENS Vari-focal lens configuration by rotating the barrel on the lens. This feature makes it convenient to adjust the lens FOV to a precise angle while installed on the camera. The lenses were developed to be used in place of FFL lenses to “fine tune” the FL for a particular application. Having the ability to adjust the FL “on the job” makes it easier for the installer and at the same time permits the customer to select the exact FOV necessary to observe the desired scene area. One minor inconvenience of the vari-focal lens is that it must be refocused each time the FL is changed. Typical varifocal lenses are available with focal lengths of: 3–8 mm, 5–12 mm, 8–50 mm, 10–120 mm (Table 4-6). With just these few lenses focal lengths of from 1.8–120 mm and 144–16 FOVs can be covered continuously (i.e.—any focal length in the range). The vari-focal lenses are a subset and simplified version of zoom lenses but they are not a suitable replacement for the zoom lens in a variable FOV pan/tilt application. Lenses and Optics FOCAL LENGTH (mm) 1.4–3.1 1.6–3.4 1.8–3.6 2.2–6 2.7–12 2.8–12 3.0–8 3.0–8 3.5–8 3.6–18 4.5–12.5 5.0–50 5.5–82.5 6–12 6–15 6–60 7–70 8–16 8–80 10–30 10–40 20–100 ZOOM RATIO FORMAT (INCH) 2.2:1 2.1:1 2:1 2.7:1 4.4:1 4.3:1 2.7:1 2.7:1 2.3:1 5:1 2.8:1 10:1 15:1 2:1 2.5:1 10:1 10:1 2:1 10:1 3:1 4:1 5:1 1/3 1/3 1/3 1/4 1/3 1/3 1/3 1/3 1/3 1/2 1/2 1/3 1/3 1/2 1/2 1/3 1/2 1/2 1/2 1/2 1/3 1/3 OPTICAL SPEED f/# 1.4 1.4 1.6 1.2 1.2 1.4 1.4 1.0 1.4 1.8 1.2 1.3 1.8 1.4 1.4 1.6 1.8 1.6 1.6 1.4 1.4 1.6 HORIZONTAL ANGULAR FOV (DEG.) 1/4 INCH 1/3 INCH 1/2 INCH WIDE TELE WIDE TELE WIDE TELE 121 69.5 185 94.5 135 101 180 84.3 144 79.0 90.0 34.7 75 56 97.4 23.8 73 54.7 97.4 24.1 67.0 26.0 89.5 34.0 67.0 27 91 36 58.7 26.5 79.8 35.4 54.4 11.5 72.3 54.3 95.9 20.0 45.9 34.4 61.2 45.9 81.6 30.0 39.0 4.2 52 5.6 35.3 2.5 47.1 3.3 31.6 16.8 42.1 22.4 56.1 29.9 33.1 14.4 44.1 19.2 59.1 25.7 32.7 3.5 43.6 4.7 29.0 3.0 38.2 4.0 50.0 5.1 24.5 12.6 33.6 16.8 43.5 22.4 25 26 33.0 3.5 42.9 4.6 20 7.1 27 9.4 36 12.5 20.6 5.3 27.5 7.0 10.2 2.1 13.6 2.8 91 2/3 INCH WIDE TELE 59.8 30.8 NOTE: HORIZONTAL ANGULAR FOV FROM MANUFACTURERS’ SPECIFICATIONS Table 4-6 Representative Vari-Focal Lenses—Focal Length, Vari-Focal Zoom Ratio vs. Sensor Format, Horizontal FOV 4.5 ZOOM LENS Zoom and vari-focal lenses are variable FL lenses. The lens components in these assemblies are moved to change their relative physical positions, thereby varying the FL and angle of view through a specified range of magnifications. Prior to the invention of zoom optics, quick conversion to different FLs was achieved by mounting three or four different FFL lenses on a turret with a common lens mount in front of the CCTV camera sensor and rotating each lens into position, one at a time, in front of the sensor. The lenses usually had wide, medium, and short FLs to achieve different angular coverage. This turret lens was obviously not a variable-FL lens and had limited use. 4.5.1 Zooming Zooming is a lens feature that permits seeing detailed close-up views (high magnification) of a subject (scene target) or a broad (low magnification), overall view of an area. Zoom lenses allow a smooth, continuous change in the angular FOV. The angle of view can be made narrower or wider depending on the zoom setting. As a result, a scene can be made to appear close-up (high magnification) or far away (low magnification), giving the impression of camera movement toward or away from the scene, even though the camera remains in a fixed position. Figure 4-20 shows the continuously variable nature of the zoom lens and how the FOV of the video camera can be changed without replacing the lens. To implement zooming, several elements in the lens are physically moved to vary the FL and thereby vary the angular FOV and magnification. Tables 4-1, 4-2, 4-3, and 4-4, and the Lens Finder Kit can be used to determine the FOV for any zoom lens. By adjusting the zoom ring setting, one can view narrow-, medium-, or wide-angle scenes. This allows a person to view a scene with a wide-angle perspective and then close in on one portion of the scene that is of specific interest. The zoom lens can be made significantly more useful and providing the camera a still wider FOV by mounting it on a pan/tilt platform controlled from a remote console. The pan/tilt positioning and the zoom lens variable FOV from wide to narrow angle and anywhere in between provide a large dynamic FOV capability. 4.5.2 Lens Operation The zoom lens is a cleverly designed assembly of lens elements that can be moved to change the FL from a wide angle to a narrow angle (telephoto) while the image on the sensor remains in focus (Figure 4-21). This is a significant difference from the vari-focal lens 92 CCTV Surveillance CAMERA WIDE-ANGLE LIMIT NARROW-ANGLE TELEPHOTO LIMIT ZOOM LENS VERTICAL FIELD OF VIEW HORIZONTAL FIELD OF VIEW FIGURE 4-20 Zoom lens variable focal length function FOCUS RING ZOOM RING IRIS RING IRIS MOVABLE ZOOM GROUP FRONT FOCUSING OBJECTIVE GROUP FIGURE 4-21 Zoom lens configuration CAMERA SENSOR REAR STATIONARY RELAY GROUP Lenses and Optics which must be re-focused each time its FL is changed (Section 4.4). A zoom FL lens combines at least three moveable groups of elements: 1. The front focusing objective group that can be adjusted over a limited distance with an external focus ring to initially fine-focus the image onto the camera sensor. 2. A movable zoom group located between the front and the rear group that moves appreciably (front to back) using a separate external zoom ring. The zoom group also contains corrective elements to optimize the image over the full zoom range. Other lenses are also moved a small amount to automatically adjust and keep the image on the sensor in sharp focus, thereby eliminating subsequent external adjustment of the front focusing group. 3. The rear stationary relay group at the camera end of the zoom lens that determines the final image size when it is focused on the camera sensor. 93 faster (more light throughput, lower f-number) than at the telephoto setting. For example, a 11–110 mm zoom lens may be listed as f/1.8 when set at 11 mm FL and f/4 when set at 110 mm FL. The f-number for any other FL in between the two settings lies in between these two values. 4.5.4 Configurations Each lens group normally consists of several elements. When the zoom group is positioned correctly, it sees the image produced by the objective group and creates a new image from it. The rear relay group picks up the image from the zoom group and relays it to the camera sensor. In a well-designed zoom lens a scene in focus at the wide-angle (short-FL) setting remains in focus at the narrow-angle (telephoto) setting and everywhere in between. Many manufacturers produce a large variety of manual and motorized zoom lenses suitable for a wide variety of applications. Figure 4-22 shows two very different zoom lenses used for surveillance applications. The manual zoom lens shown has a 85–51 mm FL (6:1 zoom ratio) and has an optical speed of f/1.6. The long range lens shown has a large zoom ratio of 21:1. This lens has an FL range of 30–750 mm and speed of f/4.6. Figure 4-23 shows the FOVs obtained from a 11–110 mm FL zoom lens on a 1/2-inch sensor camera at three zoom FL settings. Table 4-7 is a representative list of manual and motorized zoom lenses, from a small, lightweight, inexpensive 8–48 mm FL zoom lens to a large, expensive, 13.5–600 mm zoom lens used in high-risk security areas by industry, military, and government agencies. Zoom lenses are available with magnification ratios from 6:1 to 50:1. Many have special features, including remotely controlled preset zoom and focus positions, auto-focus and stabilization. 4.5.3 Optical Speed 4.5.5 Manual or Motorized Since the FL of a zoom lens is variable and its entrance aperture is fixed, its f-number is not fixed (see Equation 4-11). For this reason, zoom lens manufacturers often list the f-number for the zoom lens at the wide and narrow FLs, with the f-number at the wide-angle setting being The FL of a zoom lens is changed by moving an external zoom ring either manually or with an electric motor. When the zoom lens iris, focus, or zoom setting must be adjusted remotely, a motorized lens with a remote controller is used. The operator can control and change these (A) MANUAL FIGURE 4-22 Manual and motorized zoom lenses (B) MOTORIZED 94 CCTV Surveillance 11 mm FL SCENE 24 mm FL HORIZONTAL ZOOM ANGLES 110 mm FL WIDE: 46° MEDIUM: 24° NARROW: 5° 5° 24° 2 /3-INCH SENSOR FORMAT 46° ZOOM LENS 11–110 mm FIGURE 4-23 FOCAL LENGTH (mm) 4.5–54 4.6–28 5.5–77 5.5–187 5.7–34.2 5.8–121.8 6–72 6–90 7.5–105 8–48 8–96 8–160 9–180 10.5–105 10–140 10–200 10–300 12.5–75 12–120 12–240 16–160 10–500 NOTE: THE LENS HAS A CIRCULAR FOV BUT THE SENSOR 4:3 ASPECT RATIO FOV IS SHOWN Zoom lens FOV at different focal length settings ZOOM RATIO FORMAT (INCH) OPTICAL SPEED: f/# 12:1 6:1 14:1 34:1 6:1 21:1 12:1 15:1 14:1 6:1 12:1 20:1 20:1 10:1 14:1 20:1 30:1 6:1 10:1 20:1 10:1 50:1 1/4 1/4 1/3 1/3 1/3 1/3 1/3 1/3 1/2 1/2 1/2 1/2 1/3 2/3 2/3 1/2 1/2 2/3 1/2 1/2 1 1/2 1.1 1.0 1.4 1.8 1.0 1.6 1.5 1.2 1.4 1.0 2.0 2.0 1.2 1.4 1.9 2.5 1.5 1.6 1.8 1.6 1.8 4.0 HORIZONTAL ANGULAR FOV (DEG.) 1/4 INCH 1/3 INCH 1/2 INCH WIDE T ELE WIDE TELE WIDE TELE 43.5 3.7 41.6 7.5 36.2 2.6 47.1 3.5 35.0 1.1 46.6 1.5 34.5 6.1 46 8.1 33.8 1.7 45 2.3 33.4 2.8 43.6 3.8 33.0 2.3 43.8 3.1 27.0 2.0 35.5 2.6 46.2 3.5 24.9 4.4 33.0 5.9 43.2 7.7 25.3 2.2 33.4 2.8 43.5 3.7 25.1 1.3 33.4 1.7 43.6 2.3 22.6 1.2 30.3 1.5 18.6 2.0 24.8 2.6 33.0 3.5 19.9 1.5 26.4 2.0 35.0 2.7 20.3 1.1 27.0 1.4 35.5 1.8 20.0 0.7 26.6 0.9 35.5 1.25 16.1 2.8 21.4 3.7 28.4 4.9 16.6 1.7 22.1 2.3 29.4 3.1 17.2 0.9 23.0 1.2 30.8 1.6 16.8 1.8 22.4 2.4 30.8 3.2 13.7 0.3 18.2 0.4 35.5 0.7 2/3 INCH WIDE TELE 45.5 47.5 4.8 3.6 38.8 6.7 44.9 4.6 NOTE: NOMINAL HORIZONTAL ANGULAR FOV FROM MANUFACTURERS’ SPECIFICATION Table 4-7 Representative Motorized Zoom Lenses—Focal Length, Zoom Ratio vs. Sensor Formats, Horizontal FOV Lenses and Optics settings remotely using toggle switch controls on the console or automatically through preprogrammed software. The motor and gear mechanisms effecting these changes are mounted within the zoom lens. Manual zoom lenses are not very practical for surveillance since an operator is not located at the camera location and cannot manually adjust the zoom lens. 95 but in some cases the operator can choose a manualor automatic-iris setting on the lens or the controller. In surveillance applications, one shortcoming of a pan-tilt mounted zoom lens is the existence of “dead zone” viewing areas since the lens cannot point and see in all directions at once. 4.5.7 Preset Zoom and Focus 4.5.6 Adding a Pan/Tilt Mechanism A zoom lens and camera pointed in a fixed direction provides limited viewing. When the zoom lens is viewing wide angle it sees a large FOV, but when it is zoomed to a narrow angle it will zoom in and magnify only in one pointing direction—straight ahead of the camera. This is of limited use unless it is pointing at a area of importance such as an entrance door, entry/exit gate, receptionist, i.e. one single location. To fully utilize a zoom lens it is mounted on a pan-tilt mechanism so that the lens can be pointed in almost any direction (Figure 4-24). By varying the lens zoom control and moving the pan/tilt platform, a wide dynamic FOV is achieved. The pan-tilt and lens controller remotely adjusts pan, tilt, zoom, and focus. The lens usually has an automatic iris, CAMERA LENS PAN/ TILT PLATFORM In a computer-controlled surveillance system a motorized zoom lens with electronic preset functions is used. In this mode of operation as a preset zoom lens, the zoom and focus ring positions are monitored electrically and memorized by the computer during system setup. These settings (presets) are then automatically repeated on command by the computer software at a later time. In this surveillance application, this feature allows the computer to point the camera–lens combination according to a set of predetermined conditions and adjust pointing and the zoom lens FL and focus: i.e. (1) azimuth and elevation angle, (2) focused at a specific distance and (3) iris set to a specific f-number opening. When a camera needs to turn to another preset set of conditions in response to an alarm sensor or other input, the preset feature eliminates the need for human response and significantly reduces the time to acquire a new target. PAN/ TILT POINTING RANGE ZOOM LENS NARROW FOV ZOOM LENS CAN BE POINTED ANYWHERE WITHIN PAN/ TILT POINTING RANGE PAN/ TILT POINTING RANGE FIGURE 4-24 Dynamic FOV of pan/tilt-mounted zoom lens SCENE ZOOM LENS WIDE FOV 96 CCTV Surveillance 4.5.8 Electrical Connections The motorized zoom lens contains electronics, motors, and clutches to control the movement of the zoom, focus, and iris adjustment rings, and end-of-travel limit switches to protect the gear mechanism. Since the electrical connections have not been standardized among manufacturers, the manufacturer’s lens wiring diagram must be consulted for proper wiring. Figure 4-25 shows a typical wiring schematic for zoom, focus, and iris mechanisms. The zoom, focus, and iris motors are controlled with positive and negative DC voltages from the lens controller, using the polarity specified by the manufacturer. 4.5.9 Initial Lens Focusing To achieve the performance characteristics designed into a zoom lens, the lens must be properly focused onto the camera sensor during the initial installation. Since the lens operates over a wide range of focal lengths it must be tested and checked to ensure that it is in focus at the wide-angle and telephoto settings. To perform a critical focusing of the zoom lens the aperture (iris) must be wide open (set to the lowest f-number) for all backfocus adjustments of the camera sensor. This provides the conditions for a minimum depth of field, and the conditions to perform the most critical focusing. Therefore, adjustments must be performed in subdued lighting, or with optical filters in front of the lens, to reduce the light and allow the lens iris to open fully to get minimum depth of field. The following steps should be followed to focus the lens: 1. With the camera operating, view an object at least 50 feet away. 2. Make sure the lens iris is wide open so that focusing is most critical. 3. Set the lens focus control to the extreme far position. 4. Adjust the lens zoom control to the extreme wide-angle position (shortest FL). 5. Adjust the camera sensor position adjustment control to obtain the best focus on the monitor. 6. Move the lens zoom to the extreme telephoto (longest FL) setting. 7. Adjust the lens focus control (on the controller) for the best picture. 8. Re-check the focus at the wide-angle (position of shortest FL). CONNECTOR CONTROL CONSOLE SUPPLY VOLTAGE * VIDEO SIGNAL IRIS REMOTE CONTROL ** AUTO MANUAL FOCUS + 12 V NEAR *** ZOOM LENS FOCUS NEAR CONTROL FAR IRIS CAMERA FOCUS ZOOM ZOOM TELE CONTROL WIDE ZOOM + 12 V TELE *** COMMON * DEPENDS ON MANUFACTURER: RANGES FROM 8 TO 12 VDC TYPICAL ** REQUIRES POSITIVE AND NEGATIVE VOLTAGE: ±6 VDC MAX TYPICAL *** DEPENDING ON THE CONTROLLER THIS MAY BE ±6 OR ±12 VDC MAX REQUIRES POSITIVE AND NEGATIVE VOLTAGE FIGURE 4-25 Motorized zoom lens electrical configuration Lenses and Optics 4.5.10 Zoom Pinhole Lens Pinhole lenses with a small front lens element are common place in covert video surveillance applications. Zoom pinhole lenses while not as common as FFL pinhole lenses are available in straight and right-angle configuration. One lens has an FL range of 4−12 mm and an optical speed of f/4.0. 4.5.11 Zoom Lens–Camera Module The requirement for a compact zoom lens and camera combination has been satisfied with the zoom lens–camera module. This module evolved out of a requirement for a lightweight, low-inertia camera lens for use in high-speed pan-tilt dome installations in casinos and retail stores. The camera–lens module has a mechanical cube configuration (Figure 4-26) so that it can easily be incorporated into small pan-tilt dome housings and be pointed in any direction at high speeds. The module assembly includes the following components and features: (1) rugged, compact mechanical structure suitable for high-speed pan-tilt platforms, (2) large optical zoom ratio, typically 20:1, and (3) sensitive 1/4- or 1/3-inch solid-state color camera with excellent sensitivity and resolution. Options include: (1) automaticfocus capability, (2) image stabilization, and (3) electronic zoom. 4.5.12 Zoom Lens Checklist The following should be considered when applying a zoom lens: 97 • What FOV is required? See Tables 4-1, 4-2, 4-3, 4-4, and the Lens Finder Kit. • Can a zoom lens cover the FOV or must a pan-tilt platform be used? • Is the scene lighting constant or widely varying? Is a manual or automatic iris required? • What is the camera format: 1/4-, 1/3-, 1/2-inch? • What is the camera lens mount type: C, or CS? • Is auto-focus or stabilization needed? • Is electronic zoom required to extend the FL range? Zoom lenses on pan-tilt platforms significantly increase the viewing capability of a video system by providing a large range of FLs all in one lens. The increased complexity and precision required in the manufacture of zoom lenses makes them cost three to ten times as much as an FFL lens. 4.6 PINHOLE LENS A pinhole lens is a special security lens with a relatively small front diameter so that it can be hidden in a wall, ceiling, or some object. Covert pinhole lens–camera assemblies have been installed in emergency lights, exit signs, ceiling-mounted lights, table lamps, and even disguised as a building sprinkler head fixture. Any object that can house the camera and pinhole lens and can disguise or hide the front lens element is a candidate for a covert installation. In practice the front lens is considerably larger than a pinhole, usually 006−038 inch in diameter, but nevertheless it can be successfully hidden from view. Variations of the pinhole lens include straight or right-angle, manual or automatic iris, narrow-taper or stubby-front shape (Figure 4-27). The lenses shown are for use with C or CS mount cameras. Whether to use the straight or right-angle pinhole lens depends on the application. A detailed description and review of covert camera and pinhole lenses are presented in Chapter 18. 4.6.1 Generic Pinhole Types FIGURE 4-26 Compact zoom lens–camera cube A feature that distinguishes two generic pinhole lens designs from each other is the shape and size of the front taper (Figure 4-28). The slow tapering design permits easier installation than the fast taper and also has a faster optical speed, since the larger front lens collects more light. The optical speed (f-number) of the pinhole lens is important for the successful implementation of a covert camera system. The lower the f-number of the lens, the more the light reaching the camera and the better the video picture. An f/2.2 lens transmits 2.5 times more light than an f/3.5. The best theoretical f-number is equal to 98 CCTV Surveillance (A) MANUAL IRIS FAST TAPER (C) MANUAL IRIS SLOW TAPER FIGURE 4-27 (B) AUTOMATIC IRIS FAST TAPER (D) RIGHT ANGLE MANUAL IRIS SLOW TAPER (E) RIGHT ANGLE AUTOMATIC IRIS SLOW TAPER Straight and right-angle pinhole lenses the FL divided by the entrance lens diameter (d). From Equation 4-11: f/# = FL d For a pinhole lens, the light getting through the lens to the camera sensor is limited primarily by the diameter of the front lens or the mechanical opening through which it views. For this reason, the larger the lens entrance diameter, the more light gets through to the image sensor, resulting in a better picture quality, if all other conditions remain the same. 4.6.2 Sprinkler Head Pinhole There are many types of covert lenses available for the security industry: pinhole, mini, fiber optic, camera-lenses covertly concealed in objects. The sprinkler head camera is a unique pinhole lens hidden in a ceiling sprinkler fixture which makes it extremely difficult for an observer standing at floor level to detect or identify. This unique device provides an extremely useful covert surveillance system. Figure 4-29 shows the configuration and two versions of the sprinkler head lens, the straight and right-angle. This pinhole lens and camera combination is concealed in and above a ceiling using a modified sprinkler head to view the room below the ceiling. For investigative purposes, fixed pinhole lenses pointing in one specific direction are usually suitable. To look in different directions there is a panning sprinkler head version. An integral camera-lens-sprinkler head design is shown in Section 18.3.5. 4.6.3 Mini-Pinhole Another generic family of covert lenses is the mini-lens group (Figure 4-30). They are available in conventional onaxis and special off-axis versions. Their front mechanical shape can be flat or cone shaped. These lenses are very small, some with a cone-shaped front, typically less than 1/2 inch diameter by 1/2 inch long and mount directly onto a small video camera. The front lens in these mini-lenses ranges from 1/16 inch to 3/8 inch diameter. The cone-shaped mini-lens is easier to install in many applications. These mini-lenses are optically fast having speeds of f/1.4 to f/1.8 and can be used in places unsuitable for larger pinhole lenses. An f/1.4 minipinhole lens transmits five times more light than an f/3.5 pinhole lens. Mini-pinhole lenses are available in FLs from 2.1 to 11 mm and when combined with a good camera Lenses and Optics FAST-TAPER BARREL SLOW-TAPER BARREL LARGE DIAMETER SMALL DIAMETER CEILING TILE 30° FIGURE 4-28 55° Short vs. long tapered pinhole lenses STRAIGHT RIGHT-ANGLE CAMERA RIGHT-ANGLE PINHOLE SPRINKLER LENS STRAIGHT PINHOLE SPRINKLER LENS MANUAL IRIS SPRINKLER HEAD CEILING TILE SPRINKLER HEAD LENS/CAMERA FIELD OF VIEW MOVEABLE MIRROR FIGURE 4-29 Sprinkler head pinhole assembly installation CAMERA CEILING TILE 99 100 CCTV Surveillance FIGURE 4-30 Mini-pinhole lenses 8 mm IN WATEC OFFSET MOUNT 8 mm IN C MOUNT 11 mm IN WATEC OFFSET MOUNT 2.1 mm 8 mm 3.8 mm 11 mm 3.7 mm PINHOLE result in the fastest covert cameras available. A useful variation of the standard mini-lens is the off-axis mini-lens. This lens is mounted offset from the camera axis, which causes the camera to look off to one side, up, or down, depending on the offset direction chosen. Chapter 18 describes pinhole and mini-lenses in detail. 4.7 SPECIAL LENSES There are several special video security lenses and lens functions that deserve consideration. These include: (1) a new panoramic 360 lens, (2) fiber-optic and bore scope, (3) split-image, (4) right-angle, (5) relay, (6) automatic-focus, (7) stabilized, and (8) long-range. The new panoramic lens must be integral with a camera and used with computer hardware and software (Section 5.10). The other special lenses are used in applications when standard FFL, varifocal or zoom lenses are not suitable. The auto-focus and stabilizing functions are used to enhance the performance of zoom lenses, vari-focal, fixed focus lenses, etc. 4.7.1 Panoramic Lens—360 There has always been a need to see “all around” i.e. an entire room or other location, seeing 360 with one panoramic camera and lens. To date, a 360 FOV camera system has only been achieved with multiple cameras and lenses and combining the images on a split-screen monitor. This lens is usually mounted in the ceiling of a room or on a tower. Panoramic lenses have been available for many years but have only recently been combined with powerful digital electronics, sophisticated mathematical transformations and compression algorithms to take advantage of their capabilities. The availability of high resolution solid-state cameras has made it possible to map a 360 by 90 hemispherical FOV into a standard rectangular monitor format with good resolution. Figure 4-31 shows two panoramic lens having a 360 horizontal FOV and a 90 vertical FOV. In operation the lens collects light from the 360 panoramic scene and focuses it onto the camera sensor as a donut-shaped image (Figure 4-32). The electronics and mathematical algorithm convert this donut-shaped panoramic image into the rectangular (horizontal and vertical) format for normal monitor viewing. (Section 2.6.5 describes the panoramic camera in detail.) 4.7.2 Fiber-Optic and Bore Scope Optics Coherent fiber-optic bundle lenses can sometimes solve difficult video security applications. Not to be confused with the single or multiple strands of fiber commonly used to transmit the video signal over long distances, the coherent fiber-optic lens has many thousands of individual glass fibers positioned adjacent to each other. These thousands of fibers transmit a coherent image from an objective lens, over a distance of several inches to several feet, where the image is then transferred again by means of a relay lens to the camera sensor. A high-resolution 450 TV lines coherent fiber bundle consists of several hundred thousand glass fibers that transfer a focused image from one end of the fiber bundle to the other. Coherent optics means that each point in the image on the front end of the fiber bundle corresponds to a point at the rear end. Since the picture quality obtained with fiber-optic lenses is not as good as that obtained with all glass lenses, such lenses should only Lenses and Optics 360° LENS 360° HORIZ. FOV RAW DONUT IMAGE FROM 360° LENS 90° VERT. FOV LENS SEES FULL HEMISPHERE: 360° × 180° FIGURE 4-31 Panoramic 360 lens camera module 360° IMAGE FORMED ON SENSOR DONUT RING IMAGE REPRESENTING HEMISPHERE FOV: 360 H × 180 V SENSOR PANORAMIC CAMERA /SENSOR 360° HORIZ. FOV CD A B B A 90° VERT. FOV ALGORITHM TO CONVERT DONUT SHAPED PANORAMIC IMAGE TO NORMAL LINEAR X,Y 90° 0° A B D C DC LENS SEES FULL HEMISPHERE: 360° × 180° NORMAL 360° FORMAT IMAGE 0° 360° LENS FEATURES: • SEES ENTIRE AREA INCLUDING DIRECTLY BELOW • INFINITE DEPTH OF FIELD: NO FOCUSING REQUIRED • NO MOVING PARTS FOR P/ T/ Z–ALL ELECTRONIC • FLUSH CEILING (OR WALL) MOUNTING FIGURE 4-32 Panoramic lens layout and description PC SOFTWARE / HARDWARE NORMAL DISPLAY 101 102 CCTV Surveillance (A) OBJECTIVE LENS: 8 mm OR 11 mm FL FIBER TYPE: RIGID CONDUIT RELAY LENS: M = 1:1 MOUNT: C OR CS FIGURE 4-33 (B) OBJECTIVE LENS: ANY C OR CS MOUNT FIBER TYPE: FLEXIBLE BUNDLE RELAY LENS: M = 1:1 MOUNT: C OR CS Rigid and flexible fiber optic lenses be used when no other lens–camera system will solve the problem. Fiber-optic lenses are expensive and available in rigid or flexible configurations (Figure 4-33). In the complete fiber-optic lens, the fiber bundle is to be preceded by an objective lens, FFL or other, which focuses the scene onto the front end of the bundle and RELAY LENS CAMERA SENSOR followed by a relay lens that focuses the image at the rear end of the bundle onto the sensor (Figure 4-34). Fiber-optic lenses are used in security applications for viewing through thick walls (or ceilings), or any installation where the camera must be a few inches to several feet away from the front lens, for example the camera OBJECTIVE LENS SCENE COHERENT FIBER OPTIC BUNDLE G 6 to 12 INCHES LONG G IMAGE ON SENSOR G FIGURE 4-34 Fiber-optic lens configuration G FIBER OPTIC OUTPUT SCENE SCENE IMAGE Lenses and Optics on an accessible side of a wall and the front of the lens on the inaccessible scene side. In this situation the lens is a foot away from the camera sensor. Chapter 18 shows how coherent fiber-optic lenses are used in covert security applications. The bore scope lens is another class of viewing optics for video cameras (Figure 4-35). This lens has a rigid tube of 6−30 inches long and a diameter of 004–05 inches. The two generic designs, single rod lens and multiple small lenses, transmit the image from the front objective lens to the rear lens and onto a camera sensor. The single rod lens uses a unique graded index (GRIN) glass rod, to refocus the image along its length. Bore scope lenses can only transmit a small amount of light because of the small rod or lens diameters. This results in high f-numbers, typically f/11 to f/30. The slow speed limits the bore scope application to well-illuminated environments and sensitive cameras. The image quality of the bore scope lens is better than that of the fiberoptic lens, since it uses all glass lenses. Figure 4-35 shows the diagram of a GRIN bore scope lens 0.125 inches in diameter and 12 inches long, and an all-lens bore scope with a diameter of 0.187 inches and a length of 18 inches. The latter has a mirror at the tip to allow viewing at rightangles to the lens axis. 4.7.3 Bi-Focal, Tri-Focal Image Splitting Optics A lens for imaging two independent scenes onto a single video camera is called an image-splitting or bi-focal lens. The split-image lens has two female C or CS lens ports for two objective lenses. The lens views two different scenes with two separate lenses, and combines the scenes onto one camera sensor (Figure 4-36). Each of the two objective lenses can have the same or different FLs and will correspondingly produce the same or different magnifications. The split-image lens accomplishes this with only one camera. Depending on the orientation of the bifocal lens system on the camera, the image is split either vertically or horizontally. Any fixed-focus, pinhole, vari-focal, zoom, or other lens that mechanically fits onto the C or CS mount can be used. The adjustable mirror mounted on the side lens allows the camera to look in almost any direction. This external mirror can point at the same scene as the front lens. In this case, if the front lens is a wide-angle lens (4 mm FL) and the side lens is a narrow-angle lens (50 mm FL), a bifocal lens system results: one camera views a wide-field and narrowfield simultaneously (Figure 4-36). Note that the horizontal scene FOV covered by each lens is one-half of the total lens FOV. For example, with the 4 mm and 50 mm FL lenses on a 1/3-inch camera and a vertical split (as shown), the SINGLE GRADED INDEX (GRIN) LENS CAMERA RELAY LENS OBJECTIVE SECTION GRADIENT INDEX LENS SENSOR STRAIGHT FRONT LOOKING VIEWING HEAD MULTIPLE DISCRETE LENSES CAMERA RELAY LENS RELAY LENSES SENSOR OBJECTIVE SECTION RIGHT ANGLE VIEWING HEAD WORKING LENGTH FIGURE 4-35 Bore scope lens system 103 104 CCTV Surveillance VERTICAL SPLIT B A M SCENE A HORIZONTAL SPLIT A MOVEABLE MIRROR B SCENE B M SCENE A NARROW FOV SCENE B M = ADJUSTABLE MIRROR WIDE FOV FIGURE 4-36 Bi-focal split-image optics 4 mm lens displays a 30 × 45 feet scene, and the 50-mm lens displays a 24 × 36 feet scene at a distance of 50 feet. The horizontal FOV of each lens has been reduced by onehalf of what each lens would see if the lens were mounted directly onto the camera (60 × 45 feet for the 4 mm lens and 4.8 × 3.6 feet for the 50 mm lens). By rotating the splitimage lens 90 about the camera optical axis a horizontal split is obtained. In this case the vertical FOV is halved. It should be noted that the bifocal lens inverts the picture on the monitor, a condition that is simply corrected by inverting the camera. A three-way or tri-split optical image-splitting lens views three scenes (Figure 4-37). The tri-split lens provides the ability to view three different scenes with the same or different magnifications with one camera. Each scene occupies one-third of the monitor screen. Adjustable optics on the lens permit changing the pointing elevation angle of the three front lenses so that they can look close-in for short hallway applications and all the way out (near horizontal) for long hallways. Like the bi-split lens, this lens inverts the monitor image, which is corrected by inverting the camera. Both the bi-split and the tri-split lenses work on 1/4-, 1/3-, or 1/2-inch camera formats. The image splitting is accomplished without electronic splitters and is useful when only one camera is installed but two or three scenes need to be monitored. 4.7.4 Right-Angle Lens The right-angle lens permits mounting a camera parallel to a wall or ceiling while the lens views a scene at 90 to the camera axis and wall or ceiling (Figure 4-38). When space is limited behind a wall, a ceiling, in an automatic teller machine (ATM) or an elevator cab, the right-angle lens is a solution. The right-angle optical system permits use of wide-angle lenses (2.6 mm, 110 FOV) looking at right angles to the camera axis. This cannot be accomplished by using a mirror and a wide-angle lens directly on the camera since the entire scene will not be reflected by the mirror to the lens on the camera. The edges of the scene will not appear on the monitor because of picture vignetting (Figure 4-39). The right-angle adapter permits the use of any FL lens that will mechanically fit into its C or CS mount and works with 1/4-, 1/3-, or 1/2-inch camera formats. Lenses and Optics MONITOR DISPLAY 1 2 3 SCENE 1 CAMERA LENS 1 LENS 3 LENS 2 FRONT LENS SECTION MOVES UP AND DOWN TO CHANGE VERTICAL POINTING ANGLE SCENE 3 SCENE 2 FIGURE 4-37 Tri-split lens views three scenes CAMERA C OR CS MOUNT RIGHT ANGLE RELAY LENS ANY C OR CS MOUNT LENS NARROW OR WIDE ANGLE MONITOR DISPLAY FULL SCENE FIGURE 4-38 Right angle lens 105 106 CCTV Surveillance CAMERA FRONT SURFACE MIRROR WIDE-ANGLE LENS SENSOR THIS PART OF SCENE IS NOT REFLECTED OFF MIRROR THIS PART OF SCENE IS BLOCKED BY LENS FIGURE 4-39 Picture vignetting from wide-angle lens and mirror 4.7.5 Relay Lens The relay lens is used to transfer a scene image focused by any standard lens, fiber-optic or bore scope lens onto the camera sensor (Figure 4-40). CAMERA The relay lens must always be used with some other objective lens and does not produce an image in and of itself. When used at the fiber bundle output end, a fiber-optic lens re-images the scene onto the sensor. When incorporated into split-image or right-angle optics, RELAY OBJECTIVE LENS LENS SCENE IRIS FOCUSED SCENE IMAGE FOCUSED SCENE IMAGE AUTO IRIS RELAY LENS WITH C MOUNT LENS FIGURE 4-40 Relay lens adapter Lenses and Optics it re-images the “split” scene or right-angle scene onto the sensor. The relay lens can be used with a standard FFL, pinhole, zoom, or other lens as a lens extender with unit magnification (M = 1), for the purpose of optically “moving” the sensor out in front of the camera. 4.8 COMMENTS, CHECKLIST AND QUESTIONS • A standard objective lens inverts the picture image, and the video camera electronics re-invert the picture so that it is displayed right-side-up on the monitor. • The 25 mm FL lens is considered the standard or reference lens for the 1-inch (actually 16 mm diagonal) format sensor. This lens–camera combination is defined to have a magnification of M = 1 and is similar to the normal FOV of the human eye. The standard lens for a 2/3-inch format sensor is 16 mm; for a 1/2-inch sensor is 12.5 mm; for a 1/3-inch sensor is 8 mm; and for a 1/4inch sensor is 6 mm. All these combinations produce a magnification M = 1. They all have the same angular FOV and therefore view the same size scene. • A short-FL lens has a wide FOV (Table 4-3; 4.8 mm, 1/2inch sensor sees a 13.4 feet wide × 10.0 feet high scene at 10 feet). • A long-FL lens has a narrow FOV (Table 4-3; 75 mm, 1/2-inch sensor sees a 4.3 feet wide × 3.2 feet high scene at 50 feet). • To determine what FOV is required for an application, consult Tables 4-1, 4-2, 4-3, 4-4 and the Lens Finder Kit. • If the exact FOV desired cannot be obtained with an FFL, use a vari-focal lens. • Does the application require a manual or motorized zoom lens, a pan-tilt mount? • Is the scene lighting constant or widely varying? Is a manual or automatic iris required? • What is the camera format (1/4-, 1/3-, 1/2-inch)? • What type of camera lens mount: C, CS, mini 11 mm, 12 mm, 13 mm, bayonet, or other? 4.9 SUMMARY The task of choosing the right lens for a security application is an important aspect in designing a video security 107 system. The large variety of focal lengths and lens types make the proper lens choice a challenging one. The lens tables and Lens Finder Kit provide convenient tools for choosing the optimum FL lens to be used and the resulting angular FOV obtained with the chosen sensor size. The common FFL lenses used in most video systems have FLs in the range of 2.8–75 mm. Super wide-angle applications may use a 2.1 mm FL. Super telephoto applications may use FLs from 100 to 300 mm. Most in the range of 2.8–75 mm are available with a manual or automatic iris. Vari-focal lenses are often chosen when the exact FL desired cannot be obtained with the FFL. The vari-focal lens can “fine tune” the focal length exactly to obtain the angular FOV required. Vari-focal lenses are available in manual and auto-iris configurations. Vari-focal lenses must be re-focused when their FL is changed. Zoom lenses are used when the camera and lens must be scanned over a large scene area and the magnification of the scene must be changed. This is accomplished by mounting the camera-lens on a pan-tilt platform capable of remotely panning and tilting the camera-lens assembly and zooming the zoom lens. Zoom lenses are available with FLs from 8 to 200 mm with zoom ratios from 6 to 50 to 1. The zoom lens is available with a manual or automatic iris. When the video security application requires that the camera and lens be hidden, covert pinhole lenses and mini-pinhole lenses are used. The pinhole lenses are mounted to cameras having a C or CS mounts. The minipinhole lenses are mounted directly onto a small single board camera (with or without housing) and hidden behind walls or ceilings or mounted into common objects: PIR motion sensor, clock, emergency light, sprinkler head, etc.). Chapter 18 describes covert video lenses and systems in more detail. Special lenses like the bi- and tri-split, fiber-optic or bore scope lenses are only used when other simpler techniques can not be used. The newly implemented 360 panoramic lens is used with a computer system and can view a 360 horizontal FOV and up to a 90 vertical FOV. This lens has taken an important place in digital video surveillance systems. The computer transforms the complex donut-shaped image into a useful rectangular image on the monitor. This page intentionally left blank Chapter 5 Cameras—Analog, Digital, and Internet CONTENTS 5.1 5.2 5.3 5.4 5.5 Overview Camera Function 5.2.1 The Scanning Process 5.2.2 The Video Signal 5.2.2.1 Monochrome Signal 5.2.2.2 Color Signal Camera Types 5.3.1 Analog Camera 5.3.1.1 Monochrome 5.3.1.2 Color—Single Sensor 5.3.1.3 Color—Monochrome Switchover 5.3.1.4 Color—Three Sensor 5.3.2 Digital Camera 5.3.2.1 Digital Signal Processing (DSP) 5.3.2.2 Smart Camera 5.3.2.3 Legal Considerations 5.3.3 Internet Camera 5.3.3.1 The IP Camera ID 5.3.3.2 Remote Viewing 5.3.3.3 Compression for Transmission 5.3.4 Low Light Level ICCD 5.3.5 Thermal IR 5.3.6 Universal System Bus (USB) Basic Sensor Types 5.4.1 Solid State—Visible 5.4.1.1 Charge Coupled Device (CCD) 5.4.1.2 Complementary Metal Oxide Semiconductor (CMOS) 5.4.2 ICCD, SIT, ISIT—Visible/Near IR 5.4.3 Thermal IR 5.4.4 Sensor Fusion—Visible/IR Camera Features—Analog/Digital 5.5.1 Video Motion Detection (VMD) 5.5.2 Electronic Zooming 5.5.3 Electronic Shuttering 5.5.4 White Balance 5.5.5 Video Bright Light Compression 5.5.6 Geometric Accuracy 5.6 Camera Resolution/Sensitivity 5.6.1 Vertical Resolution 5.6.2 Horizontal Resolution 5.6.3 Static vs. Dynamic Resolution 5.6.4 Sensitivity 5.7 Sensor Formats 5.7.1 Solid-State 5.7.2 Image Intensifier 5.7.3 Thermal IR 5.8 Camera Lens Mounts 5.8.1 C and CS Mounts 5.8.2 Mini-Lens Mounts 5.8.3 Bayonet Mount 5.8.4 Lens–Mount Interferences 5.9 Zoom Lens–camera Module 5.10 Panoramic 360 Camera 5.11 High Definition Television (HDTV) 5.12 Summary 5.1 OVERVIEW The function of a video camera is to convert the focused visual (or IR) light image from the camera lens into a time-varying electrical video signal for later presentation on a monitor display or permanent recording on a video recorder. The lens collects the reflected light from the scene and focuses it onto the camera image sensor. The sensor converts the light image into a time-varying electronic signal. The camera electronics process the information from the sensor and via the video signal sends it to a viewing monitor by way of coaxial cable, fiber optics, two-wire unshielded twisted-pair (UTP), wireless, or other transmission means. Figure 5-1 shows a simple video camera/lens and monitor system. 109 110 CCTV Surveillance TRANSMISSION MEANS • COAXIAL • UNSHIELDED TWISTED PAIR (UTP) • FIBER OPTIC • WIRELESS CAMERA MONITOR LENS SCENE SCENE RASTER SCANNING 1 2 3 ... 1 2 1 2 CRT SCAN FIGURE 5-1 ROW/COLUMN PIXEL SCANNING 1 2 3 . . . SOLID STATE Video system with lens, camera, transmission means, and monitor The monochrome or color, solid-state or thermal IR cameras analyze the scene by scanning an array of horizontal and vertical pixels in the camera sensor. This process generates an electrical signal representing the light and color information in the scene as a function of time, so that the scene can be reconstructed on the monitor or recorded for later use. Unlike film cameras, human eyes, and LLL image intensifiers that see a complete picture continuously, a video camera scans an image—point by point —until it has scanned the entire scene, i.e. one frame. In this respect the camera scan is similar to the action of a typewriter: the type element starts at the left corner of the page and moves across to the right corner, completing a single line of type. The typewriter carriage then returns to the left side of the paper, moves down to the next line, and starts again. In most video cameras, interlaced scanning is like a typewriter that adds a second carriage return after each line, repeating the lines until it reaches the bottom of the page. This is how it completes one field, or half the video image. The scanner/typewriter then moves back up the page and begins typing on the second line at the left or in the middle of the line just below the first line. It continues this way, moving down and filling in the lines between the original lines, until the entire page is complete. In this way the scanning completes the second field and produces one full video frame. This electronic process repeats (like putting in a new sheet of paper) for each frame. Some cameras produce the video signal using progressive scanning in which every line is scanned one after the other rather than skipping a line. Computer monitors use progressive scanning. In the mid-1980s the solid-state CCD video sensor became a commercial reality. This new device replaced the vidicon tube and silicon tube image sensors and represented a significant advance in camera technology. The use of the solid-state “chip” sensor made the camera 100% solid-state offering significant advantages over any and all tube cameras: long life, no aging, no image burn-in, geometric accuracy, excellent sensitivity and resolution, low power consumption, and small size. Several different sensor types are available for video security applications with the most prominent and widely used being the CCD, CMOS, ICCD, and thermal IR. The CCD and CMOS are used in daylight and some nighttime applications and respond to visible and near-IR energy. The ICCD is used in low-light-level nighttime applications. The thermal IR camera is used in nighttime applications when there is no visible or near-IR radiation and/or there is a smoke, dust, or fog environment. Solid-state cameras use a silicon array of photo-sensor sites (pixels) to convert the input light image into an Cameras—Analog, Digital, and Internet electronic video signal that is then amplified and passed on to a monitor for display. Most solid-state sensors are charge-transfer devices (CTD) that are available in three types depending on the manufacturing technology: (1) the CCD, (2) the charge-priming device (CPD), and (3) the charge-injection device (CID). A fourth sensor type introduced more recently to the security market is the CMOS. By far the most popular devices used in security camera applications are the CCD and CMOS. The CID is reserved primarily for military and industrial applications. Solid-state and thermal cameras are significantly smaller weigh less, and consume less power than the prior tube cameras. A packaged solid-state image sensor is typically 3/4 inch × 3/4 inch × 1/4 inch or smaller while its prior, tube predecessor was 3/4 inch in diameter and 5 inches long or larger. Solid-state cameras consume from a fraction of a watt to several watts compared to 8–20 watts for the tube camera. The security field began using color cameras after the technology for solid-state color cameras developed in the consumer camcorder market. These color cameras have a single solid-state sensor with an integral three-color filter and an automatic white-balancing circuit to provide a reliable and sensitive device. To produce a noise-free monochrome or color picture with sufficient resolution to identify objects of interest, the sensor must have sufficient sensitivity to respond to available natural daytime or artificial lighting. As mentioned, security video cameras sensitive to visible and/or nearIR lighting can be represented by two general categories: (1) CCD and CMOS solid-state, and (2) LLL ICCD. Separate from these visible and near-IR cameras is a third category operating in the thermal (heat energy) IR region which is responsive to the difference in temperature in the scene rather than reflected light from the scene. The LLL and thermal cameras are described in Chapter 19. In subsequent sections each parameter contributing to the function and operation of these security cameras is described. All security cameras have a lens mount in front of the sensor to mechanically couple an objective lens or optical system to the camera. The original C mount was designed for the larger 1/2-, 2/3-, and 1-inch tube and solid-state sensor formats and still accounts for many camera installations. Currently the most popular mount is the CS mount. It is designed for the 1/4-, 1/3-, and 1/2-inch format sensor cameras and their correspondingly smaller objective lenses. The CS mount configuration evolved from the original C mount as cameras sensors became smaller. Small printed circuit board cameras used for covert surveillance use a mini-mount with 10, 12, and 13 mm thread diameters (see Section 5.8). 5.2 CAMERA FUNCTION This section describes the functioning of the major parts of a solid-state analog and digital video camera and the video signal. Figure 5-2 is a generalized block diagram for the analog and digital video camera electronics. The camera sensor function is to convert a visual or IR light image into a temporary sensor image which the camera scanning mechanism successively reads, point by point or line by line, to produce a time-varying electrical signal representing the scene light intensity. In a color camera this function is accomplished threefold to convert the three primary colors—red, green, and blue—representing the scene, into an electrical signal. The analog video camera consists of: (1) image sensor, (2) electronic scanning system with synchronization, (3) timing electronics, (4) video amplifying and processing electronics, and (5) video signal synchronizing and DIGITAL CIRCUITS ANALOG CIRCUITS PIXEL READOUT DSP SYNC TIMING (SCANNING) SENSOR LENS AMPLIFIER COMBINING ELECTRONICS DRIVER AMPLIFIER SCENE LOW VOLTAGE DC POWER CONVERTER FIGURE 5-2 CCTV camera block diagram 111 INPUT POWER: 12 VDC 117 VAC VIDEO OUT 75 ohm 112 CCTV Surveillance combining electronics. The synchronizing and combining electronics produce a composite video output signal. To provide meaningful images when the scene varies in realtime, scanning must be sufficiently fast—at least 30 fps— to capture and replay moving target scenes. The video camera must have suitable synchronizing signals so that a monitor, recorder or printer at the receiving location can be synchronized to produce a stable, flicker-free display or recording. The digital video camera (see dotted block) consists of: (1) image sensor, (2) row and column pixel readout circuitry, (3) DSP circuits, and (4) video synchronizing and combining electronics. The synchronizing and combining electronics produce a composite video output signal. The following description of the video process applies to all solid-state, LLL and thermal cameras. The lens forms a focused image on the sensor. The sensor image readout is performed in a process called “linear” (or raster) scanning. The video picture is formed by interrogating and extracting the light level on each pixel in the rows and columns. The brightness and color at each pixel varies as a function of the focused scene image so that the signal obtained is a representation of the scene intensity and color profile. 5.2.1 The Scanning Process One video frame is composed of two fields. In the US the NTSC system is based on the 60 Hz power line frequency and 1/30 second per frame (30 fps), each frame containing 525 horizontal lines. In the European system, based on a 50 Hz power line frequency and 1/25 second per frame, each frame has 625 horizontal lines. This solid-state analog video output signal has the same format as that from its tube camera predecessor. Two methods of scanning have been used: 2:1 interlace and random interlace. Present cameras use the 2:1 interlace scanning technique to reduce the amount of flicker in the picture and improve motion display while maintaining the same video signal bandwidth. In both scanning methods, every other line of pixels is scanned. In the NTSC system, each field contains 262½ television lines. This scanning mode is called two-field, odd-line scanning (Figure 5-3). START 0 FIELD 1 VIDEO OUT VIDEO CAMERA 262 1/2 END START 0 FIELD 2 NOTE: APPROXIMATELY 21 HORIZONTAL LINES FOR VERTICAL RETRACE IN 1 FIELD OR 42 LINES FOR 1 FRAME VISUAL LINES IN 1 FRAME = 525 – 2 × 21 = 483 LINES VIDEO SIGNAL FIELD 1 VERTICAL BLANKING TIME 525 END FIELD 2 262 1/2 SYNC FIGURE 5-3 NTSC two-field/odd-line scanning process 525 TIME Cameras—Analog, Digital, and Internet In the NTSC standard, 60 fields and 30 frames are completed per second. With 525 TV lines per frame and 30 fps, there are 15,750 TV lines per second. In the standard NTSC system, the vertical blanking interval uses 21 lines per field, or a total of 42 lines per frame. Subtracting these 42 lines from the 525-line frame leaves 483 active picture lines per frame representing the scene. By convention, the scanning function of every camera and every receiver monitor starts from the upper left corner of the image and proceeds horizontally across to the right of the sensor. Each time it reaches the right side of the image it quickly returns to a point just below its starting point on the left side. This occurs during what is called the “horizontal blanking interval” of the video signal. This process is continued and repeated until the sensor is completely read out and eventually reaches the bottom of the image, thereby completing one field. At this point the sensor readout stops (or in the case of the CRT monitor the beam turns off again) and returns to the top of the image: this time is called the “vertical blanking interval.” For the second field (a full frame consists of two fields), the scan lines fall in between those of the first field. By this method the scan lines of the two fields are interlaced, which reduces image flicker and allows the signal to occupy the same transmission bandwidth it would occupy if it were performing progressive scanning. When the second field is completed and the scanning spot arrives at the lower right corner, it quickly returns to the upper left corner to repeat the entire process. For the solid-state camera, the light-induced charge in the individual pixels in the sensor must be clocked out of the sensor into the camera electronics (Figure 5-4). The time-varying video signal from the individual pixels clocked out in the horizontal rows and vertical columns likewise generates the two interlaced fields. In the case of the tube camera, a moving electron beam in the tube does the scanning similar to the CRT in the tube monitor. By scanning the target twice (remember the typewriter analogy), the sensor is scanned, starting at the top left side of the picture, and a signal representing the scene image is produced. First the odd lines are scanned, until one field of 262½ lines is completed. Then the beam returns to the top left of the sensor and scans the 262½ even-numbered lines, until a total picture frame of 525 lines is completed. Two separate fields of alternate lines are combined to make the complete picture frame every 1/30th of a VERTICAL TIMING VERTICAL VIDEO SIGNAL SHIFTING VIDEO SIGNAL HORIZONTAL VIDEO SIGNAL SHIFTING 0 ,1,2,3 . . . 754 0 1 2 3 . . . 1 HORIZONTAL LINE SENSOR OUTPUT SIGNAL VERTICAL SHIFT REGISTER SYNC FIGURE 5-4 VIDEO OUTPUT HORIZONTAL SHIFT REGISTER HORIZONTAL TIMING SOLID STATE SENSOR Solid-state camera scanning process 113 488 TIME 114 CCTV Surveillance second. This TV camera signal is then transmitted to the monitor, where it re-creates the picture in an inverse fashion. This base-band video signal has a voltage level from 0 to 1 volt (1 volt peak to peak) and is contained in a 4–10 MHz electrical bandwidth, depending on the system resolution. The synchronizing signals are contained in the 0.5 volt timing pulses. 5.2.2 The Video Signal The video signal can be better understood by looking at the single horizontal line of the composite signal shown in Figure 5-5. The signal is divided into two basic parts: (1) the scene illumination intensity information and (2) the synchronizing pulses. Synchronization pulses with 0.1-microsecond rise and fall times contain frequency components of up to 2.5 MHz. Other high frequencies are generated in the video signal when the image scene detail contains rapidly changing light-to-dark picture levels of small size and are represented by about 4.2 MHz, and for good fidelity must be reproduced by the electronic circuits. These high-frequency video signal components represent rapid changes in the scene—either moving targets or very small objects. To produce a stable image on the monitor, the synchronizing pulses must be very sharp and the electronic bandwidth wide enough to accurately reproduce them. The color signal in addition to a luminance (Y ) intensity component includes an additional chrominance (C) color component in the form of a “color burst” signal. The color signal can also be represented by three primary color components: red, green, and blue (RGB), each having waveforms similar to the monochrome signal (without the color burst signal). 5.2.2.1 Monochrome Signal The monochrome camera signal contains intensity information representing the illumination on the sensor. For the monochrome camera, all color information from the scene is combined and represented in one video signal. The monochrome signal contains four components: 1. 2. 3. 4. horizontal line synchronization pulses setup (black) level luminance (gray-scale) level field synchronizing pulses. WHITE LEVEL V(t) SCENE ILLUMINATION (A) ONE HORIZONTAL LINE BLACK LEVEL TIME (t) H 63 MICROSECONDS 0 (B) ONE FRAME—2 FIELDS V(t) EQUALIZING PULSE INTERVAL H H 3H VERTICAL SYNC PULSE INTERVAL EQUALIZING PULSE INTERVAL 3H (NOT TO SCALE) FIGURE 5-5 Monochrome NTSC CCTV video signal 3H HORIZONTAL SYNC PULSES 9TH TO 12TH TIME (t) Cameras—Analog, Digital, and Internet 5.2.2.2 Color Signal Any video color signal is made up of three component parts: 1. Luminance—over all (Black and White) 2. Hue—tint or color 3. Saturation—color intensity of the hue. A black and white video signal describes only the luminance. The luminance, hue, and saturation of picture information can be approximated by the unique combination of primary color information. The primary colors in this additive color process consist of red, green, and blue (RGB) color signals. These primary colors can be combined to give the colors frequently seen in a color bar test pattern. These colors are described by turning on the primary color component parts, either full “on” to full “off.” To produce the other colors needed, the intensity of the individual primary colors must be continuously variable from full on to full off (like the light dimmer on a light switch). The color camera signal contains light intensity and color information. It must separate the spectral distribution of the scene illumination into the RGB color components (Figure 5-6). The color video signal is far more complex than its monochrome counterpart, and the timing accuracy, linearity, and frequency response of the electronic circuits are more critical in order to achieve high-quality color pictures. The color video signal contains seven components necessary to extract the color and intensity information from the picture scene and later reproduce it on a color monitor: 1. 2. 3. 4. 5. 6. 7. horizontal line synchronization pulses color synchronization (color burst signal) setup (black) level luminance (gray-scale) level color hue (tint) color saturation (vividness) field synchronizing pulses. Figure 5-7 shows the video waveform with some of these components. Horizontal Line Synchronization Pulses. The first component the horizontal line synchronization pulses of the composite video signal has three parts: (1) the front porch, which isolates the synchronization pulses from the active picture information of the previous line, (2) the back porch, which isolates the synchronization pulses from the active picture information of the next scanned line, and (3) the horizontal line sync pulse, which synchronizes the receiver, monitor, or recorder to the camera. Color Synchronization (Burst Signal). The second component, the color synchronization, is a short burst of color information used as phase synchronization for the color information in the color portion of each horizontal line. The front porch, synchronization pulse, color burst, and back porch make up the horizontal blanking interval. This color burst signal, occurring during the back-porch interval of the video signal, serves as a color synchronization signal for the chrominance signal. RGB TO COMPOSITE ENCODER SIGNAL INPUT RGB CAMERA SIGNAL OUTPUT RED(R) NTSC COMPOSITE VIDEO GREEN(G) Y–LUMINANCE BLUE(B) C–CHROMA SYNC * GROUND GROUND S–VIDEO(Y,C) LENS ANALOG MONITOR * EXTERNAL SYNC OR SYNC ON GREEN SCENE SCENE FIGURE 5-6 115 RGB to composite video encoding block diagram 116 CCTV Surveillance EXPANDED SYNC PORTION OF TV SIGNAL IRE UNITS PICTURE VIDEO HORIZONTAL PICTURE BLANKING 100 WHITE LEVEL BACK PORCH 20 7.5 0 BREEZE FRONT WAY PORCH COLOR SYNC BURST ooo LUMINANCE BLACK LEVEL –20 SYNC –40 WAVEFORM COMPONENTS: • HORIZONTAL LINE SYNC PULSES (FRONT PORCH, BACK PORCH) • COLOR SYNC (BURST) • SETUP (BLACK) LEVEL • PICTURE VIDEO FIGURE 5-7 Luminance signal superimposed with sub-carrier color signal Setup. The third component of the color television waveform is the setup or black level, representing the video signal amplitude under zero light conditions. Luminance. The fourth component is the luminance black-and-white picture detail information. Changes and shifts of light as well as the average light level are part of this information. Color Hue, and Saturation. The fifth and sixth components are the color hue, and color saturation information. This information is combined with the black-and-white picture detail portion of the waveform to produce the color image. Field Synchronization Pulse. This component maintains the vertical synchronization and proper interlace. These seven components form the composite waveform for a color video signal. The chrominance and the luminance make up the analog component parts of any video color signal. By keeping these component parts separated, the interaction between chrominance and luminance that could produce picture distortion in the NTSC encoded signal is minimized. By keeping the chrominance and luminance components separated, picture quality can be improved dramatically. The output of high-end video security systems is in a form of three RGB signals. In most cases for video security, these RGB signals are combined (or encoded) into a single video signal that is a composite of the primary color information, or dual video signals: (1) luminance (Y ) and (2) chrominance (C ), representing the intensity and color information, respectively. For the composite signal the RGB signals go into an encoder and a single encoded color signal comes out: the composite video signal. In the USA, the color encoding standards were established by the national television systems committee (NTSC). European and other countries use a color encoding standard called phase alternation line (PAL) or sequential with memory (SECAM). Figure 5-6 shows the block diagram for the RGB to composite video encoding. In the NTSC system the luminance (Y ) or black-andwhite component of the video signal is used as a base upon which the color signal is built. The color signal rides on the base signal as a “sub-carrier” signal. Figure 5-7 shows this sub-carrier signal superimposed on the base luminance signal which then completely describes the color and monochrome video signals. After much experimentation it was found that by combining the three RGB video signals in specific proportions an accurate rendition of the original color signal was obtained. These ratios were: 30% of the red video signal, 59% of the Cameras—Analog, Digital, and Internet green video signal, and 11% of the blue video signal. To this signal was added the saturation and hue information. This involved the generation of two additional combinations of the RGB video signals. In the NTSC color system the hue and saturation of a color system are described as a result of combining the proper proportions of an I modulating level and a Q modulating level, consisting of specific ratios of RGB signals. To obtain accurate color rendition, the proper ratio and phase relationships of the signals (expressed in degrees) are required. This analysis is explored in detail in Chapter 25 with the use of vector scopes. 117 5.3.1 Analog Camera Until about the year 2000, all security cameras were CCD and CMOS analog types. With the development of higher density integrated circuits, digital signal processing (DSP) was added and the use of digital cameras is now common place. This section describes the monochrome and color analog cameras. 5.3.1.1 Monochrome 5.3 CAMERA TYPES Video security cameras are represented by several generic forms including: (1) analog, (2) digital, (3) Internet, (4) LLL, and (5) thermal IR. For daytime applications, monochrome, color, analog, digital, and IP cameras are used. When remote surveillance is required an IP camera is used. For low light and nighttime applications the LLL ICCD image intensified camera is used. For very low light level or no light level applications, thermal IR cameras are used. CYAN BLUE MAGENTA Most CCD and CMOS image sensors have wide spectral ranges covering the entire visible range of 400–700 nanometers (nm) and the near-IR spectral region of 800–900 nm. Figure 5-8 shows the spectral response of a visible and near-IR CCD sensor with and without filters. Some monochrome cameras are responsive to near-IR energy from natural light or IR LED illuminators. These cameras are operated without IR cutoff filters. When the CCD or CMOS camera is pointed toward a strong light source or bright object, the sensor is often overloaded due to the high sensitivity of the imager in GREEN YELLOW ORANGE UV INFRARED (IR) SPECTRUM RED RELATIVE SENSITIVITY 100 NEAR IR-CCD WITHOUT IR FILTER 80 60 CMOS WITHOUT IR FILTER 40 PHOTOPIC EYE RESPONSE CCD WITH IR FILTER 20 0 0 FIGURE 5-8 400 500 600 700 800 900 1000 Spectral response of a visible and near-IR CCD, and CMOS sensor 1100 WAVELENGTH (NANOMETERS) 118 CCTV Surveillance the near-IR region. This overload produces a bright-light band above and below the object on the monitor display. If the illuminating source contains a bright spot of IR radiation, such as from sunlight or a car headlight, the IR cutoff filter should be used to prevent sensor overload. Monochrome cameras generally operate in most types of scene lighting providing the light level is sufficient. Light sources such as mercury vapor, metal arc, tungsten, and low- and high-pressure sodium are widely used for monochrome camera applications. 5.3.1.2 Color—Single Sensor There are two generic color video camera types: singlesensor and three-sensor with prism. The single color sensor is the by far the most common type used in security applications (Figure 5-9). This camera has a complex color-imaging sensor that contains an overlay of three integral optical filters to produce signals responding to the three primary colors: red (R), green (G), and blue (B), which are sufficient to reproduce all the colors in the visible spectrum. The three color filters divide the total number of pixels on the sensor by three, so that each filter type covers one third of the pixels. The sensor is followed by video electronics and clocking signals to synchronize the composite video color SCENE LENS signal. A higher quality alternative to the composite signal is found in some color cameras having a 3-wire RGB, or a 2-wire Y and C output signal. Since the single sensor camera has only one sensor, the light from the lens must be split into thirds, thereby decreasing the overall camera sensitivity by three. Since each resolution element on the display monitor is composed of three colors, the resolution likewise is reduced by this factor of 3. However, because of its relatively low cost the single-sensor camera is still much more widely used than the more expensive three-sensor prism type. Color cameras are supplied with IR blocking filters since the IR energy does not supply any color information and would only overload the sensor and/or distort the color rendition. The IR filter alters the spectral response of the CCD imager to match the visible color spectrum (Figure 5-10). The two curves represent the sensor with and without the IR filter in place. In order to obtain good color rendition when using color cameras, the light source must have sufficient energy between 400 nm (0.4 micron) and 790 nm (0.79 micron) corresponding to the visible light spectrum. The IR blocking filters restrict the optical bandwidth reaching the color sensor to within this range so that color cameras cannot be used with IR light sources having radiation to the range of 800–1200 nm. FULL SPECTRUM IMAGE SINGLE COLOR SENSOR COLOR VIDEO SIGNAL OUTPUT PROCESSING ELECTRONICS CCD COLOR PIXEL TRIPLET R G B STRIPE FILTERS RED GREEN BLUE FIGURE 5-9 Single-sensor color video camera block diagram PICTURE ELEMENT (PIXEL) Cameras—Analog, Digital, and Internet CYAN BLUE MAGENTA 119 GREEN YELLOW ORANGE UV INFRARED (IR) SPECTRUM RED RELATIVE SENSITIVITY 100 SOLID STATE WITHOUT IR FILTER 80 60 SOLID STATE WITH IR FILTER 40 VIDICON (REF) 20 0 0 FIGURE 5-10 400 500 600 700 800 900 1000 1100 WAVELENGTH (NANOMETERS) Spectral response of CCD imagers with and without IR filters The color tube camera and early versions of the color CCD camera had external white-balance sensors and circuits to compensate for color changes. Present solid-state color cameras incorporate automatic whitebalance compensation as an integral part of the camera (see Section 5.5.4). 5.3.1.3 Color—Monochrome Switchover Many applications (particularly outdoor) require cameras that operate in daytime and nighttime. To accomplish this, some cameras incorporate automatic conversion from color to monochrome operation. This automatic switchover significantly increases effectiveness of the camera in daytime and nighttime operation and reduces the number of cameras required and the overall cost. The conversion (switchover) is accomplished electronically and/or optically. Using the optical technique to switch from the daytime mode to the nighttime mode, an IR blocking filter is mechanically moved out of the optical path so that visible and near-IR radiation falls onto the color sensor. Simultaneously the three-component color signal is combined into one monochrome signal resulting in a typical tenfold increase in camera sensitivity (Figure 5-11). 5.3.1.4 Color—Three Sensor The three-sensor color camera uses a beam-splitting prism interposed between the lens and three solid-state sensors to produce the color video signal (Figure 5-12). The function of the prism is to split the full visible spectrum into the three primary colors, R, G, and B. Each individual sensor has its own video electronics and clocking signals synchronized together to eventually produce three separate signals proportional to the RGB color content in the original scene. The display from this camera when compared with the single-sensor camera has three times the number of pixels and shows a picture having almost three times higher resolution and sensitivity, and a picture with a rendition closer to the true colors in the scene. This camera is well suited for the higher resolution analog S-VHS, Hi-8 VCRs, and digital DVRs and digital versatile disks (DVDs) now available for higher resolution security applications (Chapter 9). S-VHS, Hi-8 and DVR recorders can use the higher resolution Y (luminance) and C (chrominance) signals, or RGB signals representing the color scene. The camera output signals (Y C or RGB) can be combined to produce a standard composite video output signal. This optical light combining prism and three-sensor technique is significantly more costly than a single-sensor camera, but results in a signal having 120 CCTV Surveillance SCENE ILLUMINATION SENSOR VIDEO OUT AMPLIFIER LENS VIDEO SIGNAL LEVEL SENSE IR BLOCKING FILTER + – OPERATION: MOTOR • SWITCHOVER: AUTOMATIC FROM DAY TO NIGHT CAMERA IR BLOCKING FILTER SENSOR MOTOR DAYTIME: FILTER IN NIGHTTIME: FILTER OUT FIGURE 5-11 Daytime color to nighttime monochrome camera switchover significantly superior color fidelity and higher resolution and sensitivity. 5.3.2 Digital Camera Although most electronics have moved into a digital computer world, until recently video security was still operating in analog terms. For the camera an analog output signals is typically recorded downstream as an analog signal. There is presently a strong migration toward a digital video world using digital electronics in all the components of the video system. Digital signal processing (DSP) has been the driving force behind this migration. The initial first step occurred with the introduction of DSP cameras and has continued with the development of advanced PCdriven switching devices, digital ID cameras, and DVRs. Today’s DSP cameras are less expensive than the analog cameras they are replacing and have more features. Likewise DVRs replacing analog VCRs have increased resolution, improved reliability, and provide easy access to and routing of the stored video records. The advancements in digital technology have made color video more practical, effective, and economical. Presently color cameras now account for 70–80% of all video camera sales. This is directly attributable to higher performance and lower cost provided by digital technology. Most average resolution digital video cameras used in security applications have about 512 by 576 active pixels. High resolution cameras typically have 752 by 582 active pixels. The latter is equivalent to SVHS-quality analog video recording and has a bandwidth of approximately 6–7 MHz. Since VHS quality is sufficient for many applications, the standard full screen image format or fractional screen common intermediate format (CIF)—with 352 by 240 (NTSC) pixels for the luminance signal Y and 176 by 144 pixels for the chrominance signals U and V—was defined. The use of CIF resolution considerably reduces the amount of data being recorded or transmitted while providing adequate image quality. Presently the CCD camera is the camera of choice in digital systems. However, the CCD is being challenged by CMOS technology because of their lower prices, smaller size, and lower power requirements. While many customers want to make use of their existing analog components in a digital system upgrade, replacement of analog components to digital components makes most sense. This is particularly true if the system will be used to send the video signal over the Internet or other digital networks since analog video signals sent over the Internet require a high bandwidth than when digital components Cameras—Analog, Digital, and Internet 121 BLUE BLUE IMAGE VIDEO ELECTRONICS CLOCKING SIGNALS CCD SCENE BLUE LENS GREEN IMAGE VIDEO ELECTRONICS CCD CLOCKING SIGNALS B G R 3 SENSOR OUTPUTS MIXED AND ASSEMBLED INTO COLOR PICTURE GREEN REFLECTS RED RED REFLECTS BLUE RED IMAGE CCD VIDEO ELECTRONICS GREEN CLOCKING SIGNALS RED FIGURE 5-12 Three sensor color camera using prism are used. Analog signals can be converted to digital signals before sending the signal across the network but this requires special converters. It is more cost-effective to buy a digital camera and put it directly on the network. 5.3.2.1 Digital Signal Processing (DSP) The introduction of DSP cameras and advanced digital technology has thrown the entire video security industry into a major tailspin—digital video security is here to stay. The word “digital,” when referring to CCTV cameras, only means that the camera incorporates digital enhancement or processing of the video signal and not that the output signal is a digital signal. These cameras offer improved image quality and features such as back-light compensation, iris control, shuttering, electronic zoom, and electronic sensitivity control to improve picture intelligence and overcome large lighting variations and other problems. The output signal from most surveillance cameras is still an analog signal. This is because the required maximum operating distance needed in most systems is longer than most digital signals can be transmitted. A camera with true digital output would have a very limited operating distance (a few hundred feet) which would not be very useful in most video security applications. The solution for this is the use of network cameras and system networking equipment leading to the use of Internet cameras transmitting over: (1) local area network (LAN), (2) wide area network (WAN) or WLAN, (3) wireless networks (WiFi), (4) intranets, and (5) the Internet, as a means for long-distance monitoring. As mentioned earlier, most DSP camera outputs are analog and use the communication channels listed above. Since the signal-to-noise ratio (SNR) in DSP cameras is better than in analog cameras, manufacturers can increase the amplification using automatic gain control (AGC) resulting in a higher quality video image under poor lighting conditions. The typical SNR for a non-DSP camera is between 46 and 48 dB. Cameras with DSP have an SNR of between 50 and 54 dB. Note that every 3 dB change in signal strength equals a 50% improvement in the signal level. One new DSP signal processing technology employs circuitry that expands the dynamic range of an image sensor up to 64 times over that of a conventional CCD camera and brings camera performance closer to the capabilities of the human eye. The camera simultaneously views bright and dark light levels and digitally processes the bright and dim images independently. In this new technique a 122 CCTV Surveillance long exposure is used in the dark portions of the scene, and a short exposure in the bright portions. The signals are later combined using DSP into an enhanced image incorporating the best portions of each exposure, and the composite image is sent as a standard analog signal to the monitor or recorder. In the analog video world, if a video signal is weak or noisy it can be amplified or filtered but the digital video world is different. The digital video signal is immune to many external signal disturbances but it can tolerate only so many errors and then the signal is gone. A sudden signal drop-off is referred to as the cliff effect in which the video signal is momentarily lost—a complete video picture break-up or drop-out (see Figure 5-13). 5.3.2.2 Smart Camera The introduction of smart digital cameras has changed the architecture of video surveillance systems so that they can now perform automated video security (AVS). Most analog video systems allow the security officer to make decisions based on the information seen on the video monitor. With the availability of smart digital video cameras and DSP electronics, decisions are made by the camera rather than the security personnel. The evolution from analog to digital cameras has provided the ability to incorporate intelligence into the camera and make the video camera a smart camera. In the past if a guard saw a person walking the wrong way in a restricted area, the guard would sound an alarm or alert someone in the area to investigate the activity. Smart cameras now have VMD algorithms to distinguish different types of objects and direction of movement. It is a small task to have software sound an alarm or alert someone automatically and free the guard for other tasks. As another example, software algorithms have been developed that can perform menial tasks. If a store manager wants to know how many people entered the front door and went to a particular aisle or location, today’s smart cameras have software that can analyze the video and provide this information automatically. The camera’s DSP takes all the incoming video and converts it to a format that it can use to perform the analysis and make decisions. The resulting output then interfaces to other devices to carry out the decisions. COMPARISON OF ANALOG AND DIGITAL PICTURE QUALITY PICTURE QUALITY DIGITAL SIGNAL † PICTURE BREAK-UP (CLIFF EFFECT) * ANALOG-SIGNAL QUALITY DROPS GRADUALLY ANALOG * SIGNAL † DIGITAL-SIGNAL QUALITY DROPS ABRUPTLY. PICTURE BREAKS UP OR DROPS OUT. S/N (ANALOG) ERROR RATE (DIGITAL) DIGITAL PICTURE DROP-OUT DIGITAL PICTURE BREAK-UP REPEATS SAME IMAGE BLOCKS INSTEAD OF PICTURE FIGURE 5-13 FRAME 1 Digital video signal picture break-up or drop-out FRAME 2 FRAME N Cameras—Analog, Digital, and Internet To effectively implement AVS and video intelligence across multiple cameras requires moving the image analysis into the camera. By running all or part of the video analysis software in the camera, reliability is improved in the overall system by eliminating a single point of failure. There is also improved scalability in the system since additional cameras do not impact the central AVS system. By performing the video analysis in the camera, the analysis software takes advantage of the uncorrupted video, and has the ability to instantly adjust to changes in the scene to optimize it for the algorithm. Since transmission bandwidth is limited, the smart camera decides what if any video should be sent from the camera and how much compression should be applied to the video signal before transmitting it. This technique reduces signal degradation since the information has already been acted upon in the camera. Cameras can be made smarter through the use of DSP. Not only can a camera make a record of an event, but it can now evaluate the importance and relevance of that event. By processing images at the camera level, the camera electronics can make decisions as to how to capture an image. When an event occurs, such as movement in the image, the camera electronics determine if the movement is in a field of interest. Likewise it can recognize a person as the main object vs. a dog or piece of paper blowing in the wind. The camera can determine whether a person needs to know about the event and alert security personnel automatically. This feature allows a single person to manage a much greater number of cameras than would otherwise be possible with an analog system and can significantly reduce employee expenses. If there is no activity the camera can capture the scene at a lower resolution or frame rate, thereby reducing the bandwidth required, minimizing the impact a digital camera will have on a network, and conserving storage capacity in the recorder. (A) WEDGE HOUSING FIGURE 5-14 5.3.2.3 Legal Considerations There are legal factors to consider when using a digital camera or digital video system for court and prosecution purposes. In the digital process the camera image can be manipulated pixel-by-pixel, with text or other modifications made in the image after the original image has been recorded. There is a bias in the courts that points out that compressed video can be manipulated. Therefore, it is suggested that the scene be captured using an uncompressed, full JPEG image at a high frame rate with a smart camera when an event is potentially important. However, without a smart camera, in the time it takes to alert a person and await a response to capture the event in JPEG, the important moment could already have taken place. A smart camera could make this determination itself and thus be more responsive by increasing the resolution and frame rate automatically. The camera could also intelligently zoom in on the target to get more detail and information, something analog cameras cannot do without human intervention. 5.3.3 Internet Camera The Internet camera using the IP and the Internet has become a critical component in the use of AVS. Prior to the Internet, video security was focused on the use of video to bring the visual scene to the security officer. Using the power of the Internet and digital IP cameras, the camera scenes can now be transmitted directly to a security officer located anywhere on the network (Figure 5-14). To uniquely identify any specific camera on the network, an ID address and a password are assigned to the camera. The camera when connected to the network can be interrogated from any Internet port, anywhere in the (B) COMPACT Digital internet protocol (IP) cameras 123 (C) FULL FEATURED 124 CCTV Surveillance world. During installation the camera is assigned an Internet address so that the user can view the camera scene using the appropriate password and camera ID number and commanding it to send the picture over the network to the monitoring port. Likewise the security operator can transmit command signals to the camera and platform to perform pan, tilt, zoom, etc. 5.3.3.1 The IP Camera ID The IP camera is assigned a digital address so that it can be accessed from anywhere on the network—locally or remotely. The network permits direct two-way communications for commanding the camera in pan, tilt, and zoom while simultaneously receiving the image from the remote Internet camera. The IP camera is given an Internet protocol (IP) address having the form shown in Table 5-1. 5.3.3.2 Remote Viewing Remote viewing or AVS is the direction that the video security industry is taking. This powerful new tool allows viewing anywhere in the world using the Internet camera and the Internet as its transmission means. This AVS function means that all security personnel can gain access to camera control, etc. depending on password authorization, thereby significantly increasing the effectiveness of the video security system. The ability to view a scene remotely via the Internet, intranet, or other long-distance communications path reliably and economically has resulted in the implementation of AVS. The ability to receive a video picture from the camera and command the camera to pan, tilt, zoom, etc. all from the security control room at any remote distance has significantly increased the functionality and value of video security systems. 5.3.3.3 Compression for Transmission Long before digital video transmission was envisioned, engineers realized the need to compress the color video signal. The color systems were designed to be compatible with monochrome video signals already in use. The color video signal had to fit in the same bandwidth space as the monochrome signal. This color compatibility was not an easy engineering task and created many trade-offs that can only be solved with digital video transmission. In an ideal system the color signal would be transmitted as three, highresolution primary channels red, green, and blue (RGB), each with its own luminance and color information. Even before the analog color video signal is converted to digital data and compressed using data compression algorithms, the video signal has been compressed using analog matrix coding. It has not been possible to send a high-quality, high-resolution computer digital video signal through a standard real-time video transmission system. It is the same as trying to pass high-quality stereo sound through a telephone: even with extensive coding it is not possible. In the analog system video noise manifests itself as grain in the color picture, or smearing, or contrast and brightness problems that cause tint (hue) changes and picture rolling or breakup. None of these analog problems should occur in a well-designed digital video system. However, digital video has a whole new set of problems such as aliasing, compression artifacts, jagged edges, jumpy motion, and just plain poor quality due to low data bit rate or compression. There is no digital video system benchmark at present to accurately compare different video systems. DISSECTING THE IP ADDRESS AND SUBNET MASK DECIMAL NOTATION BINARY NOTATION IP ADDRESS 154.140.76.45 10011010 10001100 01001100 00101101 SUBNET MASK 255.255.255.0 11111111 11111111 11111111 00000000 THIS OCTET IS PART OF AN EXTENDED NETWORK PREFIX THIS OCTET REPRESENTS HOST INFORMATION CAMERA ASSIGNED IP ADDRESS DURING INSTALLATION Table 5-1 Internet Protocol (IP) Camera Address Cameras—Analog, Digital, and Internet To transmit the wide bandwidth video signal over a narrow bandwidth communication channel requires that the video signal be compressed at the camera location and decompressed at the monitoring location. The compression algorithms used for video removes redundant signal and picture information both within each video frame (intra-frame) and or redundant information from frame to frame (interframe). The techniques (algorithms) used to remove this redundant information have been developed by several technical groups and manufacturers. Several of the most common algorithms are M-JPEG, MPEG-4, and H.264 developed by the Joint Motion Picture Engineers Group. These compression formats use frame-by-frame compression. A wavelet compression algorithm called JPEG 2000 was created as the successor to the original JPEG format developed in the late 1980s for still digital video (single frame) and photography. This algorithm is based on stateof-the-art wavelet techniques, but is designed for static imaging applications, on the Internet for e-commerce, digital photography, image databases, cell phones, and PDAs, rather than for real-time video transmission. There are basically two different types of video compression: (1) lossy, and (2) lossless (Chapter 7). Lossy compression as its name implies means that the final displayed picture is not an exact replica of the original camera signal. The amount of compression determines how much the final signal departs from the original. As a rule of thumb, the more the compression the more the departure from the original. The compression range for a lossy system can vary from 10 to 1, to 400 to 1 reduction in signal bandwidth. Digital video compression is simply a system for reducing the redundancy in the data words that describe every pixel on the screen. Compression is used to reduce the data size for a given video frame and de-compression is used to convert the compressed signal back into a form like the original video signal. How closely this compressed signal matches the original input video depends on the quality and the power of the compression algorithm. There are several generic types of compression techniques available to the digital video engineer. Two basic types are: inter-frame compression, which occurs in between frames, and intra-frame which occurs within a frame. Inter-frame compression is based on the fact that for most scenes there is not a great change in data from one frame to the next. It takes advantage of the condition that only a part of the scene changes or has motion and therefore only those portions which are different are compressed. 5.3.4 Low Light Level ICCD The most sensitive LLL camera (Chapter 19) is the intensified CCD (ICCD). In special applications the silicon intensified target (SIT), and intensified SIT (ISIT) are used, but these prior generation tube cameras have all 125 but been replaced by the ICCD camera. These LLL cameras share many of the characteristics of the monochrome CCD and CMOS described earlier but include a light intensification means to amplify the light thereby responding to much lower light levels. The most sensitive solid-state video camera is the ICCD and is used to view scenes illuminated under very low-light-level artificial lighting, moonlight, and starlight conditions. These LLL cameras have an image intensifier coupled to an imaging tube or solidstate sensor and can view scenes hundreds to thousands of feet from the camera under nighttime conditions. 5.3.5 Thermal IR Thermal IR imaging systems are different from LLL nightvision systems based on ICCD image-intensifying sensors. The ICCD responds to reflected sunlight, artificial lighting, moonlight, and starlight to form a visual image. It also responds to the reflected light from near-IR emitting LEDs and filtered IR thermal lamp sources. In contrast, thermal imaging systems respond exclusively to the heat from warm or hot emitting objects. The availability of non-cooled (room temperature) thermal IR detector technology is now driving the IR imaging security market. The primary reasons are significant cost reduction, room temperature operation, and improved camera operating characteristics. 5.3.6 Universal System Bus (USB) The Universal system bus (USB) is a transmission protocol developed to permit disparate electronic equipment, cameras, etc. to communicate with a computer. The original narrower bandwidth USB-1 protocol has been surpassed by the new wideband USB-2 which interfaces the real-time video signal with the computer USB port. 5.4 BASIC SENSOR TYPES Background. Solid-state CCD sensors are a family of image-sensing silicon semiconductor components invented at Bell Telephone Laboratories in 1969. The CCD imagers used in security applications are small, rugged, and low in power consumption. The solid-state CID camera was invented at the General Electric Company in the 1970s. Unlike all other solid-state sensors, this camera can address or scan any pixel in a random sequence, rather than in the row and column sequence used in the others. Although this feature has not been used in the security field in the past, some new digital cameras are taking advantage of this capability. When the CID camera is scanned in the normal NTSC pattern, it has attributes similar to those of other solid-state cameras. 126 CCTV Surveillance Most video security installations use visible light monochrome or color solid-state cameras. Prior to the use of the solid-state cameras all video cameras used sensors based on vacuum tube technology. The only instance in which this technology is now used in video security practice is in the LLL, SIT, and ISIT camera. Prior to the solid-state sensor camera, video cameras utilized tube technology for the sensor and solid-state transistors and integrated circuits for all signal processing. The tube cameras (mostly monochrome) used a scanning electron beam to convert the optical image into an electronic signal. The camera tube consisted of a transparent window, the light-sensitive target, and a scanning electron beam assembly. In operation, the electron beam scanned across the sensor target area by means of electromagnetic coils positioned around the exterior of the tube that deflected the beam horizontally and vertically. The video signal was extracted from the tube by means of the electron beam with a new picture extracted every 1/30th of a second. Tube cameras were available in sizes of 1/2-, 2/3-, and 1-inch formats. Tube cameras were susceptible to image burn-in when exposed to bright light sources and had a maximum lifetime expectancy of only a few years. Functionally, the camera lens focuses the scene image onto the target surface after passing through the sensor window. The rear surface of the sensitive target area is scanned by the electronic beam to produce an electrical signal representative of the scene image. Solid-state electronics then amplified this electrical signal to a level of 1 volt and combined it with the synchronizing pulses. These electronics produce the composite video signal consisting of an amplitude-modulated signal representing the instantaneous intensity of the light signal on the sensor and the horizontal and vertical synchronizing pulses. Tube monochrome cameras provided excellent resolution because the target was a homogeneous continuous surface. With small electron beam spots sizes, high resolutions of 500–600 TV lines for a 2/3-inch camera and 1000 TV lines for a 1 inch diameter vidicon tube were obtained. The workhorse of the industry was the monochrome vidicon tube that was sensitive to visible light. Later the monochrome silicon and Newvicon (Panasonic trademark) types were developed that were sensitive to visible and near-IR energy. These tube cameras operated with light levels from bright sunlight (10,000 FtCd) down to 1 FtCd. The vidicon was the least sensitive type with the silicon or Newvicon tube being a better choice for dawn to dusk applications having sensitivities between 10 and 100 times higher than the vidicon depending on the spectral color and IR content of the illumination. The silicon diode had a high sensitivity in the red region of the visible spectrum and in the near-IR spectrum and could “see in the dark” when the scene was illuminated with an IR source. The silicon camera was the most sensitive tube-type camera and had the highest resistance to bright light damage. 5.4.1 Solid State—Visible The CCD sensor was a new technology that replaced the tube camera. The CCD solid-state sensor camera reduced cost, power consumption, and product size, and was considerably more stable and reliable than the tube-type. The CCD and newer CMOS sensor video cameras operate significantly differently than did their predecessor tube cameras. No electron beam scans the sensor. Solid-state sensors have hundreds of pixels in the horizontal and vertical directions equivalent to several hundred thousand pixels over the entire sensor area. A pixel is the smallest sensing element on the sensor and converts light energy into an electrical charge, and then to an electrical current signal. Arranged in a checker-board pattern, sensor pixels come with a specific number of rows and columns. The total number determines the resolution of the camera. Solid-state image sensors are available in several types, but all fall into two basic categories: charge transfer device (CTD) and CMOS. The generic CTD class can further be divided into CCD, CPD, and CID. Of these three types, the CCD and CMOS are by far the most popular. Charge coupled devices provide quality video performance manifesting low noise, wide dynamic range, good sensitivity, fair anti-blooming and anti-smear reduction capabilities, and operate at real-time (30 fps) video rates. 5.4.1.1 Charge Coupled Device (CCD) At approximately the same time the CCD was invented in 1969 at the Bell Telephone Laboratories in New Jersey, the Philips research laboratory in the Netherlands was also working on an imaging transfer device. The Philips device was called a “bucket brigade device” (BBD), which was essentially a circuit constructed by wiring discrete MOS transistors and capacitors together. The BBD was never seriously considered for use as an imaging device, but the concept of a “bucket brigade” provides a concise functional mechanism similar to the CCD in which charge is passed from one storage site to the next through a series of MOS capacitors. By placing pixels in a line and stacking multiple lines, an area array detector is created. As the camera lens focuses the light from a single point in the scene onto each pixel, the incident light on each pixel generates an electron charge “packet” whose intensity is proportional to the incident light. Each charge packet corresponds to a pixel. Each row of pixels represents one line of horizontal video information. If the pattern of incident radiation is a focused light image from the optical lens system, then the charge packets created in the pixel array are a faithful reproduction of that image. In the process, called “charge coupling,” the electrical charges are collectively transferred from each CCD pixel Cameras—Analog, Digital, and Internet to an adjacent storage element by use of external synchronizing or clocking voltages. In the CCD sensor the image scene is moved out of the silicon sensor via timed clocking pulses that in effect push out the signal, line by line, at a precisely determined clocked time. The amount of charge in any individual pixel depends on the light intensity in the scene, and represents a single point of the intelligence in the picture. To produce the equivalent of scanning, a periodic clock voltage is applied to the CCD sensor causing the discrete charge packets in each pixel to move out for processing and transmission. The image sensor has both vertical and horizontal transfer clocking signals as well as storage registers, to deliver an entire field of video information once, during each integration period, 1/30th of a second in the NTSC system. CCD sensors require other timing circuits, clocks, bias voltages made by standard manufacturing processes, and five or more support chips. All CCD image sensors consume relatively low power and operate at low voltages. They are not damaged by intense light but suffer some saturation and blooming under intense illumination. Most recent devices contain anti-blooming geometry and exposure control (electronic shuttering) to reduce optical overload. Typical device LIGHT PIXELS INPUT parameters for a 1/3-inch format CCD available today are: 771 × 492 pixels (horizontal by vertical) for monochrome and 768 × 494 for color cameras. They have horizontal resolutions of 570 TV lines for monochrome and 480 TV lines for color. Sensitivities are 0.05 lux (F/1.2 lens) for monochrome and 0.5 lux (F/1.0 lens) for color. The CCD sensors are available in formats of 1/4-, 1/3-, and 1/2inch, and in some special cameras in a 1/5-, 1/6-, or 2/3-inch format. All have the standard 4 × 3 aspect ratio. Typical dynamic ranges for monochrome and color are 100 to 1 without shuttering, and 3000 to 1 with electronic shuttering times range from 1/16–1/10,000 second. Interline Transfer. There are several different CCD sensor pixel architectures used by different manufacturers. The two most common types are the inter-line transfer (ILT) and frame transfer (FT). Figure 5-15 shows the pixel organization and readout technique for the ILT CCD image sensor. The pixel organization has precisely aligned photosensors with vertical inter-linearly arrayed shift registers, and a horizontal shift register linked with the vertical shift registers as shown. The photo-sensor sites respond to light variations that generate electronic charges proportional to VERTICAL STORAGE/ READOUT PHOTO SENSOR SITES (PIXELS) HORIZONTAL READOUT OUTPUT AMPLIFIER VERTICAL SHIFT REGISTER LIGHT IN-SIGNAL PATH EVEN LINE ODD LINE EVEN LINE ODD LINE EVEN LINE ODD LINE EVEN LINE SIGNAL OUTPUT AMPLIFIER HORIZONTAL SHIFT REGISTER FIGURE 5-15 Interline transfer CCD sensor layout 127 OUT 128 CCTV Surveillance the light intensity. The charges are passed into the vertical shift registers simultaneously and then transferred to the horizontal shift registers successively until they reach the sensor output amplifier. The camera electronics further amplify and process the signal. Each pixel and line of information in the ILT device is transferred out of the sensor array line-by-line, eventually clocking out all 525 lines and thereby scanning the entire sensor to produce a frame of video. This sequence is repeated to produce a continuous video signal. Frame Transfer. In the FT CCD, the entire 525 lines are transferred out of the light sensitive array and simultaneously, and stored temporarily in an adjacent nonilluminated silicon buffer array (Figure 5-16). The basic FT CCD structure is composed of two major elements: a photo-plane and a companion memory section. First the photo-plane is exposed to light. After exposure the charge produced is quickly transferred to the companion memory and then read out of memory—one line at a time for the entire frame time. While this memory is being read out, the photo-plane is being exposed for the next image. Although full-pixel storage memory is required for this structure, it has the big advantage of having all the pixels exposed at the same time. CMOS technology on the other hand exposes a line until it is time to read out that line, then that line is transferred to the output register. Consequently the beginning and end of each exposure time of each line is different for every line, i.e. all pixels are not exposed at the same time. The difference between CCD and CMOS is seen when there is motion in the scene. The CCD works better whenever the scene consists of significant motion relative to a line time. The FT CCD imager has photo-sites (pixels) arranged in an X-Y matrix of rows and columns. Each site has a lightsensitive photodiode and an adjacent charge site which receives no light. The pixel photodiode converts the light photons into charge (electrons). The number of electrons produced is proportional to the number of photoelectrons (light intensity). The light is collected over the entire sensor simultaneously and then transferred to the adjacent site, and then each row is read out to a horizontal transfer register. The charge packets for each row are read out serially and then sensed by a charge-to-voltage converter and amplifier section. LIGHT IN-SIGNAL PATH IMAGING AREA LIGHT SENSITIVE PIXELS STORAGE AREA (MEMORY) LIGHT INSENSITIVE HORIZONTAL OUTPUT REGISTER FIGURE 5-16 Frame transfer CCD sensor pixel organization OUTPUT AMPLIFIER SIGNAL OUTPUT Cameras—Analog, Digital, and Internet 5.4.1.2 Complementary Metal Oxide Semiconductor (CMOS) For more than two decades solid-state CCD has been the technology of choice for security cameras. However, they are now being challenged by the CMOS sensor. CMOS research sponsored by NASA and has led to many commercial applications of the CMOS imagers. In the past CMOS image sensors were relegated to low resolution applications but now they have sufficient pixels for serious security applications. Charge coupled device sensors will still have a place in the high resolution, high sensitivity applications but the CMOS has found increasing application for main-stream video security. The holy grail in most CMOS imager ventures has been the “camera-on-a-chip” in which a single CMOS chip includes the imaging sensor, timing and control, as well as post-processing circuitry. The CMOS-type sensor exhibits high picture quality but has a lower sensitivity than the CCD. In the CMOS device, the electric signals are read out directly through an array of transistor switches rather than line by line as in the CCD sensor. The CMOS sensor has come into vogue because of the advantage of incorporating on-board analog to digital converters, timing circuits, clocks, and synchronization circuits on the chip. The sensor is manufactured using standard silicon processes, the same as those used in computer chip fabrication, resulting in lower fabrication costs. A CMOS sensor uses about 10–20% as much power as a comparable CCD. Digital signals from CMOS sensors are always transmitted (not stored as in the CCD sensor) and therefore do not need a DSP. Significant improvements have been made in CMOS cameras for low light level indoor applications. The typical CMOS camera requires a light level of 05–1 FtCd. In general, CCD cameras operate in lower light conditions than CMOS cameras. Using the standard semiconductor production lines it is possible to add a microprocessor or DSP, random access memory (RAM), read only memory (ROM), and a USB controller to the same IC. Complementary metal oxide semiconductor sensors are lower-priced than CCD and will likely remain so because they are manufactured using the most common silicon processing techniques and are also easier to integrate with other electronic circuitry. CMOS sensors are inherently better than their CCD counterpart in light overload situations and exhibit far less blooming than the CCD. When the CCD is pointed at a bright lamp (100 watt incandescent or other) light source, a white blob is seen around the bulb which obscures the fixture and ceiling scene adjacent to it. With the CMOS the fixture and ceiling detail is seen. Active Pixel Sensor (APS). The CMOS APS digital camera-on-a-chip technology has progressed rapidly since its invention by the scientists at the NASA Jet Propulsion Laboratory (California). In the 1990s Stanford University developed a new technology to improve CMOS sensors called the “active pixel sensor” (APS). This digital pixel system (DPS) technology produced higher quality, sharper images, and included an amplifier and analog-to-digital converter (ADC) within each image sensor pixel. The ADCs convert the light signal values into digital values at the point of light capture. Figure 5-17a shows how the DPS works, illustrating that B) ACTIVE COLUMN SENSOR (ACS) A) ACTIVE PIXEL SENSOR (APS) LARGE OPEN LOOP AMPLIFIER ON EACH PIXEL RESET BIAS TO NEXT COLUMN ROW RESET ROW SELECT SMALL AMPLIFIER CLOSED LOOP UNITY GAIN AMP. TO NEXT ROW RESET COLUMN ROW SELECT ACTIVE SENSOR AREA ~70% OF PIXEL AREA TECHNOLOGY: AMPLIFIER INSIDE EACH PIXEL WEAKNESSES: INTERNAL AMPLIFIER LOWERS “FILL FACTOR” TO ~30% REDUCED DYNAMIC RANGE: VARIATION IN AMPLIFIER GAIN FROM PIXEL TO PIXEL TECHNOLOGY: SHARED UNITY GAIN AMPLIFIER FOR EACH COLUMN ATTRIBUTES: HIGH “FILL FACTOR” 70% ONLY 1 TRANSISTOR IN EACH PIXEL HIGH DYNAMIC RANGE: 80 dB UNITY GAIN AMPLIFIER SHARED BY EACH PIXEL IN EACH COLUMN FIXED PATTERN NOISE NOISIER THAN ACS FIGURE 5-17 OUTPUT RESET BIAS OUTPUT ACTIVE SENSOR AREA ~30% OF PIXEL AREA 129 CMOS active pixel sensor (APS) and active column sensor (ACS) 130 CCTV Surveillance the charge is removed just before saturation of the pixel occurs, thereby insuring that each pixel is neither under nor over exposed. Because each pixel has its own ADC, each pixel in effect acts as its own camera. These sensors have in effect thousands of “cameras” which are combined to create highquality video frames and pictures. One disadvantages of the APS technology is that it reduces the “fill factor” (sensitivity, dynamic range) and produces fixed pattern noise. A salient advantage of the technology is that high-lighted areas do not saturate and cause blooming or smearing as when illuminated by a street light or automobile light for applications in nighttime highway surveillance or vehicle license plate identification. The CMOS APS devices are immune to smear and have 30–40% fill factors. To increase sensor sensitivity, modern on-chip microlenses are formed by an inexpensive process. These lenses act as “funnels” to direct light incident across an entire pixel toward the sensitive portions of the pixel (not an imaging lens). Microlenses increase the responsivity of some low-fillfactor sensors by a factor of two to three. The fill factor is the ratio of optically illuminated area of the sensitive silicon area to the total silicon area in a particular pixel. Active Column Sensor (ACS). To overcome some of the disadvantages of the APS CMOS sensor (sensitivity, noise), suppliers have developed active column sensor (ACS) CMOS sensors (Figure 5-17b). The CMOS sensors have had limitations for the video security industry but the ACS process has the potential of overcoming these limitations. The ACS CMOS imager technology eliminates nonuniformity of gain by using a unity gain amplifier at each pixel site. Active column sensor also increases the 30% fill factor for APS technology to 70% for ACS. These sensors can also operate at much faster clock speeds and therefore produce no smear for fast motion in the image or fast pan/tilt applications. They offer outstanding antiblooming capability in both rows and columns which makes them well suited for high- and low-lighted scenes. They rank high in video quality as do CCD imagers. The ACS technology, CMOS imager could do to the CCD sensor what the CCD did to the vidicon. The Internet requires the best image quality at very low cost for video graphics array (VGA) and common intermediate format (CIF) display resolution. The ACS CMOS sensor sensitivity has so improved that CMOS sensors are now comparable to the CCD. Prior to the use of ACS imager technology, most CMOS imagers used the APS technology, the technique of placing an amplifier inside each pixel. This reduced the fill factor and therefore the sensitivity and the dynamic range of the sensor. The ACS process uses a unity gain amplifier which reduces the non-uniformity of the individual pixels and results in a higher fill factor and higher dynamic range. In the coming years CMOS sensors should exhibit no limitation whatsoever regarding frame speed, resolution, sensitivity, and noise in comparison with CCD sensors. Most available CCD sensors have a signal-to-noise ratio (S/N) of no greater than 58 dB. Some advanced CMOS sensor arrays already have a 66 dB sensitivity and from 1024 × 1034 to 4096 × 4096 pixels. Table 5-2 compares the sensitivity of different types of CCD and CMOS solid-state sensors (see also section 5.6). 5.4.2 ICCD, SIT, ISIT—Visible/Near IR For dawn and dusk outdoor illumination only the best CCD cameras can produce a usable video picture. ICCD cameras can operate under the light of a quarter-moon with 0.001 FtCd. The ISIT camera can produce an image with only 0.0001 FtCd, which is the light available from stars on a moonless night. These LLL cameras offer a 100–1000 times improvement in sensitivity over the best monochrome CCD or CMOS cameras. They intensify light, whereas the CCD and CMOS detect light. The ICCD uses a light intensifier tube or micro-channel plate (MCP) intensifier to amplify the available light up to 50,000 times. The resulting sensitivity approaches that of the SIT camera, is much smaller, requires much less power, and eliminates the blurring characteristics of the SIT under very low light level conditions. The ICCD camera system has sufficient sensitivity and automatic light compensation to be used in surveillance applications from full sunlight to quarter moonlight conditions. The cameras are provided with automatic lightlevel compensation mechanisms having a 100 million to 1 light-level range and built-in protection to prevent sensor degradation or overload when viewing bright scenes. For viewing the lowest light levels, the ISIT camera provides the widest dynamic range from full sunlight to starlight conditions, having a 4 billion to 1 automatic lightlevel range control. Though large, these cameras have been used in critical LLL security applications. The ISIT camera uses an SIT tube with an additional light amplification stage and is still the lowest (and most expensive) LLL camera available. A description of these LLL cameras is given in Chapter 19. 5.4.3 Thermal IR The infrared spectrum is generally defined as followings: the near IR or short-wave IR covers from 700 to 3000 nm (075–3 microns (m)), the mid-wave IR from 3 to 5 microns, and the long-wave IR from 8 to 14 microns. Short-wave IR camera systems use the natural reflection and emission from targets and are used in applications making use of available LLL radiation from reflected moonlight, sky glow (near cities or other nighttime lighted facilities), or artificially generated radiation from IR LEDs or filtered IR lamp sources. Mid-wave IR systems use the energy Cameras—Analog, Digital, and Internet FORMAT TYPE DESCRIPTION HORIZONTAL RESOLUTION (TV LINES) SENSITIVITY* (Lux) COLOR COMMENTS B/W 1/6 CCD COLOR (NTSC), B/W 480 5.0 1/6 CCD COLOR (NTSC), B/W 470 2.5 0.1 ULTRA-FAST IP SPEED DOME 1/6 CCD COLOR (PAL), B/W 460 2.5 0.1 ULTRA-FAST IP SPEED DOME 1/4 CCD 1/4 CCD COLOR (NTSC), B/W COLOR (PAL), B/W COLOR (NTSC), B/W 470 470 510 0.5 0.5 1.0 0.01 0.01 0.06 SPEED DOME, SURVEILLANCE SPEED DOME, SURVEILLANCE COLOR (NTSC) 380 3.0 1/4 CCD 1/4 CMOS REMOTE HEAD 7 mm DIAMETER SURVEILLANCE GENERAL SURVEILLANCE 1/3 CCD COLOR (NTSC), B/W0 480/570 (B/W) 0.8 0.1 SURVEILLANCE-DAY/NIGHT 1/3 CCD MONOCHROME (NTSC) 380 — 0.5 SURVEILLANCE 1/3 CCD COLOR (NTSC), B/W 480 1.0 0.05 SURVEILLANCE 1/3 CMOS COLOR (NTSC) 380 2.0 1/3 CMOS MONOCHROME (NTSC) 400 — 0.05 SURVEILLANCE 1/3 CMOS COLOR (NTSC) 380 1.0** 0.05** COVERT SURVEILLANCE 1/2 CCD 1/2 CCD 1/2 CCD COLOR (NTSC), B/W COLOR (PAL), B/W MONOCHROME (NTSC) 480 0.15 0.015 SURVEILLANCE 480 570 0.15 0.015 0.07 SURVEILLANCE HIGH RESOLUTION B/ W — 131 COVERT SURVEILLANCE * SENSITIVITY IS A MEASURE OF THE LIGHT LEVEL AT 3200 Degrees KELVIN COLOR TEMPERATURE NECESSARY TO PRODUCE A FULL 1 VOLT PEAK TO PEAK VIDEO SIGNAL. ** MINIMUM ILLUMINATION = THE LIGHT LEVEL TO OBTAIN A RECOGNIZABLE VIDEO SCENE. B/ W = BLACK / WHITE (MONOCHROME) Table 5-2 Sensitivity of Representative CCD and CMOS Image Sensors from hot sources (fires, bright lamps, gun barrel emission, explosives and very hot, red hot, white hot objects) that provide good thermal emission. Long-wave IR systems use the differences in radiation from room temperature emitters like humans, animals, vehicles, ships and aircraft (engine areas), warm buildings, and other hot objects as compared to their surroundings. The IR thermal camera is the only system that can “see” when the visible or near-IR radiation suitable for visible, near- or mid-IR sensors is too low to detect. These systems see in total darkness and can often “see” through smoke and fog. The use of IR cameras relies on thermal differences (contrast)—heat emitted by target vs. heat emitted by the background surrounding it—thereby providing images with better contrast than using ICCD image intensification. Thermal sensors require very little temperature difference between the target and background for the sensor to detect the target. Thermal IR cameras look like video cameras in their mechanical and electrical characteristics but the lenses required are different in that the glass in standard visible light or near-IR cameras is replaced by a lens using an infrared transmitting material such as germanium. Thermal systems are readily available for security application but cost between 10 and 100 times more than standard video cameras. These lower resolution IR cameras have a comparatively small number of pixels that result in a pixilated picture, but there is often sufficient intelligence to determine the objects or activity in the scene. Electronic smoothing of the picture is often used to improve the displayed scene. The use of pseudocolors, i.e. different colors representing different temperatures, is a significant aid in interpreting the scene. Medium resolution systems typically have 320 × 256 pixel arrays and high resolution systems have 640 × 512 arrays (military, very expensive). See Chapter 19 for examples of thermal IR imagers. The human body glows (radiates energy) like a 100 watt bulb in the IR spectrum but only if it is viewed in the correct spectrum, i.e. the long-wave IR spectrum. The wavelengths of the radiation emitted by most terrestrial objects lie between about 3 and 12 m in the mid- and far-IR region of the spectrum. The peak of the human body radiation (at 98 Fahrenheit) is at about 9 m. Infrared detectors fall into two different categories: photovoltaic and thermal. The photovoltaic detectors generate an electrical current directly proportional to the number of photons incident on the detector. Thermal detectors respond to the change in resistance or some other temperature-dependent parameter in the material. As the absorbed light heats up, the material (pixels) changes in 132 CCTV Surveillance resistance or capacitance producing a change in the electrical circuit. Pyroelectric and bolometric detectors are the two types of detectors that form the basis of most non-cooled thermal IR camera designs. 5.4.4 Sensor Fusion—Visible/IR A technique called “multi-spectral imaging” in which an image is displayed from two different detectors operating at different wavelengths is finding increased use in the security field. Displaying the images from two different wavelength regions (sensor fusion) on the same monitor significantly increases intelligence obtained from the combined scene. In the 3–5 micron region some targets and backgrounds “reverse” their energy levels. This change can be detected when the two signals are subtracted. In normal single detector systems this signal reversal is averaged out and not detected, thereby reducing detection capability. A powerful sensor fusion technique uses the combination of an image-intensified camera and a thermal IR camera to significantly improve seeing under adverse nighttime conditions having smoke, dust, and fog. The fusion of near-IR and far-IR cameras with combined overlay display results in a significantly improved night vision system. The system combines the strengths of image intensification (a clear sharp picture) with the advantages of thermal IR (high detection capability). This provides the ability to see in practically any non-illuminated nighttime environmental condition. 5.5 CAMERA FEATURES—ANALOG/DIGITAL Analog cameras are limited to a few automatic compensating functions: (1) automatic gain control (AGC), (2) white light balance (WB). Digital cameras with DSP on the other hand can have many automatic functions. Some are described below. 5.5.1 Video Motion Detection (VMD) The second-most often used intrusion-detection device (first is the pyroelectric infrared (PIR)) in the security industry is the VMD. The digital VMD uses an analog-todigital device to convert the analog video signal to a digital signal. The DSP circuits then respond to the movement or activity in the image as recognized as a specific type and rate change within a defined area using a preset minimum sensitivity for size and speed. While PIR intrusion sensors detect the change in the temperature of a particular part of the viewed area, the VMD senses a change in the contrast within the camera scene from the normal quiescent video image. These digital VMD modules are now small enough to be incorporated directly into a video camera housing or larger, more sophisticated ones connected in between the camera and the video monitor. These digital VMDs are much more immune to RFI and EFI interferences and temperature changes that can cause false alarms in the PIR devices. Prior analog VMD technology exhibited an array of false alarm problems related to changes in scene lighting, shadows, cable or wireless transmission noise, etc. With the advancement of CCD cameras and DSP circuitry, the reliability and false alarm rate have been managed, resulting in reliable VMD detectors with the CCD and CMOS cameras scene contrast analysis replaced by localized pixel analysis. It is now possible to digitally analyze changes in individual or small groups of pixels, resulting in increased levels of reliability and reduced false alarm rate. Recent improvements in the digital VMD have addressed problems associated with false alarms due to foreign objects moving through the field of view at rates of speeds too fast or too slow rates to be of interest. The products available have automatic adjustments (algorithms) to process the video signal data to exclude these false alarms. Other false alarms caused by natural weather changes, i.e. clouds coming into the field of view, or small animals and birds or other debris passing through the camera field of view have for the most part been eliminated. These new digital systems have resulted in low false alarm rates and systems that only respond to intruders. Digital VMDs do not require a computer for operation and are usually provided with an RS232 interface for computer integration and remote programming and reporting. This approach to operation and control provides a user-friendly interface to most users that are familiar with a menu-driven screen and mouse operation. Physically they consist of modular units or are designed in the form of plug-in boards for easy installation into existing camera equipment. 5.5.2 Electronic Zooming Prior to video cameras incorporating DSP electronics, the only option for zooming the video camera system was through the use of zoom lens optics. Electronic zoom was first perfected in consumer CCD and CMOS camcorders and still cameras and then in the security industry. The electronic zooming technique makes use of magnifying the image electronically by selecting a portion of the sensor area and presenting its magnified video image onto the monitor screen. Zoom ratios of from 5:1 to 20:1 are available depending on the basic resolution of the sensor. Since only a selected portion of the entire sensor is used, electronic zooming can often be combined with electronic pan and tilt by moving the area used in the sensor over Cameras—Analog, Digital, and Internet different parts of the entire sensor area. This results in electronically panning and tilting while the camera and lens are held stationary. 5.5.3 Electronic Shuttering It is essential to match the camera sensor sensitivity to the lighting in the scene. In general the more the lighting available the less sensitive the camera has to be. Digital signal processing technology permits the camera to adapt to the scene illumination through the use of electronic shuttering of the camera. The camera electronics adapt so that it is optimally adjusted for the scene light level, which changes the sensitivity of the sensor to compensate for varying light levels. This electronic sensitivity control (ESC) allows for small changes in light levels found in indoor applications such as lobby areas, hallways with external windows, storage areas, or where an outside door is occasionally opened. It is not for use in outdoor applications having large light level changes (due to circuitry limitations), where the use of an automatic-iris lens is usually required. It often permits the use of a manual-iris lens assembly, which reduces the overall cost of the camera– lens combination, rather than an auto-iris. 5.5.4 White Balance Automatic white balance is required so that when the camera is initially turned on, it properly balances its color circuits to a white background, which in turn is determined by the type of illumination at the scene. The camera constantly checks the white-balance circuitry and makes any minor compensation for variations in the illumination color temperature, i.e. the spectrum of colors in the viewed scene. Color cameras are sensitive to the color temperature of light as defined by the color rendering index (CRI) of light sources. A common problem for many color camera systems is their inability to reproduce the exact color of an object when using different light sources with different CRIs. Color rendering is the term used to describe how well a light source is able to produce the actual color of the viewed object without causing a shift or error in color. The color temperature determines the white component of the light source composed of the totality of all the colors in the light source spectrum. Different types of lamps produce different ranges of “white” light and these differences must be compensated for. This compensation is performed by the WB circuits of the camera. Today’s DSP cameras have automatic WB electronics that can adjust between color temperatures from 2800 to 7600 K which encompasses most lighting conditions. Chapter 3 shows the spectral output from common light sources and video camera spectral sensitivities used in security applications. 133 5.5.5 Video Bright Light Compression One major improvement resulting from the use of DSP in cameras is back light compensation (BLC). The DSP camera with BLC adjusts to and can simultaneously view dark and bright scene areas thereby increasing the camera dynamic range by more than thirty times over conventional cameras. This technique is ideal for many applications where there are highly contrasted lighting conditions or where contrast conditions change throughout the course of viewing. The camera accomplishes this by digitizing the image signal, at two different rates. Short times (faster speed) register the bright image areas, and long times (slower speed) register dark image areas. The two signals are processed together in the camera and combined into a single signal at the output. Until the use of BLC these conditions did not permit a clear view of the entire image and required the use of high-end cameras with digital back-light masking capabilities. Back light compensation allows cameras to be pointing at brightly lighted building entrances and exits, ATMs, or underground parking facilities. Other applications include casinos where interior lighting is designed to brighten gaming and cash areas and to soften lounges, seating areas, and aisles. Another application is a loading dock that is illuminated with different light levels and poses a similar problem during the course of any given day. Exterior lighting conditions in these areas vary from dark to blinding sunshine. In another interior application, jewelry counters often feature brightly illuminated display areas with subdued lighting in the surrounding areas. Now cameras with DSP compensation can be used to continuously monitor both interior and exterior areas under virtually any lighting condition, applications that were previously not possible with analog camera designs. 5.5.6 Geometric Accuracy One of the significant advantages solid-state image sensors have over their tube sensor predecessors is the precise geometric location of the pixels with respect to one another. In a CCD, CMOS, or thermal IR sensor, the locations of the individual photo-sensor pixel sites are known exactly since they are determined during manufacture of the sensor and never move. 5.6 CAMERA RESOLUTION/SENSITIVITY When classifying a video camera the two specifications that are most important are the resolution and sensitivity. Unfortunately in many data sheets there is confusion surrounding these terms. 134 CCTV Surveillance Resolution. Resolution is the quality of definition and clarity of the picture, and is defined in discernible TV lines; the more the lines the higher the resolution and the better the picture quality. Resolution is a function of the number of pixels (picture elements) in the CCD chip. In other words, the resolution is directly proportional to the number of pixels in the CCD sensor. In some data sheets, two types of resolution are defined: vertical and horizontal. Vertical resolution is equal to the number of discernible horizontal lines in the picture and is limited by either the 525 or the 625 line resolution as defined in the NTSC or CCIR standards. Horizontal resolution relates to the number of lines reproduced in the picture in the vertical direction, and depends on the bandwidth. Sensitivity. Sensitivity is a measure of how low a light level a camera can respond to and still produce a usable or minimum quality picture. It is measured in FtCd or lux for CCD, CMOS, and ICCD cameras operating and the visible and near-IR wavelength range, and in delta-temp (t ) in the mid- and far-IR. One FtCd equals approximately 9.3 lux. The smaller the number (FtCd, lux or t ) the more sensitive the camera. Typical values for state-ofthe-art cameras are: (1) monochrome camera 01−0001 lux, (2) color camera (single sensor) 1 FtCd–5 FtCd, (3) thermal IR 0.1 t . 5.6.1 Vertical Resolution Vertical resolution in the analog scanning system is derived from the 504 effective scanning lines in the 525-line NTSC television system. The camera scanning dissects a vertical line appearing in the scene into 483 separate segments. Since each scanning line on the monitor has a discrete width, some of the scene detail between the lines is lost. As a general rule approximately 30% of any scene is lost (called the “Kell factor”). Therefore, the standard 525-line NTSC television system produces 340 vertical TV lines of resolution (483 effective lines × 07). In any standard 525line CCTV system, the maximum achievable vertical resolution is approximately 350 TV lines. In a 625-line system, the maximum achievable vertical resolution is approximately 408 TV lines. Vertical resolution in the digital system is just the number of vertical camera pixels. However, if a digital camera is displayed on a 525 (or 625) line analog CRT display, then the resolution is limited to the 350 (or 408) TV lines of the analog system. 5.6.2 Horizontal Resolution The NTSC standard provides a full video frame composed of 525 lines, with 483 lines for the image and two vertical blanking intervals composed of 21 retrace lines each. The TV industry adopted a viewing format with a widthto-height ratio of 4:3 and specifies horizontal resolution in TV lines per picture height. The horizontal resolution on the monitor tube depends on how fast the video signal changes its intensity as it traces the image on a horizontal line. The traditional method for testing and presenting video resolution test results is to use the Electronic Industries Association (EIA) resolution target (Figure 5-18). If only one resolution is defined in a camera data sheet, the manufacturer is referring to the horizontal resolution. There are several ways for measuring the horizontal resolution. The most common is to use a video resolution chart which has horizontal and vertical lines as the target scene. The camera resolution is the point where the lines start to merge and cannot be separated. This chart-measuring technique can be subjective since different people perceive, when the lines merge, differently. The resolution of the monitor must be higher than the camera. The minimum-spaced discernible black-and-white transition boundaries in the two wedge areas are the vertical limiting (horizontal wedge) and horizontal limiting (vertical wedge) resolution values. Various industries using electronic imaging devices have specified resolution criteria dependent on the particular discipline involved. In the analog video security industry the concept of TV lines is defined as the resolution parameter. A more scientific technique for measuring the horizontal resolution is by measuring the bandwidth of the signal. The bandwidth of the video signal from the camera is measured on an oscilloscope (see Chapter 25). Multiplying the bandwidth by 80 TV lines/MHz gives the resolution of the camera. For example if the bandwidth is 6 MHz the camera resolution will be 6 × 80 or 480 TV lines. The horizontal resolution is determined by the maximum speed or frequency response (bandwidth) of the video electronics and video signal. While the vertical resolution is determined solely by the number of lines or pixels chosen— and thus not variable under the US standard of 525 lines— the horizontal resolution depends on the electrical performance of the individual camera, transmission system, and monitor. Most standard cameras with a 6 MHz bandwidth produce a horizontal resolution in excess of 450 TV lines. The horizontal resolution of the system is therefore limited to approximately 80 lines/MHz of bandwidth. The solid-state-imaging industry has adopted pixels as its resolution parameter. To obtain TV-line resolution equivalent when the number of pixels are specified, multiply the number of pixels by 0.75. In photography, line pairs or cycles per millimeter is the resolving power notation. While all these parameters are useful, they tend to be confusing. For the purposes of CCTV security applications, the TV line notation is used. For reference, the other parameters are defined as follows: • One cycle is equivalent to one line pair. • One line pair is equivalent to two TV lines. • One TV line is equivalent to 1.25 pixels. Cameras—Analog, Digital, and Internet 135 INDICATES VERTICAL RESOLUTION (200 TV LINES) INDICATES 10 SHADES OF GRAY SCALE IN PICTURE INDICATES HORIZONTAL RESOLUTION AT EDGE OF PICTURE (200 TV LINES) INDICATES HORIZONTAL AND VERTICAL RESOLUTION AT CORNER OF PICTURE INDICATES HORIZONTAL RESOLUTION AT CENTER OF PICTURE (200 TV LINES) NOTE: THE MINIMUM SPACED DISCERNIBLE BLACK AND WHITE TRANSITION BOUNDARIES IN THE TWO WEDGE AREAS ARE THE VERTICAL (HORIZONTAL WEDGE) AND HORIZONTAL (VERTICAL WEDGE) LIMITING RESOLUTION VALUES. FIGURE 5-18 EIA resolution target One cycle is equivalent to one black-and-white transition and represents the minimum sampling information needed to resolve the elemental areas of the scene image. A figure of merit for solid-state CCTV cameras is the total number of pixels reproduced in a picture area. A typical value is 380,000 pixels for a good 525-line CCTV system. A parameter deserving mention that is used in lens, camera, and image-intensifier literature is the modulation transfer function (MTF). This concept was introduced to assist in predicting the overall system performance when cascading several devices such as the lens, camera, transmission medium, and monitor or recorder in one system. The MTF provides a figure of merit for a part of the system (such as the camera or monitor) acting alone or when the parts are combined with other elements of the system. It is used particularly in the evaluation of LLL devices (Chapter 19). The resolution for a good monochrome security camera is 550−600 TV lines and for a color camera is 450−480 TV lines. The data sheets from manufacturers of solid-state cameras (and monitors) often quote the number of pixels instead of TV line resolution. However, unless the number of pixels is converted into equivalent TV lines, it is hard to compare picture resolution. Table 5-3 summarizes the state of the art in solid-state sensors and gives information on the horizontal and vertical pixels available for representative 1/6, 1/4-, 1/3-, and 1/2-inch format types. When monochrome solid-state sensor cameras were first introduced, the sensors had a maximum horizontal resolution of approximately 200 TV lines per picture height. These early low-resolution sensors had 288 horizontal by 394 vertical pixels. Present-day sensors have horizontal resolutions of 400–600 TV lines per picture height. Mediumresolution camera sensors have 510H × 492(V) pixels, and high-resolution cameras have 739H × 484(V) pixels. Improvements in the resolution of solid-state sensors to match the best tube sensors have resulted from various approaches with the most successful increase coming from increased pixel density. These strides in decreasing the pixel size have resulted from the techniques used to 136 CCTV Surveillance TYPE DESCRIPTION HORIZONTAL VERTICAL TOTAL RESOLUTION (TV LINES) COMMENTS 1/6 CCD COLOR (NTSC) 811 508 412,000 480 7 mm DIAMETER SENSOR HEAD 1/6 CCD COLOR (NTSC) 736 480 340,000 470 ULTRA-FAST IP SPEED DOME 1/6 CCD COLOR (PAL) 736 544 400,000 460 ULTRA-FAST IP SPEED DOME 1/4 CCD COLOR (NTSC) 768 494 380,000 480 SURVEILLANCE 1/4 CCD COLOR (NTSC), B/W 768 494 380,000 470 NETWORK IP SPEED DOME 1/4 CCD COLOR (PAL), B/W 752 582 438,000 470 NETWORK IP SPEED DOME 1/4 CMOS COLOR (NTSC) 640 480 307,200 480 NETWORK IP 1/3 CCD MONOCHROME (NTSC) 510 492 251,000 380 SURVEILLANCE 1/3 CCD MONOCHROME (CCIR) 512 582 297,000 380 1/3 CCD COLOR (NTSC), B/W 771 492 380,000 480/570 (B/W) SURVEILLANCE DAY/NIGHT SURVEILLANCE 1/3 CCD COLOR (NTSC) 768 494 380,000 480 SURVEILLANCE 1/3 CCD COLOR (PAL) 811 508 412,000 480 SURVEILLANCE 1/3 CMOS COLOR (NTSC) 640 480 307,200 340 COVERT SURVEILLANCE 1/2 CCD COLOR (NTSC), B/W 768 494 380,000 480 DAY/NIGHT SURVEILLANCE 1/2 CCD COLOR (PAL), B/W 752 582 440,000 480 DAY/NIGHT SURVEILLANCE 1/2 CCD MONOCHROME (NTSC) 811 508 412,000 570 HIGH RESOLUTION B/W *RESOLUTION IS THE ABILITY TO JUST DISCERN TWO ADJACENT BLACK LINES SEPARATED BY A WHITE SPACE. THE SYSTEM SHOULD HAVE A GRAY SCALE WITH A MINIMUM OF 10 LINES FROM BLACK TO WHITE. FOR DIGITAL VIDEO SYSTEMS THE HORIZONTAL AND VERTICAL RESOLUTIONS ARE APPROXIMATELY 0.75 × NUMBER OF PIXELS. FOR LEGACY NTSC AND PAL SYSTEMS, VERTICAL RESOLUTION IS LIMITED BY THE 525 AND 625 HORIZONTAL LINE SCAN RATE AND THE HORIZONTAL RESOLUTION BY THE SYSTEM BANDWIDTH. B/W = BLACK/WHITE (MONOCHROME) Table 5-3 Resolution of Representative Solid-State CCD and CMOS Cameras manufacture very large scale integrated (VLSI) devices for computers. Image sensors are VLSI devices. The majority of solid-state sensors in use today have a 1/4-, 1/3- or 1/2inch image format. There are some available with 1/5- and 1/6-inch image formats, and larger ones with 2/3-inch formats. Several other techniques are used to improve resolution. In one camera configuration, image-shift enhancement results in a doubling of the ILT CCD imager horizontal resolution by shifting the visual image in front of the CCD sensor by one-half pixel. This technique simultaneously reduces aliasing, which causes a fold-back of the highfrequency signal components, resulting in "herringbone" or jagged edges in the image. This artifact is often seen when viewing plaid patterns on clothing and screens, with medium to low resolution solid-state cameras. Aliasing reduces resolution and causes considerable loss in picture intelligence. Another technique used to improve the horizontal resolution without increasing the pixel count is offsetting each row of pixels by one-half pixel, generating a zigzag of the pixel rows. This arrangement, in conjunction with corresponding clocking, allows simultaneous readout of two horizontal rows and nearly doubles the horizontal resolution compared with conventional detectors with identical pixel counts. 5.6.3 Static vs. Dynamic Resolution The previous section described static resolution. This represents resolution achieved when a camera views a stationary scene. When a camera views a moving target—a person walking through the scene, a car passing by—or the camera scans a scene, a new parameter called dynamic resolution is defined. Under either the moving-target or scanning condition, extracting intelligence from the scene depends on resolving, detecting, and identifying fine detail. The solid-state camera has the ability to resolve rapid movement without degradation in resolution under almost all suitable lighting conditions. When high resolution is required while viewing very fast moving targets, solid-state cameras with an electronic shutter are used to capture the action. Many solid-state cameras have a variable-shutter-speed function, with common shutter speeds of 1/60, 1/1000, and 1/2000. This shuttering technique is equivalent to using a fast shutter Cameras—Analog, Digital, and Internet speed on a film camera. The ability to shutter solid-state cameras results in advantages similar to those obtained in photography: the moving object or fast-scan panning that would normally produce a blurred image can now produce a sharp one. The only disadvantage this technique has is that since a decreased amount of light enters the camera, the scene lighting must be adequate for the system to work successfully. 5.6.4 Sensitivity Sensitivity of a camera is measured in foot candles (FtCd) or lux (1 FtCd = 9.3 lux) and usually refers to the minimum light level required to get an acceptable video picture. There is a great deal of confusion in the video industry over camera specifications with respect to what an acceptable video picture is. Manufacturers use two definitions for camera sensitivity: (1) sensitivity at the camera sensor faceplate and (2) minimum scene illumination. Sensitivity at the faceplate indicates the minimum light required at the sensor chip to get an acceptable video picture. Minimum scene illumination indicates the minimum light required at the scene to get an acceptable video picture. When sensitivity is defined as the minimum scene illumination, parameters such as the scene reflectance, the lens optical speed (f/#), usable video, automatic gain control (on, off), and shutter speed should be defined. With regard to reflectance, most camera manufactures use 89% or 75% (white surface) reflectance surface to define the minimum scene illumination. If the actual scene being viewed has the same reflectance as the data sheet then this is a correct measurement. This is usually not the case. Typical light reflectivities of different materials range from snow 90%, grass 40%, brick 25%, to blacktop 5%. It is apparent that if the camera is viewing a black car, only about 5% of the light is reflected back to the camera and therefore at least fifteen times more light is required at the scene to give the same amount of light that would come from a white surface. One camera technology that significantly increases the sensitivity of the CCD sensor over existing devices by a factor of two uses an on-chip lens (OCL) technique. By manufacturing the sensor with microscopic lenses on each pixel, the incoming light is concentrated on the photo-sensor areas thereby increasing the sensitivity of the camera. An improvement particularly important in CMOS sensors incorporates microscopic lenses that cover the active area of each pixel as well as the inactive area between pixels, thereby eliminating the ineffective areas between the microlenses. This increases sensitivity by over a factor of two and reduces the smear level significantly compared to that of the original technology. 137 5.7 SENSOR FORMATS The development of the superior solid-state CCD sensor color camera for the VCR home consumer market accelerated the use of color cameras in the security industry. There are three popular image format sizes for solid-state security cameras: 1/4-, 1/3-, and 1/2-inch. All security sensor formats have a horizontal-by-vertical geometry of 4 × 3 as defined in the EIA and NTSC standards. For a given lens, the 1/4-inch format sensor sees the smallest scene image and the 1/2-inch sees the largest, with the 1/3-inch format camera seeing proportionally in between. The ISIT tube cameras using the 1-inch tube to provide LLL capabilities have by all intents and purposes been replaced by their solid-state counterpart, the ICCD. As a basis for comparison with other formats, Figure 5-19 shows the solid state CCD, CMOS, and tube image formats compared to photographic film formats. For reference the 16 mm semiprofessional film camera, and the 35 mm film camera used for bank holdup and forensic applications is shown. Table 5-4 lists the three popular video image format sizes: 1/4-, 1/3-, and 1/2-inch, and four less used sizes: 1-, 2/3-, 1/6-, and 1/5-inch. For reference, the physical target area in tube cameras is circular and usually corresponds to the diagonal of the lens image circle. The tube active target is the inscribed 4 × 3 rectangular aspect ratio area scanned by the electron beam in the tube. Since each pixel is used in the solid-state camera image the target area in the solid-state sensor is the full sensor 4 × 3 format array. The camera sensor format is important since it determines the lens format size with which it must operate and, along with the lens focal length (FL), sets the video system field of view (FOV). As a general rule, the larger the sensor size, the larger the diameter of the lens glass size required which translates into increased lens size, weight, and cost. Any lens designed for a larger format can be used on a smaller format camera. The opposite is not true, for example a lens designed for a 1/3-inch format will not work properly on a 1/2-inch format camera and will produce vignetting (dark area surrounding the image). 5.7.1 Solid-State Most solid-state cameras using CCD or CMOS sensor technology have 1/4-, 1/3-, and 1/2-inch formats. The sensor arrays are rectangular in shape and have the active area sizes as listed in Table 5-4 and shown in Figure 519. Significant progress has been made in producing exceptionally high-quality 1/4-, 1/3-, and 1/2-inch format sensors that rival the sensitivity of some of earlier larger 2/3- and 1-inch solid-state or tube sensors. Most color cameras used in security applications have single-chip sensors with three-color stripe filters integral with the image sensor. Typical sensitivities for these color cameras 138 CCTV Surveillance 1/3" TUBE (REFERENCE) • VIDICON • SILICON • SIT, ISIT NOMINAL 1" TUBE DIAMETER 1" 2/3" 1/2" N/A 4.8 × 6.4 (8 DIAG.) SOLID STATE 1/6" 6.6 × 8.8 (11 DIAG.) 1/4" 1/3" ACTIVE SENSOR AREA (mm) 9.6 × 12.8 (16 DIAG.) 1/2" 2/3" • CCD • CMOS • CID 2.4 × 3.2 (4.0 DIAG.) 1.8 × 2.4 (3.0 DIAG.) 3.6 × 4.8 (6.0 DIAG.) 4.8 × 6.4 (8 DIAG.) 6.6 × 8.8 (11 DIAG.) 35 mm FILM 16 mm SUPER 8 mm FILM (REFERENCE) • SUPER 8 mm • 16 mm • 35 mm 4.1 × 5.8 7.4 × 10.3 24 × 36 NOTE: ALL DIMENSIONS IN mm FIGURE 5-19 Tube, solid state and film image formats IMAGE SENSOR SIZE DIAGONAL (d ) FORMAT HORIZONTAL (h ) VERTICAL (v ) mm inches mm inches mm inches 16 0.63 12.8 0.50 9.6 0.38 2/3" 11 0.43 8.8 0.35 6.6 0.26 * 1/2" 8 0.31 6.4 0.25 4.8 0.19 * 1/3" 6 0.24 4.8 0.19 3.6 0.14 * 1/4" 4 0.16 3.2 0.13 2.4 0.1 1/6" 3 0.12 2.4 0.09 1.8 0.07 1" (REFERENCE) * MOST COMMON CCTV SENSOR FORMATS Table 5-4 CCTV Camera Sensor Formats range from 0.5 to 2.0 FtCd (4.6 to 18.6 lux) for full video, which is less sensitive than their monochrome counterpart by a factor of about 10. Low resolution color cameras have a horizontal resolution of about 330 TV lines. High-resolution color cameras have a horizontal resolution of about 480 TV lines. 5.7.2 Image Intensifier The most common image intensifier is the ICCD and uses standard monochrome resolution CCD image formats. Typical values for the format resolution are 500–600 for a 1/2-inch sensor. Cameras—Analog, Digital, and Internet 5.7.3 Thermal IR The thermal IR camera uses a long-wave IR array fabricated using completely different manufacturing techniques as compared with CCD or ICCD manufacture. These sensors are far more difficult to manufacture and have far lower yields than do other solid-state sensors. As a result the number of pixels in the sensor is significantly less. Typical sensor arrays have 280–320 horizontal TV line resolution. Future generations of these thermal IR cameras will have near equivalent resolution to those of CCD and CMOS cameras. 5.8 CAMERA LENS MOUNTS Several lens-to-camera mounts are standard in the CCTV industry. Some are mechanically interchangeable and others are not. Care must be taken so that the lens mount matches the camera mount. The two widely used cameralens mounts are the C and CS mount. Small surveillance cameras use the 10, 12, and 13 mm thread diameter minilens mounts. The 10 and 12 mm diameter mounts have a 0.5 mm pitch and the 13 mm diameter mount has a 1.0 mm pitch. Large bayonet mounts are used with specialized cameras and lenses on some occasions. These lens-tocamera mounts are described in the following sections. 139 of the CS mount system is that the lens can be smaller, lighter, and less expensive than its C mount counterpart. The CS mount camera is completely compatible with the common C mount lens when a 5 mm spacer ring is inserted between the C mount lens and the CS mount camera. The opposite is not true: a CS mount lens will not work on a C mount camera. Table 5-5 summarizes the present lens mount parameters. 5.8.2 Mini-Lens Mounts The proliferation of small minilenses (see Chapter 4) and small CCD and CMOS cameras has led to widespread use of smaller lens/camera mounts. Manufacturers supply these mini-lenses and cameras with mounts having metric thread sizes of 10, 12, or 13 mm diameter and thread pitches of 0.5 and 1.0 mm. The two widely used sizes are the 10 and 12 mm diameter with 0.5 mm pitch. 5.8.3 Bayonet Mount The large 2.25-inch-diameter bayonet mount is used primarily in custom security, industrial, broadcast, and military applications with three-sensor color cameras, LLL cameras, and long FL large lenses. It is only in limited use in the security field. 5.8.1 C and CS Mounts 5.8.4 Lens–Mount Interferences For many years, all 1-, 2/3-, and 1/2-inch cameras used an industry-standard mount called the C mount to mechanically couple the lens to the camera. Figure 5-20 shows the mechanical details of the C and CS mounts. The C mount camera has a 1-inch-diameter hole with 32 threads per inch (TPI) and the C mount lens has a matching thread (1–32 TPI) that screws into the camera thread. The distance between the lens rear mounting surface and the image sensor for the C mount is 0.69 inches (17.526 mm). With the introduction of the smaller 1/4- and 1/3inch (and 1/2-inch) format cameras and lenses, it became possible and desirable to reduce the size of the lens and the distance between the lens and the sensor. A mount adopted by the industry for 1/4-, 1/3, and 1/2-inch-sensorformat cameras became the CS mount. The CS mount matches the C mount in diameter and thread but the distance between the lens rear mounting surface and the image sensor for the CS mount is 0.492 inches (12.5 mm). The CS mount is 0.2 inches (5 mm) shorter than the C mount. Since the lens is 5 mm closer to the sensor, the lens can be made smaller in diameter. A C mount lens can be used on a CS mount camera if a 5 mm spacer is interposed between the lens and the camera and if the lens format covers the camera format size. The advantage Figure 5-21 illustrates a potential problem with some lenses when used with CCD or solid-state cameras. Some of the shorter-FL lenses (2.2, 2.6, 3.5, and 4.8 mm) have a protrusion that extends behind the C or CS mount or minimount and can interfere with the filter or window used with the solid-state sensor. This mechanical interference prevents the lens from fully seating in the mount, thereby causing the image to be out of focus. Most lens and camera manufacturers are aware of the problem and for the most part have designed lenses and cameras that are compatible. However, since lenses are often interchanged, the potential problem exists and the security designer should be aware of the condition. 5.9 ZOOM LENS–CAMERA MODULE The requirement for a compact zoom lens and camera combination has been satisfied with a zoom lens–camera module. This module evolved out of a requirement for a lightweight, low inertia camera-lens for use in high-speed pan/tilt dome installations in casinos, airports, malls, retail stores, etc. The camera–lens module has a mechanical cube configuration so that it can easily be incorporated into a 140 CCTV Surveillance 1" DIAMETER 32 THREADS PER INCH CAMERA 1" DIAMETER 32 THREADS PER INCH C MOUNT LENS CAMERA CS MOUNT LENS SENSOR SENSOR 0.492" (12.5 mm) 0.69" (17.526 mm) 5 mm SPACER C MOUNT LENS CS MOUNT LENS = + NOTE: DIFFERENCE BETWEEN C MOUNT AND CS MOUNT: 17.526 mm – 12.5 mm = 5 mm (SPACER) FIGURE 5-20 Mechanical details of the C mount and CS mount CAMERA MOUNT TYPE MOUNTING SURFACE TO SENSOR DISTANCE (d ) MOUNT TYPE inch mm THREAD: DIAMETER (D) C 0.069 17.526 1-inch DIA. 32 TPI CS 0.492 12.5 1-inch DIA. 32 TPI 10 mm DIA. 0.50 mm PITCH MINI: 10 mm MINI: 12 mm VARIES FROM * 3.5 mm (0.14") TO 9 mm (0.35") MINI: 12 mm LENS D d 12 mm DIA. 0.50 mm PITCH 13 mm DIA. 0.50 mm PITCH TPI—THREADS PER inch TPM—THREADS PER mm * VARIES WITH MANUFACTURER TO CONVERT A C MOUNT LENS TO A CS MOUNT, ADD A 5 mm SPACER Table 5-5 Standard Camera/Lens Mount Parameters pan/tilt dome housing and be pointed in any direction at high speeds (Figure 5-22). The module assembly includes the following components and features: (1) rugged, compact mechanical structure suitable for high-speed pan/tilt platforms; (2) large optical zoom ratio, typically 16 or 20 to 1; (3) large elec- tronic zoom ratio, typically 8 or 10 to 1; and (4) a 1/4-inch solid-state color camera with excellent sensitivity and resolution. Options include: (1) automatic focus and (2) image stabilization capability (see Section 4.5.11). The automatic-focusing option is useful providing the lens is zooming slowly and the module is not panning or Cameras—Analog, Digital, and Internet 141 FIGURE 5-21 Lens-mount interference CROSSHATCHED AREA REPRESENTS MECHANICAL INTERFERENCE INFRARED FILTER C MOUNT LENS CAMERA SENSOR MECHANICAL INTERFERENCE BETWEEN LENS AND FILTER at the 3.6 mm FL setting). At the wide-angle setting the lens and camera covers a 54 horizontal angular FOV. At the telephoto setting, it covers a 25 horizontal angular FOV. The lens–camera module is also available, packaged for mounting on standard pan/tilt platforms. 5.10 PANORAMIC 360 CAMERA FIGURE 5-22 Zoom lens–camera module tilting rapidly. When a person walks into the lens FOV the automatic-focus lens changes focus from the surrounding scene to the moving person, keeping the person in focus. The auto-focus system keeps the person in focus even though they move toward or away from the lens. Auto-focus is ineffective while the lens is zooming and should not be used if the module is panning and/or tilting rapidly. In this situation the system becomes “confused” and does not know what object to focus on, causing the person to be out of focus in the picture. The zoom lens in a typical module has an FL range of 36−80 mm (f/1.6 There has always been a need to see “all around” an entire room or area, seeing 360 horizontally and 90 vertically with one panoramic camera and lens. Early versions of such a 360 FOV camera systems were achieved using multiple cameras and lenses and combining the scenes as a split screen on the monitor. Panoramic lenses have been available for many years but have only recently been combined with high resolution digital cameras and DSP electronics using sophisticated mathematical transforms to take advantage of their very wide-angle capabilities. The availability of high resolution solid-state cameras has made it possible to map a 360 by 90 hemispherical FOV onto a rectangular monitor with good resolution. Figure 4-31 shows a panoramic camera and operational diagram having a 360 horizontal and a 90 vertical FOV. In operation, the lens collects light from the 360 panoramic scene and focuses it onto the camera sensor as a donut-shaped image (Figures 4-31 and 4-32). The electronics and mathematical algorithm convert this donutshaped panoramic image into the rectangular (horizontal and vertical) format for normal monitor viewing. In operation, a joystick or computer mouse is used to electronically 142 CCTV Surveillance FIGURE 5-23 High definition television (HDTV) formats 16 4 HDTV 16:9 9 FORMAT HORIZONTAL (PIXELS) 4:3 3 VERTICAL (PIXELS) VERTICAL TV LINES ASPECT RATIO 16 × 9 4×3 ARRAY SIZE: PIXELS HDTV 720i 1280 720 2,073,000 HDTV 720p 1280 720 2,073,000 HDTV 1080i 1920 1080 2,073,000 HDTV 1080p 1920 1080 2,073,000 NTSC * 525 921,600 PAL /SECAM * 625 921,600 * ANALOG REFERENCE: STANDARD TELEVISION (SDTV) pan and tilt the camera so that at any given time a segment of the 360 horizontal by 90 vertical image is displayed on the monitor. 5.11 HIGH DEFINITION TELEVISION (HDTV) High definition television (HDTV) provides a new video display format having a 16×9 horizontal by vertical format, thereby providing a significantly increased resolution over that of standard NTSC 4 × 3 format (Figure 5-23). The reason for defining this new format is to provide: (1) a higher resolution or definition video display, (2) one that has a format that better matches the view seen by the human eye (wider horizontal view), and (3) a format more closely matching the many images that the eye sees, i.e. landscapes, parking lots, etc. This new format was originally developed for the consumer market; however, it will find its way into the video security market because of the superior monitor display format and resolution it provides. The new HDTV format and size has many variations and has not yet been standardized in the security industry. Not all HDTV images have the same number of horizontal lines or the same resolutions. The way the different picture formats are painted on the screen is also different. HDTV formats available include: 720p, 1080i, and 1080p/24. The first number in the type designation is the vertical resolution or how many scan lines there are from the top to the bottom of the picture. This first designation is usually followed by a letter. The letter is either an “i” or “p.” These are the abbreviations for interlaced (i) or progressive (p) scans respectively. Progressive means that the whole picture is painted from the top of the screen to the bottom and then a new frame is painted over again. Interlaced means only half the image is painted first (oddnumbered lines) and then the other half of the image is painted (even-numbered lines). There seems to be a general consensus that the progressive scan is better than the interlaced. All present NTSC video security video systems using the 4 × 3 format use 2:1 interlaced lines and every computer monitor uses progressive. The last number in the designation 24, 30, or 60 refers to the frame rate. At present, the best HDTV system is 1080i, and interlaced 30 frame/60 fields per second, system similar to NTSC, but with the 16 × 9 picture format of HDTV. HDTV video improves the intelligence provided in many security displays since it presents a wider horizontal aspect ratio, has higher resolution, and can support a larger screen size. The increased resolution produces crisper, sharper images. 5.12 SUMMARY There have been many important improvements and innovations in the development of the video camera and its use in the security field. The single most significant advances in CCTV camera technology have been the development of the CCD and CMOS solid-state camera image sensor, IR thermal cameras, IP camera, and the use of DSP. These sensors and camera electronics offer a compelling advantage over original vacuum-tube technology because of solid-state reliability, inherent long life, low cost, low-voltage operation, low power dissipation, geometric Cameras—Analog, Digital, and Internet reproducibility, absence of image lag, DSP, and visible and/or IR response. These solid-state cameras have provided the increased performance, reliability, and stability needed in monochrome, color, and IR video security systems. The availability of solid-state color cameras has made a significant impact on the security video industry. Color cameras provide enhanced video surveillance because of their increased ability to display and recognize objects and persons. The choices available for lighting in most security applications is sufficient for most color cameras 143 to have satisfactory sensitivity and resolution. Solid-state color cameras have excellent color rendition, maintain color balance, and need no color rebalancing when light level or lighting color temperature varies. Intensified charge coupled device cameras coupled to tube or micro-channel plate intensifiers provide the low light sensitivity required in dawn to dusk applications and some nighttime applications. Room temperature, thermal IR cameras have provided the “eyes” when no visible or near-IR light is available and visible sensors are inoperable. This page intentionally left blank Chapter 6 Analog Video, Voice, and Control Signal Transmission CONTENTS 6.1 6.2 6.3 Overview Base-band Signal Analysis 6.2.1 Video Picture Signal 6.2.2 Video Synchronization Signal 6.2.3 Voice Signal 6.2.4 Control Data Signals 6.2.5 Modulation and Demodulation 6.2.6 Signal Bandwidth Wired Video Transmission 6.3.1 Coaxial Cable 6.3.1.1 Unbalanced Single-Conductor Cable 6.3.1.2 Connectors 6.3.1.3 Amplifiers 6.3.2 Balanced Two-Conductor Twin-axial Cable Transmission 6.3.2.1 Indoor Cable 6.3.2.2 Outdoor Cable 6.3.2.3 Electrical Interference 6.3.2.4 Grounding Problems 6.3.2.5 Aluminum Cable 6.3.2.6 Plenum Cable 6.3.3 Two-Wire Cable Unshielded Twisted Pair (UTP) Transmission 6.3.3.1 Balanced 2-Wire Attributes 6.3.3.2 The UTP Technology 6.3.3.3 UTP Implementation with Video, Audio, and Control Signals 6.3.3.4 Slow-Scan Transmission 6.3.4 Fiber-Optic Transmission 6.3.4.1 Background 6.3.4.2 Simplified Theory 6.3.4.3 Cable Types 6.3.4.3.1 Multimode Step-Index Fiber 6.3.4.3.2 6.4 6.5 Multimode Graded-Index Fiber 6.3.4.3.3 Cable Construction and Sizes 6.3.4.3.4 Indoor and Outdoor Cables 6.3.4.4 Connectors and Fiber Termination 6.3.4.4.1 Coupling Efficiency 6.3.4.4.2 Cylindrical and Cone Ferrule Connector 6.3.4.4.3 Fiber Termination Kits 6.3.4.4.4 Splicing Fibers 6.3.4.5 Fiber-Optic Transmitter 6.3.4.5.1 Generic Types 6.3.4.5.2 Modulation Techniques 6.3.4.5.3 Operational Wavelengths 6.3.4.6 Fiber-Optic Receiver 6.3.4.6.1 Demodulation techniques 6.3.4.7 Multi-Signal, Single-Fiber Transmission 6.3.4.8 Fiber Optic—Advantages/ Disadvantages 6.3.4.8.1 Pro 6.3.4.8.2 Con 6.3.4.9 Fiber-Optic Transmission: Checklist Wired Control Signal Transmission 6.4.1 Camera/Lens Functions 6.4.2 Pan/Tilt Functions 6.4.3 Control Protocols Wireless Video Transmission 6.5.1 Transmission Types 6.5.2 Frequency and Transmission Path Considerations 145 146 CCTV Surveillance 6.5.3 6.6 6.7 6.8 6.9 6.10 6.11 Microwave Transmission 6.5.3.1 Terrestrial Equipment 6.5.3.2 Satellite Equipment 6.5.3.3 Interference Sources 6.5.4 Radio Frequency Transmission 6.5.4.1 Transmission Path Considerations 6.5.4.2 Radio Frequency Equipment 6.5.5 Infrared Atmospheric Transmission 6.5.5.1 Transmission Path Considerations 6.5.5.2 Infrared Equipment Wireless Control Signal Transmission Signal Multiplexing/De-multiplexing 6.7.1 Wideband Video Signal 6.7.2 Audio and Control Signal Secure Video Transmission 6.8.1 Scrambling 6.8.2 Encryption Cable Television Analog Transmission Checklist 6.10.1 Wired Transmission 6.10.1.1 Coaxial Cable 6.10.1.2 Two-Wire UTP 6.10.1.3 Fiber-Optic Cable 6.10.2 Wireless Transmission 6.10.2.1 Radio Frequency (RF) 6.10.2.2 Microwave 6.10.2.3 Infrared Summary 6.1 OVERVIEW Closed circuit television (CCTV) and open circuit television (OCTV) video signals are transmitted from the camera to a variety of remote monitors via some form of wired or wireless transmission channel. Control, communications, and audio signals are also transmitted depending on the system. This chapter covers most of the analog techniques for transmitting these signals. Chapter 7 describes the techniques for transmission of digital signals. Analog transmission is still critically important because of the immense installed base of analog equipment in the security field. These video systems are in operation, and will remain so for many years to come. In its most common form, the video signal is transmitted at base-band frequencies over a coaxial cable. This chapter identifies techniques and problems associated with transmitting video and other signals from the camera site to the remote monitoring location using wired copper-wire and fiber optics, and through-the-air wireless transmission. Electrical-wire techniques include coaxial cable and twowire unshielded twisted-pair (UTP). Coaxial cable is suitable for all video frequencies with minimum distortion or attenuation. Two-wire UTP systems using standard conductors (intercom wire, etc.) use special transmitters and receivers that preferentially boost the high video frequencies to compensate for their loss over the wire length. Faithful video signal transmission is one of the most important aspects of a video system. Each color video channel requires approximately a 6 MHz bandwidth. Monochrome picture transmission needs only a 4.2 MHz bandwidth. Figure 6-1 shows the single-channel video bandwidth requirements for monochrome and color systems. Using information from other chapters, it is not difficult to specify a good lens, camera, monitor, and video recorder to produce a high-quality picture. However, if means of transmission does not deliver an adequate signal from the camera to the monitor, recorder, or printer, an unsatisfactory picture will result. The final picture is only as good as the weakest link in the system and it is often the transmission means. Good signal transmission requires that the system designer and installer choose the best transmission type, and use high-quality materials, and practices professional installation techniques. A poor transmission system will degrade the specifications for camera, lens, monitoring, and recording equipment. Fiber optics offers a technology for transmitting highbandwidth, high-quality, multiplexed video pictures, and audio and control signals over a single fiber. Fiber-optic technology has been an important addition to video signal transmission means. The use of fiber-optic cable has significantly improved the picture quality of the transmitted video signal and provided a more secure, reliable, and cost-effective transmission link. Some advantages of fiber optics over electrical coaxial-cable or two-wire UTP systems include: • high bandwidth providing higher resolution or simultaneous transmission of multiple video signals; • no electrical interference to or from other electrical equipments or sources; • strong resistance to tapping (eavesdropping), thereby providing a secure link; and • no environmental degradation: unharmed by corrosion, moisture, and electrical storms. Wireless transmission techniques use radio frequencies (RF) in the very high frequency (VHF) and ultra high frequency (UHF) bands, as well as microwave frequencies at 900 MHz, 1.2 GHz, and 2.4 GHz and 5.8 GHz in the S and X bands (2–50 GHz). Low-power microwave and RF systems can transmit up to several miles with excellent picture quality, but the higher power systems require an FCC (Federal Communications Commission) license for operation. Wireless systems permit independent placement of the CCTV camera in locations that might be inaccessible for coaxial or other cables. Cable-less video transmission using IR atmospheric propagation is discussed. Infrared laser transmission requires no FCC approval but is limited in range depending on visibility. Transmission ranges from a few hundred Analog Video, Voice, and Control Signal Transmission RELATIVE POWER VIDEO INFORMATION (AMPLITUDE MODULATION) PICTURE CARRIER 147 SOUND CENTER FREQUENCY COLOR SUBCARRIER 1.0 AUDIO INFORMATION (FREQUENCY MODULATION) 0 1 1.25 2 3 4 3.58 5 6 .25 FREQUENCY (MHz) 4.5 6.00 FIGURE 6-1 Single channel CCTV bandwidth requirements feet in poor visibility to several thousand feet or even many miles in good visibility. Infrared is capable of bidirectional communication; control signals are sent in the opposite direction to the video signal and audio is sent in both directions. The wired and wireless transmission techniques outlined above account for the majority of transmission means from the remote camera site to the monitoring site. There are, however, many instances when the video picture must be transmitted over very long distances—tens, hundreds, or thousands of miles, or across continents. These are accomplished using digital techniques (Chapter 7). Two-wire, coaxial, or fiber-optic cables for real-time transmission are often not practical in metropolitan areas where a video picture must be transmitted from one building to another building through congested city streets not in sight of each other. A technique developed in the 1980s for transmitting a video picture anywhere in the world over telephone lines or any two-wire network is called “slow-scan video transmission.” This technique uses a non-real-time twowire technology that permits the transmission of a video picture from one location to any other location in the world, providing that a two-wire or wireless voice-grade link (telephone line) is available. This system was the forerunner to the present Internet, intranet, and World Wide Web (WWW). The slow-scan system took a real-time camera video signal and converted into a non-real-time signal and transmitted it at a slower frame rate over any audio communications channel (3000 Hz bandwidth). Unlike conventional video transmission, in which a real-time signal changed every 1/30th of a second, the slow-scan transmission method sent a single snapshot of a scene over a time period of 1–72 seconds depending on the resolution specified. This effect is similar to that of opening your eyes once every second or once every minute or somewhere in between. When used with an alarm intrusion or VMD the slow-scan equipment began sending pictures, once every few seconds at low resolution (200 TV lines), or every 32 seconds at high resolution (500 TV lines). A quantum change and advancement has occurred in the video surveillance industry in the past five years. Computer based systems now use digital techniques and equipments from the camera, transmission means, switching and multiplexing equipment, to the DVRs, solid-state LCD, and plasma monitors. The most dramatic change, however, has been in the use of digital transmission (Chapter 7). Now with the Internet and WWW, and digital signal compression a similar function but much improved transmission is accomplished over any wired or wireless network. A basic understanding of the capabilities of the aforementioned techniques, as well as the advantages and 148 CCTV Surveillance disadvantages of different transmission means, is essential to optimize the final video picture and avoid costly retrofits. Understanding the transmission requirements when choosing the transmission means and hardware is important because it constitutes the most labor-intensive portion of the video installation. Specifying, installing, and testing the video signal and communication cables for intra-building and inter-building wiring represents the major labor cost in the video installation. If the incorrect cable is specified and installed, and must be removed and replaced with another type, serious cost increases result. In the worst situation, where cables are routed in underground outdoor conduits, it is imperative to use the correct size and type of cable so as to avoid retrenching or replacing cables. The horizontal line timing pulses occur at 63.5 microsecond intervals. The CCIR/PAL vertical standard timing is 1/50 second, 1/25 second, and 64 microseconds. The magnitude of these timing pulses is shown in Chapter 5. 6.2.3 Voice Signal In the NTSC standard, the voice and sound information is contained in a sub-carrier centered at 4.5 MHz and at 4.5, 5.5, 6.0 and 6.5 MHz in the CCIR/PAL systems. The signal is frequency modulated (FM) for high fidelity reproduction. 6.2.4 Control Data Signals 6.2 BASE-BAND SIGNAL ANALYSIS The video signal generated by the analog camera is called the “base-band video signal.” It is called base-band because it contains low frequencies, from 30 Hz for NTSC (25 hertz for CCIR) to 6 MHz. To accomplish fiber-optic transmission and wireless RF, microwave, and IR transmission the base-band signal is modulated with a carrier frequency. The monochrome or color video signal is a complex analog waveform consisting of the picture information (intensity and color) and synchronizing timing pulses. The waveform was defined in specified by the SMPTE. The full specifications are contained in standards RS-170, RS-170A, and RS-170RGB. 6.2.1 Video Picture Signal For a monochrome camera the picture information is contained in the single amplitude modulated (AM) intensity waveform. The video signal amplitude for full monochrome and color is 1 volt peak to peak. For a color camera the information is contained in three color waveforms containing the red, green, and blue color contents of the scene. The three colors can faithfully reproduce the color picture. The color signal from the camera sensor can be modified in two different forms: (1) composite video, (2) Y (intensity), C (color), and (3) red (R), green (G), blue (B). The monochrome and color video signals are described in Chapter 5. 6.2.2 Video Synchronization Signal The video synchronization signals consist of vertical field and frame timing pulses, and horizontal line timing pulses. The NTSC standard field and frame timing pulses occur at 1/60 second. and 1/30 second intervals respectively. While not generating part of the standard NTSC signal, command and control data can be added to the signal. The bits and bytes of digital information are handed during the vertical retrace times between frames and fields. Camera control (on/off, etc.), lens control (focus, zoom, iris control), and camera platform control (pan, tilt, presets, etc.) signals are digitally controlled to perform these functions. 6.2.5 Modulation and Demodulation To accomplish fiber-optic transmission, the base-band video signal is converted to an FM signal. For RF transmission the base-band video signal is frequency modulated with the RF of the carrier and 928 MHz (also 435, 1200, 1700 MHz and others). For microwave transmission the base-band is modulated with a camera frequency of 2.4 and 5.8 GHz. 6.2.6 Signal Bandwidth The base-band color video signal for NTSC is 30 Hz–6 MHz (4 MHz for monochrome), and 25 Hz–7MHz for CCIR/PAL. 6.3 WIRED VIDEO TRANSMISSION 6.3.1 Coaxial Cable Coaxial cable is used widely for short to medium distances (several hundred to several thousand feet) because its electrical characteristics best match those required to transmit the full-signal bandwidth from the camera to the monitor. The video signal is composed of slowly varying (low-frequency) and rapidly varying (high-frequency) components. Most wires of any type can transmit the low Analog Video, Voice, and Control Signal Transmission frequencies (20 Hz to a few thousand Hz); practically any wire can carry a telephone conversation. It takes the special coaxial-cable configuration to transmit the full spectrum of frequencies from 20 Hz to 6 MHz without attenuation, as required for high-quality video pictures and audio. There are basically two types of coaxial and two types of twin-axial cable for use in video transmission systems: 1. 2. 3. 4. 75-ohm unbalanced indoor coaxial cable 75-ohm unbalanced outdoor coaxial cable 124-ohm balanced indoor twin-axial cable 124-ohm balanced outdoor twin-axial cable. The cable construction for the coaxial and twin-axial types are shown in Figure 6-2. The choice of a particular coaxial cable depends on the environment in which it will be used and the electrical characteristics required. By far the most common coaxial cables are the RG59/U and the RG11/U, having a 75-ohm impedance. For short camerato-monitor distances (a few hundred feet), preassembled or field-terminated lengths of RG59/U coaxial cable with BNC connectors at each end are used. The BNC connector is a rugged video and RF connector in common use for many decades and the connector of choice for all base-band video connections. Short preassembled lengths of 5, 10, 25, 50, and 100 feet, with BNC-type connectors attached, are available. Long cable runs (several hundred feet and longer) are assembled in the field, made up of a single length of coaxial cable with a connector at each end. For most interior video installations, RG59/U (0.25 inch diameter), or RG11/U (0.5 inch diameter), 75-ohm unbalanced coaxial cable is used. When using the larger diameter RG11/U cable, a larger UHF-type connector is used. When a long cable run of several thousand feet or more is required, particularly between several buildings, or if electrical interference is present, the balanced 124 - ohm coaxial cable or fiber-optic cable is used. When the camera and monitoring equipments are in two different buildings, and likely at different ground potentials, an unwanted signal may be impressed on the video signal which shows up as an interference (wide bars on the video screen) and makes the picture unacceptable. A two-wire balanced or fiber-optic cable eliminates this condition. Television camera manufacturers generally specify the maximum distance between camera and monitor over which their equipment will operate when interconnected with a specific type of cable. Table 6-1 summarizes the transmission properties of coaxial and twin-axial cables when used to transmit the video signal. In applications with cameras and monitors separated by several thousand feet, video amplifiers are required. Located at the camera output and/or somewhere along the coaxial-cable run, they permit increasing TWINAXIAL (BALANCED) COAXIAL (UNBALANCED) COPPER SHIELDING GROUND LEAD FLEXIBLE OUTER JACKET FOAM DIELECTRIC POLYPROPYLENE COPPER CENTER CONDUCTOR INTERCONNECTING SCHEMATIC CAMERA CABLE IMPEDANCE: 75 ohms TYPES: RG59/U, RG11/U, RG8/U FIGURE 6-2 Coaxial-twin-axial cable construction 149 MONITOR 150 CCTV Surveillance MAXIMUM RECOMMENDED CABLE LENGTH (D ) COAXIAL TYPE CABLE WITH AMPLIFIER CABLE ONLY FEETS CONDUCTOR (GAUGE) METER FEETS METER NOMINAL DC RESISTANCE (ohms/1000 ft) RG59/U 750 230 3,400 1,035 22 SOLID COPPER 10.5 RG59 MINI 200 61 800 250 20 SOLID COPPER 41.0 RG6/U 1,500 455 4,800 1,465 18 SOLID COPPER RG11/U 1,800 550 6,500 1,980 14 SOLID COPPER CAMERA D 6.5 1.24 MONITOR NOTE: IMPEDANCE FOR ALL CABLES = 75 ohms Table 6-1 Coaxial Cable Run Capabilities the camera-to-monitor distance to 3400 feet for RG59/U cable and to 6500 feet for RG11/U cable. The increased use of color television in security applications requires the accurate transmission of the video signal with minimum distortion by the transmitting cable. High-quality coaxial-cable, UTP, and fiber-optic installations satisfy these requirements. While a coaxial cable is the most suitable hard-wire cable to transmit the video signal, video information transmitted through coaxial cable over long distances is attenuated differently depending on its signal frequencies. Figure 6-3 illustrates the attenuation as a function of distance and frequency as exhibited by standard coaxial cables. The attenuation of a 10 MHz signal is approximately three times greater than that of a 1 MHz signal when using a high-quality RG11/U cable. In video transmission, a 3000-foot cable run would attenuate the 5 MHz part of the video signal (representing the high-resolution part of the picture) to approximately one-fourth of its original level at the camera; a 1 MHz signal would be attenuated to only half of its original level. At frequencies below 500 kHz, the attenuation is generally negligible for these distances. This variation in attenuation as a function of frequency has an adverse effect on picture resolution and color quality. The signal deterioration appears on monitors in the form of less definition and contrast and poor color rendition. For example, severe high-frequency attenuation of a signal depicting a white picket fence against a dark background would cause the pickets to merge into a solid, smearing mass, resulting in less intelligence in the picture. The most commonly used standard coaxial is RG59/U, which also has the highest signal attenuation. For a 6 MHz bandwidth, the attenuation is approximately 1 dB per 100 feet, representing a signal loss of 11%. A 1000-foot run would have a 10 dB loss—that is, only 31.6% of the video signal would reach the monitor end. In a process called “vidi-plexing,” special CCTV cameras transmit both the camera power and the video signal over a single coaxial cable (RG59/U or RG11/U). This singlecable camera reduces installation costs, eliminates power wiring, and is ideal for hard-to-reach locations, temporary installations, or camera sites where power is unavailable. 6.3.1.1 Unbalanced Single-Conductor Cable The most widely used coaxial cable for video security transmission and distribution systems is the unbalanced coaxial cable, represented by the RG59/U or RG11/U configurations. This cable has a single conductor with a characteristic impedance of 75 ohms, and the video signal is applied between the center conductor and a coaxial braided or foil shield (Figure 6-2). Single-conductor coaxial cables are manufactured with different impedances, but video transmission uses only the 75-ohm impedance, as specified in EIA standards. Other cables that may look like the 75-ohm cable have a different electrical impedance and will not produce an acceptable Analog Video, Voice, and Control Signal Transmission 151 ATTENUATION (dB/100 ft) RG6/U 1.50 RG11/U RG59/U 16 GAUGE BALANCED VIDEO PAIR 1.25 1.00 FOAM RG11/U * 0.75 0.50 0.25 FIBER OPTIC CABLE 0 0 10 20 30 40 50 FREQUENCY (MHz) * PREFERRED DIELECTRIC: CELLULAR (FOAM) POLYETHYLENE INDOORS, SOLID POLYETHYLENE OUTDOORS FIGURE 6-3 Coaxial cable signal attenuation vs. frequency television picture when used at a distance of 25 or 50 feet or more. The RG59/U and RG11/U cables are available from many manufacturers in a variety of configurations. The primary difference in construction is the amount and type of shielding and the insulator (dielectric) used to isolate the center conductor from the outer shield. The most common shields are standard single copper braid, double braid, or aluminum foil. Aluminum foil–type should not be used for any CCTV application. It is used only for cable television. Common dielectrics are foam, solid plastic, and air, the latter having a spiral insulator to keep the center conductor from touching the outer braid. The cable is called unbalanced, because the signal current path travels in the forward direction from the camera to the monitor on the center conductor and from the monitor back to the camera again on the shield, which produces a voltage difference (potential) across the outer shield. This current (and voltage) has the effect of unbalancing the electrical circuit. For short cable runs (a few hundred feet), the deleterious effects of the coaxial cable—such as signal attenuation, hum bars on the picture, deterioration of image resolution, and contrast—are not observed. However, as the distance between the camera and monitor increases to 1000–3000 feet, all these effects come into play. In particular, high-frequency attenuation sometimes requires equalizing equipment in order to restore resolution and contrast. Video coaxial cables are designed to transmit maximum signal power from the camera output impedance (75 ohms) with a minimum signal loss. If the cable characteristic impedance is not 75 ohms, excessive signal loss and signal reflection from the receiving end will occur and cause a deteriorated picture. The cable impedance is determined by the conductor and shield resistance of the core dielectric material, shield construction, conductor diameter, and distance between the conductor and the shield. As a guide, resistance of the center conductor for an RG59/U cable should be approximately 15 ohms per 1000 feet, and for an RG11/U cable, approximately 2.6 ohms per 1000 feet. Table 6-2 summarizes some of the characteristics of the RG59/U and RG11/U coaxial cables. 6.3.1.2 Connectors Coaxial cables are terminated with several types of connectors: the PL-259, used with the RG11/U cable, and the BNC, used with the RG59/U cable. The F-type is an RF connector used in cable television systems. Figure 6-4 illustrates these connectors. The BNC has become the connector of choice in the video industry because it provides a reliable connection 152 CCTV Surveillance CABLE TYPE ATTENUATION (dB) @ 5–10 MHZ 100 ft 200 ft 300 ft 400 ft 500 ft 1000 ft 1500 ft 2000 ft RG59/U 1.0 2.0 3.0 4.0 5.0 10.0 15.0 20.0 RG59 MINI 1.3 2.6 3.9 5.2 6.5 13.0 19.5 26.0 RG6/U .8 1.6 2.4 3.2 4.0 8.0 12.0 16.0 RG11/U .51 1.02 1.53 2.04 2.55 5.1 2422/UL1384 * 3.96 7.9 11.9 18.8 19.8 39.6 59.4 79.2 2546 * 1.82 3.6 5.5 7.3 9.1 18.2 27.3 36.4 RG179B/U 2.0 4.0 6.0 8.0 10.0 20.0 30.0 40.0 SIAMESE: RG59 (2) #22AWG 1.0 2.0 3.0 4.0 5.0 10.0 15.0 20.0 7.66 10.2 * MOGAMI NOTE: IMPEDANCE FOR ALL CABLES = 75 ohms dB LOSS % SIGNAL REMAINING Table 6-2 RCA 1 2 3 4.5 6 8 10.5 90 80 70 60 50 40 30 14 20 20 10 Coaxial Cable Attenuation vs. Length BNC UHF with minimum signal loss, has a fast and positive twist-on action, and has a small size, so that many connectors can be installed on a chassis when required. There are essentially three types of BNC connectors available: (1) solder, (2) crimp-on, and (3) screw-on. The most durable and reliable connectors are the solder and crimp-on. They are used when the connector is installed at the point of manufacture or in a suitably equipped electrical shop. The crimp-on and screw-on types are the most commonly used in the field, during installation and repair of a system. Either type can be successfully assembled with few tools in most locations. The crimp-on type uses a sleeve, which is attached to the cable end after the braid and insulation have been properly cut back; it is crimped onto the outer braid and the center conductor with a special crimping plier. When properly installed, this cable termination provides a reliable connection. To assemble the screw-on type, the braid and insulation are cut back and the connector slid over the end of the cable and then screwed on. This too is a fairly reliable type of connection, but it is not as durable as the crimp-on type, since it can be inadvertently unscrewed from the end of the cable. The screw-on termination is less reliable if the cable must be taken on or off many times. 6.3.1.3 Amplifiers SMA F SIAMESE POWER/BNC FIGURE 6-4 RCA, BNC, F, SMA, UHF and siamese cable connectors When the distance between the camera and the monitor exceeds the recommended length for the RG59/U and RG11/U cables, it is necessary to insert a video amplifier to boost the video signal level. The video amplifier is inserted at the camera location or somewhere along the coaxial Analog Video, Voice, and Control Signal Transmission cable run between the camera and the monitor location (Figure 6-5). The disadvantage of locating the video amplifier somewhere along the coaxial cable is that since the amplifier requires a source of AC (or DC) power, the power source must be available at its location. Table 6-1 compares the cable-length runs with and without a video amplifier. Note that the distance transmitted can be increased more than fourfold with one of these amplifiers. When the output from the camera must be distributed to various monitors or separate buildings and locations, a distribution amplifier is used (see Figure 6-5). This amplifier transmits and distributes monochrome and color video signals to multiple locations. In a quad unit, a single video input to the amplifier results in four identical, isolated video outputs capable of driving four 75-ohm RG59/U or RG11/U cables. The distribution amplifier is in effect a power amplifier, boosting the power from the single camera output so that multiple 75-ohm loads can be driven. A potential problem with an unbalanced coaxial cable is that the video signal is applied across the single inner conductor and the outer shield, thereby impressing a small voltage (hum voltage) on the signal. This hum voltage can be eliminated by using an isolation amplifier, a balanced coaxial cable, or fiber optics. 6.3.2 Balanced Two-Conductor Twin-axial Cable Transmission Balanced twin-axial cables are less familiar to the CCTV industry than the unbalanced cables. They have a pair of inner conductors surrounded by insulation, a coaxial-type shield, and an outer insulating protective sheath has a characteristic impedance of 124 ohms. They have been used for many years by telephone industry for transmitting video information and other high-frequency data. These cables have an outside diameter (typically 0.5 inch) and their cost, weight, and volume are higher than those of an unbalanced cable. Since the polarity on balanced cables must be maintained, the connector types are usually polarized (keyed). Figure 6-6 shows the construction and configuration of a balanced twin-axial cable system. The primary purposes for using balanced cable are to increase transmission range and to eliminate the picture degradation found in some unbalanced applications. Unwanted hum bars (dark bars on the television picture) SCENE EXTENDED RANGE CAMERA COAX VIDEO AMPLIFIER MONITOR COAX DVR / VCR * DISTRIBUTION TO MULTIPLE RECEIVING EQUIPMENT SCENE MONITOR DVR / VCR * CAMERA COAX MONITOR DISTRIBUTION AMPLIFIER SWITCHER COAX MONITOR DVR / VCR * MONITOR * EQUIPMENT IN THE SAME OR MULTIPLE LOCATIONS MONITOR DVR / VCR * MONITOR DVR / VCR PRINTER FIGURE 6-5 153 Video amplifier to extend range and/or distribute signal 154 CCTV Surveillance METALLIC BRAID DUAL COPPER CONDUCTORS OUTER INSULATED JACKET IMPEDANCE: 124 ohms FOAM INSULATION INTERCONNECTING SCHEMATIC SCENE CAMERA FIGURE 6-6 BALANCED TRANSMITTING TRANSFORMER 124 ohm BALANCED CABLE BALANCED RECEIVING TRANSFORMER MONITOR Balanced twin-axial cable construction and interconnection are introduced in unbalanced coaxial transmission systems when there is a difference in voltage between the two ends of the coaxial cable (see Section 6.3.1.1). This can often occur when two ends of a long cable run are terminated in different buildings, or when electrical power is derived from different power sources—in different buildings or even within the same building. Since the signal path and the hum current path through the shield of an unbalanced cable are common and result in the hum problem, a logical solution is to provide a separate path for each. This is accomplished by applying the signal between each center conductor of two parallel unbalanced cables (Figure 6-6). The shields of the two cables carry the ground currents while the two conductors carry the transmitted signal. This technique has been used for many years in the communications industry to reduce or eliminate hum. Since the transmitted video signal travels on the inner conductors, any noise or induced AC hum is applied equally to each conductor. At the termination of the run the disturbances are cancelled while the signal is directed to the load unattenuated. This technique in effect removes the unwanted hum and noise signals. While the balanced transmission line offers many advantages over the unbalanced line, it has not been in widespread use in the video security industry. The primary reason is the need for transformers at the camerasending and monitor-receiving ends, as well as the need for two-conductor twin-axial cable. All three hardware items require additional cost as compared with the unbal- anced single-conductor coaxial cable. The use of UTP transmission has become a popular replacement for the coaxial cable (Section 6.3.3), or fiber optics, described in Section 6.3.4. 6.3.2.1 Indoor Cable Indoor coaxial cable is small in diameter (0.25 inch), uses a braided shield, and is much more flexible than outdoor cable. To maintain the correct electrical impedance this smaller outside diameter cable requires proportionally smaller inner conductors. This decrease in diameter of the cable conductor causes a corresponding increase in the cable signal attenuation and therefore the RG59/U indoor cable cannot be used over long distances. The impedance of any coaxial cable is directly related to the spacing between the inner conductor and the shield; any change in this spacing caused by tight bends, kinking, indentations, or other factors will change the cable impedance resulting in picture degradation. Since indoor cabling and connectors need no protection from water, solder, crimp-on, or screw-on connectors can be used. 6.3.2.2 Outdoor Cable Outdoor video transmission applications put additional physical requirements on the coaxial cable. Environmental factors such as precipitation, temperature changes, humidity, and corrosion are present for both above-ground Analog Video, Voice, and Control Signal Transmission and buried installations. Other above-ground considerations include: wind loading, rodent damage, and electrical storm interference. For direct burial applications, ground shifts, damage due to water, and rodent damage are potential problems. Outdoor coaxial cabling is 1/2 inch in diameter or larger, since the insulation qualities in outside protective sheathing must be superior to those of indoor cables and their electrical qualities are better than indoor RG59/U cables. Outdoor cables have approximately 16 gauge inner-conductor diameters resulting in much less signal loss than the smaller, approximately 18 gauge center conductor indoor RG59/U cables. Outdoor cables are designed and constructed to take much more physical abuse than the indoor RG59/U cable. Outdoor cables are not very flexible and care must be taken with extremely sharp bends. As a rule of thumb, outdoor cabling should always be used for cable runs of more than 1000 feet, regardless of the environment. Outdoor video cable may be buried, run along the ground, or suspended on utility poles. The exact method should be determined by the length of the cable run, the economics of the installation, and the particular environment. Environment is an important consideration. In locations with severe weather, electrical storms or high winds, it is prudent to locate the coaxial cable underground, either direct-buried or enclosed in a conduit. This method isolates the cable from the severe environment, improving the life of the cable and reducing signal loss. In locations having rodent or ground-shift problems, enclosing the cable in a separate conduit will protect it. For short cable runs between buildings (less than 600–700 feet) and where the conduit is waterproof, indoor RG59/U cable is suitable. There are about 25 different types of RG59/U and about 10 different types of RG11/U cable but only a few are suitable for video systems. For optimum performance, choose a cable that has 95% copper shield or more and a copper or copper-clad center conductor. The copper-clad center conductor has a core of steel and copper cladding, has higher tensile strength, and is more suitable for pulling through conduit over long cable runs. While cable with 65% copper shield is available, 95% shielding or more should be used to reduce and prevent outside electromagnetic interference (EMI) signals from penetrating the shield, causing spurious noise on the video signal. A coaxial cable with 95% shield and a copper center conductor will have a loop resistance of approximately 16–57 ohms per 1000 feet. 155 near other electrical power distribution equipment or machinery producing high electromagnetic fields. In outdoor applications, in addition to the above the adverse environmental conditions caused by lightning storms or other high-voltage noise generators, such as transformers on power lines, electrical substations, automobile/truck electrical noise, or other EMI must be considered. In the case of EMI, a facility site survey should be made of the electromagnetic radiation present in any electrically noisy power distribution equipment. The cables should then be routed away from such equipment so that there is no interference with the television signal. When a site survey indicates that the coaxial cable must run through an area containing large electrical interfering signals (EMI) caused by large machinery, highvoltage power lines, refrigeration units, microwaves, truck ignition, radio or television stations, fluorescent lamps, two-way radios, motor-generator sets, or other sources, a better shielded cable, such as a twin-axial, tri-axial, UTP, or fiber optic cable may be the answer. The tri-axial cable has a center conductor, an insulator, a shield, a second insulator, a second shield, and the normal outer polyethylene or other covering to protect it from the environment. The double shielding significantly reduces the amount of outside EMI radiation that gets to the center conductor. The number of horizontal bars on the monitor can indicate where the source of the problem is. If the monitor has six dark bars, multiplying 6 by 60 equals 360, which is close to a 400-cycle frequency. This interference could be caused by an auxiliary motor-generator set often found in large factory machines operating at this frequency. To correct the problem, the cable could be rerouted away from the noise source, replaced with a balanced twin-axial or tri-axial cable, UTP, or for 100% elimination of the problem, upgraded to fiber-optic cable. If lighting and electrical storms are anticipated and signal loss is unacceptable, outdoor cables must be buried underground and proper high voltage–suppression circuitry must be installed at each end of the cable run and on the input power to the television equipment. In new installations with long cable runs (several thousand feet to several miles) or where different ground voltages exist, a fiber-optic link is the better solution, although balanced systems and isolation amplifiers can often solve the problem. 6.3.2.4 Grounding Problems 6.3.2.3 Electrical Interference For indoor applications, interference and noise can result from the following problems: (1) different ground potentials at the ends of the coaxial cable at different video equipment locations in a building, and (2) coaxial cable Ground loops are by far the most troublesome and noticeable video cabling problem (Figure 6-7). Ground loops are most easily detected before connecting the cables, by measuring the electrical voltage between the coaxial-cable shield and the chassis to which it is being connected. If the voltage difference is a few volts or more, there is a 156 CCTV Surveillance HUM BAR PICTURE TEARING HUM BAR HUM BAR HUM BAR FIGURE 6-7 Hum bars caused by ground loops potential for a hum problem. As a precaution, it is good practice to measure the voltage difference before connecting the cable and chassis for systems with a long run or between any two electrical supplies to prevent any damage to the equipment. Many large multiple-camera systems have some distortion in the video picture caused by random or periodic noise or if more severe, by hum bars. The hum bar appears as a horizontal distortion across the monitor at two locations: one-third and two-thirds of the way down the picture. If the camera is synchronized or power-line-locked, the bar will be stationary on the screen. If the camera is not line-locked, the distortion or bar will continuously roll slowly through the picture. Sometimes the hum bars are accompanied by sharp tearing regions across the monitor or erratic horizontal pulling at the edge of the screen (Figure 6-7). This is caused by the effect of the high voltages on the horizontal synchronization signal. Other symptoms include uncontrolled vertical rolling of the scene on the screen when there are very high voltages present in the ground loop. Interference caused by external sources or voltage differences can often be predicted prior to installation. The hum bar and potential difference between two electrical systems usually cannot be determined until the actual installation. The system designer should try to anticipate the problem and, along with the user, be prepared to devote additional equipment and time to solve it. The problem is not related to equipment at the camera or monitor end or to the cable installed; it is strictly an effect of the particular environment encountered, be it EMI interference or difference in potential between the main power sources at each location. The grounding problem can occur at any remote location, and it can be eliminated inexpensively with the installation of an isolation amplifier. Another solution, described in Section 6.3.4, is the use of fiber-optic transmission means, which eliminates electrical connections entirely. One totally unacceptable solution is the removal of the third wire on a three-pronged electrical plug, which is used to ground the equipment chassis to earth ground. Not only is such removal a violation of local electrical codes and Underwriters Laboratory (UL) recommendations, it is a hazardous procedure. If the earth ground is removed from the chassis, a voltage can appear on the camera, monitor, or other equipment chassis, producing a “hot” chassis that, if touched, can shock any person with 60–70 volts. When video cables bridge two power distribution systems, ground loops occur. Consider the situation (Figure 6-8) in which the CCTV camera receives AC power from power source A, while some distance away or in a different building the CCTV monitor receives power from distribution system B. The camera chassis is at 0 volts (connected to electrical ground) with reference to its AC power input A. The monitor chassis is also at 0 volts with respect to its AC distribution system B. However, the level of the electrical ground in one distribution system may be higher (or lower) than that of the ground in the other system; hence a voltage potential can exist between the two chassis. When a video cable is connected between the two distribution system grounds, the cable shield connects the two chassis and an alternating current flows in the shield between the units. This extraneous voltage (causing a ground-loop current to flow) produces the unwanted hum bars in the video image on the monitor. The second way in which hum bars can be produced on a television monitor is when two equipment chassis are mechanically connected, such as when a camera is mounted on a pan/tilt unit. If the camera receives power from one distribution system and the chassis of the pan/tilt unit is grounded to another system with a different level, a ground loop and hum bars may result. The size and extent of the horizontal bars depends on the severity of the ground potential difference. 6.3.2.5 Aluminum Cable Although coaxial cable with aluminum shielding provides 100% shielding, it should only be used for RF cable television (CATV) and master television (MATV) signals used for home video cable reception. This aluminum-shield type should never be used for CCTV for two reasons: (1) it Analog Video, Voice, and Control Signal Transmission 157 LOCATION B LOCATION A CAMERA (PAN/ TILT, ETC.) POWER SOURCE A MONITOR (OR SWITCHER, VCR, PRINTER, ETC.) POWER SOURCE B COAXIAL CABLE CONTROL WIRE GROUND COAXIAL SHIELD GROUND 117 VAC POWER FROM SYSTEM A LOCATION B GROUND 0 VOLTS LOCATION A GROUND 117 VAC POWER FROM SYSTEM B COAXIAL SHIELD GROUND * VOLTAGE DIFFERENCE * NOTE: THE VOLTAGE DIFFERENCE BETWEEN GROUND A AND B CAN BE 5–30 VOLTS, CAUSING CURRENT TO FLOW IN THE CABLE SHIELD, HUM BARS AND FAULTY OPERATION FIGURE 6-8 Two source AC power distribution system has higher resistance, and (2) it distorts horizontal synchronization pulses. The added resistance—approximately seven times more than that of a 95% copper or copper-clad shield— increases the video cable loop resistance, causing a reduction in the video signal transmitted along the cable. The higher loop resistance means a smaller video signal reaches the monitoring site, producing less contrast and an inferior picture. Always use a good-grade 95% copper braid RG59/U cable to transmit the video signal up to 1000 feet and an RG11/U to transmit up to 2000 feet. Distortion of the horizontal synchronization pulse causes picture tearing on the monitor, depicting straight-edged objects with ragged edges. 6.3.2.6 Plenum Cable Another category of coaxial cable is designed to be used in the plenum space in large buildings. This plenum cable has a flame-resistant exterior covering and very low smoke emission. The cable can be used in air-duct air-conditioning returns and does not require a metal conduit for added protection. The cable, designated as “plenum rated,” is approved by the National Electrical Code and UL. 6.3.3 Two-Wire Cable Unshielded Twisted Pair (UTP) Transmission It is convenient, inexpensive, and simple to transmit the video signal over an existing two-wire system. A standard, twisted pair, two-wire telephone, intercom, or other electrical system with an appropriate UTP transmitter and receiver has the capability to transmit all of the highfrequency information required for an excellent resolution monochrome or color picture. The UTP is a CAT-5, CAT-5e, or CAT-3 cable. The higher the level of CAT (5e) cable the greater the distance. Either a passive (no power required) or an active (12VDC, longer distanced) transmitter/receiver pair can be used. The passive system uses a small transmitter and receiver-one at each end of the pair of wires—and transmits the picture at distances of a few hundred feet to 3000 feet. The active powered system (12VDC) can transmit the video image 8000 feet for monochrome and 5000 feet for color. Picture resolution can be equivalent to that obtained with a coaxial cable 158 CCTV Surveillance system. The two-wire pair must have a continuous conductive path from the camera to the monitor location. Highfrequency emphasis in the transmitter and receiver compensate for any attenuation of the high frequencies. The balanced UTP configuration makes the cable immune to most external electrical interference and in many environments the UTP cable can be located in the same conduit with other data cables. The UTP system must have a conductive (resistive copper) path for the two wires. The signal path cannot have electrical switching circuits between the camera and the monitor location; however, mechanical splices and connectors are permissible. The components for the two-wire system can cost more than equivalent coaxial cable since an additional transmitter and receiver are required. However, this cost may be small compared with the cost of installing a new coaxial cable from the camera to the monitor location. Figure 6-9 illustrates the block diagram and connections for the UTP, and active transmitter and receiver pair. 6.3.3.1 Balanced 2-Wire Attributes The UTP provides a technology that can significantly reduce external electrical radiation from causing noise in the video signal. It also eliminates ground loops present in unbalanced coaxial transmission since isolation designed into the UTP transmitters and receivers. 6.3.3.2 UTP Technology The UTP technology is based on the concept that any external electrical interference affects each of the two conductors identically so that the external disturbance is canceled and has no effect on the video signal. The transmitter unit converts the camera signal 75 ohm impedance to match the UTP 100 ohm CAT-5e impedance and provides the frequency compensation required. The receiver unit amplifies and reconstructs the signal and transmits it over a short distance to the television monitor via 75 ohm coaxial cable. Most active transmitters and receivers have 3 to 5 position dip switches which are set depending on the cable length to optimize the video signal waveform. Both the transmitter and the receiver are powered by either 12VDC or self-powered from the camera or monitor. The UTP system can be operated with CAT-3, 5, 5e, and 6 as defined in the TIA/EIA 568b.2 standard. CAT-5e is now used for most new video installations and supercedes the extensively installed CAT-5 cable. UNSHIELDED TWISTED PAIR (UTP) CAT-3, 5, 5e COAX 75 ohm VIDEO RECEIVER VIDEO TRANSMITTER COAX 75 ohm 100 ohm CAMERA MONITOR (A) ACTIVE TRANSMITTER FIGURE 6-9 Two wire UTP video transmission system (B) ACTIVE RECEIVER Analog Video, Voice, and Control Signal Transmission 6.3.3.3 UTP Implementation with Video, Audio, and Control Signals The UTP transmitter is located between the camera and the CAT-5e UTP cable input to transmit video, audio, alarms, and control signals. The receiver is located between the monitor end of the CAT-5e UTP cable and the monitor, recorder, or control console. UTP transmitters are small enough to be part of the camera electronics or can be powered by the camera, audio, and/or control electronics (Figure 6-10). 6.3.3.4 Slow-Scan Transmission The wireless transmission systems described in the previous sections all result in real-time video transmission. A scheme for transmitting the television picture over large distances, even anywhere in the world, uses slow-scan television transmission (Figure 6-11). This non-real-time technique involves storing one television picture frame (snapshot) and sending it slowly over a telephone or other audio-grade network anywhere within a country or to another country. The received picture is reconstructed at the remote receiver to produce a continuously displayed television snapshot. Each snapshot takes anywhere from several to 72 seconds to transmit, with a resulting picture having from low to high resolution, depending on the speed of transmission. A TL effect is achieved, and every scene frame is transmitted spaced from several to 72 seconds apart. Through this operation, specific frames are serially captured, sent down the telephone line, and reconstructed by 159 the slow-scan system. Once the receiver has stored the digital picture information, if the transmitter is turned off or the video scene image does not change, the receiver continues to display the last video frame continuously (30 fps) as a still image. The image stored in the receiver or transmitter changes when the system is commanded, manually or automatically, to take a new snapshot. Figure 6-12 illustrates the salient difference between real-time and nonreal-time television transmission. Implementation. Figure 6-12 shows the relationship of non-real-time or slow-scan television transmission. At the camera site the first frame starts at time zero (Figure 6-12a), the second frame at 1/30th of a second, and the third frame at 2/30th of a second (the same as for real-time). Before these frames are transmitted over the audio-grade transmission link, the signal is processed at the camera site in a transmitter processor. The processor captures Frame 1 from the camera, that is, it memorizes (digitizes) the CCTV picture. The processor then slowly (at 2 seconds per frame, as shown in Figure 6-12b) transmits the video frame, element by element, line by line, until the receiver processor located at the monitor site has accepted all 525 lines in that frame. The significant difference between real-time and slowscan transmission is the time it takes to transmit the picture. In the real-time case, it is 1/30th of a second, the real-time of the frame. In the case of the slow-scan (Figure 6-12b), it may take 2, 4, 8, 32, up to 72 seconds to transmit that single frame to the monitor site. Figure 6-13 is a block diagram of a simplex (one-way) slow-scan system. CAMERA COAX TRANSMITTER: • VIDEO • AUDIO • ALARM RECEIVER: • AUDIO • CONTROL DEDICATED TWO WIRE SYSTEM • • • SHIELDED 2 WIRE TWISTED PAIR (UTP) TELEPHONE RECEIVER: • VIDEO • AUDIO • ALARM VIDEO COAX TRANSMITTER: • AUDIO • CONTROL MICROPHONE MICROPHONE SPEAKER SPEAKER COMMAND FUNCTIONS ALARM INPUTS FIGURE 6-10 Real-time transmission system with video, audio, and controls ALARM OUTPUT DEVICES 160 CCTV Surveillance LOCATION 1 LOCATION 2 SLOW-SCAN TRANSCEIVER SLOW-SCAN TRANSCEIVER CAMERA CAMERA DUPLEX (TWO-WAY) NETWORK 3000 Hz BANDWIDTH MONITOR MONITOR (A) PICTURE RESOLUTION: 128 × 64 (H × V) FULL PICTURE TRANSMIT TIME: 2.6 SEC (B) PICTURE RESOLUTION: 256 × 128 (H × V) FULL PICTURE TRANSMIT TIME: 8.0 SEC (C) PICTURE RESOLUTION: 512 × 256 (H × V) FULL PICTURE TRANSMIT TIME: 31 SEC NOTE: PICTURE TRANSMIT UPDATE TIME DEPENDS ON MOTION IN PICTURE. MONOCHROME PICTURE. FIGURE 6-11 Slow-scan video transmission and transmitted pictures over telephone lines (A) REAL-TIME (30 FPS) (B) SLOW-SCAN FRAME 1 T=0 FRAME 1 T=0 FRAME 2 T = 1/30 SEC FRAME 2 T = 2 SEC G FRAME 3 T = 2/30 SEC FRAME 3 T = 4 SEC G CONTINUOUS VIEWING OF MOVING VEHICLE FIGURE 6-12 Real-time video transmission vs. non-real-time (slow-scan) SNAP SHOTS OF PERSON WALKING Analog Video, Voice, and Control Signal Transmission NUMBER OF GRAY SCALE LEVELS SCAN RATE 161 DIGITAL TELEPHONE DIALER LENS CAMERA ANALOG/ DIGITAL CONVERTER VIDEO FRAME GRABBER AUDIO DRIVE AMPLIFIER DIGITAL VIDEO COMPRESSION DIGITAL TO ANALOG CONVERTER ALARM INPUT ANY DUPLEX AUDIO BANDWITH COMMUNICATIONS LINK AUDIO FREQUENCY DEMODULATOR FIGURE 6-13 ANALOG TO DIGITAL CONVERTER DIGITAL VIDEO DECOMPRESSION TEMPORARY MEMORY DIGITAL TO ANALOG CONVERTER MONITOR Slow-scan system block diagram—simplex (one-way) To increase transmission speed, complex compression and modulation algorithms are used so that only the changing parts of a scene (i.e. the movements) are transmitted. Another technique first transmits areas of high scene activity with high resolution and then areas of lower priority with lower resolution. These techniques increase the transmission of the intelligence in the scene. By increasing transmission time of the frame from 1/30th of a second to several seconds, the choice of cable or transmission path changes significantly. For slow-scan it is possible to send the full video image on a twisted-pair or telephone line or any communications channel having a bandwidth equivalent to audio frequencies, that is, up to only 3000 Hz (instead of a bandwidth up to 4.2 MHz, as needed in real-time transmission). So all existing satellite links, mobile telephones, and other connections can be used. A variation of this equipment for an alarm application can store multiple frames at the transmitting site, so if the information to be transmitted is an alarm, this alarm video image can be stored for a few seconds (every 1/30 second) and then slowly transmitted to the remote monitoring site frame by frame, thereby transmitting all of the alarm frames. Figure 6-14 shows the interconnecting diagram and controls for a typical slow-scan system. Resolution, Scene Activity vs. Transmit Time. Transmit time per frame is determined by video picture resolution and activity (motion) in the scene. The larger the number of gray-scale levels and number of colors transmitted, the longer the transmit time per frame of video. If only a few gray-scale levels are transmitted (photocopy quality), or a limited number of colors and a small amount of motion in the scene are present, then there is less picture information to transmit and short (1–8 seconds) transmit times result. High gray-scale levels (256 levels), full color, and motion require more information and longer transmission times. Slow-scan transmission is a compromise between resolution and scene activity and required scene update time. 6.3.4 Fiber-Optic Transmission 6.3.4.1 Background One of the most significant advances in communications and signal transmission has been the innovation of fiber optics. However, the concept of transmitting video signals over fiber optics is not new. The transmission of optical signals in fibers was investigated in the 1920s and 1930s but it was not until the 1950s that Kapany invented the 162 CCTV Surveillance AC OR DC POWER AC OR DC POWER 1 CAMERA MONITOR TRANSMISSION PATH: ANY AUDIO GRADE CHANNEL 2 CAMERA 3 CAMERA 1 CAMERA 2 4 CAMERA CAMERA MONITOR * TALK / VIEW AUTO ANSWER / TRANSMIT TALK / VIEW AUTO ANSWER / TRANSMIT 3 CAMERA 4 CAMERA KEYPAD CONTROLS: • TRANSMIT TIME (1, 2, 4, 8, 16, 32, 64) • SHADES OF GRAY (16, 32, 64, 128) • PIXEL RESOLUTION (32, 64, 128, 256, 512) • QUAD OR FULL SCREEN • CAMERA SELECTOR (4 CAMERAS) • EXTERNAL DEVICES (VCR, HORN, LIGHTS, ETC.) * DUPLEX SYSTEM (2 WAY) USE MONITOR AND CAMERA AT EACH END FIGURE 6-14 Slow-scan interconnecting diagram and controls practical glass-coated (clad) glass fiber and coined the term fiber optics. Clad fiber was actively investigated in the 1960s by K.C. Kao and G.A. Hockham, researchers at Standard Telecommunications Laboratories in England, who proposed that this type of waveguide could form the basis of a new transmission system. In 1967 attenuations through the fiber was more than 1000 dB per kilometer (0.001% transmission/km) which were impractical for transmission purposes, and researchers focused on reducing these losses. Figure 6-16 shows a comparison of fiber-optic transmission vs. other electrical transmission means. In 1970, investigators Kapron, Keck, and Maurer at Corning Glass Works announced a reduction of losses to less than 20 dB per kilometer in fibers hundreds of meters long. In 1972 Corning announced a reduction of 4 dB per one kilometer of cable, and in 1973 Corning broke this record with a 2 dB per kilometer cable. This low-loss achievement made a revolution in transmission of wide-bandwidth, long-distance communications inevitable. In the early 1970s, manufacturers began making glass fibers that were sufficiently lowloss to transmit light signals over practical distances of hundreds or a few thousand feet. Broadband fiber-optic components are much more expensive than cable. They should be used when there is a definite need for them. Note also that video signals must be digitized to avoid nonlinear transmitter/receiver effects. Why use fiber-optic transmission when coaxial cables can provide adequate video signal transmission? Today’s high-performance video systems require greater reliability and more “throughput,” that is, getting more signals from the camera end to the monitor end, over greater distances, and in harsher environments. The fiber-optic transmission system preserves the quality of the video signal and provides a high level of security. The information-carrying capacity of a transmission line, whether electrical or optical, increases as the carrier frequency increases. The carrier for fiber-optic signals is light, which has frequencies several orders of magnitude (1000 times) greater than radio frequencies, and the higher the carrier frequency the larger the bandwidth that can be modulated onto the cable. Some transmitters and receivers permit multiplexing multiple television signals, control signals, and duplex audio onto the same fiber optic because of its wide bandwidth. The clarity of the picture transmitted using fiber optics is now limited only by the camera, environment, and Analog Video, Voice, and Control Signal Transmission OPTICAL CONNECTORS (SMA, SFR, LFR) TRANSMITTER COAXIAL CABLE LENS CAMERA BNC ELECTRICAL TO OPTICAL SIGNAL BNC CONVERTER GND 163 OPTICAL CONNECTORS (SMA, SFR, LFR) RECEIVER FIBER OPTIC CABLE 12 VDC POWER CONVERTER 117 VAC TO 12 VDC 117 VAC POWER COAXIAL OPTICAL TO CABLE ELECTRICAL SIGNAL BNC CONVERTER GND 12 VDC POWER CONVERTER 117 VAC TO 12 VDC BNC MONITOR 117 VAC POWER FIBER OPTIC SYSTEM COMPONENTS: • TRANSMITTER • RECEIVER • FIBER OPTIC CABLE • POWER CONVERTERS (2) FIGURE 6-15 Fiber optic transmission system monitoring equipment. Fiber-optic systems can transmit signals from a camera to a monitor over great distances— typically several miles—with virtually no distortion or loss in picture resolution or detail. Figure 6-15 shows the block diagram of the hardware required for a fiber-optic system. The system uses an electrical-to-optical signal converter/transmitter, a fiber cable for sending the light signal from the camera to the monitor, and a light-toelectrical signal receiver/converter to transform the signal back to a base-band video signal required by the monitor. At both camera and monitor ends standard RG59/U coaxial cable or UTP wire is used to connect the camera and monitor to the system. A glass fiber optic–based video link offers distinct advantages over copper-wire or coaxial-cable transmission means: • The system transmits information with greater fidelity and clarity over longer distances. • The fiber is totally immune to all types of electrical interference—EMI or lightning—and will not conduct electricity. It can touch high-voltage electrical equipment or power lines without a problem. • The fiber being nonconductive does not create any ground loops. • The fiber can be serviced while the transmitting or receiving equipment is still energized since no electrical power is involved. • The fiber can be used where electrical codes and common sense prohibit the use of copper wires. • The cable will not corrode and the glass fiber is unaffected by salt and most chemicals. The direct-burial type of cable can be laid in most kinds of soil or exposed to most corrosive atmospheres inside chemical plants or outdoors. • Since there is no electrical connection of any type, the fiber poses no fire hazard to any equipment or facility in even the most flammable atmosphere. • The fiber is virtually unaffected by atmospheric conditions, so the cable can be mounted aboveground and on telephone poles. When properly applied, the cable is stronger than standard electrical wire or coaxial cable and will therefore withstand far more stress from wind and ice loading. • Single or multiple fiber-optic cables are much smaller and lighter than a coaxial cable. It is easier to handle and install, and uses less conduit or duct space. A single optical cable weighs 8 pounds per 3300 feet and has an overall diameter of 0.156 inches. A single coaxial cable weighs 330 pounds per 3300 feet and is approximately 0.25 inches in diameter. 164 CCTV Surveillance • It transmits the video signal more efficiently (i.e. with lower attenuation) and since over distances of less than 50 miles it needs no repeater (amplifier), it is more reliable and easier to maintain. • It is a more secure transmission medium, since not only is it hard to tap but an attempted tap is easily detected. The economics of using a fiber-optic system is complex. Users evaluating fiber optics should consider the costs beyond those for the components themselves. The small size, lightweight, and flexibility of fiber optics often present offsetting cost advantages. The prevention of unanticipated problems such as those just listed can easily offset any increased hardware costs of fiber-optic systems. With such rapid advances, the security system designer should consider fiber optics the optimum means to transmit high-quality television signals from high-resolution monochrome or color cameras to a receiver (monitor, switcher, recorder, printer, and so on) without degradation. This section reviews the attributes of fiber-optic systems, their design requirements, and their applications. 6.3.4.2 Simplified Theory The fiber-optic system uses a transmitter at the camera and a receiver at the monitor and the fiber cable in between (Figure 6-15). The following sections describe these three components. By far the most critical is the fiber-optic cable, since it must transmit the video light signal over a long distance without attenuation distortion (changing its shape or attenuation at high frequencies). As shown in Figure 6-15, the signal from the camera is sent to the transmitter via standard coaxial cable. At the receiver end, the output from the receiver is likewise sent via standard wire cable to the monitor or recording system. The optical transmitter at the camera end converts (modulates) the electrical video analog signal into a corresponding optical signal. The output from the transmitter is an optical signal generated by either an LED or an ILD, emitting IR light. When more than one video signal is to be transmitted another option is to transmit multiple signals over one fiber using wavelength multiplexing (Section 6.3.4.7). The multi-fiber-optic cable consists of multiple glass fibers, each acting as a waveguide or conduit for one video optical signal. The glass fibers are enclosed in a protective outer jacket whose construction depends on the application. The fiber-optic receiver collects the light from the end of the fiber-optic cable and converts (demodulates) the optical signal back into an electrical signal having the same waveform and characteristics as the original video signal at the camera and then sends it to the monitor or recorder. The only variation in this block diagram for a single camera is the inclusion of a connection, splice, or repeater that may be required if the cable run is very long (many miles). The connector physically joins the output end of one cable to the input end of another cable. The splice reconnects two fiber ends so as to make them continuous. The repeater amplifies the light signal to provide a good signal at the receiver end. How does the fiber-optic transmission system differ from the electrical cable systems described in the previous sections? From the block diagram (Figure 6-15) it is apparent that two new hardware components are required: a transmitter and a receiver. The transmitter provides an amplitude- or frequency-modulated representation of the video signal at near-IR wavelengths which the fiber optic transmits, and at a level sufficient to produce a high-quality picture at the receiver end. The receiver collects whatever light energy is available at the output of the fiber-optic cable and converts it efficiently, with all the information from the video signal retained, into an electrical signal that is identical in shape and amplitude to the camera output signal. As with any of the transmission means, the fiberoptic cable attenuates the video signal. Figure 6-16 shows the attenuation frequency for current fiber-optic cable as compared with telephone cable, special high-frequency cable, coaxial cable, and early fiber-optic cable. The fiber-optic cable efficiently transmits the modulated light signal from the camera end over a long distance to the monitor, while maintaining the signal’s shape and amplitude. Characteristics of fiber-optic cable are totally different from those of coaxial cable or two-wire transmission systems. Before discussing the construction of the fiber-optic cable, we will briefly describe the transmitting light. In any optical material, light travels at a velocity (Vm ) characteristic of the material, which is lower than the velocity of light (C ) in free space of air (Figure 6-17a). The ratio (fraction) of the velocity in the material compared with that in free space defines the refractive index (n) of the material: n= C Vm When light traveling in a medium of a particular refractive index strikes another material of a lower refractive index, the light is bent toward the interface of the two materials (Figure 6-17b). If the angle of incidence is increased, a point is reached where the bent light will travel along the interface of the two materials. This is known as the “critical angle” (C ). Light at any angle greater than the critical angle is totally reflected from the interface and follows a zigzag transmission path (Figure 6-17b,c). This zigzag transmission path is exactly what occurs in a fiberoptic cable: the light entering one end of the cable zigzags through the medium and eventually exits at the far end at approximately the same angle. As shown in Figure 6-17c, some incoming light is reflected from the fiber-optic end and never enters the fiber. Analog Video, Voice, and Control Signal Transmission ATTENUATION (dB/km) 30 SPECIAL HIGH TELEPHONE, FREQUENCY UTP 3, 5e CABLE CABLE 165 COAXIAL CABLE 25 EARLY FIBER OPTIC CABLE: 20 dB/km (1970) 20 (100 TO 1 LOSS) 15 10 IMPROVED CABLE: 4 dB/km (1972) 5 CURRENT CABLE: 2– 4 dB/km 0 0.1 1.0 10 5 MHz MAX. VIDEO BANDWIDTH FIGURE 6-16 100 1000 FREQUENCY (MHz) NOTE: 1 KILOMETER (km) = .67 mile Attenuation vs. frequency for copper and fiber optic cable In practice, an optical fiber consists of a core, a cladding, and a protective coating. The core material has a higher index of refraction than the cladding material and therefore the light, as just described, is confined to the core. This core material can be plastic or glass, but glass provides a far superior performance (lower attenuation and greater bandwidth) and therefore is more widespread for long-distance applications. One parameter often encountered in the literature is the numerical aperture (NA) of a fiber optic, a parameter that indicates the angle of acceptance of light into a fiber—or simply the ease with which the fiber accepts light. The NA is an important fiber parameter that must be considered when determining the signal-loss budget of a fiber-optic system. To visualize the concept, picture a bottle with a funnel (Figure 6-18). The larger the funnel angle, the easier it is to pour liquid into the bottle. The same concept holds for the fiber. The wider the acceptance angle, the higher the NA, the larger the amount of light that can be funneled into the fiber from the transmitter. The larger or higher an optical fiber NA, the easier it is to launch light into the fiber, which correlates to higher coupling efficiency. Since fiber-optic systems are often coupled to LEDs, which are the light generators at the transmitter, and since LEDs have a less-concentrated, diffuse output beam than ILDs, fiber optics with high NAs allow more collection of the LED output power. In order for the light from the transmitter to follow the zigzag path of internally reflected rays, the angles of reflection must exceed the critical angle. These reflection angles are associated with “waveguide modes.” Depending on the size (diameter) of the fiber-optic core, one or more modes are transmitted down the fiber. The characteristics and properties of these different cables carrying singlemode and multimode fibers are discussed in the next section. Like radio waves, light is electromagnetic energy. The frequencies of light used in fiber-optic video, voice, and data transmission are approximately 36 × 1014 , which is several orders of magnitude higher than the highest radio waves. Wavelength (the reciprocal of frequency) is a more common way of describing light waves. Visible light with wavelengths from about 400 nm for deep violet to 750 nm for deep red covers only a small portion of the electromagnetic spectrum (see Chapter 3). Fiber-optic video transmission uses the near-IR region, extending from approximately 750 to 1500 nm, since glass fibers propagate light at these wavelengths most efficiently, and efficient detectors (silicon and germanium) are available to detect such light. 166 CCTV Surveillance (C) SMALL AMOUNT REFLECTED (A) REFRACTION OF LIGHT REFRACTED LIGHT CORE CLADDING θ1 GLASS θ2 nCORE > nCLADDING INCOMING LIGHT VELOCITY = VM (B) θ1 LIGHT LOST TO CLADDING θC = CRITICAL ANGLE FREE SPACE (AIR) θ1 θC VELOCITY = C NA ACCEPTED θ1 NA = SIN θ LIGHT TRANSMITTED THROUGH CABLE CORE NA = NUMERICAL APERTURE FIGURE 6-17 Light reflection/transmission in fiber optics 6.3.4.3 Cable Types The most significant part of the fiber-optic signal transmission system is the glass fiber itself, a thin strand of very pure glass approximately the diameter of a human hair. The fiber transmits visible and near-IR frequencies with extremely high efficiency. Most fiber-optic system operate at IR wavelengths of 850, 1300, or 1550 nm. Figure 6-19 shows where these near-IR light frequencies are located with respect to the visible light spectrum. Most short (several miles long) fiber-optic security systems operate at a wavelength of 850 nm rather than 1300 or 1550 nm because 850 nm LED emitters are more readily available and less expensive than their 1300 nm or 1550 nm counterparts. Likewise, IR detectors are more sensitive at 850 nm. LED and ILD radiation at the 1300 and 1550 nm wavelengths is transmitted along the fiberoptic cables more efficiently than at the 850 nm frequency; they are used for much longer run cables (hundreds of miles). Two types of fibers are used in security systems: (1) multimode step-index (rarely), and (2) graded-index. These two types are defined by the index of refraction (n) profile of the fiber and the cross section of the fiber core. The two types have different properties and are used in different applications. 6.3.4.3.1 Multimode Step-Index Fiber Figure 6-20a illustrates the physical characteristics of the multimode step-index fiber. The fiber consists of a center core of index n = 147 and outer cladding of index n = 2. Light rays enter the core and are reflected a multiple number of times down the core and exit at the far end. Since this fiber propagates many modes, it is called “multimode step-index.” The multimode step-index is usually 50, 100, or even 200 microns (0.002, 0.004, or 0.008 inches) in diameter. The fiber core itself is clad with a thin layer of glass having a sharply different index of refraction. Light travels down the fiber, constantly being reflected back and forth from the interface between the two layers of glass. Light that enters the fiber at a sharp angle is reflected at a sharp angle from the interface and is reflected back and forth many more times, thus traveling more slowly through the fiber than light that enters at a shallow angle. The difference in the arrival time at the end of the fiber limits the bandwidth of the step-index fiber, so that most such fibers provide good signal transmission up to a 20 MHz signal for Analog Video, Voice, and Control Signal Transmission NEAR FIELD FIBER CLADDING C A θ FIBER CORE NUMERICAL APERTURE: NA = SIN θ = A /C NA 0.1 0.2 0.3 0.4 0.5 FAR FIELD TYPICAL NA VALUES IN GLASS f/# θ (DEGREES) 5.7 5.00 11.5 2.45 17.5 1.58 23.4 1.14 30.0 0.87 S = SENDING R = RECEIVING NAS NAR SIGNAL LOSS (dB) 1.0 LIGHT ENERGY MISSED BY FIBER (CROSSHATCH) LIGHT FROM FIBER 1 0.1 .01 LIGHT FROM FIBER 2 .80 FIBER 1 .85 .90 .95 1.00 1.05 FIBER 2 NUMERICAL APERTURE MISMATCH RATIO = FIGURE 6-18 1.10 NAR NAS Fiber optic numerical aperture NEAR-IR RELATIVE/ OUTPUT RESPONSE VISIBLE 850 1300 1550 10 9 HUMAN EYE VIDICON (RFEFERENCE) RESPONSE 8 CCD,CMOS 7 6 5 4 3 2 1 0 FIGURE 6-19 400 500 600 700 800 Fiber optic transmission wavelengths 900 1000 1100 1200 1300 1400 1500 1600 WAVELENGTH (NANOMETERS) 167 168 CCTV Surveillance (A) MULTI-MODE STEP-INDEX FIBER N2 TYPICAL ATTENUATION: N1 50 mm 7–15 dB/km @ 850 nm NA = 0.30 125 mm DIA. (B) MULTI-MODE GRADED-INDEX FIBER VARIABLE N TYPICAL ATTENUATION: N2 2.5–5.0 dB/km @ 850 nm 50 mm 0.7–2.5 dB/km @ 1300 nm NA = .20 125 mm DIA. FIGURE 6-20 Multimode fiber optic cable about 1 kilometer. This limitation is more than adequate for many video applications. 6.3.4.3.2 Multimode Graded-Index Fiber The multimode graded-index fiber is the workhorse of the video security industry (Figure 6-20b). Its low power attenuation—less than 3 dB (50% loss) per kilometer at 850 nm—makes it well suited for short and long cable runs. Most fibers are available in 50-micron-diameter core with 125-micron total fiber diameter (exclusive of outside protective sheathing). Graded-index fiber sizes are designated by their core/cladding diameter ratio, thus the 50/125 fiber has 50-micron-diameter core and a 125-micron cladding. The typical graded-index fiber has 50/125 62.5/125 † 100/140 TYPICAL CABLE PARAMETERS DIAMETER * (MICRONS) FIBER TYPE SINGLE FIBER 2 FIBER OD ** 4 FIBER CLADDING BUFFERING OD (mm) WEIGHT (kg/km) (mm) WEIGHT (kg/km) 50 125 250 2.6 6.5 3.4 × 6 22 8 55 62.5 125 250 3.0 6.4 3.0 × 6.1 18 9.4 65.5 140 250 2.6 6.5 3.4 22 7.1 50 100 ** CABLE OUTSIDE DIAMETER OR CROSS SECTION † MOST WIDELY USED IN SECURITY APPLICATIONS 1 mm = 1000 MICRONS 1 kg/km = 0.671 lb/1000 ft Standard Fiber Optic Cable Sizes OD (mm) WEIGHT (kg/km) CORE * FIBER DIAMETER (1 MICRON = .00004 inch) Table 6-3 a bandwidth of approximately 1000 MHz and is one of the least expensive fiber types available. The 50/125 fiber provides high efficiency when used with a high-quality LED transmitter or for very long distances or very wide bandwidths, with an ILD source. Table 6-3 lists some of the common cable sizes available. For the graded-index fiber, the index of refraction (n) of the core is highest at the center and gradually decreases as the distance from the center increases (Figure 6-20b). Light in this type of core travels by refraction: the light rays are continually bent toward the center of the fiber-optic axis. In this manner the light rays traveling in the center of the core have a lower velocity due to the high index of refraction, and the rays at the outer limits travel much Analog Video, Voice, and Control Signal Transmission faster. This effect causes all the light to traverse the length of the fiber in nearly the same time and greatly reduces the difference in arrival time of light from different modes, thereby increasing the fiber bandwidth–carrying capability. The graded-index fiber satisfies long-haul, widebandwidth security system requirements that cannot be met by the multimode step-index fiber. 6.3.4.3.3 Cable Construction and Sizes A fiber-optic cable consists of a single optical fiber that is surrounded by a tube of plastic substantially larger than the fiber itself. Over this tube is a layer of Kevlar reinforcement material. The entire assembly is then covered with an outer jacket, typically made of polyvinyl chloride (PVC). This construction is generally accepted for use indoors or where the cable is easily pulled through a dry conduit. The two approaches to providing primary protection to a fiber is the tight buffer and loose tube (Figure 6-21). The tight buffer uses a dielectric (insulator) material such as PVC or polyurethane applied tightly to the fiber. For medium- and high-loss fibers (step-index type), such cable-induced attenuation is small compared with overall attenuation. The tight buffer offers the advantages of smaller bend radii and better crush resistances than loose- tube cabling. These advantages make tightly buffered fibers useful in applications of short runs where sharp bends are encountered or where cables may be laid under carpeted walking surfaces. The loose-tube method isolates the fiber from the rest of the cable, allowing the cabling to be twisted, pulled, and otherwise stressed with little effect on the fiber. Microbends caused by tight buffers are eliminated by placing the fiber within a hard plastic tube that has an inside diameter several times larger than the diameter of the fiber. Fibers for long-distance applications typically use a loose tube since decoupling of the fiber from the cable allows the cable to be pulled long lengths during installation. The tubes may be filled with jelly to protect against moisture that could condense and freeze and damage the fiber. Multimode graded-index fiber is available in several primary core sizes: 50/125, 62.5/125, and 100/140. Table 6-3 summarizes the properties of different fiber-cable types used in security systems, indicating the sizes and weights. The first number, in the fiber designation (50 in 50/125), refers to the core outside diameter size, the second (125) to the glass fiber outside diameter (the sizes exclude reinforcement or sheathing). The fiber size is expressed in microns: 1 micron (m) equals one one-thousandth of a LOOSE TUBE BUFFER OUTER PROTECTIVE JACKET (3 mm DIA.) TIGHT BUFFER 50/125 FIBER LOOSE JACKET BUFFER (250 µm DIA.) CLADDING (125 µm DIA.) OUTER PROTECTIVE JACKET (3 mm DIA.) CORE (50 µm DIA.) STRENGTH MEMBER TIGHT BUFFER JACKET (940 µm DIA.) KEVLAR STRENGTH MEMBER CLADDING (125 µm DIA.) TRANSMITTING CORE (50 µm DIA.) PVC OR POLYURETHANE INSULATOR TYPICAL OPTICAL CHARACTERISTICS NOTE: 1000 µm (MICRONS) = 1 mm (MILLIMETER) (125 µm = .125 mm, 50 µm = .05 mm) FIGURE 6-21 169 MINIMUM BANDWIDTH: 200 MHz ATTENUATION: @ 850 nm = 4–6 dB/km @ 1300 nm = 3 dB/km NUMERICAL APERTURE = NA = .25 Tight-buffer and loose-tube single fiber optic cable construction 170 CCTV Surveillance TYPICAL LOSS CABLE LOSS TYPE COMMENTS (dB) (%) AXIAL-LATERAL DISPLACEMENT (10%) 0.55 12.0 ANGULAR MISALIGNMENT (2 DEGREES) 0.30 6.7 END SEPARATION (AIR GAP) 0.32 7.0 MOST CRITICAL FACTOR FUNCTION OF NUMERICAL APERTURE ESSENTIALLY ELIMINATED USING INDEX MATCHING FLUID END FINISH: (A)ROUGHNESS (1 MICRON) (B)NON PERPENDICULAR 0.50 0.25 11.0 CORE SIZE MISMATCH: 1% DIAMETER TOLERANCE ±5% DIAMETER TOLERANCE 0.17 0.83 4.0 18.0 LOSS OCCURS ONLY WHEN LARGER 1.66 31.6 CRITICAL FACTOR WHEN NAS IS LARGER THAN NAR NUMERICAL APERTURE (NA) DIFFERENCE OF ± 0.02 (2%) NOTE: dB = DECIBELS = 10 LOG POWERS POWERR Table 6-4 5.6 INCLUDES FRESNEL LOSS (.35 dB) LOSS NOT COMMONLY FOUND CORE COUPLES INTO SMALLER CORE S = SENDING FIBER R = RECEIVING FIBER Fiber Optic Connector Coupling Losses millimeter (1/1000 mm). By comparison, the diameter of a human hair is about 0.002 inches or 50 microns. Each size has advantages for particular applications, and all three are EIA standards. The most popular and least expensive multimode fiber is the 50/125, used extensively in video security. It has the lowest NA of any multimode fiber, which allows the highest bandwidth. Because 50/125 has been used for many years, established installers are experienced and comfortable working with it. Many connector types are available for terminating the 50/125 cable, an alternative to the 62.5/125 fiber. The 50/125 and 62.5/125 were developed for telephone networks and are now used extensively for video. An 85/125 was developed specifically for computer or digital local networks where short distances are required. The slightly larger 85-micron size permits easier connector specifications and LED source requirements. The 100/140 multimode fiber was developed in response to computer manufacturers, who wanted an LEDcompatible, short-wavelength, optical-fiber data link that could handle higher data rates than coaxial cable. While this fiber was developed for the computer market it is excellent for short-haul CCTV security applications. It is least sensitive to fiber-optic geometry variations and connector tolerances which generally means lower losses at joint connections. This is particularly important in industrial environments where the cable may be disconnected and connected many times. The only disadvantage of 140 outsidediameter is that it is nonstandard, so available connectors are fewer and more expensive than those for the 125 size. 6.3.4.3.4 Indoor and Outdoor Cables Indoor and outdoor fiber-optic cables differ in the jacket surrounding the fiber and the protective sheath that gives it sufficient tensile strength to be pulled through a conduit or overhead duct or strung on poles. Single indoor cables (Figure 6-22) consist of the clad fiber-optic cable surrounded by a Kevlar reinforcement sheath, wrapped in a polyurethane jacket for protection from abrasion and the environment. The outdoor cable has additional protective sheathing for additional environmental protection. Plenum fiber-optic cables are available for indoor applications that require specific smoke- and flame-retardant characteristics and do not require the use of a metal conduit. When higher tensile strength is needed, additional strands of Kevlar are added outside the polyethylene jacket and another polyethylene jacket provided over these Kevlar reinforcement elements. Some indoor cables utilize a stranded-steel central-strength member or nonmetallic Kevlar. Kevlar is preferred in installations located in explosive areas or areas of high electromagnetic interference, where nonconducting strength members are desirable. Analog Video, Voice, and Control Signal Transmission (A) INDOOR CABLE (B) OUTDOOR CABLE SINGLE DUPLEX KEVLAR REINFORCEMENT TIGHT BUFFER LOW SMOKE AND FLAME SPREAD FLUOROPOLYMER JACKET PVC OUTER JACKET OPTICAL FIBER BRAIDED KEVLAR STRENGTH MEMBER FIGURE 6-22 171 LOOSE TUBE BUFFERED OPTICAL FIBER Indoor and outdoor fiber optic cable construction The mechanical properties of cables typically found on data sheets include crush resistance, impact resistance, bend radius, and strength. An outdoor cable or one that will be subjected to heavy stress—in long-cable-run pulls in a conduit or aerial application—uses dual Kevlar/polyethylene layers as just described. The polyethylene coating also retards the deleterious effects of sunlight and weather. When two fibers are required, two single cable structures may be paired in siamese fashion (side by side) with a jacket surrounding around them. If additional fiber-optic runs are required, multi-fiber cables (having four, six, eight, or ten fibers) with similar properties are used (Figure 6-23). The fibers are enclosed in a single or multiple buffer tube around a tensile-strength member composed of Kevlar and then surrounded with an outer jacket of Kevlar. 6.3.4.4 Connectors and Fiber Termination This section describes fiber-optic connectors, techniques for finishing the fiber ends when terminated with connectors, and potential connector problems. For very long cable runs, joining and fusing the actual glass-fiber core and cladding is done by a technique called “splicing.” Splicing joins two lengths of cable by fusing the two fibers (locally melting the glass) and physically joining them in a permanent connection (Section 6.3.4.4.4). Fiber-optic cables require connectors to couple the optical signal from the transmitter at the camera into the fiberoptic cable, and at the monitoring end to couple the light output from the fiber into the receiver. If the fiber-optic cable run is very long or must go through various barriers (e.g. walls), the total run is often fabricated from sections of fiber-optic cable and each end joined with connectors. This is equivalent to an inter-line coaxial connector. A large variety of optical connectors is available for terminating fiber-optic cables. Most are based on butt coupling of cut and polished fibers to allow direct transmission of optical power from one fiber core to the other. Such a connection is made using two mating connectors, precisely centering the two fibers into the connector ferrules and fixing them in place with epoxy. The ferrule and fiber surfaces at the ends of both cables are ground and polished to produce a clean optical surface. The two most common types are the cylindrical and cone ferrule connectors. 6.3.4.4.1 Coupling Efficiency The efficiency of light transfer from the end of one fiberoptic cable to the following cable or device is a function of six different parameters: 1. 2. 3. 4. 5. 6. Fiber-core lateral or axial misalignment Angular core misalignment Fiber end separation Fiber distortion Fiber end finish Fresnel reflections. Of these loss mechanisms, distortion loss and the effects of fiber end finish can be minimized by using proper techniques when the fibers are prepared for termination. 172 CCTV Surveillance (A) INDOOR MULTI-FIBER (B) OUTDOOR AERIAL-SELF SUPPORTING KEVLAR REINFORCEMENT CORE TAPE WRAP CORE TAPE WRAP TIGHT BUFFER TIGHT BUFFER OPTICAL FIBER LOW SMOKE AND FLAME SPREAD FLUOROPOLYMER JACKET KEVLAR REINFORCEMENT OPTICAL FIBER OUTER POLYETHYLENE JACKET INNER JACKET FILLING COMPOUND EPOXY GLASS CENTRAL STRENGTH MEMBER (C) INDOOR FIGURE 6-23 (D) OUTDOOR (E) ARMORED Multi-conductor fiber optic cable A chipped or scratched fiber end will scatter much of the light signal power, but proper grinding and polishing minimize these effects in epoxy/polish-type connectors. Lateral misalignment of fiber cores causes the largest amount of light loss, as shown in Figure 6-24a. An evaluation of the overlap area of laterally misaligned step-index fibers indicates that a total misalignment of 10% of a core diameter yields a loss of greater than 0.5 dB. This means that a fiber core of 0.002 inches (50 microns) must be placed within 0.0001 inches of the center of its connector for a worst-case lateral misalignment loss of 0.5 dB. While this dimension is small, the connection is readily accomplished in the field. Present connector designs maintain angular alignment well below one degree (Figure 6-24b), which adds only another 0.1 dB (2.3%) of loss for most fibers. Fiber end–separation loss depends on the NA of the fiber. Since the optical light power emanating from a transmitting fiber is in the form of a cone, the amount of light coupled into the receiving fiber or device will decrease as the fibers are moved apart from each other (Figure 6-24c). A separation distance of 10% of the core diameter using a fiber with an NA of 0.2 can add another 0.1 dB of loss. Fresnel losses usually add another 0.3 to 0.4 dB when the connection does not use an index-matching fluid (Figure 6-24d). The summation of all of these different losses often adds up to 0.5–1.0 dB for ST-type (higher for SMA 1906) terminations and connections (Table 6-4). 6.3.4.4.2 Cylindrical and Cone Ferrule Connector In the cylindrical ferrule design, the two connectors are joined and the two ferrules are brought into contact inside precisely guiding cylindrical sleeves. Figure 6-25 shows the geometry of this type of connection. Lateral offset in cylindrical ferrule connectors is usually the largest loss contributor. In a 50-micron gradedindex fiber, 0.5 dB (12%) loss results from a 5-micron offset. A loss of 0.5 dB can also result from a 35-micron gap between the ends of the fibers, or from a 2.5 tilted fiber surface. Commercial connectors of this type reach 0.5–1 dB (12–26%) optical loss for ST type and higher for SMA 1906. Optical-index-matching fluids in the gap further reduce the loss. The cone ferrule termination technique centers the fiber in one connector and insures concentricity with the mating fiber in the other connector using a cone-shaped Analog Video, Voice, and Control Signal Transmission (A) AXIAL (LATERAL) DISPLACEMENT (D ) LATERAL MISALIGNMENT LOSS (dB) 173 (C) END SEPARATION (S ) END SEPARATION LOSS (dB) D 4 D S L 3 .5 NA 3 2 2 1 .2 NA .15 NA 1 .1 .2 .3 .4 .5 .1 .2 LATERAL MISALIGNMENT RATIO L / D (B) ANGULAR MISALIGNMENT θ TILT ANGULAR LOSS (dB) .3 .4 .5 END SEPARATION RATIO S/D (D) END FINISH SURFACE ROUGHNESS LOSS (dB) θ D 1.0 1.5 .15 NA .2 NA 1.0 .5 .5 NA .5 FRESNEL LOSS (FOR PERFECT END FINISH) 0 1 2 3 4 1 5 MISALIGNMENT ANGLE θ IN DEGREES 8 15 FIBER FACE ROUGHNESS (µm) TOTAL CONNECTOR LOSS = D + S + 0 + E (dB) LOSS RANGE = 0.3 + 0.2 + 0.1 = 0.6 (GOOD) TO 0.7 + 0.5 + 0.2 + = 1.4 (POOR) FIGURE 6-24 Factors affecting fiber optic coupling efficiency CYLINDRICAL CONICAL FIBER SLEEVE FIGURE 6-25 PLUG FIBER SLEEVE PLUG Cylindrical and conical butt-coupled fiber optic ferrule connectors plug instead of a cylindrical ferrule. The key to the cone connector design is the placement of the fiber-centering hole (in the cone) in relationship to the true center, which exists when the connector is mated to its other half. A fiber (within acceptable tolerances) is inserted into the ferrule, adhesive is added, and the ferrule is compressed to fit the fiber size while the adhesive sets. The fiber faces and ferrule are polished to an optical finish and the ferrule (with fiber) is placed into the final alignment housing. Most low-loss fiber-optic connections are made utilizing the cone-shaped plug technique. The two most popular cone-shaped designs are the small-fiber resilient (SFR) bonded connector and the SMA, a redesigned coaxial connector style (Figure 6-26). 174 CCTV Surveillance TYPE: SMA THREADED—SCREW ON TYPE: ST, SFR (SMALL FIBER RESILIENT) POLARIZED AND SPRING LOADED QUARTER TURN BAYONET LOCK FIGURE 6-26 SMA and SFR connectors Both use the cone-shaped ferrule, which provides a reliable, low-cost, easily assembled termination in the field. Both connectors can terminate fibers with diameters of 125-micron cladding. The technique eliminates almost all fiber and connector error tolerance buildup that normally causes light losses. It makes use of a resilient material for the ferrule, metal for the construction of the retainer assembly, and a rugged metallic connector for termination. The fiber alignment is repeatable after many “connects” and “disconnects” due to the tight interference fit of the cone-shaped ferrule into the cone-shaped mating half. This cone design also forms a sealed interface for a fiber-to-fiber or fiber-to-active-device junction, such as fiber cable to transmitter or fiber cable to receiver. Tolerances in the fiber diameter are absorbed by the resiliency of the plastic ferrule. This connector offers a maximum signal loss of 1.0 dB (26%) and provides repeatable coupling and uncoupling with little increase in light loss. The popular SMA-style connector is compatible with many other SMA-manufacturer-type connectors and terminates the 125-micron fibers. Internal ferrules insure axial fiber alignment to within 0.1 . The SMA connector has a corrosion-resistant metal body and is available in an environmentally sealed version. 6.3.4.4.3 Fiber Termination Kits An efficient fiber-optic-cable transmission system relies on a high-quality termination of the cable core and cladding. This step requires the use of perhaps unfamiliar but easy techniques with which the installer must be acquainted. Fiber-terminating kits are available from most fiber-cable, connector, and accessory manufacturers. Figure 6-27 shows a complete kit, including all grinding and polishing compounds, alignment jigs, tools, and instructions. Manufacturers can provide descriptions of the various techniques for terminating the ends of fiber-optic cables, including cable preparation, grinding, polishing, testing, etc. 6.3.4.4.4 Splicing Fibers Splicing of multimode fibers is sometimes necessary in systems having long fiber-optic-cable runs (longer than 2 km). In these applications it is advantageous to splice cable sections together rather than connect them by using connectors. A splice made between two fiber-optic cables can provide a connection with only one-tenth the optical loss of that obtained when a connector is inserted between fibers. Good fusion splices made with an electric arc produce losses as low as 0.05–0.1 dB (1.2–2.3% loss). Making a splice via a fusing technique is more difficult and requires more equipment and skill than terminating the end of a fiber with a connector. It is worth the effort if it eliminates the use of an in-line amplifier. The splice can also be used to repair a damaged cable, eliminating the need to add connector terminations that would decrease the light signal level. 6.3.4.5 Fiber-Optic Transmitter The fiber-optic transmitter is the electro-optic transducer between the camera video electrical signal output and the light signal input to the fiber-optic cable (Figure 6-15). The function of the transmitter is to efficiently and accurately convert the electrical video signal into an optical signal and couple it into the fiber optic. The transmitter electronics convert the amplitude-modulated video signal through LED or ILD into an AM or FM light signal, which faithfully represents the video signal. The transmitter consists of an LED for normal security applications or an ILD when a long range transmission is required. The former is used for most CCTV security applications. Figure 6-28 illustrates the block diagram for the transmitter unit. 6.3.4.5.1 Generic Types The LED light source is a semiconductor device made of gallium arsenide (GaAs) or a related semiconductor compound which converts an electrical video signal to an optical signal. The LED is a diode junction that spontaneously emits nearly monochromatic (single wavelength or color) radiation into a narrow light beam when current is passed through it. While the ILD has a very narrow beam Analog Video, Voice, and Control Signal Transmission 175 FIBER END GRINDING AND POLISHING CABLE TERMINATING KIT SMA POLISH TOOL WET WITH WATER 600 GRIT 3 MICRON 0.3 MICRON GRIND AND POLISH IN FIGURE 8 PATTERN FIGURE 6-27 Fiber optic termination kit LENS SCENE CAMERA COAX CABLE FIBER OPTIC TRANSMITTER IMPEDANCE MATCHING INPUT ATTENUATOR STAGE FIGURE 6-28 LINEARIZING MODULATOR STAGE HIGH LINEARITY DRIVER AMPLIFIER LOW LOSS CONNECTOR LIGHT EMITTING DIODE (LED) WITH OPTICS FIBER OPTICS Block diagram of LED fiber optic transmitter width and is more powerful, the LED is more reliable, less expensive, and easier to use. The ILD is used in very long distance, wide-bandwidth fiber-optic applications. The LED’s main requirements as a light source are: (1) to have a fast operating speed to meet the bandwidth requirements of the video signal, (2) to provide enough optical power to provide the receiver with a signal-to-noise (S/N) ratio suitable for a good television picture, and (3) to produce a wavelength that takes advantage of the low-loss propagation characteristics of the fiber. The parameters that constitute a good light source for injecting light into a fiber-optic cable are those that produce as intense a light output into as small a cone diameter as possible. Another factor affecting the lighttransmission efficiency is the cone angle of the LED output that can be accepted by and launched down the fiber-optic cable. Figure 6-29 illustrates the LED-coupling problem. The entire output beam from the LED (illustrated by the cone of light) is not intercepted or collected by the fiber-optic core. This unintercepted illumination loss can be a problem when the light-emitting surface is separated from the end of the fiber core. Most LEDs have a lens at the surface of the LED package to collect the light from the emitting source and concentrate it onto the core of the fiber. 176 CCTV Surveillance LED OUTPUT BEAM LIGHT EMITTING DIODE (LED) JUNCTION LENS AND WINDOW LENS FIBER OPTIC CLADDING CORE LIGHT BEAM LED FIGURE 6-29 LED light beam output cone angle 6.3.4.5.2 Modulation Techniques For video security applications, the electrical signal from the camera is AM or FM and converted to light output variations in the LED or ILD. The optical output power varies directly to the electrical input signal for AM and is constant for FM. LEDs with an 850 nm IR wavelength emission are best suited since they can be amplitude modulated: the electrical video signal can be converted to a light output signal that is a near-linear function of the LED drive current. This produces a very faithful transformation of the electrical video information to the light information that is transmitted along the fiber-optic cable. of aluminum. In most transmitters today, the emitting wavelength is 850 nm, which matches the maximum transmission capability of the glass fiber. Alternative transmitting wavelengths are 1060, 1300, and 1550 nm, which are regions where glass fibers exhibit a lower attenuation and dispersion than at 850 nm. These wavelengths are produced by combining the element indium with gallium arsenide (to get InGaAs) and are used in some long-distance transmission applications. 6.3.4.5.3 Operational Wavelengths An important characteristic of the transmitter output is the wavelength of the emitted light. This should be compatible with the fiber’s minimum-attenuation wavelength, which is 850 nm (in the IR region) for most CCTV fiber-optic cable. The wavelength of light emitted by an LED depends on the semiconductor material composition. Pure GaAs diodes emit maximum power at a wavelength of 940 nm (nearIR), which is undesirable because most glass fibers have a high attenuation at that wavelength. Adding aluminum to GaAs to produce a GaAlAs diode yields a maximum power output at a wavelength between 800 and 900 nm, with the exact wavelength determined by the percentage The term receiver at the output end of the fiber-optic cable refers to a light-detecting transducer and its related electronics that provides any necessary signal conditioning to restore the signal to its original shape at the input and additional signal amplification. The most common fiber-optic receiver uses a photodiode to convert the incident light from the fiber into electrical energy. To interface the receiver with the optical fiber, the proper match between light source, fiber-optic cable, and light detector is required. In the AM transmission system, the optical power input at the fiber is modulated so the photodetector operating in the photocurrent mode must provide good linearity, speed, and stability. The photodiode produces 6.3.4.6 Fiber-Optic Receiver Analog Video, Voice, and Control Signal Transmission 177 FIBER OPTIC RECEIVER FIBER OPTIC CONNECTOR FIBER OPTIC RECEIVER OPTICS OPTICAL DETECTOR (PIN DIODE) HIGH GAIN VIDEO AMPLIFIER VARIABLE GAIN POST AMPLIFIER OUTPUT DRIVER COAXIAL CABLE VIDEO OUTPUT SIGNAL GAIN MONITOR FIGURE 6-30 Fiber optic receiver block diagram no electrical gain and is therefore followed by circuits that amplify electrical voltage and power to drive the coaxial cable. Figure 6-30 illustrates the block diagram for the receiver unit. The light exiting from the receiver end of an optical fiber spreads out with a divergence approximately equal to the acceptance cone angle at the transmitter end of the fiber. Photodiodes are packaged with lenses on their housings so that the lens collects this output energy and focuses it down onto the photodiode-sensitive area. After the light energy is converted into an electrical signal by the photodiode, it is linearly amplified and conditioned to be suitable for transmission over standard coaxial cable or two-wire UTP to a monitor or recorder. 6.3.4.6.1 Demodulation Techniques The receiver demodulates the video light signal to its original base-band video form. This takes the form of either AM or FM demodulation. Since FM modulation– demodulation is less sensitive to external electrical influences it is the technique of choice in most systems. 6.3.4.7 Multi-Signal, Single-Fiber Transmission The primary attribute of fiber-optic transmission is the cable’s wide signal bandwidth capability. Transmitting a single video signal on a single fiber easily fits within the bandwidth capability of all fiber-optic cables. Modulators and demodulators in transmitters, receivers, and transceivers permit transmission of bidirectional video, audio, and control signals over a single optical-fiber cable. Using the full-duplex capabilities of the system, the transceiver at the camera transmits video and audio signals from the camera location to the monitor location while simultaneously receiving audio, control, or camera genlock signals from the transceiver at the monitor location. All transmissions occur via the same single optical-fiber cable. The transmitter and receiver contain all circuitry for the bidirectional transmission of pan/tilt, zoom, focus, iris, contact-closure, and video in the opposite direction. When more than one video signal is to be transmitted by optical fiber between two points, either multiple fibers may be used or the signals may be combined using wavelength division multiplexing (WDM MUX), thus saving the cost of additional fibers and expensive multi-fiber connectors. Using this technique the outputs from each optical transmitter operating at different wavelengths (1060, 1300, and 1550 m) are modulated by separate video signals and combined on a single optical fiber and combined using WDM MUX. These video signals may then be separated at the other end of the fiber by a WDM de-multiplexer (WDM DE-MUX) (Figure 6-31). Typically two or more wavelengths are provided by two LED transmitters operating at wavelengths between 850 and 1550 nm. The WDM MUX and DE-MUX may be fabricated using an optical coupler and splitter assembly, using lens and grating components. The lens focuses each of the channels onto the grating, which then separates the channels according to wavelength and according to the grating spacing. Data sheets for a typical WDM MUX/DE-MUX devices include the following specifications: 1. Number of Channels: The number of video signals that can be multiplexed and de-multiplexed over the optical fiber to which the WDM MUX/DEMUX are connected. 178 CCTV Surveillance WAVELENGTH DIVISION MULTIPLEXING (WDM)-TRANSMITTER ELECTRICAL SIGNAL TO LIGHT PULSE MODULATORS MODULATED LASER/LED 1060 nm INPUT: * VIDEO 1 LIGHT COUPLER 1300 nm VIDEO 2 1550 nm VIDEO 3 * UP TO 32 CHANNELS OF VIDEO, VOICE, DATA SINGLE FIBER OPTIC DWDM-RECEIVER GRATING ** COLLIMATED BEAM ** GRATING FUNCTION: DISPERSE MULTI-WAVELENGTH LIGHT INTO CONSTITUANT COMPONENTS (λ1 λ 2 λ 3 …) λ1 λ2 λ3 LIGHT DETECTORS OUTPUT: VIDEO 1 LIGHT TO ELECTRICAL SIGNAL DEMODULATORS VIDEO 2 VIDEO 3 FF631 FIGURE 6-31 Wavelength division multiplexing and de-multiplexing video signal 2. Center Wavelengths: Center wavelength of the channels over which the video signals are multiplexed. 3. Channel Spacing: The minimum distance (wavelength or frequency) between channels in a WDM MUX/DEMUX system. In the illustration the channel spacing is approximately 0.8 nm, approximately 100 GHz. 4. Bandwidth (also referred to as Passband Width): The linewidth of a specific wavelength channel. A manufacturer generally specifies the line-width at 1 dB, 3 dB and 20 dB insertion loss as shown in Figure 6-32. 5. Maximum Insertion Loss: Loss sustained by the video signals when the WDM MUX/DE-MUX is applied in a system. Typical values range from 1.5 dB for a 4-channel device to 6 dB for a high number of channels, WDM MUX/DE-MUX. 6. Isolation: The loss or attenuation between video signal channels, usually more than 30 dB. 6.3.4.8 Fiber Optic—Advantages/Disadvantages Figure 6-33 illustrates two typical channels over which the video signals 1540.56 m and 154135 m are multiplexed. 6.3.4.8.1 Pro Widest Bandwidth. In general the bandwidth capacity is directly proportional to the carrier frequency. Light Why go through all the complexity and extra expense of converting the electrical video signal to a light signal and then back again? Fiber optics offers several very important features that no electrical cabling system offers, including: • ultra-wide bandwidth supporting multiple video, audio, control, and communications signals on one fiber • complete electrical isolation • complete noise immunity to RFI, EMI, and electromagnetic pulse (EMP) • transmission security (fiber-optic cable is hard to tap) • no spark or fire hazard or short-circuit possibility • absence of crosstalk • no RFI/EMI radiation. Table 6-5 compares the features of coaxial and fiber-optic transmission. Analog Video, Voice, and Control Signal Transmission INSERTION LOSS (dB) CHANNEL SPACING 179 PASS BAND 0 INSERTION LOSS –5 –10 –15 –20 SIGNAL OVERLAP REGION— CROSSTALK –25 –30 –35 –40 1542.0 1542.5 1543.0 1543.5 1544.0 1544.5 1545.0 1545.5 WAVELENGTH (nm) FIGURE 6-32 Line-width vs. insertion loss of a specific wavelength channel (and near IR) frequencies are approximately 1014 Hz. Typical video microwave transmitters operate at 1010 GHz or 1010 Hz. Fiber optics has a 104 or 10,000 times higher bandwidth capability than microwave. Electrical Isolation. The complete electrical isolation of the transmitting section (i.e. the camera, lens controller, pan/tilt, and related equipment) from the receiving section (i.e. the monitor, recorder, printer, switching network, and so on) is very important for inter-building and intra-building locations when a different electrical power source is used for each location. Using fiber-optic transmission prevents all possibility of ground loops and ground voltage differences that could require the redesign of a coaxial cable–based system. RFI, EMI, and EMP Immunity. When a transmission path runs through a building or outdoors past other electrical equipment, the site survey usually cannot uncover all possible contingencies of existing RFI/EMI noise. This is also true of EMP and lightning strikes. Therefore, using fiber optics in the initial design prevents any problems caused by such noise sources. Transmission Security. Since the fiber optic has no electrical noise to leak and no visible light, it exhibits excellent inherent transmission security and it is hard to intercept. Methods for compromising the fiber-optic cable are difficult, and the intrusion is usually detected. To tap a fiber-optic cable, the bare fiber in the cable must be isolated from its sheath without breaking it. This will probably end the tapping attempt. If the bare fiber is successfully isolated, an optical tap must be made, the simplest of which is achieved by bending the fiber into a small radius and extracting some of the light. If a measurable amount of power is tapped (which is necessary for a useful tap), the tap can be detected by monitoring the power at the system receiver. In contrast, tapping a coaxial cable is easy to do and hard to detect. No Fire Hazards. Since no electricity is involved in any part of the fiber-optic cable, there is no chance or opportunity for sparks or electrical short circuits, and hence no fire hazards. Short circuits and other hazards encountered in electrical wiring systems can start fires or cause explosions. When a light-carrying fiber is severed, there is no spark, and a fiber cannot short-circuit in the electrical meaning of the term. Absence of Crosstalk. Because the transmission medium is light, there is no crosstalk between any of the fiber-optic cables. Therefore there is no degradation due to the close proximity of cables in the same bundle, as there can be when multiple channels are encased in the same electrical cable. 180 CCTV Surveillance BI-DIRECTIONAL WAVELENGTH DIVISION MULTIPLEXING (WDM) CAMERA 1 VIDEO TRANSMITTER λ1 PAN/ TILT/ZOOM PLATFORM 4 CHANNEL * CAMERA 3 MUX /DMUX VIDEO CAMERA 2 TRANSMITTER DOME VIDEO TRANSMITTER PAN/ TILT CAMERA LENS CONTROLS RECEIVER λ2 λ3 SINGLE FIBER OPTIC IN 1550 nm BAND λ4 λ1 4-CHANNEL * MUX / DMUX λ2 λ3 * MANY WDM/DMDM ARE BI-DIRECTIONAL DISPLAY RECEIVER RECEIVER SWITCHER RECEIVER λ4 PAN / TILT CONTROLLER TRANSMITTER LENS CONTROLLER FIGURE 6-33 Video signals and controls multiplexed over four channels Analog Video, Voice, and Control Signal Transmission dB/1000 ft dB/km OUTSIDE DIAMETER (inches) ATTENUATION DESIGNATION @ 5–10 MHz CABLE TYPE dB/100 ft RG59 COAXIAL 1.0 10.0 32.8 .242 3.5–4.0 RG59 MINI COAXIAL 1.3 13.0 42.6 .135 1.5 RG6 COAXIAL .8 8.0 26.2 .272 7.9 RG11 COAXIAL .51 5.1 16.7 .405 9 –11 2422/UL1384 MINI–COAX 3.96 39.6 129.9 .079 0.9 2546 † MINI–COAX 1.82 18.2 59.7 .13 1.4 † MINI–COAX 2.1 21.0 69.0 .118 1.0 MINI–COAX 2.0 20.0 65.6 .089 1.0 .036 — .12 .50 –1.0 .244 — 2895 RG179B/U 10/125 * FIBER OPTIC 850 NM ** 1300 NM 50/125 FIBER OPTIC 140/200 FIBER OPTIC — — — .01–.02 .1–.2 .4–.8 850 NM 1300 NM .12–.21 1.2–2.1 4–7.0 .09–.18 .9–1.8 850 NM 1300 NM .08–.18 .02–.14 .8–1.8 .2–1.4 3–6.0 2.5–6.0 .8–4.5 * USED ONLY IN VERY LONG DISTANCE, WIDE BANDWIDTH APPLICATIONS ** TRANSMISSION WAVELENGTH † WEIGHT (lb) PER 100 ft (NANOMETER–NM) MOGAMI Table 6-5 181 1 KILOMETER (km) = 3280 FEET (ft) 1 MILE (Mi) = 1.609 KILOMETERS (km) 1 POUND (lb) = .454 KILOGRAMS (kg) Comparison of Fiber Optic and Coaxial Cable Transmission No RFI or EMI Radiation. Fibers do not radiate energy. They generate no interference to other systems. Therefore, the fiber-optic cable will not emit any measurable EMI/RFI radiation, and other cabling in the vicinity will suffer no noise degradation. There are no FCC requirements for fiber-optic transmission. 6.3.4.8.2 Con Higher Cost. Coaxial cable and connectors are inexpensive and no transmitter/receiver pairs are required. For short distances if new cable must be run, coaxial is the most cost-effective. Connector Termination More Difficult. Fiber-optic cable and connectors cost more than coaxial. Terminating fiber cables takes longer than coaxial cables and requires more technical skill. 6.3.4.9 Fiber-Optic Transmission: Checklist The following are some questions that should be asked when considering the use or design of a new fiber-optic transmission system. 1. What are the lengths of cable runs? If over 500 feet, then fiber optic should be considered. In screen rooms or tempest areas, runs as short as 10 feet sometimes require fiber-optic cable. 2. What size core/clad-diameter fiber should be used? The most common diameter is 50/125 microns. 3. What wavelength should be used—850, 1060, 1300, or 1550 nm? The most common wavelengths are 850 and 1300 nm. 4. How many fibers are necessary for transmitting video, audio, and controls? Should single- or multi-fiber cable be used? 5. Are the cable runs going to be indoors or outdoors? Separate the indoor and outdoor requirements and determine if outdoor fiber cables are required. What will the outdoor environment be (lighting, etc.)? 6. If outdoors, will the fiber be strung on poles, surfacemounted on the ground, undergo direct burial, or pass through a conduit? Choose cable according to manufacturers’ recommendations. 7. If indoors, will it be in a conduit, cable tray or trough, plenum, or ceiling? Choose cable according to manufacturers’ recommendations. 8. What temperature range will the fiber-optic cable experience? Most cable types will be suitable for most indoor environments. For outdoor use, the cable chosen must operate over the full range of hot and cold temperatures expected and must withstand ice and wind loading if mounted above ground level. 9. Are there any special considerations such as water, ice, chemicals? See manufacturers’ specifications for extreme environmental hazards. 10. Are there special safety codes? Fiber-optic cable is available with plenum-grade or special abrasion-resistant construction. 182 CCTV Surveillance 11. Should spare cables be included? Each design is different, but it is prudent to include one or more spare fiber-optic cables to account for cable failure or future system growth. The number of spares also depends on how easy or difficult it is to replace a cable or add to existing cables. 6.4 WIRED CONTROL SIGNAL TRANSMISSION Fixed cameras do not require any control functions. Moving cameras require pan, tilt, focus and sometimes iris control signals for proper operation. Microwave transmission also is controlled by the FCC. Rules are set forth and licenses are required for certain frequencies and applications, which limits usage to specific purposes. The US government currently exercises strict control over transmission of wireless video via RF and microwave. Up until recently, RF and microwave atmospheric video transmission links were limited to governmental agencies (federal, state, and local) that could obtain the necessary licenses. Now some low-power RF and microwave transmitters and receivers suitable for short links (less than a mile) are available for use without an FCC license. High power and RF microwave links are licensable by private users after a frequency check is made with the FCC. 6.4.1 Camera/Lens Functions 6.5.1 Transmission Types Lenses can require zoom, focus, and iris-control signals. 6.4.2 Pan/Tilt Functions Moving cameras require pan/tilt functions to scan the camera horizontally, tilt it vertically, and set preset pointing directions to specific locations in the scene. 6.4.3 Control Protocols The simplest controls can take the form of on/off or proportional voltage control wiring for each function. These direct controls require the largest number of wires. The most standard two- and four-wire control protocol for cameras, lens, and pan/tilt platforms are the RS422 and RS485. 6.5 WIRELESS VIDEO TRANSMISSION Most video security systems transmit video, audio, and control signals via coaxial cable, two-wire, or fiber-optic transmission means. These techniques are cost-effective and reliable, and provide an excellent solution for transmission. However, there are applications and circumstances that require wireless transmission of video and other signals. The video signal can be transmitted from the camera to the monitor through the atmosphere, without having to connect the two with hard wire or fiber. The most familiar technique is the transmission of commercial television signals from some distant transmitter tower to consumer television sets, broadcast through the atmosphere on VHF and UHF radio frequency channels. Commercial broadcasting is of course rigidly controlled by the FCC, whose regulations dictate its precise usage. Some examples of wireless TV transmission described in the following sections include microwave (groundto-ground station, satellite), RF over VHF or UHF, and light-wave transmission using IR beams. The hardware cost of RF, microwave, and lightwave systems is considerably higher than any of the copper-wire or fiber-optic systems, and such systems should be used only when absolutely necessary as when their use avoids expensive cable installations (such as across roadways), or in temporary or covert applications, wireless transmission becomes cost-effective. The results obtainable with hard-wired copper-wire or fiber-optic video transmission are usually predictable, with the exception of interference that might occur due to the copper-wire cables running near electromagnetic radiating equipment or electrical storms. The results obtained with wireless transmission are generally not as predictable because of the variable nature of the atmospheric path and materials through which RF, microwave, or light signals must travel, as well as the specific transmitting and propagating characteristics of the particular wavelength or frequency of transmission. Each of the three wireless transmitting regimes acts differently because of the wide diversity in frequencies at which they transmit. 6.5.2 Frequency and Transmission Path Considerations The RF link constitutes the lowest carrier frequency (Figure 6-34). It penetrates many visually opaque materials, goes around corners, and does not require a line-of-sight path (i.e. a receiver in sight of the transmitter) when transmitting from one location to another. The radio frequencies are, however, susceptible to attenuation and reflection by metallic objects, ground terrain, or large Analog Video, Voice, and Control Signal Transmission 183 POWER OUTPUT RF SPECTRUM 100 MICROWAVE SPECTRUM UHF VHF 10.525 GHz 920–930 MHz 5–5.8 GHz 21–24 GHz 2.4–2.5 GHz 150 980 8 13 GHz 0 100 MHz FIGURE 6-34 200 500 1000 MHz 1 GHz 2 GHz 5 GHz 10 GHz 20 GHz FREQUENCY Wireless video transmission frequencies buildings or structures, and therefore they sometimes produce unpredictable results. The microwave link requires an unobstructed line of sight; any metallic or wet objects in the transmission path cause severe attenuation and reflection, often rendering a system useless. However, metallic poles or flat surfaces can sometimes be used to reflect the microwave energy, allowing the beam to turn a corner. Reflection of this type does reduce the energy reaching the receiver and the effective range of the system. Some microwave frequencies penetrate dry nonmetallic structures such as wood or drywall walls and floors, so that non-line-of-sight transmission is possible. The frequency range most severely attenuated by the atmosphere and blocked completely by any opaque object is a light-wave signal in the near-IR region. The IR beam can be strongly attenuated by heavy fog or precipitation, severely reducing its effective range as compared with clear-line-of-sight, clear-weather conditions. As would be expected, the IR-wavelength system requires a clear line of sight with no opaque obstructions whatsoever between the transmitter and the receiver. The IR beam can be reflected off one or more mirrors to go around corners. The advantages of the IR system over RF and microwave links are: (1) security (since it is hard to tap a narrow light beam), (2) high bandwidth (able to carry multiple channels of information), and (3) bidirectional operation. 6.5.3 Microwave Transmission Microwave systems applicable in television transmission have been allocated frequencies in bands from 1 to 75 GHz (see Table 6-6). Microwave frequencies, which approach light-wave frequencies, are usually transmitted and received by parabolically shaped reflector antennas or metallic horns. Even when a line of sight exists, there can be signal fading, caused primarily by changes in atmospheric conditions between the transmitter and the receiver, a problem that must be taken into account in the design. This fading can result at any frequency, but in general is more severe at the higher microwave frequencies. 6.5.3.1 Terrestrial Equipment For terrestrial use, several manufacturers provide reliable microwave transmission equipment suitable for transmitting video, audio, and control signals over distances of from several hundred feet to 10–20 miles in line-of-sight conditions. One system transmits a single NTSC video channel and two 20 kHz audio channels over a distance of 1 mile. A high-gain directional antenna is available to extend the system operating range to several miles. Figure 6-35a shows the transmitter and receiver units. This system operates at 184 CCTV Surveillance BAND/USE * FREQUENCY BAND (GHz) SECURITY FREQUENCY RANGE (GHz) USAGE RESTRICTIONS L 1.2–1.7 1.2–1.7 VIDEO TRANSMITTER S 2.4–2.5 2.4–2.5 VIDEO TRANSMITTER LOW POWER /HIGH POWER FCC PART 15 S 2.6–3.95 G 3.95–5.85 — C 4.9–7.05 — C 5–5.8 5–5.8 VIDEO TRANSMITTER LOW POWER/HIGH POWER FCC PART 15 J 5.85–8.2 8.4–8.6 VIDEO TRANSMITTER H 7.05–10.0 — X 8.2–12.4 M 10.0–15.0 10.525 P 12.4–18.0 10.35–10.8 VIDEO TRANSMITTER N 15.0–22.0 21.2–23.2 VIDEO UP TO 3 Mi RANGE K 2.45 LOW POWER—NO RESTRICTIONS NO FCC LICENSE REQUIRED HIGH POWER—LICENSE REQURIED CONSUMER MICROWAVE OVEN LOW POWER—NO RESTRICTIONS NO FCC LICENSE REQUIRED HIGH POWER—LICENSE REQURIED 10.4–10.6 18.0–26.5 24.125 R 26.5–40.0 — V 40–75 — VIDEO, AUDIO, INTRUSION VIDEO, VOICE, INTRUSION * ALSO SEE TABLE 6-7 FCC PART 15.249 TRANSMITTER, ANY TYPE OF MODULATION 902–928 MHz–50 mV/M MAXIMUM AT 3 METERS 2.4–2.F835 GHz–50 mV/M MAXIMUM AT 3 METERS 5.735–5.875 GHz–250 mV/M MAXIMUM AT 3 METERS Table 6-6 GOVERNMENT SECURITY LAW ENFORCEMENT ONLY 1 GIGAHERTZ (GHz) = 1000 MHz FCC PART 15.247 TRANSMITTER USING SPREAD SPECTRUM 902–928 MHz–1 WATT MAXIMUM 2.4000–2.4835–1 WATT MAXIMUM 5.725–5.675 GHz–1 WATT MAXIMUM Microwave Video Transmission Frequencies a carrier frequency of 2450–2483.5 MHz with a power output of 1 watt. The transmitter and receiver operate from 11 to 16 volts DC derived from batteries, an AC-to-DC power supply, or 12 volts DC vehicle power. The microwave transmitter utilizes an omnidirectional antenna. A high-gain, low-noise receiver collects the microwave transmitter signal with an omnidirectional or directional antenna. The system has a selectable video bandwidth from 4.2 MHz for enhanced sensitivity or 8 MHz for high resolution and has a single or dual audio sub-carrier channel for audio communications between the two sites. It transmits monochrome or color video with excellent quality. The 2450–2483.5 MHz band is available for a variety of industrial applications and requires an FCC license for operation. The system operates indoors or outdoors, uses FM, and provides immunity from vehicles, power lines, and other AM-type noise sources. The microwave frequency utilized has the ability to penetrate dry walls and ceilings and reflect off metal surfaces. Figures 6-35b, c show examples of small short-range microwave transmitters operating at 2.4 GHz and 5.8 GHz designed for outdoor use. These systems use directional patch antennas pointed toward each other to provide the necessary signal at the receiver from the transmitter. The systems are weatherproof, pedestal mounted, and designed for permanent installation. They transmit excellent full-color or monochrome pictures over an FM carrier in a frequency range of 2.4 GHz and 5.8 GHz with a video bandwidth of 10 MHz. In addition to the video channel, the system is capable of providing up to three voice or data (control) channels. The data channels may be used to control pan/tilt, zoom, focus, and iris at the camera location. Low power systems do not require FCC licensing. FCC licensing is required for high power systems and can be obtained for government and industrial users, providing an authorized interference survey is made to verify that no interference will result in other equipment. Other variations and functions, the microwave transmitter/receiver systems can perform include: 1. Operation in any frequency band from 8.5 to 12.4 GHz with output powers up to 100 milliwatts. Analog Video, Voice, and Control Signal Transmission 185 (A) 2.4 GHz TRANSMITTER/RECEIVER (B) 2.4 GHz OUTDOOR FIGURE 6-35 (C) LONG RANGE 5.8 GHz Monochrome/color microwave video transmission systems 2. Operation as a command-and-control unit providing a multi-channel system for transmitting control signal information. The commands are encoded at the transmitter and decoded at the receiver to control power on/off, lens focus, zoom and iris, and camera motion (pan/tilt). 3. An audio channel to provide simplex (one-way) or duplex (two-way) communications (IR system). 4. The ability to sequence through and transmit the video outputs from multiple surveillance cameras. The receiver and control units are located at the monitor site and the transmitter and sequencer units are located with the CCTV cameras. The camera outputs are fed to the sequencer unit. The operator at the receiver end controls the sequencing of the eight cameras and has the option to: (1) manually advance through the 186 CCTV Surveillance cameras, (2) have the cameras sequence automatically, or (3) change the camera dwell time. 6.5.3.2 Satellite Equipment Microwave transmission of video signals can be accomplished via satellite. Such systems are in extensive use for earth-to-satellite-to-earth communications, in which one ground-based antenna transmits to an orbiting synchronous satellite repeater, which relays the microwave signal at a shifted frequency to one or more receivers on earth (Figure 6-36). While this type of communication and transmission for video security applications was not put into widespread use for analog video systems, it now is enjoying wide special use for digital video Internet (WWW) systems. The satellites used for transmission are in a synchronous orbit at an altitude of 22,300 miles and appear stationary with respect to the earth. Satellites are placed in a synchronous or stationary orbit to permit communications from any two points in the continental USA by a single “up” and a single “down” transmission link. Consequently, a characteristic of domestic satellite video communications is that the trans- mission cost is independent of terrestrial distance. It takes 0.15 seconds for a microwave signal traveling at the speed of light to make a one-way journey to or from the satellite. Therefore, there is a 0.3 second delay between transmission and reception of the video carrier, independent of ground distance. This delay is not usually a problem for transmission of video security signals; however, this must be kept in mind when synchronization of different incoming video signals is required. The signal level reaching the feed horn depends on the size and shape of the antenna (Figure 6-37). The quality of an antenna is determined by how well it concentrates the radiation intercepted from a target satellite to a single point and by how well it ignores noise and unwanted signals coming from sources other than the target satellite. Three interrelated concepts— gain, beam width, and noise temperature—describe how well an antenna performs. Antenna gain is a measure of how many thousands of times a satellite signal is concentrated by the time it reaches the focus of the antenna. For example, a typical well-built 10-foot-diameter primefocused antenna dish can have a gain of 40 dB, which is a factor of 10,000 power gain, which means that the signal is ORBITTING SATELLITE .15 sec TRANSIT TIME .15 sec TRANSIT TIME MICROWAVE RECEIVER FIGURE 6-36 Satellite video transmission systems MICROWAVE TRANSMITTER Analog Video, Voice, and Control Signal Transmission 187 EARTH ORBITING SATELLITE SATELLITE DISH ANTENNA FEED HORN LOW NOISE AMPLIFIER ANTENNA RECEIVING DISH MONITORING ROOM FEED HORN LOW LOSS COAX CABLE VIDEO MONITOR DISPLAY LOW-NOISE AMPLIFIER (LNA) AMPLIFIER DOWN CONVERTER TUNER: UHF VHF ANTENNA LNA FINE POINTING MECHANISM FIGURE 6-37 Satellite video receiver system concentrated 10,000 times higher at the focal point than anywhere on the antenna. This gain is primarily dependent on the following three factors. Dish Size. As the size of a dish increases, more radiation from space is intercepted. Thus if the diameter of an antenna is doubled, the gain is increased fourfold (four times the area). Frequency. Gain increases with increasing frequency because higher-frequency microwaves, being closer to the frequency of light, behave a little more like light. Thus they do not spread out like waves in water but can be focused more easily into straight lines like beams of light. Since the gain of a microwave antenna is proportional to the square of the frequency, a signal with twice the frequency is concentrated by an antenna with four times the gain. As an example, if the gain is 10,000 when a signal of 5 GHz is received, then it will have a gain of 40,000 at 10 GHz. Surface Accuracy. Gain is further determined by how accurately the surface of an antenna is machined or formed to exactly a parabolic or other selected shape, and how well the shape is maintained under wind loading, temperature changes, or other environmental conditions. A good antenna will see only a narrow beam width and will be able to pick out a satellite. A poor-quality dish will see too much extraneous noise and will receive less signal energy from the satellite of interest and pick up unwanted energy. Dish antennas focus on one earth-orbiting satellite at a time and concentrate the faint signals into a feed horn (waveguide) that directs the microwave signal into a lownoise amplifier (LNA). The LNA amplifies the weak signal by 10,000 times and eventually transmits it by cable to the monitoring location. Figure 6-37 shows a block diagram of a satellite receiver system. The LNA is the first active electronic component in the receiving system that acts on the video signal. The LNA is analogous to the audio preamplifier in that it provides the first critical preamplification. Its noise characteristics generally determine the quality of the final video image seen on the monitor. The microwave signal from the LNA is fed via coaxial cable to a down converter which converts the satellite microwave signal to a lower frequency. Since the signal level is still very low, a special low-loss coaxial cable must be used and the signal run must be as short 188 CCTV Surveillance as possible. Increasing cable run decreases signal level, thereby decreasing the final S/N. The down-converted microwave signal is eventually converted to VHF or UHF and displayed on a television receiver or converted to baseband and displayed on a monitor. Today most satellite receivers generate a base-band signal containing the base-band video and audio and synchronizing information that can be fed directly into a video monitor or recorder. The receiver also outputs a channel 3 or 4 modulated signal for the input to a standard television tuner TV set. 6.5.3.3 Interference Sources Transmission interference occurs when unwanted signals are received along with the desired satellite signal. Of the several types of interference, perhaps the most common and irritating is caused by the reception of nearby microwave signals using the same or adjacent frequency band. Microwaves reflecting off buildings or even passing cars are responsible for the interference. Very often, moving the microwave antenna several feet can significantly reduce the interfering signal levels. Other interference includes stray signals from adjacent satellites, or uplink or downlink interference. Finally, a predictable form of interference is caused by the sun. Twice a year the sun lines up directly behind each satellite for periods of approximately ten minutes per day for two or three days. Since the sun is a source of massive amounts of radio noise, no transmissions can be received from satellites during these sun outage times. This unavoidable type of interference can be expected during the normal course of operation of an earth satellite station. 6.5.4 Radio Frequency Transmission Radio frequency (RF) is a wireless video transmission means originally used primarily by government agencies and amateur radio operators. Government frequencies include the 1200 MHz (1.2 GHz) and 1700 MHz (1.7 GHz) bands. Radio frequency wireless has now found widespread use in commercial security applications in temporary covert and permanent surveillance applications. Video transmitters and receivers transmit monochrome or color video signals over distances of several hundred feet to several miles using small, portable, battery-operated equipment. Operating frequencies cover the 150 to 980 MHz, 2.4 GHz, and 5.8 GHz bands. While RF transmission provides significant advantages when a wired system is not possible, there are FCC restrictions limiting the use of many such transmitters to government applications. Only low-power transmitters are available for commercial applications. Any RF systems used outside the United States require the approval of the foreign government. Tables 6-7 and 7-5 summarize the channel frequencies available. 6.5.4.1 Transmission Path Considerations An RF video signal transmission means can follow either commercial broadcasting standards in which the visual signal are amplitude modulated (AM), or noncommercial standards that use an FM signal. In the commercial standard the audio signal is frequency modulated on the carrier. In both systems the video input signal ranges from a few hertz to 4.5 MHz. For the low-powered transmitter/receiver systems used in security applications, FM modulation has provided far superior performance (increased range and lack of interference) and is the preferred method. The range obtained with an FM RF transmitter is from three to four times that of the AM type. Transmitting at standard commercial broadcast video standards using AM signals and operating on one of the designated VHF or UHF channels is prohibited by the FCC since any consumer-type receiver could receive and display the video picture. This potential is obviously a disadvantage for covert security surveillance. In the case of FM video transmission, many consumer receivers, though not designed to receive such signals, do display FM signals with some degree of picture quality because of nonlinear and sporadic operation of various receiver circuits. Likewise, the FCC does not permit the commercial use of FM or other modulation techniques in the commercial VHF and UHF channels. Low-power RF transmission in the 902–928 MHz, 2.4 GHz and 5.8 GHz ranges have been approved for general security applications without an FCC license. The 1.2 and 1.7 GHz bands have not been approved for commercial use. 6.5.4.2 Radio Frequency Equipment Many manufacturers produce wireless video RF and microwave links operating in the 900 MHz, 2.4 GHz, and 5.8 GHz frequency bands. This equipment operates on FCC-assigned frequencies with specific maximum transmitter output power levels (a few hundred milliwatts). These general-purpose RF links operate at output field strengths 50–250 milliwatts per meter at 3 meters (Part 15 of the FCC specification). Figure 6-38 illustrates typical RF and microwave video transmission equipments. Figure 6-38a shows two very small, four-channel, low power 1.2 GHz (government use only) and 2.4 GHz transmitters. Any one of four channels can be selected at a time. Figure 6-38b is a small four channel 2.4 GHz transmitter and receiver pair using highgain Yaggi antennas for increased range and directionality. Figure 6-38c shows a long range 2.4 GHz receiver. Analog Video, Voice, and Control Signal Transmission COMMERCIAL TELEVISION CHANNELS BAND FREQUENCY RANGE (MHz) USAGE RESTRICTIONS VHF—LOWBAND 2–6 54–88 LOW-MEDIUM POWER SEVERAL MILES RANGE GOVERNMENT, LAW ENFORCEMENT ONLY FM RADIO — 88–108 COMMERCIAL RADIO FCC REGULATED VHF—HIGHBAND 7–13 174–216 LOW-MEDIUM POWER, RANGE UP TO SEVERAL MILES GOVERNMENT, LAW ENFORCEMENT ONLY SECURITY — 350–950 SINGLE CHANNEL TRANSMITTER/RECEIVER GOVERNMENT, LAW ENFORCEMENT ONLY UHF 14–83 470–890 LOW-MEDIUM POWER, RANGE UP TO SEVERAL MILES GOVERNMENT, LAW ENFORCEMENT ONLY SECURITY — 902–928 LOW POWER, FCC PART 15 NO RESTRICTIONS, NO FCC LICENSE REQUIRED SECURITY — 1.2–1.7 GHz SECURITY — 2.4 GHz LOW POWER, FCC PART 15 * SECURITY — 2.4 GHz HIGH POWER, FCC PART 90 ** SECURITY — 5.8 GHz LOW-MEDIUM POWER LOW POWER SEVERAL MILES RANGE 189 GOVERNMENT, LAW ENFORCEMENT ONLY NO RESTRICTIONS, NO FCC LICENSE REQUIRED GOVERNMENT, LAW ENFORCEMENT ONLY NO RESTRICTIONS, NO FCC LICENSE REQUIRED ALL SECURITY FREQUENCY BANDS ARE OUTSIDE THE COMMERCIAL TELEVISION BANDS * INDUSTRIAL, SECURITY, MEDICAL (ISM) **FCC PART 90, 5 WATT MAXIMUM Table 6-7 RF and Microwave Video Transmission Frequencies For indoor applications, most RF and microwave transmitter/receiver systems use omnidirectional dipole antennas for ease of operation. For outdoor operation, dipoles or whip antennas are used. High-gain Yaggi antennas are used to increase range and minimize interference from other radiation sources. The RF and microwave transmitters and receivers have a standard 75-ohm input impedance; however, they require a 50-ohm coaxial cable at the transmitter output and the receiver input. Using a 75-ohm coaxial cable between the antenna and the transmitter output or the receiver input will seriously degrade the performance of the system even if it is short (1–2 ft). Miniature 50-ohm, RG58U, and RG8 coaxial cables terminated in a small SMA or BNC connector are used. Figure 6-39 shows the approximate distance between transmitter and receiver antennas (range) versus transmitted power, for video transmission. The range values are for smooth and obstacle-free terrain applications using a dipole antenna at the transmitter and receiver. The antennas should be located as high above the ground as possible. The numbers obtained should be used as a guide only. Actual installation and experience with specific equipment on-site will determine the actual quality of the video image received. 6.5.5 Infrared Atmospheric Transmission A technique for transmitting a video signal by wireless means uses propagation of an IR beam of light through the atmosphere (Figure 6-40). The light beam is generated by either an LED or an ILD in the transmitter. The receiver in the optical communication link uses a silicon-diode IR detector, amplifier, and output circuitry to drive the 75-ohm coaxial cable and monitor. The transmitter-to-receiver distance and security requirements of the link determine the type of diode used. Short-range transmissions of up to several hundred feet are accomplished using LED. To obtain good results for longer ranges, up to several miles under clear atmospheric conditions, ILD must be used. The LED system costs less and has a wider beam, 10–20 wide, making it relatively simple to align the transmitter and receiver. The beam width of a typical ILD transmitter is 01 or 02 , making it more difficult to align and requiring that the mounting structure for both transmitter and the receiver be very stable in order to maintain alignment. To insure a good, stable signal strength at the receiver, the ILD transmitter and receiver must be securely mounted on the building structure. Additionally, the building structure must not sway, creep, vibrate, or produce appreciable twist due to uneven thermal heating (sun loading). 190 CCTV Surveillance (A) MINIATURE TRANSMITTER WITH DIPOLE ANTENNA 1.2 GHz, 2.4 GHz. 2.4 GHz 4 CHANNEL 4 CHANNEL POWER OUT: 100 mW POWER OUT: 0.5 W FIGURE 6-38 (C) RECEIVER WITH DIPOLE ANTENNA (B) SMALL TRANSMITTER AND RECEIVER WITH YAGGI ANTENNA 2.4 GHz RF and microwave video transmitters POWER OUT (WATTS) 10.0 1.0 0.1 TYPICAL RANGE 0 0.1 1.0 10.0 TRANSMISSION CONDITIONS: CLEAR AIR, OUTDOOR, NO OBSTRUCTIONS, DIPOLE ANTENNA FIGURE 6-39 Transmitter RF power out vs. transmission range (Mi) Analog Video, Voice, and Control Signal Transmission DETECTOR RECEIVER FIELD OF VIEW (FOV) LED IR TRANSMITTER 191 IR RECEIVER LED BEAM DIVERGENCE (10−20°) BUILDING BUILDING CAMERA MONITOR ILD IR TRANSMITTER ILD DIVERGENCE (0.1−0.2°) DETECTOR RECEIVER FIELD OF VIEW (FOV) BUILDING IR RECEIVER BUILDING FIGURE 6-40 IR atmospheric video transmission system Both LED and ILD systems can transmit the IR beam through most transparent window glazing; however, glazing with high tin content severely decreases signal transmission, thereby producing poor video quality. The suitability of the window can be determined only by testing the system. Since many applications require the IR beam to pass through window panes across a city street or between two buildings, window IR transmission tests should be performed prior to designing and installing such a system. The primary advantages of the ILD system are longrange (under clear atmospheric conditions) and secure video, audio, and control signal transmission. ILD atmospheric links are hard to tap because the tapping device—a laser receiver—must be positioned into the laser beam, which is hard to accomplish undetected. 6.5.5.1 Transmission Path Considerations Several transmission parameters must be considered in any atmospheric transmission link. Both LED and ILD atmospheric transmission s ystems suffer video signal transmission losses caused by atmosphere path absorption. Molecular absorption is always present when a light beam travels through a gas (air). At certain wavelengths of light, the absorption in the air is so great as to make that wavelength useless for communications purposes. Wavelength ranges in which the attenuation by absorption is tolerable are called atmospheric windows. These windows have been extensively tabulated in the literature. All LED and ILD systems operate in these atmospheric windows. Another cause of light signal absorption is particles such as dust and aerosols, which are always present in the atmosphere to some degree. These particles may reach very high concentrations in a geographical area near a body of water. In these locations, improved performance can be achieved by locating the link as high above the ground as possible. Fog is a third factor causing severe absorption of the IR signal. In fog-prone areas, local weather conditions must be considered when specifying an atmospheric link, since the presence of fog will greatly influence link downtime. Figure 6-41 shows the predicted communication range vs. visibility for a practical LED or ILD atmospheric communications system. 192 CCTV Surveillance ATMOSPHERIC LOSS FACTORS ABSORPTION SCATTERING SCINTILLATION AEROSOLS ATMOSPHERIC SIGNAL LOSS dB 0 –10 –20 –30 –40 –50 –60 –70 .5 FIGURE 6-41 1 2 3 4 5 DISTANCE (Mi) Atmospheric absorption factors and visibility In addition to signal loss, the atmosphere contributes signal noise, since it exhibits some degree of turbulence. Turbulence causes a refractive index variation in the signal path (similar to the heat waves seen when there is solar heating in air—the mirage effect) and its subsequent wind-aided turbulent mixing. The net effect of this turbulence is to move or bend the IR beam in an unpredictable direction, so that the transmitter radiation does not reach the remote receiver. To compensate for this turbulence, the transmitter beam is made wide enough so that it is highly unlikely that the beam will miss the receiver. This wider beam, however, results in lower beam intensity, so the received signal on average will be less than from a narrower beam. 6.5.5.2 Infrared Equipment The transmitter and receiver used in atmospheric IR transmission systems are very similar to those used in the fiber-optic-cable transmission system (Section 6.3.4). The primary differences are in the type of LED (or ILD) in the transmitter and the optics in both the transmitter and the receiver (Figure 6-42). The optics in the transmitter must couple the maximum amount of light from the emitter into the lens and atmosphere, that is, to produce the specified beam divergence depending on LED or ILD usage. The receiver optics are made as large as practically possible (several inches in diameter) to maximize transmitter beam col- lection, thereby achieving the highest possible S/N. An example of an atmospheric IR link is shown in Figure 6-43. The system has a range of approximately 3000 feet and operates at 12 volts DC. For outdoor applications, the transmitter is mounted in an environmental housing with a thermostatically controlled heater and fan, as well as a window washer and wiper. 6.6 WIRELESS CONTROL SIGNAL TRANSMISSION Signal multiplexing has been used to combine audio and control functions in time-division or frequency-division multiplexing. One system uses the telephone Touch-Tone system, which is standard throughout the world. With this system, an encoder generates a number code corresponding to the given switch (digit) closure. Each switch closure produces a dual Touch-Tone signal which is uniquely defined and recognized by the remote receiver station. All that is needed for transmitting the signal is a twisted-pair or telephone-grade line. With such a system, audio and all of the conceivable camera functions (pan, tilt, zoom focus, on/off, sequencing, and others) can be controlled with a single cable pair or single transmission channel. This concept offers a powerful means for controlling remote equipment with an existing transmission path. It is sometimes advantageous to combine several video and/or audio and control signals onto one transmission Analog Video, Voice, and Control Signal Transmission RECEIVER TRANSMITTER LIGHT EMITTING DIODE (LED) ATMOSPHERIC PATH SILICON DETECTOR LENS LENS VIDEO AMPLIFIER LOWNOISE AMPLIFIER LED DRIVER VIDEO AMPLIFIER AND DRIVER LENS LED 193 LENS Si DETECTOR ILD LENS DIVERGING BEAM COLLIMATED BEAM FIGURE 6-42 Block diagram of IR video transmitter and receiver VIDEO SIGNAL TO NOISE RATIO (dB) 60 58 56 54 52 50 48 TRANS/REC DISTANCE 46 (Ff) 1000 2000 3000 4000 5000 6000 7000 VIDEO STANDARD: NTSC, PAL, SECAM (525 TV LINES, 60 Hz OR 625 TV LINES, 50 Hz) MONOCHROME, COLOR RANGE: 1 MILE + LICENSE REQUIREMENT: NONE TYPE: SIMPLEX (ONE DIRECTION) SIGNAL BANDWIDTH: 5.5 MHz ±1dB, 7 MHz ±3 dB TRANSMITTER: LIGHT EMITTING DIODE (860–900 nm) PEAK POWER OUTPUT: 30 MW BEAM DIVERGENCE: 3 MILLIRADIANS POWER: AC, 115/220 V, 50/60 Hz, 25 VA DC, 12 VDC, 12 WATTS FIGURE 6-43 IR video transmitter and receiver hardware RECEIVER: SILICON AVALANCHE DETECTOR FIELD OF VIEW: 3.75 MILLIRADIANS POWER: AC, 115/220 V, 50/60 HZ, 25 VA DC, 12 VDC, 12 WATTS 194 CCTV Surveillance channel. This is true when a limited number of cables are available or when transmission is wireless. If cables are already in place or a wireless system is required, the hardware to multiplex the various functions onto one channel is cost-effective. Multiplexing of video signals is used in many CATV installations whereby several VHF and/or UHF video channels are simultaneously transmitted over a single coaxial cable or microwave link. In CCTV systems, modulators and demodulators are available to transmit the video control signals on the same coaxial cable used to transmit the video signal. 6.7 SIGNAL MULTIPLEXING/DE-MULTIPLEXING It is sometimes desirable or necessary to combine several video signals onto one communications channel and transmit them from the camera location to the monitor location. This technique is called multiplexing. Some systems allow multiplexing video, control, and audio signals onto one channel. 6.7.1 Wideband Video Signal The camera video signal is an analog base-band signal with frequencies of up to 6 MHz. When more than one video signal must be transmitted over a single wire or wireless channel the signals are multiplexed. This is accomplished by modulating the base-band camera signal with an RF (VHF or HF) or microwave frequency carrier and combining the multiple video signals onto the channel. 6.7.2 Audio and Control Signal The analog and control signals can be multiplexed with the video signals as sub-carriers on each of the video signals. In the RF band no more than two channels at 928 MHz are practical. In the microwave band at 2.4 GHz up to four channels can be used. At 5.8 GHz up to eight channels can be used. 6.8 SECURE VIDEO TRANSMISSION When it comes to protecting the integrity of the information on a signal, high-level security applications sometimes require the scrambling of video signals. The video scrambler is a privacy device that alters a television camera output signal to reduce the ability to recognize the transmitted signal when displayed on a standard monitor/receiver. The descrambler device restores the signal to permit retrieval of the original video information. 6.8.1 Scrambling Video scrambling refers to an analog technique to hide, or make covert, the picture intelligence in the picture signal. Basic types include: (1) negative video, (2) moving the horizontal lines, (3) cutting and moving sections of the horizontal lines, and (4) altering or removing the synchronization pulses. All negative video requires that the signal modulator at the camera has some synchronization with the demodulator at the monitoring site. The key to any analog video scrambling system is to modify one or more basic video signal parameters to prevent an ordinary television receiver or monitor from being able to receive a recognizable picture. The challenges in scrambling-system design are to make the signal secure without degrading the picture quality when it is reconstructed, to minimize the increase in bandwidth or storage requirements for the scrambled signal, and to make the system cost-effective. There are basically two classes of scrambling techniques. The first modifies the signal with a fixed algorithm, that is, some periodic change in the signal. These systems are comparatively simple and inexpensive to build and are common in CATV pay television, as well as in some security applications. The signals can easily be descrambled once the scrambling code or technique has been discovered. It is relatively straightforward to devise and manufacture a descrambling unit to recover the video signal. One of the earliest techniques for modifying the standard video signal is called video inversion, in which the polarity of the video signal is inverted so that a black-on-white picture appears white-on-black (Figure 6-44). While this technique destroys some of the intelligence in the picture, the content is still recognizable. Some scrambling systems employ a dynamic video-inversion technique: a parameter such as the polarity is inverted every few lines or fields in a pseudo-random fashion to make the image even more unintelligible. Another early technique was to suppress the vertical and/or horizontal synchronization pulses to cause the picture to roll or tear on the television monitor. Likewise, this technique produced some intelligence loss, but some television receivers could still lock on to the picture, or a descrambler could be built to re-insert the missing pulses and synchronize the picture, making it intelligible again. A second class of scrambling systems using much more sophisticated techniques modifies the signal with an algorithm that continually changes in some unpredictable or pseudo-random fashion. These more complex dynamic scrambler systems require some communication channel between the transmitter and the receiver in order to provide the descrambling information to the receiver unit, which reconstructs the missing signal. This descrambling information is communicated either by some signal transmitted along with the television image or by some separate Analog Video, Voice, and Control Signal Transmission 195 LINE DICING SIGNAL INVERSION 97 96 STANDARD VIDEO VIDEO SIGNAL 1 2 34 STANDARD VIDEO SYNC t SYNC RANDOMLY CODED TRANSPOSED SEGMENTS SCRAMBLED SIGNAL SEGMENTS INVERTED VIDEO 96 4 1 97 SYNC 97 96 SYNC REASSEMBLED VIDEO UNSCRAMBLED SIGNAL 1 2 34 NEGATIVE VIDEO SYNC FIGURE 6-44 Video scrambling techniques means, such as a different channel in the link. The decoding signal can be sent by telephone or other means. In a much more secure technique known as “line dicing,” each horizontal line of the video image is cut into segments that are then transmitted in random order, thereby displacing the different segments horizontally into new locations (Figure 6-44). A picture so constructed on a standard receiver has no intelligence whatsoever. Related to line dicing is a technique known as “line shuffling,” in which the scan lines of the video signal are sent not in the normal top-to-bottom image format but in a pseudorandom or unpredictable sequence. It is often necessary to scramble the audio signal in addition to the video signal, using techniques such as frequency hopping adapted from military technology. Similar to line dicing, this technique breaks up the audio signal into many different bits coming from four or five different audio channels and by jumping from one to another in a pseudo-random fashion scrambles the audio signal. The descrambler is equipped to tune to the different audio channels in synchronism with the transmitting signal, thereby recovering the audio information. In the most sophisticated dynamic scrambling systems, utilized for direct-broadcast satellites and multi-channel applications, the video and audio signals are scrambled in a way that cannot be decoded even by the equipment manufacturer without the information from the signal operator. For example, the audio signal can be digitized and then transmitted in the vertical blanking interval, the horizontal blanking interval, or on a separate sub-carrier of the television signal. 6.8.2 Encryption Video encryption refers to digitizing and coding the video signal at the camera using a computer and then decoding the digitized signal at the receiver location with the corresponding digital decoder. Digital encryption results in a much higher level of security than analog scrambling. Section 7.7.4 analyzes digital encryption techniques in more detail. 6.9 CABLE TELEVISION Cable television (CATV) systems distribute multiple channels of video in the VHF or UHF bands using coaxial cable, fiber-optic cable, and RF and microwave links. Consumer-based CATV employs this modulation– demodulation scheme using a coaxial or fiber-optic cable. The multiplexing technique is often used when video information from a large number of cameras must be 196 CCTV Surveillance transmitted to a large number of receivers in a network. Table 6-8 summarizes the VHF and UHF television frequencies used in these CATV RF transmission systems. In CATV distribution systems, the equipment accepts base-band (composite video) and audio channels and linearly modulates them to any user-selected RF carrier in the UHF (470–770 MHz) spectrum. The modulated signal is then passed through an isolating combiner, where they are multiplexed with the other signals. The combined signal is then transmitted over a communications channel and separated at the receiver end into individual video and audio information channels. At the receiver end the signal is demodulated and the multiple camera signals are separated and presented on multiple monitors or switched one at a time (Figure 6-45). Cable costs are significantly reduced by modulating multiple channels on a single cable. Since the transmission is done at radio frequencies, design and installation is far more critical as compared with base-band CCTV. Highquality CATV systems are now installed with fiber-optic cable for medium to long distances or distribution within a building. 6.10 ANALOG TRANSMISSION CHECKLIST Transmitting the video, audio, and control signals faithfully is all important in any security system. This section itemizes some of the factors that should be considered in any design and analysis. 6.10.1 Wired Transmission The following checklists for coaxial two-wire UTP, and fiber-optic cable transmission systems show some items that should be considered when designing and installing a video security project. 6.10.1.1 Coaxial Cable 1. When using coaxial cable, always terminate all unused inputs and unused outputs in their respective impedances. 2. When calculating coaxial-cable attenuation, always figure the attenuation at the highest frequency to be used; that is, when working with a 6 MHz bandwidth, refer to the cable losses at 6 MHz. 3. In long cable runs do not use an excessive number of connectors since each conductor causes additional attenuation. Avoid splicing coaxial cables without the use of proper connectors, since incorrect splices cause higher attenuation and can cause severe reflection of the signal and thus distortion. CHANNEL DESIGNATION BAND FREQUENCY RANGE PICTURE CARRIER (MHz) CATV LOW-BAND 2 CH2 6 CH6 55.25 67.25 CATV MID-BAND A5 CH95 A1 CH99 91.25 115.25 CATV HIGH-BAND CH23 CH36 175.25 211.25 CATV MID-BAND A CH14 I CH22 121.25 169.25 217.25 295.25 301.25 547.25 7 CATV SUPER-BAND CATV HYPER-BAND J 13 W CH23 CH26 AA CH37 PPP CH78 NOTE: AIRWAVE VHF TV CHANNELS 2–6 OPERATE FROM 55.25–83.25 MHz AIRWAVE UHF TV CHANNELS 7–13 OPERATE FROM 175.25–211.25 MHz AIRWAVE FM STATIONS OPERATE FROM 88.1–107.9 MHz Table 6-8 Allocated CATV RF Transmission Frequencies Analog Video, Voice, and Control Signal Transmission C1 C2 • • • MULTIPLEXER COMBINES CAMERA AUDIO/ VIDEO SIGNALS 197 DE-MULTIPLEXER MICROWAVE CATV (RF) FIBER OPTIC SEPARATES CAMERA AUDIO/ VIDEO SIGNALS DE-MULTIPLEXER MULTIPLEXER SWITCHER CN ALL SIGNALS TRANSMITTED OVER ONE WIDEBAND VIDEO TRANSMISSION LINK MULTIPLE VIDEO CAMERAS M1 OUTPUT DEVICES • SWITCHER • MONITOR • DVR / VCR • PRINTER M2 o o o MN FIGURE 6-45 MULTIPLE MONITORS AUDIO/ VIDEO Multiplexed video transmission system 4. For outdoor applications, be sure that all connectors are waterproof and weatherproof; many are not, so consult the manufacturer. 5. Try to anticipate ground loop problems if unbalancedcoaxial-cable video runs between two power sources are used. Use fiber optics to avoid the problem. 6. Using a balanced coaxial cable (or fiber-optic cable) is usually worth the increased cost in long transmission systems. When connecting long cable runs between several buildings or power sources, measure the voltage before attempting to mate the cable connectors. Be careful, since the voltage between the cable and the connected equipment may be of sufficient potential to harm you. 7. Do not run cable lines adjacent to high-power RF sources such as power lines, heavy electrical equipment, other RFI sources, or electromagnetic sources. Good earth ground is essential when working with long transmission lines. Be sure that there is adequate grounding, and that the ground wire is eventually connected to a water pipe ground. 6.10.1.2 Two-Wire UTP 1. Choose a two-wire twisted-pair having approximately 1 twist for 1–2 inches of wire. 2. Choose a wire gauge between 24 AWG (smallest) and 16 AWG. 3. Choose a reputable UTP transmitter/receiver manufacture. Either have the manufacture supply technical specifications showing performance over the distance required or test the product first. 4. Will the UTP transmitter/receiver be powered from the camera or separate 12 VDC power supply? 6.10.1.3 Fiber-Optic Cable 1. Consider the use of fiber optics when the distance between camera and monitor is more than a few hundred feet (depending on the environment), if it is a color system, and if the camera and monitor are in different buildings or powered by different AC power sources. 2. If the cable is outdoors and above ground, use fiber optics to avoid atmospheric disturbance from lightning. 3. If the cable run is through a hazardous chemical or electrical area, use fiber optics. 4. Use fiber optics when a high-security link is required. 6.10.2 Wireless Transmission The following checklists for RF, microwave, and IR transmission systems should be considered when designing and 198 CCTV Surveillance installing a video security. The wireless video transmission techniques require more careful scrutiny because of the many variables that can influence the overall performance and success of the system. as other building materials. Microwave energy does not transmit through metal objects and only partially through nonmetal. 7. For outdoor transmission, obstructions such as trees, buildings, etc. must be considered. 6.10.2.1 Radio Frequency (RF) 1. How many channels does the system require? At 928 MHz the maximum number of channels is two to avoid crosstalk between channels. 2. Are all cameras approximately in the same location? If in different locations the crosstalk is minimized. 3. RF transmission is susceptible to external electrical interference. Are there probable interferences in the area? 4. Range is a function of the transmitter power and intervening atmosphere and objects (buildings, trees, etc.). Is there a line of sight between the transmitter and the receiver? 5. Is the transmission path indoors or outdoors? 6. For indoor transmission, reflection and absorption of all objects must be taken into consideration. RF transmission does not penetrate metal objects and is partially absorbed by other materials. 7. For outdoor transmission, obstructions such as trees, buildings, etc. must be considered. 6.10.2.2 Microwave 1. How many channels does the system require? At 2.4 GHz the maximum number of channels is four and for 5.8 GHz is 8 if crosstalk between channels is to be avoided. 2. Are all cameras approximately in the same location? If they are in different locations the crosstalk is minimized. 3. Microwave transmission is susceptible to external interference from other microwave or noise sources. Are there probable interferences in the transmission path? 4. Range is a function of the transmitter power and intervening atmosphere and objects (buildings trees, etc.). Is there a line of sight between the transmitter and receiver? If not metal panels can be used to redirect the microwave transmission. 5. Is the transmission path indoor or outdoor? 6. For indoor transmission, reflection and absorption of metal objects must be taken into consideration, as well 6.10.2.3 Infrared 1. Infrared transmission is very sensitive to the intervening atmosphere. Dust, fog, and humidity play an important role in the transmission and cause absorption and scattering of the IR signal. 2. Is there a line of sight between the IR transmitter and receiver; no obstructions? 3. Can a mirror be used to “see around a corner”? 4. Are the transmitter (most important) and receiver units mounted on a sturdy nonvibrating mounting. Are the buildings stable and motionless under high wind, and over full sun loading conditions? 5. Is secure transmission needed? 6.11 SUMMARY Video signal transmission is a key component in any CCTV installation. Success requires a good understanding of transmission systems. Most systems use coaxial cable, but fiber-optic cable is gaining acceptance because of its better picture quality (particularly with color) and lower risk factor with respect to ground loops and electrical interference. In special situations where coaxial or fiber-optic cable is inappropriate, other wired or wireless means are used, such as RF, microwave, or light-wave transmission. For very long range applications, non-real-time slow-scan systems are appropriate. Many security system designers consider cabling to be less important than choosing the camera and lens and other monitoring equipment in a CCTV application. Often they attempt to cut costs on cabling equipment and installation time, since they often make up a large fraction of the total system cost. Such equipment is not visible and can seem like an unimportant accessory. However, such cost-cutting can drastically weaken the overall system performance and picture quality. Chapter 7 Digital Transmission—Video, Communications, Control CONTENTS 7.1 7.2 7.3 Overview 7.1.1 Migration from Analog to Digital 7.1.2 Local Area Network (LAN), Wide Area Network (WAN), Wireless LAN (WiFi) 7.1.3 Internet 7.1.4 Wireless 802.11, Spread Spectrum Modulation (SSM) 7.1.5 Digital Video Recorder (DVR), Network DVR (NDVR) 7.1.6 Network Security, Hackers, Viruses, Reliability Communication Channels 7.2.1 Wired Channels 7.2.1.1 Local Area Network (LAN) 7.2.1.2 Power over Ethernet (PoE) 7.2.1.3 Wide Area Network (WAN) 7.2.1.4 Internet, World Wide Web (WWW) 7.2.1.5 Leased Land Lines, DSL, Cable 7.2.1.5.1 PSTN-ISDN Link 7.2.1.5.2 DSL Link 7.2.1.5.3 T1 and T3 Links 7.2.1.5.4 Cable 7.2.1.6 Fiber Optic 7.2.2 Wireless Channels 7.2.2.1 Wireless LAN (WLAN, WiFi) 7.2.2.2 Mesh Network 7.2.2.3 Multiple Input/Multiple Output (MIMO) 7.2.2.4 Environmental Factors: Indoor–Outdoor 7.2.2.5 Broadband Microwave 7.2.2.6 Infrared (IR) Video Image Quality 7.3.1 Quality of Service (QoS) 7.3.2 Resolution vs. Frame Rate 7.3.3 Picture Integrity, Dropout 7.4 Video Signal Compression 7.4.1 Lossless Compression 7.4.2 Lossy Compression 7.4.2.1 Direct Cosine Transform (DCT) 7.4.2.2 Discrete Wavelet Transform (DWT) 7.4.3 Video Compression Algorithms 7.4.3.1 Joint Picture Experts Group: JPEG 7.4.3.2 Moving—Joint Picture Experts Group: M-JPEG 7.4.3.3 Moving Picture Experts Group: MPEG-2, MPEG-4, MPEG-4 Visual 7.4.3.3.1 MPEG-2 Standard 7.4.3.3.2 MPEG-4 Standard 7.4.3.3.3 MPEG-4 Visual Standard 7.4.3.4 MPEG-4 Advanced Video Coding (AVC)/H.264 7.4.3.5 JPEG 2000, Wavelet 7.4.3.6 Other Compression Methods: H.263, SMICT 7.4.3.6.1 H.263 Standard 7.4.3.6.2 SMICT Standard 7.5 Internet-Based Remote Video Monitoring—Network Configurations 7.5.1 Point to Multi-Point 7.5.2 Point to Point 7.5.3 Multi-Point to Point 7.5.4 Video Unicast and Multicast 7.6 Transmission Technology Protocols: WiFi, Spread Spectrum Modulation (SSM) 7.6.1 Spread Spectrum Modulation (SSM) 7.6.1.1 Background 7.6.1.2 Frequency Hopping Spread Spectrum Technology (FHSS) 7.6.1.3 Slow Hoppers 7.6.1.4 Fast Hoppers 199 200 CCTV Surveillance 7.6.1.5 7.7 7.8 7.9 7.10 7.11 7.12 Direct Sequence Spread Spectrum (DSSS) 7.6.2 WiFi Protocol: 802.11 Standards 7.6.2.1 802.11b Standard 7.6.2.2 802.11a Standard 7.6.2.3 802.11g Standard 7.6.2.4 802.11n Standard 7.6.2.5 802.11i Standard 7.6.3 Asynchronous Transfer Mode (ATM) Transmission Network Security 7.7.1 Wired Equivalent Privacy (WEP) 7.7.2 Virtual Private Network (VPN) 7.7.3 WiFi Protected Access (WPA) 7.7.4 Advanced Encryption Standard (AES), Digital Encryption Standard (DES) 7.7.5 Firewalls, Viruses, Hackers Internet Protocol Network Camera, Address 7.8.1 Internet Protocol Network Camera 7.8.2 Internet Protocol Camera Protocols 7.8.3 Internet Protocol Camera Address Video Server, Router, Switch 7.9.1 Video Server 7.9.2 Video Router/Access Point 7.9.3 Video Switch Personal Computer, Laptop, PDA, Cell Phone 7.10.1 Personal Computer, Laptop 7.10.2 Personal Digital Assistant (PDA) 7.10.3 Cell Phone Internet Protocol Surveillance Systems: Features, Checklist, Pros, Cons 7.11.1 Features 7.11.2 Checklist 7.11.3 Pros 7.11.4 Cons Summary 7.1 OVERVIEW 7.1.1 Migration from Analog to Digital The video security industry is migrating from a technology of CCTV to open circuit television (OCTV) and Automated Video Surveillance (AVS). The OCTV and the AVS technologies make use of networked digital surveillance and digital surveillance systems. There is little doubt that connecting all video cameras directly to a digital video network is becoming commonplace and cost effective in new and existing systems. Classes of video applications using these networking technologies to advantage are: (1) remote video surveillance, (2) remote video alarm verification, (3) rapid deployment video and alarm systems, and (4) remote access to stored digital video images. The OCTV permits multiple security operators to manage many remote facilities, and allows almost instantaneous monitoring of remote sites via these digital networks. Systems using existing analog video cameras can connect to the Internet via digital servers thereby providing remote site surveillance and camera control. The AVS is achieved through the use of smart cameras that can “learn,” and the use of other “intelligent” algorithms and electronics that make decisions based on past experience. This “artificial intelligence” significantly reduces the number of decisions the guard must make. The fastest-growing market segment in the video security field is digital video surveillance. The security industry is rapidly moving toward AVS in which smart cameras and sensors “learn” and make decisions and provide the security officer with enough information to act. Prior to the year 2001, camera systems were primarily used to catch the bad guys after a crime had been committed. If a large competent well-trained security team was available, the thief or criminal could be caught in the act. The primary video surveillance functions were to: • • • • • • Catch perpetrators Watch workers Protect from litigation Watch a perimeter of the facility Monitor traffic Protect assets. With more sophisticated analog video systems and the migration to wired and wireless digital local area networks (LAN), intranets, and Internet networks, the security system provided additional functions to: • Monitor suspicious activities to prevent illegal activity • Identify and apprehend perpetrators of a crime • All the other activities listed above. Historically CCTV systems were closed and proprietary networks that were controlled by the security manager. Now analog video systems, access control, intrusion detection, fire, safety, environmental sensors, and control and communication systems are often open and video images and information are sent over digital networks to multiple managers and multiple sites. From an economic point of view it makes sense to have all these sensors distributed throughout a facility or enterprise and monitored by multiple managers and facilitators. The video security requirements are now often added to the backbone of the information technology (IT) structure. This is in contrast to the analog CCTV methodology that requires individual video feeds connected to a security console with dedicated monitors and recorders and printers that do not operate on a local digital network, an intranet, or the Internet. The full impact of video surveillance using wireless cameras, monitors, and servers has yet to be realized. Wireless video surveillance is rapidly growing in popularity for monitoring remote locations whether from a personal computer (PC), laptop, personal digital assistant (PDA), or cell phone. Digital Transmission—Video, Communications, Control Remote video surveillance systems have three main functions: (1) recording the surveillance camera image, (2) playback of the surveillance image and search of specific event stored video, and (3) remote control of security equipment. The first step in the transmission process for remote video surveillance occurs when the cameras capture visual images from the surveillance area. The cameras (the input terminal) view the target areas, compress the video signals, and transmit them via a transmission means. The monitoring location(s) or control terminal receives the signals and de-compresses them back into visual images, usually achieving near real-time transmission and viewing of them. In an analog system this process involves converting the input signals from analog to digital form and then back to analog form for display on a video monitor, and/or recording on an analog VCR. The video signal is left in digital form when it is recorded on a digital DVR and displayed on an LCD, plasma, or other digital monitor. Networked transmission allows the user to remotely adjust the P/T/Z, focus and aperture (iris diaphragm) settings of the camera at any time from the remote monitoring location. Video monitoring is simplified through the use of digital video motion detectors (DVMDs) and smart cameras. Simultaneous monitoring and control from multiple geographical locations is often required. The video security industry is experiencing revolutionary changes brought upon by digital information technology (IT). This shift in video security from analog to digital began when the analog VCR was replaced by the DVR. The recent phase of this technology has advanced to the utilization of wired and wireless IT systems and networks. Video systems are expected to be full-time: 24/7/365 video surveillance, voice communications, and control. 7.1.2 Local Area Network (LAN), Wide Area Network (WAN), Wireless LAN (WiFi) The digital signal transmission channels now available include local area network (LAN), wide area network (WAN), wireless LAN (WLAN, WiFi), intranet, Internet, and World Wide Web (WWW). 7.1.3 Internet At the core of remote monitoring is a basic network infrastructure exemplified by network cameras, video servers, and computers. All these equipments communicate via a standard called the Internet protocol (IP). The IP is the ideal solution for remote monitoring since it allows users to connect and manage video, audio, data, control PTZ, and other communications over a single network that is accessible to users anywhere in the world. This data is available in most cases by a standard Web browser 201 and Internet access that can be found on any desktop PC, laptop, and many PDAs and cell phones. Video servers include an analog camera video input, an image digitizer, an image compressor, a Web server and network connection. The servers digitize the video from the analog cameras and transmit them over the computer network, essentially turning an analog camera into a network camera. 7.1.4 Wireless 802.11, Spread Spectrum Modulation (SSM) A key component to the digital transmission means is a technology called spread spectrum modulation (SSM). In this type of modulation a transmission code is combined with the information carrying base-band video signal and transmitted over the wireless network. The effect of “spreading” the signal over a wide spectrum of bandwidth provides the ability to transmit many different signals in the same allotted bandwidth with high security. This SSM communication has long been a favorite technology of the military because of its resistance to interception and jamming and was adopted in the Institute of Electrical and Electronic Engineers (IEEE) 802.11 series of transmission standards for digital transmission applications including digital video. The subsets of 802.11 applicable to video transmission are 802.11a, b, c, g, i, and the new n. The SSM technology is used in digital cellular phones, some advanced alarm systems, and radar—just to name a few common applications. The advantages of the technology include cost, bandwidth efficiency, and security. The SSM signals are difficult to detect and are therefore difficult to jam because they produce little or no interference. The products utilizing this technology operate in a licenseexempt category. There are no charges to the user from any company or government agency. 7.1.5 Digital Video Recorder (DVR), Network DVR (NDVR) The digital video recorder (DVR) has been a significant innovation in the video security market. It has rapidly replaced the analog VCR as a means for storing video images. The DVR using lossy or lossless digital compression provides the ability to store video images with little or no degradation. The DVR provides a highly advanced search capability for looking back at recorded images. The DVR also incorporates features such as video motion detection, the ability to have multi-users view the recorded video, and the ability to perform PTZ control functions from the monitoring and recording site. The DVR provides a significant upgrade in image quality and flexibility and serves as an excellent replacement for the analog VCR. 202 CCTV Surveillance An alternative to the DVR is the network DVR (NDVR). This digital Internet solution takes the streaming (realtime) and non-streaming video from cameras and records them on computers on the network. This makes them available to anyone having access on the network and makes use of the storage capability of the network. Advantages of the NDVR on the IP surveillance system over DVR technology make a strong case for it to be the system of choice for today’s enterprise-level surveillance solutions. The wide bandwidth and high information content of the video signal requires that it be compressed by some means when transmitted over the network. At present there are several compression technologies that operate with wired and wireless digital networks. They each have their own application areas with advantages and disadvantages. Three formats that are very efficient for video transmission are designated by MPEG-2, MPEG-4, and H.264 developed by the Motion Picture Experts Group, an industry standards committee. These compression standards permit near real-time transmission of video images with sufficient resolution and quality for surveillance applications and makes the camera scenes available for remote observation via Internet browsers. 7.1.6 Network Security, Hackers, Viruses, Reliability An important aspect of the digital revolution is that of security from hackers, viruses, and other adversaries. The digital system must be safeguarded against these intruders via password protection, virtual private networks (VPNs), encryption, and firewalls. Viruses are abundant on the Internet, and must be guarded against when using a remote digital monitoring system. The VPN is a private data network that makes use of the public telecommunication infrastructure, maintaining privacy through the use of firewall protocols and security procedures. Today many companies are using a VPN for both extranets and wide area intranets. Higher levels of security are obtained through the use of WiFi protected access (WPA), digital encryption standard (DES), and advanced encryption standard (AES). A firewall is typically located at the boundary between the Internet and corporate network and controls access into the network. It also defines who has access outside of the network. The firewall is, in physical terms, the access control for the network. As with any form of video networking, keeping the information safe and error-free is imperative. Errors and contamination are: man-made, due to an equipment failure, external interference, hackers, or viruses. The security industry must put forth all efforts to ensure the information is accurate. State-of-the-art image authentication software has increased the reliability of digital video monitoring by preventing signal tampering. These methods can be incorporated in special compression codes, using date/time stamping or the summation of pixel changes. Demonstrating that the video signal and image has not been tampered with helps ensure the acceptance of this information in a court of law. 7.2 COMMUNICATION CHANNELS This chapter treats all of the video digital transmission networks including the Internet transmission media with its unique protocols, standards, signal compression, and security requirements. It addresses the specific algorithms required to compress the video frame information image sizes to make them compatible with the existing bandwidths available in wired and wireless transmission channels. It describes the powerful SSM technology used to transmit the digital signal and the industry standard 802.11 SSM protocols relating to video, voice, command, and control transmission. Digital transmission channels include LAN, WAN, MAN, WiFi, intranet, Internet, and WWW, transmitted via IP and viewed through a Web browser. The most common form of digital transmission suitable for video transmission is the LAN that is traditionally interconnected via a two-wire unshielded twisted-pair (UTP), coaxial cable, or fiber-optic. When connecting to multiple sites and remote locations the WAN, MAN, and WiFi are the transmission means. When cables are difficult or impossible to install, WiFi is used to transmit to all the different communication devices and locations. The WiFi serves the same purpose as that of a wired or optical LAN: it communicates information among the different devices attached to the LAN but without the use of cables. When implementing WiFi transmission there is no physical cabling connecting the different devices together from the monitoring site to the camera locations. These digital channels use 802.11 with all the different variations of the standard using the SSM technology. The primary factors dictating the choice of a network type for interconnecting different surveillance sites are: (1) the integrity and guaranteed availability of the network connection, (2) the availability of a backup signal path, (3) the data carrying capacity (bandwidth) of the network, and (4) the operating costs for using the network. Wireless can bring a significant reduction for installation labor required when running or moving cabling within a building or from building to building. 7.2.1 Wired Channels Where video monitoring already exists, wired digital video transmission is accomplished by converting the analog video signal into a digital signal, and then transmitting the digitized camera video signal over a suitable network via modem. At the remote monitoring location a modem Digital Transmission—Video, Communications, Control converts the digital video signal back into an analog signal. Customers can use their existing telephone service to transmit the video signal. The systems used in the 1980s and early 1990s were generally referred to as slowscan video transmission (Chapter 6). The video equipment often interfaces with alarm intrusion sensors to produce an alarm signal and the video images serve as an assessment of the alarm intrusion. Wired digital video transmission works especially well in panic alarm situations where a remote location is connected to a central alarm station. If an alarm at a remote location is activated or if a person initiates an alarm with a panic button, a video clip from the camera prior to the alarm, during the alarm, and after the alarm at the remote location is sent to the monitoring station. The operator at the central-station is able to forward the video clip to the police, who now are prepared for what the situation is, how many people were involved, and if there were any weapons. The police can use the video clip to identify and apprehend them. These systems use the dial-up or public switched telephone network (PSTN) sometimes referred to as the plain old telephone service (POTS) and both are still a common transmitting means. Since the telephone service was designed for the human voice it is not very suitable for high-speed, wide bandwidth video transmission. The wired phone system has a maximum bandwidth of 3000 Hz and a maximum modem bit rate of 56 Kbps. However, only about 40 Kbps is normally realized. A slightly improved version of PSTN is the integrated services data network (ISDN) that gives direct access to digital data transmission at data rates of 64 Kbps. Since many corporations have already set up LAN/WAN networking systems for IT business applications, the next logical expansion is to these networks for complete integration of video surveillance. A major advantage of IPaddressed, network-capable video devices is the ability to receive a signal anywhere using equipment ranging from a simple network Internet browser to special clientbased application software products. The high bandwidth requirements for full-frame high-quality video without compression exceed the capability of most WAN network connections. On average, a low-quality image transmitted via networks requires 256 Kbps and can reach 1 Mbps if image quality and refresh rates are increased. Even LANs would be strained as large numbers of cameras attempted to simultaneously pass video signals back to a central video server, DVR, or both. 7.2.1.1 Local Area Network (LAN) The most common and most extensively installed LAN is the Ethernet. This network is specified in the IEEE 802.3 standard and was originally developed by the Xerox Corporation, and further developed by Xerox, DEC, and Intel corporations. The typical Ethernet LAN uses a coaxial 203 cable or special grade of twisted-pair wires for transmission (Figure 7-1). For long ranges or transmission through areas having electrical interference, the Ethernet can use fiber-optic transmission technology. The most common lowest bandwidth Ethernet systems is called 10BASE-T and can provide transmission speeds up to 10 Mbps. For fast Ethernet connections a 100BASE-T is used and provides speeds up to 100 Mbps. Gigabit Ethernet systems provide an even higher speed of transmission of 1000 Mbps (1 Gigabit = one billion bits per second). The latter two are used as the backbone for digital transmission systems. Video systems generally use the 10Base-T or 100Base-T networks. WANs connect LANs to form a large structured network sometimes called an intranet. These networks can be connected inside buildings and from building to building, and connected to the Internet. 7.2.1.2 Power over Ethernet (PoE) The PoE, also referred to as power over LAN (PoL) is a technology that integrates data and power over standard LAN infrastructure cabled networks (Figure 7-2). The PoE is a means to supply reliable, uninterrupted power to network cameras, wireless LAN access points, and other Ethernet devices using existing, commonly used category (CAT) cable with four twisted pair conductors and CAT5 cable infrastructure (Figure 7-3). The PoE is a technology for wired Ethernet LANs that allows the electrical power (current and voltage) necessary for the operation of each device to be carried by the data cables rather than by power cords. This minimizes the number of wires that must be strung in order to install the network. The result is lower cost, less downtime, easier maintenance, and greater installation flexibility than with traditional wiring. Unlike a traditional telephone infrastructure, local power is not always accessible for wireless access points, IP video cameras, phones, or other network devices deployed in ceilings, lobbies, stairwells, or other obscure areas. Adding new wiring for power may be a difficult and costly option. In cases like this, an option is to combine the provision of power with the network connection using PoE technology over any existing or new data communications cabling. The standard was developed by the IEEE as 802.3af. The standard Ethernet cable uses only two of those pairs for 10BaseT or 100BaseT transmission. Because the Ethernet data pairs are transformer-coupled at each end of the cable, either the spare pairs or the data pairs can be used to power powered-device (PD) equipment. At the power source end of the cable, the power source equipment may apply power to either the spare pairs or the data pair of that cable, but not to both simultaneously. Also the power source equipment may not apply power to non-PoE devices if they are connected to the cable. The PoE uses 48 VDC designated as safety extra low-voltage (SELV) providing 204 CCTV Surveillance ANALOG CAMERA IP CAMERA SERVER BNC RJ45 INTERNET INTRANET ETHERNET/IP NETWORK 10BASE–T 100BASE–T FIGURE 7-1 Ethernet local area network (LAN) SECURITY SWITCH PoE INJECTOR CONTROL NETWORK CAMERAS DIGITAL VIDEO RECORDER LEGEND: VIDEO / DATA VIDEO / DATA /POWER FIGURE 7-2 Digital video network using Power over Ethernet (PoE) Digital Transmission—Video, Communications, Control 205 POWERED DEVICE (PD) TX (TRANSMIT) PAIR POWER SOURCING EQUIPMENT (PSE) TO ETHERNET DEVICE RX (RECEIVE) PAIR TO ETHERNET DEVICE SENSING CIRCUIT CLOSES WHEN PD IS DETECTED DETECTION RESISTOR + 48 VDC TX TX 1 2 1 2 RX 3 3 4 4 5 5 6 6 7 7 8 8 48 VDC RX FIGURE 7-3 POWERED ELECTRONICS • CAMERA • ROUTER • SERVER • OTHER 48 VDC GND. SWITCH OPEN UNTIL SUCCESSFUL DETECTION Power over Ethernet (PoE) connections an additional safety factor. The PoE has the capability of powering up to a 13 watt load. Table 7-1 summarizes the characteristics of UTP CAT Cables. The PoE avoids the need for separate power and data cable infrastructure and costly AC outlets near cameras. CAT CABLE TYPE It reduces installation time, a significant saving in cost. It allows networks cameras to be installed where they are most effective, and not where the AC power outlets reduce the number of cameras and further reduce the surveillance implementation costs. Power delivered over the LAN CROSS TALK * NEXT (dB) APPLICATION BANDWIDTH (MHz) IMPEDANCE (ohms) CAT-3 16 100 29 10BaseT LAN STANDARD ETHERNET CAT-4 20 100 30 10/100BaseT LAN FAST ETHERNET CAT-5 100 100 32.3 10/100BaseT FAST ETHERNET CAT-5e 100 100 35.3 1 Mb/s GIGABIT ETHERNET CAT-6 250 100 44.3 1Mb/s GIGABIT ETHERNET CAT-7 600 100 62.1 1 Mb/s GIGABIT ETHERNET * NEAR END CROSS TALK NOTE: CABLE SPECIFICATIONS TYPICAL FOR UNSHIELDED TWISTED PAIR (UTP) AWG (AMERICAN WIRE GUAGE) 22 AND 24 CAT-3, CAT-5e MOST COMMON FOR VIDEO Table 7-1 Category (CAT) Cable Specifications 206 CCTV Surveillance infrastructure is automatically activated when a compatible terminal is identified, and blocked to legacy analog devices that are not compatible. This allows the mixture of analog and power over LAN compatible devices on the same network. Two system types are available: (1) power is supplied directly from the data ports; (2) power is supplied by a device between an ordinary Ethernet switch and the terminals, often referred to as the “power hub.” By backing up the power over LAN in the communication room with an uninterrupted power supply (UPS), the entire camera network can continue operation during a power outage. This is a real must for high-end surveillance systems. The inclusion of line detection technology that enables safe equipment installation without concerns of highvoltage damage to laptops, desktops, and other equipment due to a misplaced connection is one of the reasons the power over LAN is much more than an intelligent power source. To take advantage of PoE the power source equipment must be able to detect the presence of a PD at the end of any Ethernet cable connected to it. The PD appliances must assert their PoE compatibility and their maximum power requirements. When the system is powered up the PoE enabled LAN appliances identify themselves by means of a nominal 25 K resistance across their power input. 7.2.1.3 Wide Area Network (WAN) The WANs, in the past, suffered from limited bandwidth. The most common WAN link was a T1 telephone land line supplied by AT&T with a maximum data rate of 1.5 Mbps. Advanced technology WAN systems now incorporate optical OC3 (155 Mbps) and OC12 (622 Mbps) communication links. Figure 7-4 shows a diagram of the WAN as applied to digital video surveillance. 7.2.1.4 Internet, World Wide Web (WWW) During the 1990s an open systems revolution swept through the IT industry, converting thousands of computers connected via proprietary networks to the Internet, a network of networks based on common standards. These standards were called transmission control protocol/Internet protocol (TCP/IP) for communications, simple mail transfer protocol (SMTP) for email, hypertext transfer protocol (HTTP, http://) for displaying web pages, and file transfer protocol (FTP) for exchanging files between computers on the Internet. The Internet has made long-range video security monitoring a reality for many security applications (Figure 7-5). The availability of high-speed computers, large solid state memory, the Internet, and the WWW has brought CCTV surveillance from a legacy analog technology to an OCTV digital technology. The WWW, also known as The Web, is a salient contributor to the success of OCTV and AVS. The WWW was developed at the CERN, the European laboratory for Particle Physics in Geneva, Switzerland, by Tim Berners-Lee. The web is a multiplatform operating system that supports multimedia communications on the basis of a Graphical User Interface (GUI). The GUI provides hypertext that enables the user to click a highlighted text word in search related files, across web servers, and through hot links: in other words the Web is hyperlinked. In addition to the video, the Web supports graphics and audio with levels of quality and speed depending on the bandwidth available in the network. Since the initial conception of the Web at CERN, its home has moved to the W3 Consortium (W3C), a cooperative venture of CERN, the Massachusetts Institute of Technology (MIT), and INRIA a European organization. Since its organization in 1994 W3C has published numerous technical specifications to improve and expand the use of the WWW. Security monitoring is no longer limited to local security rooms and security officers, but rather extends out to remote sites and personnel located anywhere around the world. Monitoring equipment includes LCD and plasma display monitors, PCs and laptops, PDAs, and cell phones. The requirement for individual personnel to monitor multiple display monitors has changed to a technology of incorporating smart cameras and VMDs to establish an AVS system. The Internet is comprised of LANs using a large array of interconnected computers through which video and other communication information is sent over wired and wireless transmission channels. The location of the sender and receiver can be anywhere on the network, viewing scenes from anywhere in the world (Figure 7-6). The IP is the method by which the digital data can be sent from one computer to another over the Internet in the form of packets. Any message on the Internet is divided into these sub-messages called packets containing both the senders and receivers address. Because the video message is divided into many packets, each packet may take a different route through a different gateway computer across the Internet. These packets can arrive in a different order than the order in which they were sent. The IP just has the function to deliver them to the receiver’s address. It is up to another protocol, the TCP to put them back together in the right order. Each computer on the network is known as a host on the Internet and has at least one address that uniquely identifies it from all the other computers on the network. The digital message can consist of an email, a web page, video, or other digital data. When the video or other data stream is sent out over the Internet a router (in the form of software or hardware) determines the next network to which a packet in the message should be forwarded toward its final destination. The packet does not go directly from the sender (transmitted location, i.e. camera, etc.) to the receiver but generally goes through a gateway computer SITE 2A—MOBILE SITE 1 CORPORATE ROUTER / BRIDGE LAN ACCESS POINT/BRIDGE SITE 2—CLIENT ROUTER / BRIDGE WLAN T1 CABLE—1.5 Mbps MAXIMUM FIBER OPTIC OC3–A55 Mbps OC12– 622 Mbps SITE 4 – CLIENT WLAN LAN ROUTER / BRIDGE INTERNET LAN SITE 3A—MOBILE ROUTER—DEVICE THAT MOVES DATA BETWEEN DIFFERENT NETWORK SEGMENTS. LOOKS AT PACKET HEADER TO DETERMINE THE BEST PATH FOR THE PACKET TO TRAVEL. CAN CONNECT NETWORK SEGMENTS THAT USE DIFFERENT PROTOCOLS. BRIDGE—DEVICE THAT PASSES DATA PACKETS BETWEEN MULTIPLE NETWORK SEGMENTS USING THE SAME COMMUNICATIONS PROTOCOL. IF THE PACKET IS BOUND FOR ANOTHER SEGMENT USING A DIFFERENT PROTOCOL THE BRIDGE PASSES IT ONTO THE NETWORK BACKBONE. WLAN ROUTER/ BRIDGE LAN CLIENT—NETWORKED PC OR TERMINAL THAT SHARES SERVICES WITH OTHER PCs. ACCESS POINT—WIRELESS BASED DEVICE FOR CONNECTING ROAMING WIRELESS PC CARDS DIRECTLY TO THE INTERNET. THE ACCESS POINT PROVIDES ROAMING AND MOBILITY FROM A STATIONARY INTERNET CONNECTION. FIGURE 7-4 Wide area network (WAN) diagram ACCESS POINT/BRIDGE Digital Transmission—Video, Communications, Control SITE 3—CLIENT 207 208 CCTV Surveillance PDA CELLPHONE IP CAMERA ANALOG CAMERA LAPTOP SERVER TCP/IP (TRANSMISSION CONTROL PROTOCOL / INTERNET PROTOCOL) SMTP (SIMPLE MAIL TRANSFER PROTOCOL) WIFI CELLULAR TOWER LAN IT NETWORK HTTP (http://) HYPERTEXT DISPLAYING WEB PAGES WORLD WIDE WEB (WWW) INTERNET FTP (FILE TRANSFER PROTOCOL) INTRANET TOWER FIGURE 7-5 Block diagram for remote video surveillance via the Internet SITE 2 TOWER SITE 1 ANALOG CAMERA SERVER IP DOME IP CAMERA C TOWER LAN C C SITE 3 INTERNET C PDA LAPTOP C WIFI NETWORK C C LAN TOWER INTRANET C IP DOME SERVER PTZ C • VIDEO DATA PACKETS TAKE DIFFERENT ROUTES FROM SENDER TO RECEIVER C C = HOST COMPUTERS WORLDWIDE • EACH VIDEO COMPONENT AND COMPUTER (C) HAS A UNIQUE ADDRESS (MAC) • TCP PUTS PACKETS OF DATA (VIDEO, CONTROLS, ETC.) BACK TOGETHER IN THE CORRECT ORDER • PACKETS TAKE DIFFERENT ROUTES FROM SENDER TO RECEIVER FIGURE 7-6 Worldwide video monitoring using Internet system Digital Transmission—Video, Communications, Control that forwards the packet onto a next computer toward its final destination. The Internet allows for complete remote video surveillance, audio communication, and remote control from any one location to any other location on the network. As soon as a network is connected to the Internet, any authorized computer with a browser can receive security services. For that matter, any security system, even a system that is not networked, can be potentially made Internet based, fully or partially, the moment Internet access is provided. Traditional central stations are connected to the security systems being monitored by means of a network connection (Ethernet), a telephone dial-up, direct hard wire connection, satellite uplink, or by radio signal. Product literature that sites either “IP addressable” or “TCP/IP” reveals that the product (IP camera, etc.) has some potential for network or Internet-based applications. An important movement in the Internet industry is the development of application service providers (ASPs). A commercial central station could operate as a security ASP, just as an ASP could monitor security alarms that are reported across the Internet. This could be carried a step further in an example such as connecting police departments, enabling the police not only to view and hear what is happening at a crime scene, but to follow events as they occur before a police response arrives. TRANSMISSION TYPE TYPICAL DOWNLOAD SPEED 7.2.1.5 Leased Land Lines, DSL, Cable There are several wired transmission means for transmitting the digitally encoded video and other data signals. The most common options for gaining connection to the Internet are: (1) leased land lines using PSTN modem, (2) ISDN telephone, (3) asymmetrical digital subscriber line (ADSL), and (4) cable. The PSTN and ISDN do not offer the capacity (bandwidth) to provide multiple channels of high-quality live video, but are a perfectly usable channel for non-real-time video alarm verification or event query searching from DVRs. The ISDN is a logical choice for many video alarm verification applications as it has an excellent reliability specification, is almost universally available, and is competitively priced for the data carrying capacity it provides. Table 7-2 summarizes the bandwidth carrying capacity of these transmission channels. 7.2.1.5.1 PSTN-ISDN Link The dial-up PSTN is the most common of the available transmitting methods for digital video transmission over long distance wired networks. The service was designed for human voice, not high-speed video transmission. The data carrying capacity accessed is at best that of the PSTN modem or ISDN link, and often much less depending on network availability and traffic. On paper, ADSL offers a much faster connection to the Internet. This is based on the assumption that not all users will require all of the bandwidth they have paid for, all of the time. Typically, up TRANSMISSION TIME FOR 25 kb IMAGE (SEC.) MAX. FRAME RATE FOR 25 kb IMAGE CONNECTION MODE PSTN 45 Kbps 6 10 Frames/min DIAL-UP ISDN 120 Kbps 2 0.5 Frames/sec DIAL-UP IDSL 150 Kbps 2 0.06 ADSL—LOW END 640 Kbps 0.3 3 0.05 20 ADSL—HIGH END 5 Mbps HDSL 1.5 Mbps 0.2 6 VDSL 20 Mbps 0.01 80 CABLE MODEM 750 Kbps 0.3 3 T1 1.5 Mbps 0.2 6 10BaseT 5 Mbps 0.05 20 100BaseT 50 Mbps 0.005 200 1000BaseT 500 Mbps 0.0005 2000 Frames/sec IDSL: ISDN DSL ADSL: ASYNCHRONOUS DSL Table 7-2 209 HDSL: HIGH BIT-RATE DSL VDSL: VERY HIGH DATA RATE DSL PSTN, ISDN, ASDL, Ethernet, and other Cable Speeds DIRECT CONNECTION DIRECT CONNECTION 210 CCTV Surveillance to 20–50 users share the ASDL bandwidth depending on the service selected. For occasional access to stored video, this may be quite acceptable but for multi-channel live surveillance it is unlikely to be satisfactory. If the Internet is used for security applications, it is wise to have a backup communications by a more reliable network and to select equipment that can automatically revert to this backup network. 7.2.1.5.2 DSL Link The DSL technology supplies the necessary bandwidth for numerous applications including high-speed Internet access, dedicated Internet connectivity, and live video monitoring. This digital broadband data line directly connects the client computer to the Internet via existing cables. The speed of DSL varies depending on the connection speed and in some cases the number of people on the network. 7.2.1.5.3 T1 and T3 Links The T1 and T3 networks have much higher speeds than those previously described. “T1” is a term coined by American Telephone and Telegraph (AT&T) for a system that transfers digital signals at 1.544 Mbps. T3 is the premium transmission method and has almost 30 times the capacity of T1. T3 lines can handle 44.736 Mbps. Fiber optics with its much higher bandwidth and many superior characteristics is replacing T1 and T3 transmission cables. 7.2.1.5.4 Cable Community Antenna Television (CATV) networks have developed in parallel with DSL, and now compete for Internet access and even voice communication, in addition to the entertainment TV for which they were developed. Cable provides yet another means for transmitting the analog and digital video signal. Access to the Internet is offered by a number of CATV providers. Since the mid1990s a number of these CATV providers have upgraded much of their traditional coax-based networks with optical fiber, thereby increasing overall network performance considerably. Both the coax and fiber-optic networks can support video and two-way Internet access. With the appropriate electronic upgrades, high-speed Internet access can be provided at end-user costs comparable with DSL networks. 7.2.1.6 Fiber Optic Fiber optics is used as the transmission media of choice for digital signals transmitted over long distances or where severe electrical disturbances (lightning storms, electrical equipment) are present. The attributes of fiber optics are: (1) long-distance transmission—over many miles without degradation of the signal, (2) ultra-wide bandwidth resulting from the use of optical frequencies, and (3) secure transmission because of the difficulty to tap the optical signal. In analog systems the output signals whether video or audio are analogs of the input signals. Analog signals are susceptible to rapid degradation, electrical noise interference, and distortion along the transmission channel. Analog signals are also degraded when multiple generations or reproductions of signals are required. Digital signals, on the other hand, are immune to such problems. Theoretically any number of signal re-generations is possible with zero loss of quality. However, once the digital signal becomes too small or the interference too large, the signal “breaks up” or totally drops out. Amplitude modulation (AM), frequency modulation (FM), and pulsed-frequency modulation (PFM) are used in analog video fiber-optic transmission systems. In digitally encoded fiber-optic video transmission the video signals are sampled at very high rates and converted into digital signal formats. In both cases these signals are applied to light emitting diodes (LEDs) or injection laser diodes (ILDs) inside the optical transmitter units. The digital optical signals are transmitted through the fibers and then converted back to analog, base-band electrical video signals inside the optical receiver units. Figure 7-7 compares the AM, FM, and PFM transmission. The AM video transmission is limited to short distances using multi-mode optical fiber and only available at the 850 nm operating wavelength. The FM transmission, on the other hand, provides very high video transmission performance over long distances and is available for use at 850 and 1300 nm. The 1300 nm wavelength has higher transmission through the atmosphere and is more eye-safe. The latest generation of fiber-optic video transmission equipment digitizes the analog base-band video signals to provide a digital signal. This is accomplished via analog to digital (A/D) converters or coder-decoders inside the optical transmitters. The digitized signals modulate the LEDs or ILDs and then inject them optically into and through the fibers to the optical receivers where they are converted back into analog base-band signals by internal digital to analog (D/A) converters. Factors affecting the image quality in digitally encoded video transmission and its effect on the electrical dynamic range and signal-to-noise ratio (S/N) of the output video signal is the number of bits employed in the D/A and the compression employed. No video compression is needed in fiber-optic transmission because of the very wide bandwidth capabilities of the fiber optic. This means that the video is transmitted in real-time with zero latency (no delay) and standard 30 fps. A summary of the channels available and speeds of transmission and other parameters are compared in Table 7-3. 7.2.2 Wireless Channels The WiFi network can be connected to the Internet through the use of a variety of high-speed connections including cable modems, DSL, ISDN, satellite, broadband, Digital Transmission—Video, Communications, Control TYPICAL FIBER OPTIC FM VIDEO LINE S/N RATIO TYPICAL FIBER OPTIC AM VIDEO LINE S/N RATIO S/N RATIO (dB) S/N RATIO (dB) 65 65 60 60 55 55 50 50 45 45 40 40 35 5 1 211 10 35 20 15 5 1 20 15 10 OPTICAL PATH LOSS (dB) OPTICAL PATH LOSS (dB) TYPICAL FIBER OPTIC DIGITALLY EMBEDDED VIDEO LINE S/N RATIO S/N RATIO (dB) 65 60 55 50 45 40 35 1 5 10 15 20 OPTICAL PATH LOSS (dB) FIGURE 7-7 Comparison of AM, FM and pulse frequency modulation TRANSMISSION TYPE THEORETICAL * DOWNLOAD SPEED TRANSMISSION MEDIA PSTN 45 Kbps ISDN 120 Kbps CAT-3 HDSL 1.5 Mbps CAT-3, 5, 5e CABLE MODEM 750 Kbps CAT-3, 5, 5e 5 Mbps CAT-3, 5, 5e 10BASE T UTP CAT-3 50 Mbps CAT-5e 1000BASE T 500 Mbps CAT-6 T1 1.5 Mbps CAT-3, 5e T3 45 Mbps 100BASE T UTP CAT-5, 5e OC3 155 Mbps FIBER OPTIC OC12 622 Mbps FIBER OPTIC * REALISTIC SPEED APPROXIMATELY 1/2 OF THEORETICAL Table 7-3 Comparison of Wired UTP and Optical Transmission Channels etc. The broadband Internet connection connects to a video gateway or access point, and its Internet connection is distributed to all the computers on the network. The access points or gateways function as the “base stations” for the network. They send and receive signals from the WiFi radios to connect the various components of the security system to each other as well as to the Internet. All computers in the WiFi network can then share resources, exchange files, and use a single Internet connection. This is the central connection among all wireless client devices (PC, laptop, printers, etc.) and enables the sharing of the Internet connection with other users on the network. 212 CCTV Surveillance Access points and gateways have a wide range of features and performance capabilities and provide this basic network connection service. 7.2.2.1 Wireless LAN (WLAN, WiFi) The WiFi (Wireless Fidelity) devices “connect” to each other by transmitting and receiving signals on a specific frequency of the radio frequency (RF) and microwave bands. The components can connect to each other directly, called peer to peer or through a gateway or access point. The WiFi networks consist of two basic components: (1) WiFi radios and (2) access points or gateways. The WiFi radios are attached to the desktop computer, laptop, or other mobile devices on the network. The access points or gateways act as “base stations,” i.e. they send and receive signals from the WiFi radios to connect the various components to each other as well as to the Internet. All the computers in the WiFi network then share resources and exchange files over a single Internet connection. The IEEE developed a series of 802.11 protocols to meet the requirements of disparate applications, and continues to formulate new ones. The 802.11a, b, g, i, and n standards are most useful for the wireless digital video transmission applications. Table 7-4 summarizes some of the parameters of the standards. A peer-to-peer network is composed of several WiFi equipped computers talking to each other without using a base station (access point or gateway). All WiFi Certified™ equipment supports this type of wireless setup, which is a good solution for transferring data between computers or when sharing an Internet connection among a few computers. Many laptop computers and mobile computing devices come with a WiFi radio built into them and are ready to operate wirelessly. For other laptops without such a device, a WiFi radio embedded in a simple Personal Computer Memory Card International Association (PCMCIA) card can be inserted into expansion slot of a laptop computer. There are other ways to include the desktop PC into the network. Since many PCs do not have card slots for PC cards, the simplest method is to use a universal serial bus (USB) WiFi radio that plugs into an available USB port on the computer. 7.2.2.2 Mesh Network The mesh network is a topology that provides multiple paths between network nodes. Wired networks have used OPERATING FREQUENCY BAND (GHz) DATA† RATES (Mbps) OPERATING FREQUENCY BANDS (GHz) 802.11 * 2.4 1, 2 2.4–2.8 (LEGACY) IR IEEE STANDARD MODULATION METHOD MAX POWER OUTPUT (EIRP) DSSS FHSS IR APPLICATIONS/ COMMENTS ORIGINAL 802.11 STANDARD FOR WIRELESS LAN 300 MHz IN 3 BANDS of 100 MHz each: 5.2, 5.8 6, 12, 24, 9, 18, 36, 48 54 MAXIMUM 5.150 to 5.250 (UNII LOWER BAND) 5.250 to 5.350 (UNII MIDDLE BAND) 5.725 to 5.825 (UNII UPPER BAND) 802.11b 2.4 1, 2, 5.5, 11 11 MAXIMUM 83.5 MHz FROM 2.40 GHz to 2.4835 GHz (ISM BAND) 802.11g 2.4 802.11a ** 1, 2, 5.5, 11 6, 9, 12, 18, 24, 36, 48, 54 2.4–2.4835 108 20–40 MHz * IEEE ESTABLISHED STANDARD IN 1997 TO DEFINE MAC (MEDIA ACCESS CONTROL) AND PHY (PHYSICAL) LAYER REQUIREMENTS FOR WIRELESS LAN. ** IEEE ESTABLISHED 802.11a IN 1999 † THEORETICAL MAXIMUM RATES. REALISTIC MAXIMUM APPROXIMATELY ONE–HALF ‡ ADVANCED ENCRYPTION STANDARD ISM—INDUSTRIAL, SCIENTIFIC, MEDICAL UNII—UNLICENSED NATIONAL INFORMATION INFRASTRUCTURE COFDM—CODED ORTHONOGONAL FREQUENCY DIVISION MULTIPLEXING FDMA—FREQUENCY DIVISION MULTIPLE ACCESS DSSS—DIRECT SEQUENCE SPREAD SPECTRUM FHSS—FREQUENCY HOPPING SPREAD SPECTRUM EIRP—EQUIVALENT ISOTROPICALLY RADIATED POWER IR—INFRARED Table 7-4 DSSS FDMA DSSS COFDM 40 mW 200 mW 800 mW 1 WATT TYPICAL: 30 mW INDOOR INDOOR OUTDOOR USES FDMA, DSSS DUAL BAND 2.4 GHz ADDS HIGH LEVEL AES ENCRYPTION‡ 802.11i 802.11n COFDM Comparison of IEEE 802.11 Standards VERY HIGH DATA RATE Digital Transmission—Video, Communications, Control the mesh topology to get redundancy and reliability. Mesh networks make the most sense with wireless transmission because wireless nodes can be set up to form ad hoc networks that connect many nodes. In the wireless application if interference or excess distance between nodes causes a dropped video link the mesh system will find an alternate path through the mesh automatically. The nodes themselves may generate messages to be sent elsewhere or be available to receive data or both. The nodes act as repeaters to move the video and other data from point-to-point when they are not transmitting or receiving their own data. What results is a very robust network at low cost. The Mesh network using many closely spaced repeater transceivers (nodes) is shown in Figure 7-8. Each node can communicate with its nearby neighbors that are within range. The nodes can exchange data between themselves, store it, or forward data meant for a more distant node that is out of range of a nearby node. One of the nodes can also serve as a wired or wireless connection to an Internet node or access point. A particular attribute of the wireless Mesh network using multiple nodes is that it allows the signal to be transmitted over a longer range than would be possible with a normal line-of-sight (LOS) link. In mesh networks multiple paths IP CAMERA 213 exist through the network system, increasing the probability that the video signal from the camera will reach the monitoring location. The Mesh configuration is also more reliable since if one of the nodes fails due to a power loss, jamming or other defect, communication is still maintained, i.e. the video, voice, communication, or control signals can be routed through another path. In addition to the reliability aspect, the Mesh configuration offers the benefit of requiring very low transmitted power at any given node because the distance between nodes is usually short. Mesh networks are especially useful in monitoring a large network of image and/or alarm sensors. In portable and rapid deployment applications, low transmit power means low device power consumption and longer battery life. The military has already adopted mesh networks in battlefield systems and many forms of video security are ideal applications for this growing technology. 7.2.2.3 Multiple Input/Multiple Output (MIMO) Most wideband WiFi networks operate with data rates between 11 and 54 Mbps. There is however a need for greater network bandwidth capacity for wireless LANs. The wireless radio channel for moving video and other OUTDOOR MESH NODE IP CAMERA DOME WIRELESS MESH NETWORK BNC ANALOG CAMERAS IP CAMERAS MESH NODE SERVER IP CAMERAS BNC CENTRAL MONITORING FIGURE 7-8 Wireless mesh transmitting network 214 CCTV Surveillance digital information over the air waves has a highly variable nature. Unlike the relatively stable environment that exists on wire, cable, or fiber-optic networks, the ability of the air to carry information can and does change over time and often from moment to moment. With this fundamental variability and the overhead inherent in any networking protocol, the actual throughput available from a 54 Mbps connection is often much less than this peak number. As a consequence it is necessary to improve the performance of wireless LANs at the physical layer if higher throughputs are to be achieved. One popular approach is to gang together multiple radio channels and to use compression and related techniques to gain some additional advantage in information throughput. The ideal solution is to come up with a technology that simply packs more information per unit of bandwidth and time. This technique applied to wireless transmission is known as modulation efficiency—the number of bits per unit of bandwidth and time that can be transmitted through the air at any given time. Radio signals are subject to serious degradation as they move through space, primarily due to the distance between the transmitter and receiver, interaction with objects in the environment, and interference from other radio signals and reflections of the signal in question itself (known as multi-path). All these artifacts result in a number of forms of fading, the loss in power of the radio signal, as it moves from the transmitter to the receiver. The technique available today that has been put into practice in a wireless LAN is called multiple input, multiple output (MIMO). This technology adds an additional dimension to the radio channel—a spatial dimension— allowing a more complex but inherently more reliable radio signal to be communicated (Figure 7-9). Whereas conventional radio transmission uses a single input, single output, a true MIMO system uses at least two transmit antennas, working simultaneously in a single channel, and at least two receive antennas at the other end of the connection working in the same channel. Generally the number of receive antennas in a MIMO system is usually greater than the number of transmit antennas and the performance of transmission improves with the addition of more receive antennas. Going from a single antenna to two antennas can result in a 10 × 10 dB improvement in the S/N, a key indicator of reliability and signal quality. Adding a third antenna adds an additional 4 × 5dB improvement. Figure 7-10 illustrates a six-antenna MIMO receiver. The MIMO technology relies upon the interactions of the signal with the environment in the form of multipath for its benefits—a counterintuitive element in the technology. The phenomenon is attributed to reflections and multi-path transmissions from walls, ceilings, floors, and other objects. By improving the performance of the antennas and the number of them used in the WLAN, the REFLECTING OBJECT(S) TRANSMITTER MULTIPATH SIGNALS RECEIVER MIMO SIGNAL PROCESSING (RF + DSP) MIMO SIGNAL PROCESSING (RF + DSP) MIMO TRANSMITTER ELECTRONICS MIMO SIGNAL PROCESSING (RF + DSP) DUAL TRANSMIT ANTENNA MIMO SIGNAL PROCESSING (RF + DSP) QUAD RECEIVE ANTENNA FIGURE 7-9 Multiple input, multiple output (MIMO) receiver MIMO RECEIVING ELECTRONICS Digital Transmission—Video, Communications, Control FIGURE 7-10 Six antenna wireless LAN MIMO receiver overall performance is significantly improved. The MIMO technology introduces a third spatial dimension beyond the frequency and time domains, which would otherwise define the radio channel. The major difference between the MIMO and traditional wireless systems is a utilization of the physical multipath phenomenon. Unlike traditional modems that are typically impaired by multi-path, MIMO takes advantage of multi-path. The typical radio signal from a point source (single antenna) typically bounces off different objects during transmission, particularly indoors as it interacts with objects in the environment. The result of these interactions is multi-path fading, as the signal interferes often destructively with itself. The MIMO takes advantage of multiple paths, using signal processing implemented on digital signal processor (DSP) chips, and using clever algorithms at the transmitter and receiver. Somewhat counter intuitively, MIMO actually depends upon multi-path to function correctly and produce improvements, making it even better suited to in-building applications. The MIMO can offer a dramatic improvement in signal throughput over competing WLAN technologies. The new 802.11n standard including MIMO processing in its specification should produce performance of 144–200 Mbps. 7.2.2.4 Environmental Factors: Indoor–Outdoor Indoor and outdoor environmental effects must always be considered when implementing a wireless analog or digital video system. Atmospheric conditions, objects in the signal’s path, incorrect antenna pointing angle can all cause fading and dropouts in the digital video signal. All of these factors affect the quality of service (QoS) in the resulting video image or other communication data. Most analog and digital video transmission takes place using the FCC allocated 902 MHz, 2.4 GHz, and 5.8 GHz bands, each of 215 which exhibit signal degradation under different conditions. The 902 MHz and 2.4 GHz bands provide the best transmission through most non-metal, dry solid objects, but the 5.8 GHz band exhibits severe attenuation when objects are placed in the path between the transmitter and receiver. The 5.8 GHz band should only be used for short range indoor applications and clear LOS outdoor applications or where specific metal reflectors can be placed to re-direct the microwave beam to the receiver. The 802.11b technology operates at 2.4 GHz and a data rate of 11 Mbps and can handle up to three video data streams at a time. The 802.11g technology operating at 2.4 GHz and a data rate of 54 Mbps, and the 801.11a technology operating at 5.8 GHz and a data rate of 54 Mbps can manage multiple standard video streams. They all require innovative techniques to provide high QoS and quality video images. One system using a diversity antenna array (not MIMO) provides wireless connections at data rates of up to 54 Mbps over a time domain multi-access (TDMA) proprietary link that uses the 802.11a, 5.8 GHz frequency band. The system permits multiple streams of DVD, cable and satellite digital video, audio, and data to be delivered over the wireless links without degrading quality. The key to the improved QoS is in the front end of the receiver. The RF transceiver employs a spatial wave-front receiver that uses five antennas and two full receiver channels to eliminate multi-path (ghost) signals. It does this by using the five-antenna array to capture the RF signals and then selects the best two of five signals. This approach takes advantage of the multi-path signals as opposed to other techniques that try to eliminate them. After the two signals are selected they are fed into separate independent receive channels that amplify, filter, frequency convert, and eventually feed them to the base-band processor. The base-band chip converts the two analog signals into digital streams and then, using DSP techniques, combines them into one high-quality data stream. When a system is set up it scans the available channels for one that is not in use by any nearby 802.11 WiFi network. The chip then continuously monitors all channels for possible interference and, if a potential interference is detected, the chip looks for another unused channel. The signals that can be processed can come from any source since the chip can process video in any standard format from MPEG-1 to MPEG 4, H 264. It should be pointed out that most systems in use do not use diversity antenna arrays and are therefore limited to transmitting fewer channels of video. 7.2.2.5 Broadband Microwave Microwave transmission uses ultra-high frequencies to transmit video signals over long distances. There are several frequency ranges assigned to the microwave systems all in the gigahertz ranges. Table 7-5 lists the broadband microwave frequencies bands available for transmission. 216 CCTV Surveillance FREQUENCY BAND CHANNEL FREQUENCY 900 MHz* 1.2 GHz 2.4 GHz* 5.8 GHZ* 902–928 1.2–1.7 2.4–2.5 5.6–5.8 L 1.7–1.9 S 2.2–2.5 C1 C2 3.1–3.5 C3 X 6.2–6.4 8.2–8.6 K 21.2–23.6 4.4–5.0 NUMBER OF CHANNELS OUTPUT POWER/ RANGE 4, 2 SIMULTANEOUSLY 4, 2 SIMULTANEOUSLY 4 SIMULTANEOUSLY 11 SIMULTANEOUSLY 50–500 mW SHORT/ MEDIUM RANGE 300–2000 ft SYSTEM DEPENDENT 0.25–5 W LONG RANGE 1–20 MILES SITE 1 SITE 2 NARROW BEAM TRANSMITTER/ DISH/HORN DISH/HORN/LOW NOISE RECEIVER LONG-RANGE TRANSMISSION CAMERA MONITOR * NO FCC LICENSE REQUIRED FOR THESE LOW POWER TRANSMITTERS FCC LICENSE REQUIRED OR FOR GOVERNMENT USE ONLY FOR ALL OTHERS Table 7-5 Broadband Microwave Frequencies for Video Transmission The wavelength of these frequencies is very short and gives rise to the term “microwave.” These high-frequency signals are especially susceptible to attenuation and must therefore be amplified frequently if long distances (20– 50 miles) separate the transmitter and receiver. Repeaters at intermediate locations between the transmitter and receiver are used when distances exceed 20–30 mi. In order to maximize the strength of the high-frequency signal, focused antennas are used at both ends. Since the microwave frequencies have characteristics similar to light waves, these antennas can take the form of concave metal dishes that collect the maximum amount of incoming signal and reflect it to the receiver detector. The requirement for these tightly focused antennas limits the microwave application, and it is clearly a point-to-point rather than a broadcast transmission system. These microwave signals will not be passed through buildings, uneven terrain, or any other solid objects. Broadband microwave technology used as a video transmission media is used to interconnect LANs between buildings and over long distances. The microwave dishes must be line-of-sight from transmitter to receiver to collect the microwave signals reliably. Using the microwave technology requires FCC licensing, however once the license is granted for any particular location, that frequency band cannot be licensed to anyone else for any purpose within a 17.5 mi. radius. 7.2.2.6 Infrared (IR) Infrared (IR) links use IR signals to transmit video, data, and control signals. These IR transmission paths must be set up in a line of sight configuration or the IR signal can be reflected off an infrared reflecting surface (mirror). The major advantage of infrared transmission is its ability to carry a high-bandwidth signal and its immunity to tapping. Its major disadvantage is that the IR beam can be obstructed and it cannot pass through most solid objects. The IR emitter is in the form of an LED or ILD. 7.3 VIDEO IMAGE QUALITY In both legacy analog and digital video surveillance systems, the criteria for image quality include resolution, frame rate, and color rendition. In digital video monitoring and surveillance applications each camera generates a stream of sequential digital images typically at a rate of 2–20 per second, or 30 per second for real-time. In the video application the data network must be capable of sustaining a throughput required to deliver the packets comprising the video streams being generated by all the cameras. This is one measure of QoS, but QoS also encompasses latency (the delay between transmitting and receiving packets) and jitter (the variations in that delay from packet to packet). The QoS criterion is generally applied to the forward video signal direction since the vast majority of traffic results from these video streams from the camera to the monitor and recorder. The QoS does apply in some cases where the cameras offer centralized in-band control, whether to simply adjust settings from time-to-time or to PTZ the cameras in real-time. The Internet and other IP-based networks increasingly are being used to support real-time video applications, voice, and audio, all of which are extremely demanding in terms of latency, jitter, and signal loss. The Internet and its original underlying protocols were never intended to support QoS, which is exactly what each of these traffic types requires. The real-time streaming protocol (RTSP) is an application layer (Layer 7, Section 7.8.3) protocol for control over the delivery of data that has real-time properties including both live video data feeds, stored video clips, and audio. 7.3.1 Quality of Service (QoS) The QoS describes the video image quality and intelligence in the digital video image as determined by the video frame rate and resolution (number of pixels). Digital Transmission—Video, Communications, Control The QoS is defined as the control of four network categories: (1) bandwidth, (2) latency, (3) jitter, and (4) traffic loss. Bandwidth is defined as the total network capacity. Latency is the total time it takes for a frame to travel from a sender to a receiver. Latency can be crucial with receivers having QoS requirements. Packets arriving too early require buffering, or worse they may be dropped. Packets arriving to late are not useful and must be discarded. Jitter is the variation in the latency among a group of packets between two nodes. Jitter requires a receiver to perform complex buffering operations so that packets are presented to higher levels with a uniform latency. Traffic loss refers to the packets that never arrive at the receiver. The video signal requires compression to fit into the bandwidth available in the communication channel, and for the practical compression techniques used, this compression always results in signal degradation (exception lossless transmission). Data transmission is generally considered moving in one direction in a video monitoring or surveillance application: that is the vast majority of traffic results from the video streams flowing from the camera to the monitor or video recorder. There is some traffic that flows in the other direction including controls for the camera functions. 7.3.2 Resolution vs. Frame Rate Resolution is a measure of how clear and crisp an image appears on the monitor. Each of the individual video components included within a system contributes to the overall image quality, either recorded or displayed on the monitor. The resultant image quality is only as good as the equipment component having the lowest resolution. When a high resolution monitor is combined a with a low resolution camera, the result is a low resolution image display. This fact becomes increasingly important when using the system for recording, as the playback image from the recorder is generally less than that obtained when displayed directly on the monitor. The image quality of the video signal is dependent on: (1) the video frame rate required to reproduce motion in the scene, (2) the resolution required to convey the intelligence required in the scene, and (3) the bandwidth available for the transmission. For a practical transmission with existing communication channels the video signal must first be digitally compressed to fit into the available bandwidth. To achieve the necessary intelligence in the image, the resolution required for the application must be specified and the network must have sufficient bandwidth. When more than one video image (or additional information) is to be displayed on a video monitor, a format called Common Intermediate Format (CIF) is used. Most digital video systems with standard 4 × 3 formats display three different resolutions: (1) full screen 704 × 480 pixels 4 × 3 resulting in the highest resolution, (2) 1/4 217 screen 352 × 240 having a proportionally lower but often adequate resolution, and (3) full screen having 704 × 240 pixels. The 320 × 240 pixels requires 1/4 the bandwidth and has a 4× faster image transfer rate. The 704 × 240 has 1/2 the bandwidth of the 704 × 480 system. The 1/4 CIF format has a resolution of 352 × 240 pixels with the NTSC system and 352 × 288 pixels with the PAL system. The three formats described above referred to as CIF are summarized in Table 7-6. Their relative sizes are shown in the inset drawing. It is often desirable to display the digital video image on only part of the display screen when the screen is being shared with other systems functions (alarms, access control, etc.). In this case the 1/4 CIF is most appropriate. Since the 1/4 CIF requires only 1/4 the bandwidth, it can display the image at 4 × the CIF rate. 7.3.3 Picture Integrity, Dropout It is very important during digital video signal transmission that the video image have integrity throughout the transmission. The various compression and transmission technologies used for transmitting the video signal have different vulnerabilities to noise and external interference and cause the video image to be degraded in different ways. The temporary loss of the digital signal causes image pixelation or picture breakup which results in the loss of “blocks” of pixels causing parts of the image to be absent and displaying an incomplete picture. In worst case, when the video signal strength (S/N) is sufficiently low and synchronization is lost, video frame “lock-up” occurs and the last full frame transmitted may be displayed as a full frame, partial frame, or none at all. For general surveillance video surveillance applications, degradation or temporary loss of a few frames of video signal can be tolerated. However, in most security applications and especially in strategic surveillance applications this is unacceptable. 7.4 VIDEO SIGNAL COMPRESSION Video signal compression is the process of converting analog video images into smaller digital files for efficient transfer across a network. Compression provides reduced bandwidth, quicker file transfers, and reduced storage requirements. Compression and decompression are accomplished through the use of special software or hardware or in some cases both. From the earliest days, video (consumer television) has been a bandwidth hog. Standard broadcast channels require from 4 to 6 MHz of bandwidth to produce a complete picture and sound at full frame rates of 30 fps. In digitized form the signal requires data rates on the order of 2 Mbps. 218 CCTV Surveillance QCIF CIF 2 CIF 4 CIF PIXEL FORMAT ASPECT RATIO: 1.222 PIXEL COUNT 1/4 352 × 240 (NTSC) 352 × 288 (PAL) 88,480 101,376 1/4 SCREEN 2 CIF FULL 704 × 240 (NTSC) 704 × 288 (PAL) 168,960 202,752 FULL SCREEN-1/2 VERTICAL RESOLUTION QCIF 1/16 176 × 120 (NTSC) 176 × 144 (PAL) 21,120 25,344 4 CIF * FULL 704 × 480 (NTSC) 704 × 576 (PAL) 337,920 405,504 CIF FORMAT CIF SCREEN AREA DISPLAY 1/16 SCREEN FULL SCREEN-TWICE THE VERTICAL AND HORIZONTAL RESOLUTION OF CIF * 4 CIF RESOLUTION IS SLIGHTLY HIGHER THAN THAT OF VGA (640 × 480) Table 7-6 Common Intermediate Format (CIF) Parameters In the 1980s this bandwidth limitation for transmitting video signals was addressed by the US government Defense Advanced Research Project Agency (DARPA) to compress NTSC and HDTV type video streams to fit within available bands of the radio frequency spectrum. One result of the initial work done by DARPA and MPEG was the evolution of a family of video compression standards that apply directly to real-time video applications. The MPEG group was founded under the International Organization for Standardization (ISO) and created the first compression standard MPEG-1, in 1992. This standard was directed toward single speed applications like CD-ROM and is still used in today’s camcorders and video CD movie rentals. Two years later, MPEG-2 followed, which added frame interlace support and was directed toward applications such as digital TV (DTV) and digital video disk (DVD). A video stream consists of a series of still images or frames displayed in rapid succession. Each digital image is in the form of a rectangle consisting of an array of picture elements known as pixels. Each pixel represents the light intensity that the camera sees in either black and white (monochrome) or color, at that pixel location. The NTSC display contains 720 × 480 pixels which is known as a 4 × 3 aspect ratio. High definition television (HDTV) has a higher pixel count of 1920 × 1024. Table 7-7 summarizes the Advanced Television Systems Committee (ATSC) digital television standards. In monochrome cameras the intensity is represented by a single pixel. In color cameras the sensors are grouped together in three pixels: one for red, one for green, and one for blue (RGB). The combination of these three colors in different proportions produces every other color. To convert to digital form the output from each pixel in the camera sensor is converted to digital values by use of an A/D converter. For the monochrome camera each pixel is converted into an 8-bit value representing the intensity of the image on the pixel. For the color camera the 8-bit value and an additional 16 bits are used to digitize all three colors (red, green, and blue), resulting in 24 bits (eight bits for each color). Why is digital video signal compression required? Without video compression an enormous amount of bandwidth is required to efficiently transfer video across a network. A 24-bit color video stream at 640 × 480 resolution transferring 30 frames in one second creates almost 30 MB (megabyte) of data. Compression schemes for sending data over a restricted bandwidth have existed for years with the “zip” file of lossless compressed data being a popular program. This lossless compression, however, is not sufficient or suitable for video transmission and does not take into account an advantage of unique features of video transmission. In particular, individual frames of video often contain repetitious material and often have only small portions of the image or frame that change from frame to frame. The zip compression program does not take advantage of this feature. There are two generic types of digital video compression: lossless and lossy. Lossless as the name implies means that all the information to reproduce every pixel present in the camera output is transmitted to the monitoring Digital Transmission—Video, Communications, Control DTV FORMAT INDEX VERTICAL RESOLUTION (PIXELS) HORIZONTAL RESOLUTION (PIXELS) SCREEN FORMAT ASPECT RATIO SCAN TYPE REFRESH RATE (Hz) INTERLACED 1 2 640 3 4×3 PROGRESSIVE 4 5 6 7 INTERLACED 480 704 4×3 PROGRESSIVE 8 INTERLACED 9 10 704 11 16 × 9 PROGRESSIVE FORMAT DESCRIPTION H × V, fps i or p * 30 640 × 480, 30i 24 640 × 480, 24p 30 640 × 480, 30p 60 640 × 480, 60p 30 704 × 480, 30p 24 704 × 480, 24p 30 704 × 480, 30p 60 704 × 480, 60p 30 704 × 480, 30i 24 704 × 480, 34p 30 704 × 480, 30p 12 60 704 × 480, 60p 13 24 1280 × 720, 24p 30 1280 × 720, 30p 14 720 1280 16 × 9 PROGRESSIVE 15 INTERLACED 16 17 1080 1920 16 × 9 PROGRESSIVE 18 DTV—DIGITAL TELEVISION ATSD—ADVANCED TELEVISION SYSTEMS COMMITTEE SDTV—STANDARD DEFINITION TELEVISION EDTV—ENHANCED DIGITAL TELEVISION HDTV—HIGH DEFINITION TELEVISION Table 7-7 60 1280 × 720, 60p 30 1920 × 1080, 30i 24 1920 × 1080, 24p 30 1920 × 1080, 30p 219 FORMAT TYPE SDTV EDTV HDTV * i—INTERLACED SCAN p—PROGRESSIVE SCAN fps—FRAMES PER SECOND ATSC Digital Television Standard Scanning Formats site and reconstructed without any loss in picture quality. This means that the compression algorithms must be able to accurately reconstruct the uncompressed video signal. Lossy compression means that the reconstructed (decompressed) signal can not exactly re-create the original video signal. The following is a calculation of the number of uncompressed RGB signal bits that must be transmitted for a single frame of NTSC video if no compression were to take place: To transmit 1 frame = 720 pixels × 480 pixels × 24 pixel Spatial redundancy means that neighboring pixels within a video frame are more likely to be close to the same value (in both brightness and color) than far apart. Temporal redundancy means that neighboring frames in time tend to have a great deal of similar content, such as background information, that is either stationary or moving in predictable ways. Any compression system will perform better if the video signal is preconditioned properly. In practice this means removal of the noise that would otherwise consume precious bits. Figure 7-11 illustrates some examples of spatial and temporal redundancies in a typical video image. = 8294400 bits To transmit 1 second of video = 8294400 × 30 fps = 248832000 bits From the above it can be seen that it takes over 248 Mb to transmit 1 second of uncompressed full-color video. Clearly few transmission channels can afford to provide this much bandwidth for transmitting any video signals. For this reason some scheme of compression of video signals is required to make a practical remote video security system. Video compression takes advantage of enormous spatial and temporal redundancies in natural moving imagery. 7.4.1 Lossless Compression Lossless compression is the process of compressing 100% of video data with zero loss. This type of compression does not compress as much as lossy compression since every piece of data is retained. The benefit of this compression is that video data can be compressed and decompressed over and over without any video data degradation. Lossless compression algorithms compress the video data into the smallest package possible without losing any information in the scene. The zip file for standard data (not video) is an example of a lossless compression algorithm since the 220 CCTV Surveillance (A) SPATIAL ONLY MOVEMENT: WATER IN POOL AND SWIMMERS SWIMMERS ENTER POOL—ONLY AREA OF INTEREST (B) TEMPORAL ONLY MOVEMENT: PERSON TRAVERSING FENCE LINE (C) SPECTRAL PERSON DRESSED IN RED DETECTED • MJPEG OPERATES ON SPATIAL REDUNDANCY, NOT TEMPORAL • MPEG OPERATES ON SPATIAL, SPECTRAL (COLOR) AND TEMPORAL REDUNDANCY FIGURE 7-11 Spatial and temporal redundancies in video images data that is compressed can be decompressed and an exact duplicate of the original re-created at the receiver end. Lossless compression generates an exact duplicate of the input data scene after many compression/decompression cycles: no information is lost. This method, however, can only achieve a modest amount of compression. Typical compression ratios for lossless transmission are from 2:1 to 5:1. 7.4.2 Lossy Compression In the case of the video signal it is often not necessary that each bit of data be re-created exactly as in the original camera image. Depending on the video quality required at the monitoring location, often much of the video information can be tossed away without noticeably changing the video image that the user sees. The exclusion of this extraneous video results in the ability to achieve high compression rates. Lossy compression achieves lower bit counts than lossless compression by discarding some of the original video data before compression. Video data degradation does occur with lossy compression when it is compressed and decompressed over and over. In other words, every time video data is compressed and decompressed, less of the original video image is retained. Two common methods for compression are discrete cosine transform (DCT) and discrete wavelet transform (DWT). 7.4.2.1 Direct Cosine Transform (DCT) The DCT is a lossy compression algorithm that samples the image at regular intervals. This transform divides the video image into 8 × 8 blocks and analyzes each block individually. It analyzes the components of the image and discards those that do not affect the image as perceived by the human eye. JPEG, MPEG, M-JPEG, H.261, H.263, and H.264 incorporate DCT compression. Lossy compression can eliminate some of the data in the image at a sacrifice to the quality of the image produced. This reduction in bits transmitted, however, provides greater compression ratios than lossless compression and therefore requires less bandwidth. The choice of lossless or lossy compression results in a trade-off of file size vs. image quality. Lossy compression discards redundant information and achieves much higher compression at the sacrifice of not being able to exactly reproduce the original video scene. Typical compression ratios for lossy transmission are from 20:1 to 200:1. Digital Transmission—Video, Communications, Control 7.4.2.2 Discrete Wavelet Transform (DWT) Wavelet video compression, rather than operating on pieces of the image, operates on the entire image. The transformation uses a series of filters that determines the content of every pixel in the image. Because the technology works on the entire image there is no mosaic effect when the image is viewed as is sometimes experienced with DCT. While wavelet technology is a lossy compression technique, the lossy effects are not apparent until very high compression ratios of 350:1 are reached. Wavelet compression uses multiple single recorded frames to create a video sequence. It differs from others in that it compresses files more tightly with average file sizes for a wavelet image of about 12 Kb or 360 Kbps at 30 fps. Wavelet compression is based on full frame information and on frequency, not on 8 × 8 pixel blocks as in DCT. Wavelet compression compresses the entire image with multiple filtering at both the high and low frequencies and repeats the procedure several times. This compression method offers compression ratios up to 350:1. 7.4.3 Video Compression Algorithms Many compression algorithms have evolved over the years to address specific digital data transmission requirements. The International Telecommunications Union (ITU) and the International Organization for Standards (ISO) have developed video compression technology and standards that meet and exceed the requirements for most of today’s video security applications as well as anticipated future requirements. The compression standards that are specifically directed toward transmitting single frame and streaming video signals include: (1) MPEG-2, (2) MPEG4, (3) JPEG, (4) M-JPEG, (5) JPEG-2000, (6) wavelet, (7) H.263, (8) H.264, and (9) super motion image compression technology (SMICT). The required video frame rates for a security application are primarily determined by the motion in the scene (activity) and the number of pixels required for the specified resolution. When there is little motion in the scene or if the motion is slow, very often less than 30 fps are sufficient to obtain the necessary intelligence in the scene. This reduces the required bandwidth for the transmission of the digital video signal. Frame rates as low as 5 fps can be useful. 7.4.3.1 Joint Picture Experts Group: JPEG The JPEG is the oldest and most established compression technique and is generally applicable to still images or single frames of video. This compression technique divides the image into 8 × 8 blocks of pixels with each block a signed number (plus or minus) and code (Figure 7-12). The DCT compression software examines the blocks and their size and determines which blocks are redundant and 221 not essential in creating the image. The program transmits the blocks that are essential, which is a reduced number based on the level of compression determined by the system settings. The compression ratio is limited to approximately 10:1. New compression algorithms are evolving that have built upon JPEG and provide higher compression ratios and have higher signal quality with smaller bandwidth requirements. The JPEG uses still images to create a video stream and has an average image file size of about 25 Kb per frame or 750 Kbps at 30 fps. 7.4.3.2 Moving—Joint Picture Experts Group: M-JPEG The M-JPEG compression technology creates a video sequence (stream) from a series of still frame JPEG images. The average file size of an M-JPEG image is about 16 Kb per frame or 480 Kbps at 30 fps. The M-JPEG is a lossy compression method designed to exploit some limitations of the human eye, notably the fact that small color changes are perceived less than small changes in brightness. With a compression ratio of 20:1, compression can be achieved with only a small fraction of image degradation. 7.4.3.3 Moving Picture Experts Group: MPEG-2, MPEG-4, MPEG-4 Visual 7.4.3.3.1 MPEG-2 Standard The MPEG-2 is the successor to MPEG-1 and has the primary goal of transmitting broadcast video at bit rates between 4 and 9 Kbps. It produces high-quality live camera images using a relatively small amount of bandwidth per camera. It is capable of handling high-definition television (HDTV) and has been adopted as the digital television standard by the FCC and is the compression standard for DVDs. The MPEG-2 NTSC standard has a resolution of 720 × 480 pixels and incorporates both progressive and interlaced scanning although progressive scanning is rarely used in video security applications. Interlaced scanning is the method used in the video security industry to produce images on surveillance monitors. The MPEG-2 and MPEG-4 are based on the group of images (GOI) concept as defined by an I-frame, P-frame, and B-frame (Figure 7-13). The technology’s basic principle is to compare two compressed image groups for transmission over the network. The first frame group is called the I-frame (intra-frame), and uses the first compressed image as a reference frame. This image serves as the reference point for all frames following it that are in the same group. Following the I-frame come the P-frames (predictive), that are coded with reference to the previous frame and can either be an I-frame or another P-frame. The P-frames include the changes, i.e. movement and activity from the leading Iframe. B-frames (bi-directional) are compressed with a low 222 CCTV Surveillance IMAGE CUT INTO 8 × 8 TILES 8 × 8 TILE FOR EACH COLOR (R, G, B) VIDEO IMAGE B G R 8 × 8 TILES ZIGZAG SCANNED AT 64 FREQUENCIES DC EACH TILE PROCESSED BY COMPUTER USING DCT ALGORITHM DCT CONVERTS 8 × 8 TILE DC = OVERALL TILE BRIGHTNESS HIGH FREQUENCY = DETAILS IN TILE IMAGE DCT DISCRETE COSINE TRANSFORM ACHIEVES COMPRESSION BY DISCARDING INTRAFRAME SPATIAL AND SPECTRAL (COLOR) REDUNDANCIES FIGURE 7-12 HIGH FREQUENCIES JPEG lossless compression technique bit rate using both the previous and future references (I and P). B-frames are not used as references. Typical GOI lengths are usually 12 or 16 frames. The network viewing stations reconstruct all images based on the reference I images and the difference data in the B- and P-frames. The detail relationship between the three frame types are described in the MPEG standard. The MPEG-2 and MPEG-4 can achieve compression ratios up to approximately 60–100 to 1. 7.4.3.3.2 MPEG-4 Standard The MPEG-4 standard was introduced in 1998 and has evolved into the first true multimedia and Web compression standard because of its low bit-rate transmission and incorporation of audio and video with point-and-click interaction capabilities. The MPEG-4 uses the GOI concept and I-, P-, B-frames but in addition uses object-based compression where individual objects within a scene are tracked separately and compressed together. This method offers a very efficient compression ratio that is scalable from 20:1 to 300:1. The primary uses for the MPEG-4 standard are web–streaming media, CD distribution, video- phone, and broadcast television. The MPEG-4 consists of several standards-termed layers: • Layer 1 describes synchronization and multiplexing of video and audio. • Layer 2 is a compression codec for video signals. • Layer 3 is a compression codec for perceptual coding of audio signals. • Layer 4 describes procedures for testing compliance. • Layer 5 describes systems for software simulation. • Layer 6 describes delivery multimedia integration framework. • Layer 10 is an advanced codec for video signals, also called H.264. 7.4.3.3.3 MPEG-4 Visual Standard The MPEG-4 Visual became an international standard in 1999 with its main feature being the support of objectbased compression. Objects in the scene after appropriate identification (segmentation) can be coded as separate bit streams and manipulated independently. This is an important attribute for video security applications. If the target can be automatically recognized, tracked, and segmented from the scene, it can be coded separately from Digital Transmission—Video, Communications, Control 223 I, P, AND B FRAMES AND MOTION PREDICTION I B B P B B P B B P I = REFERENCE FRAME P = PREDICTIVE FRAME B = DIFFERENCE FRAME • I–FRAME IS ENCODED AS A SINGLE IMAGE WITH NO REFERENCE TO PAST OR FUTURE FRAMES. • P–FRAME IS ENCODED RELATIVE TO THE PAST REFERENCE FRAME. A REFERENCE FRAME CAN BE A P OR AN I–FRAME • B–FRAME IS ENCODED RELATIVE TO THE PAST REFERENCE FRAME, THE FUTURE REFERENCE FRAME, OR BOTH FRAMES. THE FUTURE REFERENCE FRAME IS THE CLOSEST FOLLOWING REFERENCE FRAME (I OR P) FIGURE 7-13 MPEG-2 and MPEG-4 compressed image frames: Reference I, difference B, and predictive P and where appropriate, with higher quality (resolution) than the other areas of the scene. The MPEG-4 Visual has enhanced functionality compared to MPEG-2. Spatial prediction within I-frames and enhanced error resiliency are two such features. Improved prediction and coding separately improve compression by 15–20% compared to MPEG-2. An advanced feature of MPEG-4 Visual is global motion compensation (GMC). This is especially useful for PTZ applications and mobile applications involving moving ground vehicles, aircraft, and ships, in which camera movement induces most of the image motion. The GMC mode reduces the motion information change to a few parameters per frame as opposed to a separate motion vector for each block of the image. The GMC can lead to significant bit-rate savings in these PTZ motion applications. The MPEG-4 Visual compressors and decompressors (CODECS), having both chips and software, are most often used for the Internet and cell phone applications. 7.4.3.4 MPEG-4 Advanced Video Coding (AVC)/H.264 An improvement over MPEG-4 Visual: MPEG-4 Advanced Video Coding (AVC), also referred to as H.264, offers greater flexibility and greater precision in motion vectors (activity in the scene). The intent of the standard was also to create one that would be capable of providing good video quality and bit rates that were half or less than previous standards relative to MPEG-2, H.263, or MPEG-4. The MPEG-4 AVC/H.264 is the most recent video compression standard introduced in 2003. The AVC was jointly developed by MPEG and ITU—a developer of video conferencing standards that calls it H.264. The MPEG-4 AVC achieves better performance than MPEG-2 by about a factor of two, producing similar quality at half the bit rate. The improved performance is mainly due to increased prediction efficiency both within and the between frames. The MPEG-4, MPEG-4 Visual (with or without GMC), and MPEG-4 AVC are superior to MPEG-2 in terms of raw efficiency (quality per bit) and are also more network friendly than MPEG-2. The H.264 compression system dramatically lowers the bandwidth (by 2 times) required to deliver digital TV (DTV) channels and provides new security business models at a significantly lower cost. Current standard-definition (SD) and the high-definition (HD) digital video are based almost entirely on MPEG-2, the 10-year-old standard that has nearly reached the limit of its video compression efficiency. The MPEG-4 AVC compression was developed specifically by and for television broadcasting, whether via terrestrial, cable, satellite, or Internet delivery. It uses the same protocol and modulation techniques as MPEG-2 so that MPEG-4 AVC is immediately deployable. By using 224 CCTV Surveillance the same protocol and modulation techniques, MPEG-4 AVC compression reduces the bandwidth by a factor of two, thus requiring 50% less bandwidth or storage capacity compared with MPEG-2 to deliver the same video quality. This means that instead of having to transmit HDTV at 19 Mbps and SD at 4 Mbps, equivalent HD picture quality is obtained at about 8 Mbps, and SP at 2 Mbps, and DVD quality video at less than 1 Mbps. The technology offers greater efficiency and reception with cell phones, PDAs, and specialized pagers. MPEG-4 AVC permits both progressive and interlaced scanning. The MPEG-4 AVC reaches compression ratios for low motion images of 800:1 to 1000:1. With images containing a high level motion, MPEG-4 AVC reaches compression ratios of 80:1 to 100:1. 7.4.3.5 JPEG 2000, Wavelet A newer standard for JPEG compression is JPEG 2000 based on wavelet compression algorithms. It has the potential to provide higher resolution at compression ratios of 200:1. The JPEG 2000 was created as the successor to the original JPEG format developed in the late 1980s and is based on state-of-the-art wavelet techniques that provide better compression and advanced system-level functionality. Wavelet video compression operates on the entire image at once, rather than on pieces of the image (Figure 7-14). Wavelet compression in contrast to JPEG and MPEG algorithms is based on full-frame information and on signal frequency components. It does not divide the image into 8 × 8 pixel blocks but analyzes the entire image as a single block. The JPEG 2000 improves download times of the still image by compressing images to roughly half the size of JPEG. In addition JPEG 2000 permits viewing “something” (a low resolution picture) while waiting for the full highresolution picture to develop on the screen. JPEG 2000’s progressive display initially presents a low-quality image and then updates the display with increasingly higher VIDEO IMAGE SCANNED FREQUENCY PLANE DC TO HIGHEST FREQUENCY DC • ALGORITHM CONSISTS OF PAIRS OF HIGH-PASS AND LOW-PASS FILTERS • IMAGE ANALYZED BEGINNING WITH DC AND PROGRESSING TO HIGHEST FREQUENCY HIGHEST FREQUENCY FREQUENCIES ASCEND IN THE ORDER: LL3, HL3, LH3, HH3, HL2, LH2, HH2, HL1, LH1, HH1 DC LOW FREQUENCY DC LL 3 H L 3 L H 3 HH 3 LH 2 HL 2 HL 1 HH 2 LH 1 • DWC = DISCRETE WAVELENGTH COMPRESSION • LL = LOWEST FREQUENCY COMPONENTS • HH = HIGHEST FREQUENCY COMPONENTS FIGURE 7-14 Wavelet compression technology ONLY HIGH FREQUENCY COMPONENTS ANALYZED HERE HH 1 Digital Transmission—Video, Communications, Control quality images. Wavelet compression is similar to JPEG in that it uses multiple single recorded frames to create a video sequence. The average file size for a wavelet image is about 12 Kbps at 30 fps. Wavelet compression compresses the entire image with multiple filtering, and filters the entire image, both high and low frequencies, and repeats this procedure several times. There is no mosaic effect once the images are viewed because the technology works on the entire image at once. 7.4.3.6 Other Compression Methods: H.263, SMICT 7.4.3.6.1 H.263 Standard The H.263 standard was developed for video conferencing using transmission networks capable of rates below 64 Kbps. It works much the same way MPEG-1 and MPEG2 work but with reduced functionality to allow very low transmission rates. The H.263 is similar to JPEG except that it only transmits the pixels in each image that have changed from the last image, rather than full images. Often the two consecutive images (frames) from a camera are essentially the same and so the H.263 standard takes advantage of this characteristic and uses a frame differencing technique that sends only the difference from one frame to the next. TYPE COMPRESSION TRANSFORM TRANSFORM BIT RATE JPEG FRAME -BASED DCT * 8 Mbps MJPEG FRAME -BASED DCT 10 Kbps to 3 Mbps MPEG -1 STREAM -BASED DCT 1.5 Mbps MPEG -2 STREAM -BASED DCT MPEG -4 PART 2 STREAM -BASED H.263 225 7.4.3.6.2 SMICT Standard The super motion image compression technology (SMICT) standard has almost the same characteristics of H.264. Based on redundancy in motion, it combines digital signal processing (DSP) hardware compression, with CPU software compression. Utilizing an intelligent nonlinear super motion CODEC, SMICT intelligently analyzes the motion changes in the scene that occurred within the frame, eliminates the redundant portion of the image that need not be stored, and compresses the delta (or change) based on motion. Table 7-8 compares the significant parameters of some of the video compression techniques. The MPEG-7 and the MPEG-21 are new standards being considered. 7.5 INTERNET-BASED REMOTE VIDEO MONITORING—NETWORK CONFIGURATIONS Wired and wireless digital video networks using LANS, WANS, WiFi, and the Internet have made AVS possible. The digital video signal must be transmitted from the camera location to the monitoring location. For the case of wireless networks there are four basic configurations that are used: (1) point to point—also known as peer to RESOLUTION FRAME RATE (fps) LATENCY (TIME LAG) 0–5 APPLICATIONS COMMENTS STORING STILL VIDEO FRAMES NOT SUITABLE FOR MOTION VIDEO 0–30 LOW IP NETWORKS BROADCAST JPEG PLAYED IN RAPID SUCCESSION 352 × 288 (PAL) 352 × 240 (NTSC) UP to 30 MEDIUM VIDEO CD SOME DVRS CIF SIZE, VHS TAPE QUALITY 2 Mbps to 15 Mbps 720 × 576 (PAL) 720 × 480 (NTSC) 24–30 MEDIUM HDTV BROADCAST QUALITY DCT AND WAVLET 10 Kbps to 10 Mbps 640 × 480 to 4096 × 2048 1–60 MEDIUM CCTV WHEN HIGH FRAME STREAMING VIDEO RATES REQUIRED OR WHEN SCENE ACTIVITY INTERNET (WEB) IS LOW TO MEDIUM STREAM -BASED DCT 30 Kbps to 64 Kbps 128 × 96 to 704 × 480 10–15 LOW TELECONFERENCE VIDEO STREAMING H.264/AVC MPEG -4 PART 10 STREAM -BASED DCT 64 Kbps to 240 Mbps 4096 × 2048 0–30 LOW HIGH SPEED VIDEO NEAR BROADCAST QUALITY. COMPRESSES VIDEO FAR MORE EFFICIENTLY THAN MPEG -4, PART 2 JPEG2000 WAVELET FRAME -BASED WAVELET 30 Kbps to 7.5 Mbps 160 × 120 320 × 240 8–30 HIGH SOME CCTV RECORDING LAG, LIMITED USE IN SECURITY BROAD RANGE SMART CARD MULTI-MEDIA CONTENT NOT YET IN SECURITY ANY SIZE MPEG -7 * DIRECT COSINE TRANSFORM. USES INTRA FRAMES (I), PREDICTED FRAMES (P), AND BI -DIRECTIONAL FRAMES (B). I, P, AND B ARE CALLED GROUP OF PICTURES (GOP). AVC—ADVANCED VIDEO CODING Table 7-8 Comparison of Most Common Compression Standards 226 CCTV Surveillance peer, (2) multi-point to point, (3) point to multi-point, and (4) mesh. This section describes the four configurations used. 7.5.1 Point to Multi-Point The point-to-multi-point wireless systems use IP packet radio transmitters and standard Ethernet interfaces to enable high-speed network connections to multiple Ethernet switches, routers, or PCs from one single location (Figure 7-15). The network cameras can be connected and conveniently located wherever necessary. Transmission capacities vary from 10 to 60 Mbps and operate at distances up to 10 miles. The point to multi-point (multi-casting) is like a radio or television station in which one signal (station or channel) is broadcast and can be heard (or viewed) by many different users in the same or different locations. With IP multi-cast, the video server needs to transmit only a single video stream for each multicast group regardless of the number of clients that will view the information. where only a single camera or sensor and a single monitoring location is used and only one to one camera control functions are required (Figure 7-16). These systems offer higher capacities and greater distances than the point-to-multi-point systems. They are ideal for transmitting video signals from a local central site where a base station is located, to a central command and control center that is located much farther away. Pointto-point systems can connect to remote sites up to 40 miles away from the monitoring site and have transmission bandwidth capacities ranging from ten to several hundred megabits per second. 7.5.3 Multi-Point to Point The multi-point to point is most commonly used when multiple video cameras are multiplexed into a central control point. Multi-point-to-point systems transmit the video signal from multiple cameras to the remote systems monitoring location (Figure 7-17). 7.5.4 Video Unicast and Multicast 7.5.2 Point to Point Point-to-point wireless video transmission is used in simpler systems to provide connectivity between two locations A video broadcast sends out a video data packet intended for transmission to one or multiple nodes on the network. A unicast signal is sent from source to viewer as a standalone stream and required that each viewer have his SITE 2 PDA WITH WIFI CARD BRIDGE WIRELESS LINKS: 802.11 SITE 3 SITE 1—BASE STATION ACCESS POINT SERVER ANALOG CAMERA BNC RJ45 IP CAMERA SITE 4 LAN TOWER SERVER IP DOME BNC RJ45 ANALOG PTZ FIGURE 7-15 Point to multi-point wireless network SITE 5 LAPTOP Digital Transmission—Video, Communications, Control TWO LOCATIONS SITE 1—BASE STATION SITE 2 SERVER ANALOG CAMERA BRIDGE WIRELESS LINK: 802.11 IP CAMERA RJ45 PDA BNC ACCESS POINT LAN TOWER LAPTOP SERVER IP DOME BNC RJ45 ANALOG PTZ FIGURE 7-16 Point to point wireless network IP CAMERA DOME SERVER SITE 1 BNC RJ45 ANALOG CAMERA 10BASE–T ETHERNET/IP NETWORK CENTRAL CONTROL ACCESS POINT SECURITY IP CAMERA S I TE 2 10BASE–T ETHERNET/IP NETWORK SERVER LAPTOP BNC RJ45 CPU CONTROL ANALOG PTZ VIDEO STORAGE DOME IP CAMERA SITE 3 100BASE–T FIBER OPTIC FIGURE 7-17 Multi-point to point wireless network 227 228 CCTV Surveillance own video viewer. A multicast stream allows multiple viewers on a network to all share the same feed. The benefit is in bandwidth consumption: for 20 people to view a 1 Mbps video stream as unicast feeds, they would consume a total of 20 Mbps of bandwidth 20 × 1 Mbps. If those same 20 viewers connected to the same feed as a multicast stream, assuming they are all on the same network, they would consume a total of 1 Mbps of bandwidth (Figure 7-18). 7.6 TRANSMISSION TECHNOLOGY PROTOCOLS: WiFi, SPREAD SPECTRUM MODULATION (SSM) Most wireless LAN systems use spread spectrum technology, a wideband radio frequency technique developed by the military for use in reliable, secure, missioncritical communications systems. Spread spectrum modulation (SSM) is designed to trade off bandwidth efficiency for reliability, integrity, and security. In other words, more bandwidth is consumed than in the case of narrowband transmission, but the trade-off produces a signal that is, in effect, louder and is easier to detect, provided that the receiver knows the parameters of the spread spectrum signal being broadcast. If a receiver is not tuned to the right frequency, a spread spectrum signal looks like background noise (Figure 7-19). In contrast to SSM, a narrowband radio system transmits and receives information at a specific radio frequency. Narrowband radio keeps the radio signal frequency as narrow as possible, just enough to pass the information. A private telephone line is much like a narrowband radio frequency. When each home in a neighborhood has its own private telephone line, people in one home cannot listen to calls made to other homes. SSM, privacy and non-interference are accomplished by the use of separate radio frequencies, and the radio receiver filters out all radio signals except the one to which it is tuned. The first publicly available patent on SSM came from the inventors Hedy Lamarr, the Hollywood movie actress, and George Antheil, an avant-garde composer. The patent was granted in 1942 but the engineering details were a closely held military secret for many years. The inventors never profited from their invention, they simply turned the patent over to the US government for use in the World War II effort, and commercial use was delayed VIDEO MULTICAST 1 Mbps SHARED AMONG ALL VIEWERS LCD DISPLAY TOWER VIDEO SOURCE KEYBOARD VIDEO UNICAST VIDEO SOURCE FIGURE 7-18 1 Mbps FOR EACH VIEWER Video unicast and video multicast configuration LAPTOP Digital Transmission—Video, Communications, Control 229 POWER CONTINUOUS WAVE (CW) SIGNAL SPREAD SPECTRUM SIGNAL FREQUENCY SPECTRUM FIGURE 7-19 Spread spectrum modulation (SSM) compared to narrow band transmission until 1985. It was initially developed by the military to avoid jamming and eavesdropping of communication signals. The present global positioning system (GPS), cellular phone, and wireless Internet transmission systems now represent the largest commercial SSM technology applications. The SSM technology provides reliable and secure communications in environments prone to jamming and/or signals prone to interception by third parties. Most SSM systems operate in the 900 MHz, 2.4 GHz, and 5.8 GHz bands, and require no licensing application and ongoing fees to anyone, providing the strict rules on signal specifications: bandwidth and power output, are adhered to. The SSM technology is currently the most widely used transmission technique for wireless LANs. The technique spreads the digital signal power over a wide range of frequencies within the band of transmission. The bands for commercial security video transmission range from 902 to 928 MHz, 2.4 to 2.484 GHz, and 5.1 to 5.8 GHz, all of which do not require an FCC license. There are two types of spread spectrum radio: frequency hopping (FH) and direct sequence (DS). In the 1960s Aerojet General first used the FH concept, the predecessor to SSM for military applications in which the signal frequencies were rapidly switched. The SSM is a similar concept to FH only performed at a much faster rate. The radio signal required very little transmitter power and was immune to noise and interference from other similar systems employing the exact same carrier frequency. The radio signal was secure and completely undetectable by signal spectrum analyzers then available. 7.6.1 Spread Spectrum Modulation (SSM) 7.6.1.1 Background The purpose of SSM is to improve (reduce) the bit error rate of the signal in the presence of noise or interference. This is achieved by spreading a transmitted signal over a frequency range greater than the minimum bandwidth required for information transmission. By spreading the data transmission over a large bandwidth, the average power level of any one frequency is reduced and less interference is caused to others in the band. Implemented appropriately, others will interfere less with the signal even if others do not employ SSM techniques. While the channel data may be analog or digital, for simplicity a basic digital system is considered. Frequency hopping the transmitter repeatedly changes the carrier frequency from one to another, referred to as hopping. The hopping pattern is usually controlled by a pseudo noise (PN) code generator. Any narrowband interference can only jam the FH signal for a short period of time in every PN code period. Direct sequence spread spectrum (DSSS) is the technology in most use today, and spreads the spectrum by modulating the original signal with PN noise. The PN is defined as a wideband sequence of digital bits, called chips that are employed to minimize confusion. The DSSS receiver converts this wideband signal into its original, narrow base-band signal by an operation known as de-spreading. While de-spreading its own signal, the receiver spreads any narrowband interfering signals, 230 CCTV Surveillance thereby reducing the interference power in the narrowband detection system. A typical spread spectrum radio transmitter transmits a sequence of coding bits, referred to as PN code, and spreads the signal over a radio spectrum 20 MHz wide per channel. At the receiver end both the desired and foreign signals are de-spread to effectively regenerate the desired signal and suppress the foreign signals. In a typical wireless LAN configuration, a transmitter/receiver (transceiver) device, called an access point, connects upstream to the wired network from a fixed location using standard cabling. 7.6.1.2 Frequency Hopping Spread Spectrum Technology (FHSS) Frequency hopping spread spectrum (FHSS) uses a narrowband carrier that changes frequency and a pattern known to both the transmitter and receiver. Properly synchronized, the net effect is to maintain a single logical channel. To an unintended receiver the FHSS appears to be short duration impulse noise. Figure 7-20 illustrates how FHSS works. The FHSS technique broadcasts the signal over a seemingly random series of radio frequencies and a receiver hops and follows these frequencies in synchronization while receiving the signal message. The message can only be fully received if the series of frequencies is known. Since only the intended receiver knows the transmitter’s hopping sequence, only that receiver can successfully receive all the signals. 7.6.1.3 Slow Hoppers With this technique the data signal is transmitted as a narrowband signal with a bandwidth only wide enough to carry the required data rate. At specific intervals this narrowband signal is moved or hopped to a different frequency within the allowed band. The sequence of frequencies follows a pseudo-random sequence known to both the transmitter and the receiver. Once the receiver has acquired the hopping sequence of the transmitter, one or more packets are transmitted before the frequency is hopped to the next channel. Many data bits are transmitted between hops. This technique is useful for narrowband data radios but not for wideband video signals. 7.6.1.4 Fast Hoppers Similar in manner to slow hoppers, fast hoppers make many hops for each bit of data that is transmitted. In this way each data bit is redundantly transmitted on several different frequencies. At the receiving end, the receiver need only receive a majority of the redundant bits correctly in order to recover the data without error. The real benefit of the fast hopper is that true process gain is provided by DWELL-TRANSMIT TIME FREQUENCY SLOTS (MHz) 928 TRANSMITTED FREQUENCY HOPS AS A FUNCTION OF TIME: f 1, f 2, • • •, f 7, • • • f2 f7 f5 f3 f1 BLANK-OFF TIME f4 f6 902 0 FIGURE 7-20 1 2 3 4 5 Frequency hopping spread spectrum (FHSS) technology 6 7 TIME Digital Transmission—Video, Communications, Control the system due to this real-time redundancy of data transmission. This allows interference to exist in the band that would effectively block one or more narrowband channels without causing loss of data. 7.6.1.5 Direct Sequence Spread Spectrum (DSSS) The DSSS method is the most widely used SSM technique and is currently used in most WiFi systems. The DSSS increases the rate of hopping so that each data bit can be even more redundantly encoded (more process gain) or that a higher bit rate can be transmitted as required in video signals. The DSSS generates a redundant pattern for each bit to be transmitted. This bit pattern is called a chip (or chipping code). It follows that the longer the chip, the greater the probability that the original data can be recovered and, of course, the more bandwidth required. Even if one or more bits in the chip are damaged during transmission, statistical techniques embedded in the radio can recover the original data without the need for retransmission. To an unintended receiver, DSSS appears as low-power wideband noise and is rejected (ignored) by most narrowband receivers. Figure 7-21 illustrates how this technology works. The FCC rules on signal specifications limit the practical data throughput for DSSS protocol to 2 Mbps in the 902 MHz band, 8 Mbps in the 2.4 GHz band, and 100 Mbps SIGNAL VOLTAGE in the 5.8 GHz band. The FCC also requires that transmitters must hop through at least 50 channels in the 902 MHz band and 75 channels in the 2.4 GHz band. The DSSS transmitters spread their transmissions by adding redundant data bits called “chips” to them. The DSSS adds at least 10 chips to each data bit. Once a receiver has received all of the signal and chip bits, it uses a correlator to remove the chips and collapses the signal to its original length. The IEEE 802.11 standard requires 11 chips for DSSS transmission. The DSSS system can operate when other systems such as microwave radio, two-way communications devices, alarm systems, and/or other DSSS devices are transmitting in close proximity. It also has the ability to select different channels to provide workarounds on the rare occasions that interference occurs. The magical and non-intuitive element of the DSSS system breakthrough is that by multiplying the PN DDSS spread signal with a copy of the same pseudo noise, the original data signal is recovered. This process is called correlation and only occurs if the codes are identical and perfectly aligned in time to within a small fraction of the code clock. By using concurrently different pseudo-random codes, multiple independent communications links can simultaneously operate within the same frequency band. To recover the specific encoded data channel, the inverse function is applied to the received signal. A major breakthrough in DSSS came when it was “ONE” DATA BIT 1 “ZERO” DATA BIT 0 TIME 0 1 MAXIMUM HOP RATE: • 2 Mbps IN 902 MHz BAND • 8 Mbps IN 2.4 GHz BAND • 100 bps IN 5.8 GHz BAND 10-CHIP CODE WORD FOR EACH “ONE” DATA BIT 0 TIME 0 1 SAME CHIP CODE WORD BUT INVERTED FOR “ZERO” DATA BIT 0 0 FIGURE 7-21 231 Direct sequence spread spectrum (DSSS) technology TIME 232 CCTV Surveillance capability of carrying a large bandwidth of data, specifically video image transmissions for surveillance applications. realized that a pseudo-random digital code or pseudorandom noise contains the frequencies from DC to that of the code clock rate. When the narrowband data signal is multiplied by the pseudo-random code sequence, the spectrum of the signal is spread to a bandwidth twice that of the code (Figure 7-22). The amount of performance improvement that is achieved against interference is known as the processing gain of the system. An ideal estimate for processing gain is the ratio of the spread spectrum bandwidth to the signal information rate: Processing Gain = 7.6.2 WiFi Protocol: 802.11 Standards Using a wireless LAN (WLAN, WiFi) dramatically reduces the time and cost of adding PCs and laptops to an established network. For a small or medium company, a complete wireless network can be set up within hours, with minimal disruption to the business. A laptop or PDA with WLAN allows mobile employees to be more productive by working from public “hotspots,” at airports, hotels, etc. Among the most fundamental steps to take when planning a WLAN is to learn about the various IEEE 802.11 standards, decide which one is appropriate for the application requirements, and apply it according to the standard. The WiFi Alliance is responsible for awarding the WiFi certified logo that ensures 802.11 compatibility and multivendor interoperability. The original 802.11 PHY (physical) standard established in June 1997 defined a 2.4 GHz system with a maximum data rate of 2 Mbps. This technology still exists but should not be considered for new deployment. In 1999 the IEEE defined two additions to the 802.11 PHY, namely 802.11b and 802.11a. There are two basic categories of IEEE 802.11 standards. SSM Bandwidth Signal Bandwidth It is important to note that data rate (signal bandwidth) and process gain are inversely proportional. In a digital data system, the process gain can be directly determined by the ratio of the pseudo-random code bits, called chips, and data or symbol rate of the desired data. For example, a system that spreads each symbol by 256 chips per symbol has a ratio of 256:1. The process gain is generally expressed in dB, the value of which is determined by the expression: P gain in dB = 10 Log Base 10 Chips/Symbol This corresponds to 24 dB for the example of 256 chips/symbol. In any case, the SSM technique results in a system that is extremely difficult to detect by observers outside the system, does not interfere with other services, and has the 1. The first are those that specify the fundamental protocols for the complete WiFi system. These are called 802.11a, 802.11b, and 802.11g standards and the new 802.11n standard. POWER FHSS DSSS DSSS HOP #25 HOP #12 HOP #8 HOP #5 HOP #60 HOP #1 FHSS FRF – R C FIGURE 7-22 FCH FRF Direct sequence spread spectrum (DSSS) modulation signal FRF + RC FREQUENCY Digital Transmission—Video, Communications, Control 233 2. Second, there are extensions that address weaknesses that provide additional functionality to these standards. These are 802.11d, e, f, h, i, and j. Only the 802.11i and 802.11e standards relating to quality of service (QoS) security are considered. other technologies this rate may be reduced further due to interference issues. Table 7-9 shows the parameters of these fundamental 802.11 standards. Each of these standards has unique advantages and disadvantages. Their specific attributes must be considered before choosing one. The 802.11a technology uses the 5 GHz radio spectrum to deliver data at a rate of 54 Mbps, and allows for 12 channels to be used simultaneously. The 802.11a standard occupies 300 MHz in three different bandwidths of 100 MHz each: 7.6.2.2 802.11a Standard 7.6.2.1 802.11b Standard 1. 5.150–5.250 GHz, lower band 2. 5.250–5.350 GHz, middle band 3. 5.725–5.825 GHz, upper band. The 802.11b technology uses the 2.4 GHz radio spectrum to deliver data at a rate of 11 Mbps, and allows for three non-overlapping channels to be used simultaneously. The 802.11b standard occupies 83.5 MHz (for North America) from 2.4000 to 2.4835 GHz. The standard 802.11b should be considered if there is no high bandwidth requirement, i.e. near real-time video is not required but there is a need for a wide coverage area. If price is a primary consideration the 802.11b system costs roughly one quarter as much has an 802.11a network covering the same area at the same data rate. Its main disadvantage is its lower maximum link rate. Also since it occupies the 2.4 GHz band used by Table 7-10 lists nine (4 non-overlapping) 20 MHz bandwidth channels available in the 5.8 GHz band. The 802.11a standard should be considered if the application requires high bandwidth, as required in high frame rate video transmission. It also should be considered when there is a small, densely packed concentration of users. The greater number of non-overlapping channels allows access points to be placed closer together without interference. Two disadvantages of the 802.11a standard is that it is not backward compatible with the older 802.11b standard, and costs roughly four times as much to cover the same area. IEEE STANDARD OPERATING FREQUENCY (GHz) DOWNLOAD SPEED * (Mbps) BANDWIDTH (MHz) 802.11a 5.8 54 TOTAL: 300 EACH CHANNEL: 20 802.11b 2.4 11 TOTAL: 83.5 EACH CHANNEL: 22 CHANNELS APPLICATIONS/COMMENTS 12 12 NON-OVERLAPPING HIGH BANDWIDTH, HIGH FRAME RATE MANY NON-OVERLAPPING CHANNELS 11 3 NON-OVERLAPPING LOW INTERFERENCE IN AREA REALTIME VIDEO NOT REQUIRED LOW COST DEFINES QUALITY of SERVICE (QoC) 802.11e — — — 802.11g 2.4 11, 54 EACH CHANNEL: 22 3 NON-OVERLAPPING 12 NON-OVERLAPPING — — — 108 200 — SUPPORTS MIMO DEPLOYMENT 802.11i 802.11n — — — * THEORETICAL MAXIMUM RATES. REALISTIC MAXIMUM APPROXIMATELY 1/2. IEEE—INSTITUTE of ELECTRICAL and ELECTRONIC ENGINEERS MIMO—MULTIPLE–IN MULTIPLE–OUT Table 7-9 IEEE 802.11 a, b, g, i, and n WiFi Standard Characteristics (a) BANDWIDTH (b) LATENCY (c) JITTER (d) SIGNAL LOSS WIDE–AREA COVERAGE HIGH BANDWIDTH DUAL BAND BACKWARD COMPATIBLE WIH 802.11b ENHANCED SECURITY– AUTHENTICATION PROTOCOL IMPROVED SECURITY KEY ADDS HIGH LEVEL AES ENCRYPTION NEWEST STANDARD: HIGH DATA RATE AND BANDWIDTH. HIGH THROUGHPUT UP TO 600 Mbps. SUPPORTS MIMO DEPLOYMENT. 234 CCTV Surveillance CHANNEL NUMBER FREQUENCY (GHz) 1 5.735 1A 5.745 2 5.755 2A 5.765 3 5.775 3A 5.785 4 5.795 4A 5.805 5 5.815 BAND * UNII UPPER BAND MAXIMUM POWER OUT 800 mW MODULATION METHOD CHANNELS COFDM 9 MAXIMUM 4 NON–OVERLAPPING * 802.11a OCCUPIES 300 MHz IN THREE DIFFERENT BANDWIDTHS OF 100MHz EACH TOTAL OF 9 CHANNELS AVAILABLE: 4 NON-OVERLAPPING. COFDM—CODED ORTHOGONAL FREQUENCY DIVISION MULTIPLEXING. UNII—UNLICENSED NATIONAL INFORMATION INFRASTRUCTURE Table 7-10 Wireless Transmission Channels in 5.8 GHz Band 7.6.2.3 802.11g Standard The 802.11g technology uses the 2.4 GHz radio spectrum to deliver data at a rate of 54 Mbps, and allows for three channels to be used simultaneously. The 802.11g standard is applicable to high-bandwidth video applications that require wide-area coverage. It should also be considered if backward compatibility with 802.11b is required. The main disadvantage of 802.11g is that maximum data throughput is reduced when 802.11g and 802.11b equipment shares the same network. Since it shares the 2.4 GHz frequency spectrum used by microwave ovens, cordless phones, garage door openers, and other wireless gadgets, it faces the same interference issues as 802.11b. Manufacturers such as Intel are supplying chipsets that include the IEEE 802.11a, b, and g technologies so that PCs and laptops can continue to connect to corporate wireless LANs without a hardware upgrade requirement even if the enterprise upgrades to a new infrastructure. 7.6.2.4 802.11n Standard The new 802.11n WiFi standard has high throughput and was created to provide over 100 Mbps effective throughput, complementing all broadband access technologies including fiber optic, DSL, cable, and satellite. The goal of the 802.11n protocol standard is to increase the 54 Mbps transmission to over 100 Mbps. The goal of the newest generation 802.11n standard more than triples the real throughput of WiFi and pushes the 30 Mbps standard to at least 108 Mbps. The new 802.11n standard including MIMO processing in its specification should produce performance of 144–200 Mbps. Figure 7-23 compares the throughput and distance improvements using the MIMO-based wireless LAN. 7.6.2.5 802.11i Standard The 802.11i standard provides enhanced security for wireless transmissions. It includes the use of authentication protocol, an improved key distribution framework, and stronger encryption via AES. 7.6.3 Asynchronous Transfer Mode (ATM) Two common protocols adopted to transmit video, voice, data, and controls over the Internet are the IP and asynchronous transfer mode (ATM). The ATM is a broadband network technology that allows very large amounts of data to be transmitted at a high rate (wide bandwidth). It does this by connecting many links into a single network. This feature has an important implication for transmitting highquality video with a guaranteed QoS. The ATM was developed in concept in the early 1980s. Since the early 1990s ATM has been highly touted as the ultimate network switching solution. This is because of its high speed and its ability to serve video and all other information types, and its ability to guarantee each type an appropriate QoS. The ATM is a fast-packet, connectionoriented, cell-switching technology for broadband signals. It has been designed from concept up, to accommodate any form of information: video images, voice, facsimile, and data whether compressed or uncompressed at broadband speeds and on an unbiased basis. Further, all such data can be supported with a very small set of Digital Transmission—Video, Communications, Control 235 DATA RATE/RANGE IN TYPICAL BUSINESS INDOOR ENVIORNMENT MAX RELIABLE RATE (Mbps) 120 100 802.11n 80 60 802.11a 40 20 0 0 20 40 60 80 100 120 140 160 180 200 RANGE (Ft) MIMO = MULTIPLE IN/MULTIPLE OUT FIGURE 7-23 Rate/range comparison of 802.11a vs. 802.11n MIMO indoors network protocols, regardless of whether the network is local, metropolitan, or wide area in nature. The ATM generally operates at minimum access speeds of 50 Mbps up to 155 Mbps. The ATM has, however, been slow to be accepted, is clearly on the rise, but it is a long time away before it may ultimately replace all of the circuit-, packet-, and frame-switching technologies currently in place. 7.7 TRANSMISSION NETWORK SECURITY The WLANs transmit video and data over the air using radio waves. Any WLAN client in the area served by the data transmitter can receive or intercept the information signal. Radio waves travel through ceilings, floors, and walls and can reach unintended recipients on different floors and outside buildings. Given the nature of the technology there is no way to assuredly direct a WLAN transmission to only one recipient. Users must be conscious of security concerns when planning wireless 802.11 networks. The first step of WLAN security is to perform a network audit to locate rogue access points within the network. The second step involves the basics of configuring and implementing the best security practices at all access points of the WLAN. In 2001, researchers and hackers demonstrated their ability to crack wired equivalency policy (WEP), a standard encryption for 802.11 wireless LANs. Because these encryption and authentication standards were vulnerable, stronger methods were developed and should be deployed to more completely secure a WLAN. The 802.11i standard has accounted for weaknesses in previous protocols but is still subject to some vulnerability if improperly implemented or by-passed by rogue devices. Every enterprise network needs a policy to ensure security on the network, and WLANs are no different. While policies will vary based on individual security and management requirements of each WLAN, a thorough policy and enforcement of the policy can protect an enterprise from unnecessary security breaches and performance degradation. 7.7.1 Wired Equivalent Privacy (WEP) The IEEE 802.11 WLAN standards include a security component called wired equivalent privacy (WEP). The WEP defines how clients and access points identify each other and communicate securely using secret keys and encryption algorithms. Although the algorithms used are well understood and not considered vulnerable, the particular way in which the keys are managed has resulted in a number of easily exploitable weaknesses. The WEP security relies on the user name/password method. Many WLAN access points are shipped with the WEP security disabled by default. This allows any WLAN-enabled device to connect 236 CCTV Surveillance to the network unchallenged. However, even when WEP is enabled there are still ways to breach the security; it just takes a little longer. As a first basic layer of security it is imperative that network administrators turn WEP “ON” prior to deploying access points in the corporate network. Most enterprises using wireless LANs do not enable the WEP and consequently users should presume that any data sent over such a wireless link can be intercepted. Furthermore with WEP now cracked by malicious hackers, organizations must explore additional measures including virtual private networks (VPNs) and vendor specific authentication schemes to provide more robust protection of the data passed over the wireless link. Wireless LAN signals do not necessarily stop at the outer walls of a building, a corporate campus border, or a physical plant perimeter. Physical security is ineffective in protecting against wireless LAN intrusions. In some metropolitan areas, hackers armed with portable computers or even PDAs with LAN cards make a game of drive-by invasions of corporate networks. As a first step, existing wireless LANs should be checked to ensure that WEP security protection is enabled. 7.7.2 Virtual Private Network (VPN) Network architects considering WLAN deployments must look beyond current WEP technology to ensure that security is not compromised. Currently the “best practices” recommendations are to overlay a VPN on top of the WLAN to establish an encrypted tunnel for users and devices to exchange sensitive information securely. Many current out-of-the-box VPN products support alternate methods for authenticating users and devices such as the use of digital IDs. It is extremely important to take advantage of enhanced identification methods for VPN, as a high level of trust is needed to grant users full access to security information. Companies must invest in products that provide secure identification and authentication capabilities with the VPN. Having a VPN overlay and basic security with a WLAN is comparable to having a security guard in the lobby of a building. The guard calls to let you know that John Doe is there to see you. If you are expecting him you let him through. But, is he who he who really says he is and how would you know until you saw him walk through the door? The security guard alone still leaves the hole in the system. But if the security guard must check John Doe’s passport (or a credential he knows to be authentic), there is no way he is coming in without authenticated documentation to prove his identity. Likewise to achieve mutual authentication, the security guard must present his or her own passport to Mr. Doe, so he knows he is at the correct building and not about to meet with an impostor. To deploy a VPN, the WLAN access point is placed outside the firewall and a VPN gateway is placed between the two. Since the WLAN access point is outside the firewall, it is effectively being treated as an untrustworthy network resource since it blurs the security parameter. Even if WEP security is compromised, no access to corporate resources is possible without a subsequent authenticated VPN. Most enterprises deploying wireless LANs will be forced to embrace vendor-specific security architecture or use VPNs. A VPN cannot be used everywhere in the wireless LAN architecture due to lack of VPN client support from manufacturers on certain handheld devices and proprietary operating systems. 7.7.3 WiFi Protected Access (WPA) WiFi protected access (WPA) is an interim standard developed by the WiFi Alliance. It combines several technologies that address known 80211× security vulnerabilities. It provides an affordable, scalable solution for protecting existing corporate WLANs without the additional expense of the VPN/firewall technology. It includes the uses of the 80211× standard in the extensible authentication protocol. For encryption it uses the temporal key integrity protocol and WEP 128-bit encryption keys. The WPA is a subset of the 802.11i standard. The WPA interim standard upgrades legacy systems and is an improvement over the WEP system. After upgrading to the WPA standard, firewalls and VPNs are no longer necessary. The national Institute of standards and technology (NIST) will not certify WPA under the FIPS 140-2 security standard. The federal government is mandated to procure a new system that conform to the FIPS 140-2 security standard and will not certify WPA onto this new standard. 7.7.4 Advanced Encryption Standard (AES), Digital Encryption Standard (DES) The data encryption standard (DES) is probably the most popular secret-key system in use on wired networks today. The much trickier triple DES is a special mode of DES that is used primarily for highly sensitive information. Triple DES uses three software keys. Data is encrypted with the first key, decrypted with the second key, and then encrypted again by the third key. The security chips used in equipment contain a triple-DES encryption/decryption engine that secures the content, avoiding troublesome theft-of-service issues for content providers. Moreover it prevents accidental viewing by another receiver sensor since it locks the data stream to a particular receiver. It provides capability for additional network entry, authentication, and authorization. Digital Transmission—Video, Communications, Control The advanced encryption standard (AES) was selected by NIST in October 2000 as an upgrade from the previous DES standard. The AES uses a 128-bit block cipher algorithm and encryption technique for protecting digital digital information. With the ability to use even larger 192bit and 256-bit keys, if necessary, it offers higher security against brute-force attack than 56-bit DES keys. The 128bit key size AES standard makes hacking of data nearly impossible. The AES is replacing both triple DES on wired networks and WEP on wireless LANs. For wireless networks, AES is being built into equipment complying with the new 802.11i protocol. 7.7.5 Firewalls, Viruses, Hackers A firewall can be a software program, a hardware device, or a combination of both. Basically a firewall is a system or group of systems that enforces an access control policy between two networks. The term “firewall” has become commonplace in discussions of network security. While firewalls certainly play an important role in securing the network, there are certain misconceptions regarding them that lead people to falsely believe that their systems are totally secure once they have a firewall. Firewalls are effective against attacks that attempt to go through the firewall, but they cannot protect against attacks that don’t go through the firewall. Nor can a firewall prevent individual employees with modems from dialing into or out of the network, bypassing the firewall entirely. The purpose of the firewall is to protect networked computers from intentional hostile intrusion from outside the network. Any private network that is connected to a public network needs firewall protection. Any enterprise that connects even a single computer to the Internet via a modem should have personal firewall software. What can the firewall protect against? Generally firewalls are configured to protect against unauthenticated interactive logins from outside the network. Firewalls help prevent pranksters and vandals from logging into the network computers. A firewall examines all traffic routed between two networks to see if the traffic meets certain criteria. There are two distinct types of firewalls that are commonly used: (1) the packet filtering router and (2) the proxy server. The first type of firewall, the packet filtering router, is a machine that forwards packets between two or more networks. It works on a set of rules and codes and decides whether to forward or block packets based on the rules and codes. The second type of firewall, the proxy server, has had the normal protocols FTP (file transfer protocol) and Telnet replaced with special servers. It relies on special protocols to provide authentication and to forward packets. In some instances the two types of firewalls are combined so that a selected machine is allowed to send packets through a packet filtering router onto an internal network. 237 7.8 INTERNET PROTOCOL NETWORK CAMERA, ADDRESS The fastest-growing technology segment in the video security industry is that of networked or IP addressable cameras and associated equipment. As the video industry shifts from traditional legacy analog CCTV monitoring to an OCTV networking system, IP cameras with internal servers are going to completely change the way surveillance is configured (Figure 7-24a,b). The camera configurations and set up and viewing of video images will be done via a LAN, WAN, MAN, or WLAN backbone, and a standard Web browser. Some security equipment manufacturers are referring to the next generation of video as Internet protocol television (IPTV). The devices making up a digital video surveillance system are comprised of an IP network camera, a video server, and PC or laptop computer. In portable surveillance applications the laptop, PDA, and cell phone are the monitoring devices. The following sections describe each of these devices and what functional part they play in the overall camera surveillance and control functions. The industry offers two different methods for networking cameras. The first method is that of incorporating an IP addressable camera into an existing LAN, WAN, or MAN configuration (Figure 7-25). In this method each camera is assigned a static IP address. With proper security codes or passwords this video information can be viewed on a standard Web browser on the network. These IP cameras with their built-in servers generally have capability for four video inputs. At the receiving and monitoring location there are two choices: (1) the system converts the video back into an analog format so that it can be displayed and/or recorded on an analog display and recorder, or (2) the video remains in digital form and is directly displayed on an LCD, PC, or laptop, and recorded on a digital video recorder (DVR). The second method for implementing remote or networked cameras is adapting the existing or standard legacy cameras and configured systems into a local network (Figure 7-26). The video outputs from the cameras, matrix switchers, and digital recorders are sent via interface adapters onto the input of the LAN, WAN, WLAN, or Internet network. The system starts as a standard security system before the video outputs and system control lines are connected to a standalone or plug-in Ethernet network interface unit. The security industry is transitioning from an analog to digital system by transporting the digital video images over an IP-based network using IP cameras as the video image source. Networked cameras can connect directly into the existing network via an Ethernet port, and eliminate the coaxial or UTP cabling that is required for analog cameras (Figure 7-27). When computers are already in place, no additional equipment is needed for viewing the video image from the network camera. The camera output can be viewed 238 CCTV Surveillance (A) LEGACY ANALOG P/T/Z MONITOR CAT 3,5 RG59 COAX ANALOG CAMERAS UTP SINGLE PAIR RG59 COAX RG59 COAX ANALOG DOME RG59 COAX BNC MULTIPLEXER LAN CAT 3,5 FIBER OPTIC INTERNET INTRANET WAN WLAN (B) IP DITIGAL SERVER CAT 3,5 BNC UTP ANALOG CAMERA TOWER UTP CAT 3,5 LCD MONITOR UTP IP CAMERAS IP DOME FIGURE 7-24 (a) Analog CCTV with coaxial, UTP, or other cabling, (b) Digital IP cameras and digital video server on wired LAN network in its simplest form on a Web browser and the computer monitor. If analog cameras are already present at a site the addition of the video server will make those camera images available in any location. To connect to the Internet many different kinds of transmission types are available. These include standard and ISDN modems, DSL modems, cable TV modems, T1 connections, and 10BaseT and 100BaseT Ethernet connections. In addition, cellular-phone modems and various 802.11 wireless network options are also available. 7.8.1 Internet Protocol Network Camera The network camera has its own IP address and built-in computing functions to handle any network communication (Figure 7-28). Everything needed for viewing images over the network is built into the camera unit. The network camera can be described as a camera and a computer combined. It is connected directly to the network as any other network device and has built-in software for a web server. It can also include alarm input and relay output. More advanced network cameras can be equipped with functions such as motion detection and analog video output. An IP compliant network camera contains a lens, a video imaging chip, a compression chip, and a computer. The network camera lens focuses the image onto a CCD or CMOS sensor that captures the image scene and digital electronics transforms the scene into electrical signals. The video signals are then transferred into the computer function, where the images are compressed and sent out over the network (Figure 7-29). For storing and transmitting images over the network, the video data must be compressed or it will consume too much disk space or bandwidth. If bandwidth is limited the amount of information being sent must be reduced by lowering the frame rate and accepting a lower image quality. 7.8.2 Internet Protocol Camera Protocols To facilitate communications between devices on a network they must be properly and uniquely addressed. Just as the telephone companies must issue phone numbers that are not duplicated, the computers and devices on the Digital Transmission—Video, Communications, Control 239 INTERNET INTRANET WAN WLAN CAT 3,5 FIBER OPTIC LAN UTP CAT 3,5 DOME TOWER PTZ LCD MONITOR DVR HARD DISK STORAGE IP CAMERAS FIGURE 7-25 Incorporating IP cameras in an existing LAN, WAN, or MAN network must be carefully programmed so that data transmissions can be transmitted and received from one to the other. Each network device has two addresses: (1) media access control (MAC) physical address and (2) IP logical address. The MAC addresses are hard-coded into a device or product at the factory (manufacturer) and typically are never changed. The IP addresses are settable and changeable, allowing networks to be configured and changed. The IP address uniquely identifies a node or device just as a name identifies a particular person. No two devices on the same network should ever have the same address. There are two versions of IP existing in use today. Most networks now use IP version 4 (IPv4) but new systems will begin to use the next-generation IP version 6 (IPv6), a protocol designed to accommodate a much larger number of computer and device address assignments. The Internet Corporation for Assigned Names and Numbers (ICANN) is a non-profit organization formed in 1999 to assume responsibilities from the federally funded Internet Assigned Numbers Authority (IANA) for assigning parameters for IPs, managing the IP address space, assigning domain names, and managing root server functions. The ICANN assigns IP addresses to organizations desiring to place computers on the Internet. The IP class and the resulting number of available host addresses an organization receives depends on the size of the organization. The organization assigns the numbers and can reassign them on the basis of either static or dynamic addressing. Static addressing involves the permanent association of an IP address with a specific device or machine. Dynamic addressing assigns an available IP address to the machine each time a connection is established. As an example, an Internet Service Provider (ISP) may hold one or more Class C address blocks. Given the limited number of IP addresses available, the ISP assigns an IP address to a user machine each time the dial-up user accesses the ISP to seek connection to the Internet. Once the connection is terminated, that IP address becomes available to other users. 7.8.3 Internet Protocol Camera Address Unlike traditional analog CCTV systems, network video is based on sets of transmission standards and protocols. These rules are necessary because the video system is no longer a closed system but an open system interconnecting with many clients and users. There are two primary sets of standards that control networking: (1) 802 created by the IEEE and (2) Open Systems Interconnect (OSI) seven-layer model, created by the International Organization for Standardization (IOS). The following sections summarize the standards. 240 CCTV Surveillance INTERNET INTRANET WAN WLAN CAT 3,5 FIBER OPTIC LAN CAT 3,5 IP/ETHERNET NETWORK VIDEO SERVER ANALOG CAMERAS MONITOR BNC UTP SINGLE PAIR RG59 COAX RG59 COAX CAT 3,5 RG59 COAX CAT 5 RG59 COAX PTZ UTP 1 4 8 12 16 SWITCHER / MULTIPLEXER ANALOG DOME RG59 COAX VCR / DVR CAMERA /LENS CONTROLS FIGURE 7-26 Diagram to connect legacy analog cameras to the digital network The OSI seven-layer model is the standard cited in almost all network documents and is the central part of any network foundation. Although all the OSI layers are necessary for communication, the four considered in this analysis are: (1) Physical, (2) Data Link, (3) Network, and (4) Transport. Figure 7.30 summarizes the seven layers of the OSI networking model. The Physical Layer 1 deals with the hardware of the system. This includes items like servers, routers, hubs, network interface cards, etc. This physical layer has the function of converting digital bits into electronic signals and connecting the devices to the network. The Data Link Layer 2 provides the interface, or link, between the higher layers in the network hardware. The Data Link Layer has three functions: (1) make sure a connection is available between two network nodes, (2) encapsulate the data into frames for transmission, and (3) ensure that incoming data is received correctly by performing some error checking routines. Layer 2 is divided into two sub layers: logical ink control and media access control. The media access control layer is better known by MAC which is a hard-coded address assigned to every network interface on any device made to attach to a network. This address is assigned by the manufacture of the device. The MAC addresses are unique throughout the entire world. The address itself is a 48-bit address, consisting of six octets (eight-digit numbers, in binary). Connections between devices on a network are ultimately made by MAC address, not IP addresses or domain names. Those methods simply assist a device in finding the MAC of another device. The first part of the MAC address, or the first three of octets, is unique to the manufacturer of the device. It is called the organizational unique identifier. Every company manufacturing network devices Digital Transmission—Video, Communications, Control ANALOG CAMERA 241 SERVER CONNECTION OPTIONS: BNC RJ45 COAX • ISDN MODEM • DSL MODEM • CABLE TV MODEM LAN • T1 CABLE • 10BaseT ETHERNET UTP CAT 3,5 DOME PC LCD MONITOR INTERNET • 100BaseT ETHERNET • FIBER OPTIC • 802.11 WIRELESS • CELLULAR WIRELESS IP CAMERAS FIGURE 7-27 Diagram to connect networked cameras, Ethernet and Internet (A) FIXED FIGURE 7-28 (B) PAN / TILT/ ZOOM IP network camera has one or several. The second part of the MAC or the last three octets, is unique to each device. No two devices in the world should have the same MAC address. The second sub-layer in Layer 2 is the logical link control (LLC). The LLC takes the raw data bits from the upper layers and encapsulates them in preparation for transmission. It organizes the data into frames, giving information such as addressing, error checking, etc. After framing and addressing is complete, the frames are then sent to Layer 1 to be converted into electrical pulses and sent across the network. Layer 3 is the Network layer and is primarily responsible for two functions: addressing and routing. This layer contains the IP protocol, part of the TCP/IP protocol. The “IP address” common to all of us is the Layer 3 responsibility and is unique throughout the entire world. The IP address is a 32-bit address and must be assigned by a user or administrator somewhere and is not set at the factory. Since it is user assignable there is great flexibility in how the address is assigned. The IP address consists of four sets of numbers separated by periods or dots, however, computers actually see the IP address in binary form. The current IP address format is called IP version four, or IPv4, in which there are over 4.3 billion possible addresses. The last OSI level considered here is the Transport Layer 4. This layer is responsible for reliably getting the packets from point A to point B. This layer supports 242 CCTV Surveillance COLUMN/ROW PIXEL SCANNING TIMING AND SYNCHRONIZING SDRAM / FLASH MEMORY LENS VIDEO SIGNAL COMPRESSION MJPEG, MPEG-4 SENSOR: CCD, CMOS CENTRAL INTERNET INTRANET LAN/WAN WIRED ETHERNET INTERFACE PORT PROCESSING UNIT IRIS, ZOOM, P/ T, SHUTTER WIRELESS PORT DSP LENS, P/ T DRIVERS NTSC /PAL OUTPUT WIFI 802.11 a /b/g LCD DISPLAY ALARM TRIGGERS ALARM OUTPUTS FIGURE 7-29 IP network camera block diagram FIGURE 7-30 Seven layer open systems interconnect (OSI) model LAYER APPLICATION 7 PRESENTATION 6 UPPER LAYER (SOFTWARE) SESSION 5 TRANSPORT 4 NETWORK (PACKET *) 3 DATA LINK (FRAME *) 2 1 LOWER LAYER (HARDWARE) PHYSICAL (BIT *) * EXCHANGE UNIT two different transmission methods: connection-oriented and connectionless. Connection-oriented transmissions are handled by TCP. These are point-to-point connections for guaranteed reception of data. An email message, accessing a Web page or downloading a file are all examples of connection-based exchanges. Error checking is performed on these exchanges because there is a guarantee of data reception. This transmission method does not work well for video since video is near real-time requiring large amounts of data to be transmitted and it would fail to produce an acceptable stream of video images for viewing or recording. If an error occurred and the sending device retransmitted parts of the video clip the video stream would not be viewable. For video transmission the connectionless protocol user datagram protocol (UDP), which does not guarantee delivery of error-free data, is Digital Transmission—Video, Communications, Control used. The UDP is the foundation of video multitasking, which is a one-to-many method of video streaming. It is a crucial element of networked video systems. In spite of the large number of addresses possible in the IPv4 standard, the popularity of TCP/IP protocol, especially the IP-based Internet, has placed a good deal a strain on the IPv4-based numbering scheme. To alleviate this problem, at least partially, in 1993 the concept of supernetting (subnetting) was devised. This technique used the number of 1 bits in the network address to specify the subnet mask. This technique reduced the number of routes and therefore the size and complexity of the routing tables that the Internet switches and routers had to support. This subnet technique goes a long way toward easing the pressure on the IPv4 addressing scheme but does not solve the basic problem of the lack of addresses in the future. The new IPv6 protocol resolves this issue through the expansion of the address field to 128 bits, thereby yielding virtually unlimited potential addresses. A proper IP address consists of four sets of numbers, separated by periods or dots. Each of the four sets of numbers is called an octet. The addressing architecture defines five address formats each of which begins with one, two, three, or four bits that identify the class of the network. The host portion of an IP address is unique for each device on a network while the network portion is the same on all devices that share a network. The way to distinguish which part of an address is which is called the subnet mask. The subnet mask is another 32-bit number that looks similar to an IP address, but does something entirely different. The five address formats are: Class A, B, C, D, and E. Figure 7-31 shows a breakdown of the three classes of network addresses of interest: Class A, B, C. 243 Each line in each class represents an IP address in binary from bit zero to 32. Under Class A the first eight bits are the network information. This identifies the network itself and is shared by all devices on that network segment. To the right of the vertical divider line, the host information part of the address uniquely identifies each device. A host is any device with an assigned address. When the classes are compared, it is seen by looking at each class the dividing line between network and host moves. Class B addresses are divided in the middle with two octets for the network ID and two for the host. Class C addresses have the first three octets for the network and the last one for the host device. Moving the dividing line and changing classes determines how many different networks can be created and how many hosts are on each. When the IP address and subnet mask are compared, anywhere where there is a one indicates the network portion of the IP address. Anywhere where there is a zero shows the host portion. If two addresses are not on the same subnet they will not be able to talk to each other. Figure 7-32 shows a dissection of an IPv4 IP address with its subnet mask. The subnet mask is uncovered by comparing the IP address and the subnet mask in binary. Anywhere where a one appears in the comparison indicates the network portion of the IP address. Anywhere where there is a zero shows the host portion of the address. The IP addresses are used to identify the camera equipment in a network whether local or on the Internet. These addresses are configured by software: they are not hardware-specific. An IP address can be either static or dynamic. Static addresses do not change and are usually found on LAN and WAN networks. However, if the network interfaces via dial-up modem, high-speed cable HOW IP VERSION 4 ASSIGNS IP ADDRESSES BREAKING DOWN 3 CLASSES OF NETWORK ADDRESSES INTO BINARY CLASS A BIT #0 7 8 NETWORK INFORMATION 31 NETWORK BEGINNING OCTET CLASS A B C HOST INFORMATION 1–126 128–191 192–223 NUMBER OF NETWORKS HOST ADDRESSES PER NETWORK 126 > 16, 000 > 2, 000, 000 16, 777, 2 1 4 65, 5 3 4 254 CLASS B BIT #0 15 NETWORK INFORMATION BIT #0 HOST INFORMATION CLASS C 23 NETWORK INFORMATION FIGURE 7-31 31 16 24 31 HOST INFORMATION Class A, B, C network addresses THE GRAY LINES REPRESENT IP ADDRESSES IN BINARY FORM FROM BIT 0 TO 32. UNDER CLASS A, THE FIRST 8 BITS ARE TITLED NETWORK INFORMATION. THESE BITS IDENTIFY THE NETWORK ITSELF AND ARE SHARED BY ALL DEVICES ON THAT NETWORK SEGMENT. AFTER THE VERTICAL DIVIDER LINE THE HOST INFORMATIN PART UNIQUELY IDENTIFIES EACH HARDWARE DEVICE. 244 CCTV Surveillance DISSECTING THE IP ADDRESS AND SUBNET MASK DECIMAL NOTATION BINARY NOTATION IP ADDRESS 154.140.76.45 10011010 10001100 01001100 00101101 SUBNET MASK 255.255.255.0 11111111 11111111 11111111 00000000 THIS OCTET IS PART OF AN EXTENDED NETWORK PREFIX FIGURE 7-32 THIS OCTET REPRESENTS HOST INFORMATION IP address and subnet mask modem, or DSL, the IP address is usually dynamic, which means it changes each time the Internet connection is made. The dynamic host configuration protocol (DHCP) is an IP for automating the configuration of equipment that uses the TCP/IP protocol. It is the IP-addressing method where the network router supplies a temporary IP address to the computer connected to it. If a device is programmed to use DHCP it is likely the device will function on the LAN, but not be accessible from outside the LAN using the Internet. The DHCP lets network administrators automate and centrally manage the assignment of IP addresses in an organization’s network. The DHCP lets the network administrator supervise and distribute IP addresses from a central point and automatically send a new IP address FIGURE 7-33 Converting the decimal IP address to binary when a computer is plugged into a different location in the network. The IP address consists of four groups, or quads (octet). The groups are decimal digits separated by periods. An example is: 153.99.12.227. In binary form the IP address is a string of zeros and ones. Part of the IP address represents the network number or address and part represents the local machine address, also known as the host number or address. The most common class used by large organizations is Class B, which allows 16 bits for the network number and 16 for the host number. Therefore, in the example, 153 and 99 represent the network address and 12 and 227 represent the host address. The decimal and binary equivalent IP address would be divided as shown in Figure 7.33 <HOST ADDRESS> <NETWORK ADDRESS> DECIMAL FORM 153. 99. 12. 227 OCTET I II III IV 10011001. 01100011. 00001100. 11100011 BINARY FORM TO CALCULATE THE FIRST OCTET: 1 × 27 + 0 × 26 + 0 × 25 + 1 × 24 + 1 × 23 + 0 × 22 + 0 × 21 + 1 × 20 =153 THE SECOND OCTET: 0 × 27 + 1 × 26 + 1 × 25 + 0 × 24 + 0 × 23 + 0 × 22 + 1 × 21 + 1 × 20 = 99 THE THIRD OCTET: 7 6 5 4 3 2 1 0 0 × 2 + 0 × 2 + 0 × 2 + 0 × 2 + 1 × 2 + 1 × 2 + 0 × 2 + 0 × 2 = 12 THE FOURTH OCTET: 1 × 2 + 1 × 2 + 1 × 2 + 0 × 2 + 0 × 2 + 0 × 2 + 1 × 2 + 1 × 2 = 227 Digital Transmission—Video, Communications, Control For LAN and WAN systems a special networking board/card must be incorporated into the user’s computers. This networking card uses TCP/IP protocol and is capable of interconnecting all of the PCs to the system. By adding a network interface to a camera site which serves as a bridge between analog-based CCTV systems and a digital network, one can view the video image over a computer network as well as control PTZ functions. Network computers have one IP address for the LAN, and a second one for the LAN connected to the Internet (WAN). The following three methods describe step-by-step procedures to obtain the LAN address of the computer using Windows XP. Method 1 1. Click on START in Windows XP. 2. Open the Control Panel within the START window and click on Network and Internet Connections. 3. Clicking on Network Connections opens a window displaying icons for Network Connections. 4. Right-click the Network Connection that is currently “enabled” and click Properties. 5. Scroll down the center of the Properties window and highlight Internet Protocol. 6. Click the Properties button, and a window will display the following information: “Obtain IP Address Automatically.” If this button is lit the computer network is using DHCP. If the button “Use The Following IP Address” is lit, the network is using “static” IP addresses that do not change periodically. The information boxes below will have values such as: • IP—192.168.1.105 • Subnet Mask—255.255.255.0 • Default Gateway—192.168.1.4. The Subnet Mask indicates the class of network (A, B, or C) being used. The Default Gateway is the LAN IP address of the network router. 8. Click OK twice to close the IP address window without changing the settings. Method 2 Another way to access the LAN IP of a specific computer is to: 1. Click START and RUN in Windows XP. Type Command and press Enter. 2. Type IP Config\All in the Command window. The same LAN IP information detailed above will be displayed on the screen. Method 3 To obtain the WAN (Internet) IP of the network: 1. Open a Web browser such as Internet Explorer. 2. Type http://www.whatismyip.com in the address line. 3. Click Go. 245 The IP address of the network will be displayed on the computer screen. 7.9 VIDEO SERVER, ROUTER, SWITCH A Server is a computer or software program that provides services to clients—such as a file storage (file server), programs (application server), printer sharing (printer server), or modem sharing (modem server). A Router is a device that moves data between different digital network segments and can look into a packet header to determine the best path for the packet to travel. Routers can connect network segments that use different protocols and allow all users in a network to share a single connection to the Internet or a WAN. A Switch is a device that improves network performance by segmenting the network and reducing competition for bandwidth. 7.9.1 Video Server A server is a computer or program that provides services to other computer programs in the same or other computers. A computer running a server program is also frequently referred to as a server. Specific to the Web and a web server is the computer program that serves requested HTML pages of files. Video servers transform analog video into high-quality digital images for live access over an intranet or the Internet. A video server enables the user to migrate from an existing analog CCTV system into the digital world. Most single video servers can network up to four analog cameras, a cost-effective solution for transmitting high-quality digital video over computer networks. By bridging the analog to digital technology gap, video servers complement previous investments in analog cameras. A video server digitizes analog video signals and distributes digital images directly over an IP-based computer network, i.e. LAN, intranet, Internet. The video server converts analog cameras into network cameras and enables users to view live images from a Web browser on any network computer, anywhere and at anytime. The video server can deliver up to 30 fps in NTSC format (25 fps PAL) over a standard Ethernet. It includes one or more analog video inputs, image digitizer, image compressor, a web server, and network/phone modem interface and serial interfaces (Figure 7-34). The video server receives analog video input from the analog camera which is directed to the image digitizer. The image digitizer converts the analog video into a digital format. The digitized video is transferred to the compression chip, where the video images are compressed to J-MPEG, MPEG-2, MPEG-4, H.264, or other format. The CPU, the Ethernet connection and serial ports, and the 246 CCTV Surveillance (A) BLOCK DIAGRAM IMAGE DIGITIZER CENTRAL PROCESSING UNIT (CPU) COMPRESSION ENGINE ETHERNET/IP NETWORK ETHERNET DRIVER PAN/ TILT/ ZOOM DRIVER FLASH MEMORY ANALOG VIDEO CAMERAS DRAM MEMORY MODEM CAMERA / PLATFORM PHONE LINE (B) FOUR CAMERA SERVER FRONT FIGURE 7-34 REAR (a) Video server block diagram, (b) Typical equipment alarm input and relay output represent the brain or computing functions of the video server. They handle the communication with the network. The CPU processes the actions of the web server and all of the software for drivers for controlling different PTZ cameras. The serial ports (RS-232 and RS-485) enable control of the camera’s PTZ functions and other surveillance equipment. There is a modem for connections to telephone or other transmission channels. The alarm input can be used to trigger the video server to start transmitting images. The relay output can start actions such as opening a door. The video server is equipped with image buffers and can send pre-alarm images of an alarm event. The flash memory is the equivalent to the hard disk of the video server and contains all software for the operating system and all applications. information packet. The router can be located at any juncture of a network or gateway including each Internet point-of-presence. The router is often included as part of a network switch (Figure 7.35). 7.9.3 Video Switch A switch port receives data packets and only forwards those packets to the appropriate port for the intended recipient. This further reduces competition for bandwidth between the clients, servers, or workgroups connected to each switch port. 7.10 PERSONAL COMPUTER, LAPTOP, PDA, CELL PHONE 7.9.2 Video Router/Access Point The video router on the Internet is a device or in some cases software in a computer, that determines the next network to which a packet of digital information should be forwarded, toward its final destination. It connects at least two networks and determines which way to send each Personal computers (PC) and laptops are the most widely used appliances for monitoring video surveillance images on the digital network. The Personal digital assistant (PDA) and cell phone are the choice when the absolute minimum in size is required, and image quality is not the primary factor. Digital Transmission—Video, Communications, Control (A) FIGURE 7-35 (B) Typical router/access point 7.10.1 Personal Computer, Laptop Personal computers and laptops have the computing capacity, digital storage, and network interfaces to monitor digital video and other surveillance functions through wired or wireless connections. They contain the displays, operating systems, application software, and communications devices to receive and communicate with all of the cameras and other devices on the security network. Laptops have the added functionality of being mobile, transportable, and battery-operated. This is a very useful attribute for rapid deployment video systems. 7.10.2 Personal Digital Assistant (PDA) The full impact of video surveillance using wireless cameras, monitors, and servers has yet to be realized. Wireless video surveillance is rapidly growing in popularity (A) PDA FIGURE 7-36 247 for monitoring remote locations, whether from a laptop or a PDA. WiFi video digital transmission provides the ability to deliver near real-time, full-motion video surveillance at 20 fps to PDAs and cell phones at any location having access to the Internet via the WiFi connection. A video server at the surveillance site compresses images and sends them wirelessly to the PDA or cell phone. The systems can provide secure access to validated mobile phones without any eavesdropping. The IP security cameras connected to the network transmit digital video via MPEG-4 video compression wirelessly to PDAs and cell phones. Remote video and alarm surveillance is only a phone call away: anytime of day, anywhere in the world. Software is available that allows PDA users running Microsoft Pocket PC 2002 to receive video, thereby remotely monitoring security areas while mobile. The Axis Camera Explorer (ACE) lets you watch live network video from anywhere, on a PDA (Figure 7-36). (B) WIRELESS LINK Personal digital assistant (PDA) and cellphone used as video receiver (C) CELLPHONE 248 CCTV Surveillance Giving personnel the ability to remotely monitor secure areas greatly increases security functionality. Access to the system via the Internet is accomplished by assigning an IP address to every surveillance device entering an address in a Web browser to connect with the system. Just about any PDA or laptop using Windows CE or Linux with a wireless card and the wireless Web modem can obtain a wireless remote video transmission. PDAs and Pocket PCs have a slot for a compact flash format WiFi radio. There are also small format WiFi radios for PDAs and mobile data devices offering additional options for wireless connections. A PDA is a very useful monitoring device for a rapid deployment video system. 7.10.3 Cell Phone The cellular phone network has a sub-carrier that can be used to transmit and receive control data for video cameras and other components. This sub-carrier channel information called the cellular digital packet data (CDPD) transmits the digital data over the cellular telephone network using the idle time between cellular voice calls. A mobile data base station (MDBS) resides at each cellular phone cell site that uses a scanning receiver to scan and detect the presence of any voice traffic, based on the signal strength. Providing that there are two channels idle (for transmitting and receiving) the MDBS will establish an air link. The type of sub-carrier available depends on the security service provider. 7.11 INTERNET PROTOCOL SURVEILLANCE SYSTEMS: FEATURES, CHECKLIST, PROS, CONS The following is a summary of features and key questions that should be considered in selecting a video surveillance transmission technology. Most comments apply to both wired and wireless networks. Some apply to wireless networks only. A list of pros and cons follows the list. 7.11.1 Features • The IP surveillance provides worldwide remote accessibility. Any video stream, live or recorded, can be accessed and controlled from any location in the world over the wired or wireless network. • Video images from any number of cameras can be stored in digital format in a host service. This enables the viewing of images from multiple cameras and playback of an entire sequence of events. • The cost of developing the infrastructure for the Internet and security system services has and will be borne primarily outside of the security industry. • As long as there is access to the Internet, any location in the world that has a PC and a browser can be provided with security system services. • The IP surveillance uses a more cost-effective infrastructure than analog technologies. Most facilities are already wired with an UTP IT infrastructure. The installation of future directed hybrid systems will be capable of accommodating new analog as well as digital systems and thereby ensure compatibility. • The IP surveillance technology provides an open, easily integrated platform to connect and manage the enterprise data, video, voice, and control, making management more effective and cost-efficient. • The IP digital surveillance brings intelligence to the camera level. The VMD, event handling, sensor input, relay output, time and date, and other built-in capabilities allow the camera to make intelligent decisions on when to send alarms, and when and at what frame rate to send video. • The cost savings for commercial companies and governmental agencies implementing IP technology could be massive. Multinational corporations and government agencies with plants and offices around the world already have worldwide communications networks onto which the security function could be added. 7.11.2 Checklist • How much bandwidth is available for network transmission? • How much total storage space is available to store the video images? • Will video be viewed and recorded remotely? • Must the video be of high enough quality to be used for personnel identification purposes? • Does the application require real-time video? • Are different frame rates needed during certain events or specific times? • Is a peer-to-peer network or one with a base station (access point) required? • How many base stations (access points or gateways) are needed? • How will the WiFi network be connected to the Internet? • What are the WiFi radio options for PCs, laptop’s, PDAs, and cell phones? • How many users will use a single access point? • What is the total number of users and computers? • Will each computer use a WiFi connection? • Is the video to be interfaced with existing networks? • What is the available bandwidth that can be reserved for the video signal? • What image quality and resolution are needed for the application? • What resolution is needed to identify the person or activity in the scene? Digital Transmission—Video, Communications, Control 249 • What frame rate is needed to be activity specific and sufficient to capture motion in the scene? • Is a wired or wireless network more suitable? Is a wired network preferable to minimize security problems? • What are the security requirements: standard— strategic? vice. In strategic applications some form of encryption is needed. 7.11.3 Pros 7.12 SUMMARY There are many advantages to the implementation of IP surveillance technologies using either wired or wireless networks in small or large surveillance applications: Video imaging and storage is going through more technological changes and structural redefinition than any other part of the physical security market. The Internet and WWW has made long-range video security monitoring a reality for many security applications. Likewise the availability of high-speed computers, large solid state memory, and compression technologies have made the sending of real-time video over these networks practical and effective. New methods of wireless transmission including MIMO mesh have improved the range, reliability and QoS of wireless transmission. This chapter has described the digital video security and Internet transmission media with its unique modulation and demodulation requirements. The specific compression algorithms required to compress the video frame image file sizes to make them compatible with the existing wired and wireless transmission channels available are described. A powerful technology used to transmit the digital signal called SSM has made wireless video transmission a reality. The 802.11 spread spectrum protocols are described as relating to video, voice, and command and control transmission. Security monitoring is no longer limited to local security rooms and security officers, but rather extends out to remote sites and personnel located anywhere around the world. Monitoring equipment includes flat panel displays, PCs, laptops, PDAs, and cell phones. The requirement for individual personnel to monitor multiple display monitors has changed to a technology of incorporating smart cameras with VMDs to establish an AVS system from local and remote sites. A key factor to be considered in any wired or wireless digital video network system is protecting the data from unfriendly intruders and viruses. Using WEP, VPN, firewalls, and anti-virus and encryption techniques is paramount. • The IP surveillance scales from one to thousands of cameras in increments of a single camera. There are no 16 channel jumps as in analog systems. • Automatic transmittal of images over the Internet to a remote location to provide video images of events that just happened. • Embedding the video images as HTML pages in a web server built right into the camera. • Transmitting video images over wireless media to PDAs, laptop’s, and cell phones at local or remote monitoring locations. • Remote guard tours to provide increased efficiency of guards and services at a greatly reduced cost. • Intelligent monitoring and control, including the transmission of images triggered by alarm conditions with pre-alarm images. • Remote surveillance from anywhere to anywhere, online any time—24/7/365. • Wireless for convenience and cost considerations. • Wireless a must when no wired installed network is available. • Can now integrate video, alarm intrusion, access control, fire, etc. into a seamless security system. 7.11.4 Cons Security personnel can question the security of Internetbased security systems. Section 7.7 described several important video surveillance security concerns when using digital IP networks. These included: viruses and hackers, and eavesdropping. Another factor of concern is that of reliability of the IP network, i.e. temporary loss of ser- • Some locations do not have high-speed Internet access. • Some Internet service providers (ISPs) may not provide reliable service. This page intentionally left blank Chapter 8 Analog Monitors and Digital Displays CONTENTS 8.1 8.2 8.3 8.4 8.5 8.6 8.7 Overview Analog Monitor 8.2.1 Cathode Ray Tube Technology 8.2.1.1 Beam Deflection 8.2.1.2 Spot Size, Resolution 8.2.1.3 Phosphors 8.2.1.4 Interlacing and Flicker 8.2.1.5 Brightness 8.2.1.6 Audio/Video 8.2.1.7 Standards 8.2.2 Monochrome Monitor 8.2.3 Color Monitor 8.2.4 Color Model Flat-Screen Digital Monitor 8.3.1 Digital Technology 8.3.1.1 Pixels, Resolution 8.3.2 Liquid Crystal Display (LCD) 8.3.2.1 Brightness 8.3.2.2 Liquid Crystal Display Modes of Operation 8.3.3 Plasma 8.3.4 Organic LED (OLED) Monitor Display Formats 8.4.1 Standard 4:3 8.4.2 High Definition 16:9 8.4.3 Split-Screen Presentation 8.4.4 Screen Size, Resolution 8.4.5 Multistandard, Multi-Sync 8.4.6 Monitor Magnification Interfacing Analog Signal to Digital Monitor Merging Video with PCs Special Features 8.7.1 Interactive Touch-Screen 8.7.1.1 Infrared 8.7.1.2 Resistive 8.7.1.3 Capacitive 8.7.1.4 Projected Capacitance Technology (PCT) 8.8 8.9 8.10 8.7.2 Anti-Glare Screen 8.7.3 Sunlight-Readable Display Receiver/Monitor, Viewfinder, Mobile Display Projection Display Summary 8.1 OVERVIEW In the late 1990s digital flat-screen devices began to be used in video surveillance systems. Dropping prices, improved performance, and the obvious space advantages of flat-panel displays has caused a rapid shift away from traditional CRT monitors for video security use. Digital video monitors and projectors are reaching new heights of performance and are replacing the longtime workhorse in the industry, the CRT monitor. This chapter analyzes the monitoring hardware used for video security systems. This hardware consists of a variety of monochrome and color monitors: CRTs, LCDs, plasma screens, and organic LEDs. These monitors vary in size from 5 to 42 inches diagonal. The monitor size depends in part on how many cameras are to be monitored, how many security personnel will be monitoring, and how much room is available in the security room. The question of how many cameras will be viewed sequentially or simultaneously on a single monitor will be analyzed. There is discussion of the consequences of displaying 1, 2, 4, 9, 16, and 32 individual camera pictures on a single monitor. Special features and accessories for these analog and digital displays will be described. These include the touch screen that allows an operator to input a command to the security system by touching defined locations on the monitor. Chapter 20 describes the integration of the analog CRT and flat-panel LCD monitor displays into the security console room. 251 252 CCTV Surveillance There are several different hardware technologies that exist for displaying the video image, computer data and graphics on video monitors. The technologies include: CRT, LCD, plasma, and OLED. The video projector is used to display images on a large screen for multiple personnel viewing. Monitors can receive an analog video signal and digital information in formats such as SVGA, NTSC, PAL, and SECAM. Color CRT monitors are versatile and often have resolutions from the standard 640 × 480 pixels to high 2048 × 1536 pixels with a 32-bit color depth (24-bit common), and a variety of refresh rates from 60 to 75 fps. The sharpness of the analog display is described by the number of TV lines it can display and that of the digital display by the number of pixels. In general the more the pixels the sharper the picture image. Resolution and image quality on analog and digital monitors are described with different parameters: for the analog monitor TV lines, for the digital monitor pixels. The horizontal resolution of a 9-inch monochrome analog CRT monitor is approximately 800 TV lines, and for a 9-inch color monitor approximately 500 TV lines. The horizontal resolution of a typical 17-inch monochrome monitor is 800 TV lines and for a 17-inch color monitor approximately 450 TV lines. Vertical resolution is about 350 TV lines on both types as limited by the NTSC standard of 525 horizontal lines. The horizontal and vertical resolution for a 15-inch digital LCD with a 4 × 3 format is 1024 pixels by 768 pixels XGA (extended graphics array). Most monitors are available for 117-volt, 60 Hz (or 230-volt, 50 Hz) AC operation, and many for 12 VDC operation with a 117 VAC to 12 VDC wall converter. Video signal connections are made via RCA plug, BNC, 9 or 25 pin connectors. The two-position switch on the rear of some monitors permits terminating the input video cable in either 75 ohms or high impedance (100,000 ohms). If only one monitor is used, the switch on the rear of the monitor is set to the 75-ohm or low-impedance position matching the cable impedance for best results. If multiple monitors are used, all but the last monitor in the series are set to the high-impedance position. The last monitor is set to the low-impedance 75-ohm position. If a VCR or DVR recorder or a video printer is connected, all the monitors are set to high impedance and the recorder or printer devices set to low impedance. Recorder and printer manufacturers set the impedance to 75 ohms at the factory. Only one 75 ohm terminated device can be used at a time. Most cameras and monitors have a 4 × 3 geometric display format, that is, the horizontal-to-vertical size is 4 units by 3 units. The high definition television (HDTV) has a 16 by 9 aspect ratio. For any application, the security director, security systems provider, and consultant must decide: • Should each camera be displayed on an individual monitor? • Should several camera scenes be displayed on one monitor? • Should the picture from each individual camera be switched to a single monitor via an electronic switcher or multiplexer? If there is high scene activity, i.e. many people passing into or out of an area, all cameras should be displayed on separate monitors. For installations with infrequent activity or casual surveillance, a manual, automatic, or other switcher or split screen should be used. Since each installation requires a different number of monitors and has different monitoring criteria depending on the application, each installation becomes a custom design. The final layout and installation of a system should be the collaboration between the security department, management, outside consultants, and security equipment providers (dealer, installer, system integrator). 8.2 ANALOG MONITOR Up until the late 1990s the CRT monitor has been the technology used in virtually all security applications, including video surveillance, access control, alarm, and fire monitoring. With the widespread use of computer displays in many security departments and the availability of flat-panel technologies for displaying data and video images, the CRT display is still used in most security monitoring applications. The continuing success of the CRT monitor is based on an extremely simple concept, with a relatively simple structure and using electronic solid-state semiconductor circuitry for all other electronic functions in the monitor. While the CRT still utilizes vacuum-tube technology, its combination with semiconductor technology provides the most cost-effective solution to displaying a video image, be it monochrome or color. The CRT monitor has become less expensive while improving in quality and lifetime. These monitors cost less than flat panel digital displays because of their simple construction and long successful history of high-volume production. While the CRT has enjoyed many years of use, it is likely that plasma displays, LCDs, OLED displays, and other new technologies will eventually make CRT-based displays obsolete in video security applications. These new designs are less bulky, consume less power, and are digitally based. As of mid-2003 some LCDs became directly comparable in price to CRTs. 8.2.1 Cathode Ray Tube Technology The CRT, invented by Karl Braun, is the most common display device used in video surveillance, computer displays, television sets, and oscilloscopes. The CRT was developed using Philo Farnsworth’s work and has been used in all Analog Monitors and Digital Displays EXTERNAL SYNC 253 (1) ELECTRON SOURCE (2) ELECTRON BEAM CONTROL (3) ELECTRON BEAM FOCUSING EXTERNAL INTERNAL VERTICAL DEFLECTION SUPPLY VERTICAL SYNC SEPARATER VIDEO OUT IMPEDANCE * (3) ELECTRON BEAM FOCUSING COIL SYNC STRIPPER VIDEO IN VERTICAL DRIVE VIDEO PROCESSING AMPLIFIER (1) CATHODE VERTICAL DEFLECTION COIL ELECTRON BEAM HIGH (2) ELECTRON BEAM CONTROL (GRID) LOW 75 ohm HORIZONTAL DEFLECTION COIL 100,000 ohm HORIZONTAL DEFLECTION SUPPLY HORIZONTAL DRIVE * HIGH = 100,000 ohm LOW = 75 ohm FIGURE 8-1 CRT ELECTRON BEAM FOCUSES TO SPOT ON SCREEN Block diagram of cathode ray tube (CRT) monitor television sets until the late-20th century. Components of the monochrome CRT monitor include the video amplifying and deflection circuitry, the video-processing circuits to remove the synchronizing signals from the video signal, and the CRT (Figure 8-1). The CRT is composed of four basic components: (1) heated cathode (2) electron gun, (3) glass envelope, and (4) phosphor screen. The color CRT requires three electron guns of similar construction to display the three primary colors, red, green, and blue (RGB). The monochrome CRT is relatively easy to manufacture since the screen consists of a uniform coating of a single phosphor material. The yield during manufacture of the CRTs is high (compared to the LCD or plasma displays) since the human eye is far less sensitive to variations in phosphor flaws than it is to the defective pixel or cell failures in flat panel digital displays. The homogeneous and continuous phosphor layer has very high resolution since it is continuous, as contrasted to flat-panel cells (pixels). The resolution of a CRT is limited by the electron beam diameter and the electronic video bandwidth that determines how fast the electron beam can turn on and off. The lifetime of standard CRTs is legendary, especially under adverse operating conditions with which it is used: consider the abuse that standard consumer TVs receive but they still continue to operate. The CCTV monitors may be cycled on and off and adjusted over a wide range of brightness and contrast, sometimes beyond their design limits and still operate satisfactorily for many years. 8.2.1.1 Beam Deflection Cathode rays are streams of high speed electrons emitted from the heated cathode of the vacuum tube. In a CRT the electrons are carefully directed into a beam and this beam is deflected by a magnetic field to scan the surface at the viewing end (anode) which is lined with a phosphorescent material that produces the visible image on the face of the tube for viewing. In the case of the video monitor, and television and computer monitors, the entire front area of the tube is scanned in a fixed pattern called a raster. The picture is created by modulating the intensity of the electron beam according to the scene light intensity represented in the video signal. The magnetic field is applied to the neck of the tube via the use of an electromagnetic coil, and the process is referred to as magnetic deflection. The CRT bends the electron beam at extremely high speed with exact timing and gating to produce a complex 254 CCTV Surveillance picture. Electron beams can be deflected so quickly that pictures on the screen can be refreshed without noticeable flicker. The CRT’s electron beam strikes and excites the phosphor screen, which has a high luminous efficiency. Most CRT monitors use magnetic deflection to deflect the electron beam in the horizontal and vertical directions to produce the scene on the monitor face. Figure 8-1 illustrates the placement of the vertical and horizontal deflection coils at the neck of the CRT. When current flows through the horizontal coils, a horizontal magnetic field is produced across the neck. The amount of horizontal deflection of the electron beam depends on the strength of the magnetic field and therefore the current through the coil. The direction of the beam deflection (left to right) while passing through the horizontal coil depends on the polarity of the field. Likewise for the vertical deflection coil the electron beam is deflected up or down depending on the strength of the magnetic field which in turn depends on the vertical deflection current. The energizing of both the horizontal and the vertical coils causes the raster scan and picture on the CRT monitor. As with the scanning in a tube or solid-state video camera, the video monitor has an aspect ratio of 4:3 with a diagonal of 5 units. The size of the tube is measured from one corner of the screen to the opposite corner and referred to as the diagonal. A disadvantage of the CRT is its relative size, particularly its depth compared to digital displays that have a short depth. However, if there is sufficient space behind the monitor there is no disadvantage to the CRT monitor size. 8.2.1.2 Spot Size, Resolution The term “image resolution” describes how much image detail the image can display on an analog or digital monitor. Higher resolution means more image detail. In analog monitors the resolution is generally defined in TV lines and defined as the number of black-and-white line pairs that are distinguishable in the horizontal and vertical direction. There is sometimes confusion defining the horizontal resolution, since it is sometimes defined as the number of horizontal TV lines in a width equivalent to the vertical height and at other times it is defined as the total number of TV lines along the horizontal axis. The correct definition is the number of TV lines for an equivalent vertical height. The spot size of the light beam is the diameter of the focused electron beam, which ultimately determines the resolution and quality of the picture. The spot size should be as small as possible to achieve high resolution. Typical spot sizes range from about 0.1 to 1.0 mm. The spot size is smallest at the center of the CRT and largest at the corners (5–10% larger). Deflection along the edges elongates the spot and decreases resolution. The convention to describe the image resolution in digital raster image displays is with a set of two positive integers where the first number is the number of pixel columns (horizontal width) and the second is the number of pixel rows (height). The second most popular convention is to describe the total number of pixels in the image, which is calculated by multiplying the pixel columns by the pixel rows. 8.2.1.3 Phosphors Cathode ray tube phosphors glow for a time determined by the phosphor material and must be matched to the refresh rate. The use of the white P4 phosphor has been widespread as the standard monochrome television monitor phosphor in the past. It is capable of achieving good focus and small spot size. Its low cost and ready availability contribute to its continued popularity in monitors. P4 is a medium, to medium-short-persistence phosphor. The phosphor “glow” activated by the electron beam fades away fairly rapidly, leaving no cursor trail or temporary “ghost” scene when the monitor is turned off. The P4 phosphor is moderately resistant to phosphor “burn” a term used to describe the permanent dark pattern caused by fixed bright scenes in the video image on the CRT face. The susceptibility to burn is somewhat proportional to persistence, with longer-persistence phosphors more liable to burn-in. Color tubes use three different phosphorescent materials that emit red, green, and blue light. These colors are emitted from closely packed patterns of clusters or strips (Sony Trinitron) as determined by a shadow mask. There are three electron guns (one for each color). The shadow mask ensures that the electrons from each color gun reach only the phosphor dots of its corresponding color, with the shadow mask absorbing electrons that would otherwise hit the wrong phosphor. 8.2.1.4 Interlacing and Flicker Flicker is the visible fading between image frames displayed on any monitor. In the CRT monitor it occurs when the CRT is driven at too low a refresh rate (frame rate) allowing the screen phosphors to lose their excitation between sweeps of the electron gun. On computer monitors using progressive scan (no interlace), if the vertical refresh rate is set at 60 Hz, most monitors will produce a visible flickering effect. Refresh rates of 75–85 Hz and above result in flicker-free viewing on progressively scanned CRTs. Above these rates no noticeable flicker reduction is seen and therefore higher rates are uncommon in video surveillance applications. Although it has become acceptable to call 60 Hz non-interlaced displays flicker-free, a large percentage of the population can see flicker at 60 Hz in peripheral vision when common P4-type phosphors are used. A 19-inch CRT viewed at 27 inches covers more than the central cone vision, and therefore most people see some flicker. While this situation is not ideal, it cannot be Analog Monitors and Digital Displays overcome because of the inherent 60 Hz power-line frequency. On LCDs, lower refresh rates around 75 Hz are often acceptable. Interlacing is one of the most common and cost-effective methods used to achieve increased resolution at conventional 60 Hz scan rates. One critical design consideration in interlaced operation is that a long-persistence P39 phosphor must be used; P4 phosphor is not suitable for interlaced operation (the European equivalent of P4 is W). The glow of short- to medium-persistence P4 phosphor begins to fade before it can be refreshed. At the standard US non-interlaced 60 Hz refresh rate this presents no problem: the viewer’s eye retains the image long enough to make any fading imperceptible. In an interlaced monitor, the beam skips every other row of phosphor as it moves down the CRT face in successive horizontal scans. Only half the image is refreshed in a vertical sweep cycle, so the frame-refresh rate is effectively 30 Hz (two 1/60 second fields equal 1/30 second frame). The eye cannot retain the image long enough to prevent pronounced flicker in the display if a short- to medium-persistence phosphor is used. The phosphor glow must persist long enough to compensate for the slower refresh rate. The “flicker threshold” of the human eye is about 50 Hz, with a short- to medium-persistence phosphor. Monitor manufacturers designing for European-standard 50 Hz operation therefore pay particular attention to the phosphor used. Interlaced scanning is the method used in most video systems to reduce flicker, and since video scene content consists of large white areas, no objectionable flicker is apparent. In computer alphanumeric/graphic displays, most display data consists of small bright or dark elements. Consequently, an annoying flicker results when alphanumeric/graphic data are displayed using interlaced scanning unless a longer-persistence phosphor is used. Therefore the phosphor type used in the video monitor is different from that used for computer terminal monitors. 255 that video and audio from the camera location are displayed on and heard from the monitor. The video input impedance is 75 ohms and the audio input impedance is 600 ohms. 8.2.1.7 Standards The NTSC television format uses 525 lines per frame with about 495 horizontal lines (any number of lines between 482 and 495 may be transmitted at the discretion of the TV station) for the picture content. To produce satisfactory horizontal picture definition—that is, a gray scale and sufficient number of gradations from dark and light per line—a bandwidth of at least 4.2 MHz is required. CCTV monitors generally conform to EIA specifications EIA-170, RS-330, RS-375, RS-420, and most often to UL specification 1410 for signal specifications and safety. The analog circuitry is usually capable of reproducing a minimum of ten discernible shades of gray, as described in the RS-375 and RS-420 specification. The outer glass on the front of the CRT allows the light generated by the phosphors to get out of the monitor. However, for color tubes the glass must block dangerous X-rays generated by the impact of the high-energy electron beam. For color monitors the type of glass used is leaded glass. Modern CRTs are safe and well within safety limits for humans because of this and other shielding and protective circuits designed to prevent the anode voltage from rising to high levels and producing X-ray emission. CRTs operate at very high voltages that can persist long after the monitor has been switched off. The CRT monitor and especially the tube should not be tampered with unless the technician has had proper engineering training and appropriate precautions have been taken. Since the CRT contains a vacuum, care should be taken to prevent tube implosion caused by improper handling. 8.2.2 Monochrome Monitor 8.2.1.5 Brightness The luminance (brightness) of the CRT monitor picture is proportional to the electron beam power, while the resolution depends on the beam diameter. Both of these properties are determined by the electron gun. Very high resolution monitors are available having a resolution of 3000 lines—close to the ergonomic limit of the human eye. Present security systems do not take advantage of this high resolution but some systems display 1000-line horizontal resolution. 8.2.1.6 Audio/Video As with home television cameras and receivers, some monitors are equipped with audio amplifiers and speakers so Figure 8-1 shows the block diagram including the videoprocessing circuits to remove the synchronizing signals from the video signal, the video amplifying and deflection circuitry, and the CRT. In its simplest form, the analog CRT monochrome monitor consists of: • • • • • • • input video terminating circuit video amplifier and driver sync stripper vertical-deflection circuitry horizontal-deflection circuitry focusing electronics and CRT: cathode electron generator, electron gun, faceplate. The video input signal to the monitor is a negative sync type, with the scene signal amplitude modulated as the 256 CCTV Surveillance positive portion of the signal (see Figure 5-4), and the synchronizing pulses the negative portion of the signal. Via frequency-selective circuits, the horizontal and vertical synchronization pulses are separated and passed onto the horizontal and vertical drive circuits. The sync-stripper circuit separates the analog video signal from the horizontal and vertical synchronizing pulses. These synchronizing pulses produce the scanning signals for the horizontal and vertical deflection of the electron beam and are similar to those used in the camera to produce scanning of the image sensor. The vertical- and horizontal-deflection electronics drive the vertical and horizontal coils on the neck of the CRT to produce a raster scan. The CRT consists of: (1) a cathode source that emits electrons to “paint” the picture, (2) a grid (valve) that controls the flow of the electrons as they pass through it, (3) the electron beam that passes a set of electrodes to focus the beam down to a spot, and (4) a phosphor-coated screen that produces the visible picture (Figure 8-2). When the focused beam passes through the field of the tube’s deflection yoke (coils), it is deflected by the yoke to strike the appropriate spot of the tube’s phosphor screen. By varying the voltage on the horizontal and vertical coils, the electron beam and spot are made to move across the CRT with the familiar raster pattern. The screen then emits light with intensity proportional to the beam inten- sity resulting in the video image on the monitor. The CRT monitor accomplishes all this using relatively simple and inexpensive components. In this way the scene received by the camera is reconstructed at the monitor. The block diagram is for any monochrome analog monitor. The color monitor has electronics for the three primary colors: red, green and blue (RGB). Analog video monitors accept the standard video baseband signal (20 Hz to 6.0 MHz) and display the image on the CRT phosphor. The monitor circuitry is essentially the same as a television receiver but lacks the electronic tuner and associated RF amplifiers and demodulators to receive the VHF or UHF broadcast, cable, and satellite signal. All monochrome and color security monitors accept the standard 525-line NTSC input signals. The video signal enters the monitor via a BNC connector and is terminated by one of two input impedances: 75 ohm to match the coaxial-cable impedance or high impedance: 10,000– 100,000 ohms (Figure 8-3). The high impedance termination does not match the coaxial-cable impedance and is used when the monitor will be terminated by some other equipment such as a looping monitor, a VCR or DVR, or some other device with a 75-ohm impedance. If two or more monitors receive the same video signal from the same source, only one of the monitors—the last one in line—should be set to MONOCHROME COLOR (3) ELECTRON BEAM GUN ASSEMBLY (3) ELECTRON BEAMS VERTICAL DRIVE CRT ELECTRON BEAM FOCUSING COIL VIDEO IN VERTICAL DEFLECTION COIL ELECTRON BEAM CATHODE ELECTRON BEAM CONTROL (GRID) CRT HORIZONTAL DEFLECTION COIL HORIZONTAL DRIVE FIGURE 8-2 R G B (3) ELECTRON BEAMS FOCUS TO FORM ONE COLOR PIXEL CLUSTER ELECTRON BEAM FOCUSES TO SINGLE SPOT ON SCREEN Cathode ray tube (CRT) components Analog Monitors and Digital Displays 257 MONITOR (CRT/LCD) INPUT IMPEDANCE: HIGH: 10,000 TO 100,000 ohm LOW: 75 ohm LOW MONITOR (CRT/LCD) RECORDER (DVR/VCR) PRINTER HIGH VIDEO INPUT OUTPUT OUTPUT INPUT HIGH 75 ohm 10,000 TO 100,000 ohm LOW FIGURE 8-3 VIDEO INPUT 75 ohm TERMINATION (INTERNAL) OUTPUT CCTV monitor terminations and connections the 75-ohm position. If a recorder rather than a second monitor is used, the recorder automatically terminates the coaxial cable with a 75-ohm resistor. As shown in the block diagram in Figure 8-3, the monitor has two BNC input connectors in parallel. When only one monitor is used, the impedance switch is moved to the 75-ohm, low-impedance position terminating the coaxial cable. If more than one monitor or auxiliary equipment is used, the terminating switch is left in the high-impedance position, opening the connection to the 75-ohm resistor so that the final termination is determined by a second monitor, recorder, or printer. Some monitors contain an external synchronization input so that the monitor may be synchronized from a central or external source. The operator controls available on most monochrome monitors are power on/off, contrast, brightness, horizontal hold, and vertical hold. Three other controls sometimes available via screwdriver adjust (front or rear of the monitor) are horizontal size, vertical size, and focus. 8.2.3 Color Monitor Until recently the major CRT technology used in color monitors employed three electron guns (one for each primary color) arranged in a triangle called the delta-delta system (Figure 8-4a). A device called a shadow mask aligns each electron gun output so that the beam falls on the proper phosphor dot. The shadow mask is a thin steel screen in the CRT containing fine holes that concentrate the electron beam. This technique provides the highest resolution possible but requires the guns to be aligned manually by a technician, as well as expensive convergent-control circuitry. The composite video input signal in the color monitor contains the information for the correct proportion of R, G, B signals to produce the desired color. It also contains the vertical and horizontal synchronization timing signals needed to steer the three video signals to the correct color guns. Composite video color monitors decode the signal and provide the proper level to generate the desired output from the three electron guns. Today the most widely used CRT color technique is the precision-in-line (PIL) tube that eliminates most of these difficulties (Figure 8-4b). The PIL tube uses the shadow mask found in its predecessors, but the electron guns are in a single line. The spacing between the holes is termed the dot pitch or dot-trio spacing and ultimately determines the tube’s resolution. The highest-resolution production PIL tube has approximately a 0.31 mm pitch, and is preconverged by the manufacturer so that no adjustment is necessary in the field. There is a slight decrease in resolution for the PIL as compared with the original deltadelta, but this is a small sacrifice considering that no field adjustment is required. A third CRT color tube called the Trinitron (trademark of Sony Corporation) consists of the phosphor layer consisting of alternate RGB vertical stripes (Figure 8-4c). Analog CRT monitors are available in many sizes. The 5- and 9- inch diagonal sizes are suitable for side-by-side 258 CCTV Surveillance RED(R), GREEN(G), BLUE(B) FORMS SINGLE WHITE DOT 0.31 mm PITCH IN-LINE ELECTRON GUN R G R B G B R G B FIGURE 8-4 Color monitor technology mounting in the standard EIA 19 inch rack. Figure 8-5 shows a few examples of these monitors. 8.2.4 Color Model Video monitors and computer displays use the RGB color model utilizing the additive model in which red, green, and blue light are combined in different proportions to (A) 9" MONOCHROME FIGURE 8-5 TRINITRON (C) PRECISION IN-LINE (PIL) (B) DELTA (A) Typical CRT monitors create all the other colors. The idea of the RGB model itself came from the additive light model. Primary colors are related to biological rather than physical concepts, and based on the physiological response of the human eye to light. The human eye contains receptor cells called cones which respond most to yellow, green, and blue lights (wavelengths of 564, 534, and 420 nm, respectively). The color red is perceived when the yellow–green receptor is stimulated significantly more than the green receptor. (B) 14" COLOR Analog Monitors and Digital Displays The RGB model is used to display colors on the CRT, LCD, plasma, and OLED monitors. Each pixel on the screen is represented in the video signal or computer’s memory as an independent value for red, green, and blue. These values are then converted into intensities and sent to the CRT or flat-panel display. Using an appropriate combination of red, green, and blue light intensities, the screen can reproduce the colors between its black and white levels. Most computer models use a total of 24 bits of information for each pixel commonly known as bits per pixel or bpp. This corresponds to eight bits each for red, green, and blue giving a range of 256 possible values or intensities for each color. With this system approximately 16.7 million discrete colors can be reproduced. In the 24 bpp, RGB values are commonly specified using three integers between 0 and 255, each representing red, green, and blue intensities in that order. For example: • • • • • • • • (0, 0, 0) is black (255, 255, 255) is white (255, 0, 0) is red (0, 255, 0) is green (0, 0, 255) is blue (255, 255, 0) is yellow (0, 255, 255) is cyan (255, 0, 255) is magenta. The colors used in Internet Web design are commonly specified using the RGB model. They are used in the HTML, and related languages with a limited color palette of 216 RGB colors as defined by the Netscape Color Cube. However, with the predominance of 24-bit displays the use of the full 16.7 million colors is used. The RGB color model for HTML was formally adopted as an Internet standard in HTML 3.2. 8.3 FLAT-SCREEN DIGITAL MONITOR There are several technologies used to manufacture flatpanel digital displays and more are in development. They all have one feature in common: a much better depth profile than the traditional CRT monitor display. Typical flat-panel displays are from 1/2 to 4-inches in depth compared with 10–20 inches for CRT monitors. The most common flat-panel displays are: • liquid crystal display (LCD) • plasma • organic LED (OLED). The LCD and plasma displays have been in commercial production and in widespread use in the video surveillance industry for several years. The OLED monitors are beginning deployment in small sizes, but are expected to be introduced in large sizes tailored for the security and computer markets. Flat-panel displays offer a small footprint and trendy modern look but have higher costs, and 259 in many cases inferior images compared with traditional CRTs. In some applications, specifically modern portable devices such as laptops, cell phones and PDAs, these negatives are being overcome. 8.3.1 Digital Technology A raster graphics image, digital image, or bitmap is the display format used by most digital video flat-screen monitors. In general the technology represents a rectangular grid of pixels or points of color on a computer monitor. The color of each pixel is individually defined (RGB) and generally consists of: colored pixels defined by three bytes, one byte for each red, green, and blue pixel. For a monochrome image only black-and-white pictures are required with a single byte for each pixel. This raster presentation is distinguished from vector graphics in that vector graphics represents an image generated through the use of geometric objects such as lines, curves, arcs, and polygons. The bitmap on the monitor corresponds to the format of the image on the camera, and is stored identically to it in the video displays’ computer memory. Each pixel in the map has a specific width and height, and the bitmap representing the image has an overall width and height consisting of a specific number of rows and columns of pixels. The quality of the raster image is determined by the total number of pixels (resolution) and the amount of information in each pixel (often called color depth). The standard for most high-quality displays in 2004 had an image that stored 24 bits (3 bytes) of color information per pixel. Such an image in a typical surveillance application is sampled at 640 × 480 pixels (total 307,200 pixels). This image will look good as compared to an excellent image sampled at 1280 × 1024 (1,310,720 pixels). High-quality, high-resolution pictures such as these generally require compression techniques to reduce the size of the image file stored in the computer’s memory and to fit the signal into the limited bandwidth communication transmission channels available. Raster graphics cannot be scaled to a higher resolution i.e. larger screen size without a loss of resolution and image quality (this is in contrast to vector graphics which can easily scale and retain their quality and size on the larger device on which they are displayed). 8.3.1.1 Pixels, Resolution A pixel (contraction of picture elements) represents the smallest resolution element made up in the monitor picture, in a computer memory, or in the camera image sensor. Usually the pixels (dots) are so small and so numerous that they cannot be distinguished on the monitor and appear to merge into a smooth image when viewed at a normal distance from the monitor. The pixel dots in the 260 CCTV Surveillance flat-panel display are analogous to the dots used in hardcopy printed matter used to produce a printed image. The color and intensity of each pixel is such that it represents the scene image at that location. The more the pixels used to represent the image the higher the resolution and the closer the image resembles the original scene. The number of pixels in the image determines the image resolution. The normal VGA display has 640 × 480 pixels. In a monochrome image each pixel has its own brightness between the range of zero and one where zero represents black and one represents white. For example an eight-bit image can display 255 brightness levels. In a color image the number of distinct colors that can be represented by pixels depends on the number of bits per pixel (bpp). Some standard values are: • 8 bpp provides 256 colors. • 16 bpp provides 65,536 colors referred to as Highcolor. • 24 bpp provides 16,777,216 colors referred to as Truecolor. In both full color LCD, plasma, OLED flat panels, and CRT monitors, each of the pixels is constructed from three sub-pixels for the three colors, and are spaced closely together. A unique technology is the Sony Trinitron that has three closely spaced stripes of red, green, and blue (Figure 8-4c). Each of the sub-pixels has an intensity determined by its color RGB component values and due to their close proximity they create an illusion of being one specifically tinted pixel. A recent technique for increasing the apparent resolution of a color display is referred to as “sub-pixel font rendering.” This technique uses knowledge of the pixel geometry to manipulate the three colors’ sub-pixels separately and works best on LCDs. It also eliminates much of the anti-aliasing in some scenes and is used primarily to improve the appearance of text. Microsoft’s ClearTypeTM that is available in Windows XP is an example using this technology. The display resolution of a digital video monitor or computer display is represented by the maximum number of pixels that can be displayed on the screen, usually given as a product of the number of columns horizontal (X ) and the number of lines vertical (Y ). The horizontal number is always stated first. Common current computer display resolutions are listed in Table 8-1. The 640 × 480 resolution was introduced by IBM, and has been in use from approximately 1990 to1997 in their PS/2 VGA multicolor onboard graphics chips. This particular format was chosen partly due to its 4:3 ratio. The 800 × 600 array has been the standard resolution from 1998 to the present, but the 1024 × 768 is fast becoming the standard resolution since it has not only the 4:3 ratio but a higher resolution. Many websites and multimedia products are designed for this resolution. Windows XP is designed to run at 800 × 600 minimum although it is also possible to run applications with the 640 × 480 format. With 15- and 17-inch digital monitors in use, 1024 × 768 resolution is standard. For 19-inch monitors 1280 × 1024 is the recommended standard. Good 21-inch monitors are capable of 1600 × 1200 resolution. There are also 24-inch wide-screen monitors that can often display 1900+ pixels horizontally. SCREEN SIZES * PIXEL FORMAT ASPECT RATIO LCD PLASMA QVGA 320 × 240 4:3 — — VGA 640 × 480 4:3 15, 17, 20 — SVGA 800 × 600 4:3 20 — XGA 1024 × 768 4:3 15 42, 43 XGA+ 1152 × 864 4:3 — — SXGA+ 1400 × 1050 4:3 — — WSXGA 1680 × 1050 16:10 22 — WUXGA 1920 × 1200 16:10 23 — QXGA 2048 × 1536 4:3 — — HDTV 1080i 1920 × 1080 16:9 42, 45 42, 50 HDTV 720p 1280 × 720 16:9 17, 23, 27, 30 42, 50 COMPUTER STANDARD * DIAGONAL (inch) Table 8-1 Digital Video Monitor Display Formats Analog Monitors and Digital Displays then passes through the second sheet. The entire assembly looks nearly transparent with a slight darkening caused by light losses in the original polarizing sheet. When an electric field is applied to the panel the molecules in the liquid align themselves with the field and inhibit the rotation of the polarized light. Since the light impinges on the polarizing sheet perpendicularly to the direction of polarization, all the light is absorbed in the cell and it appears dark. Most visible wavelengths—all colors—are rotated by LCDs in the same way. In a color LCD each pixel triad is divided into three sections having one red, one green, and one blue filter to project the individual colors. All colors are achieved by varying the relative brightness of the three sections. 8.3.2 Liquid Crystal Display (LCD) In 1968 a group at RCA demonstrated the first operational LCD based on the dynamic scattering mode (DSM). In 1969 a former member of the RCA group at Kent State University discovered the twisted nematic field effect in liquid crystals and in 1971 the ILIXCO Company produced the first LCD based on this effect. This technology has now superseded the DSM type and is now used in most LCD displays. The LCD is a thin lightweight panel consisting of an electrically controlled light polarizing liquid sealed in cells between two transparent polarizing sheets (Figure 8-6). The two polarizing axes of the two sheets are aligned and perpendicular to each other, and each cell is supplied with electrical contacts that allow electric fields to be applied to the liquid inside. In operation, when no electric field is applied, light is polarized by one sheet, rotated through the smooth twisting of the crystal molecules, and 8.3.2.1 Brightness The brightness of a display is defined by a unit of luminance called the “nit” and is often used to quote the brightness of a display. Typical displays have a luminance LIGHT LIGHT VERTICAL POLARIZER LIGHT SOURCE POLARIZING FILTER NO LIGHT OUT RANDOM POLARIZATION HORIZONTAL POLARIZER TWISTED NEMATIC CRYSTAL DC VOLTAGE POLARIZING FILTER LCD OFF LCD ON (LIGHT BLOCKED BY 2nd POLARIZER) POLARIZATION OVERVIEW TWISTED NEMATIC (TN) OPERATION UPPER POLARIZER UPPER GLASS PATTERNED TRANSPARENT ELECTRODES LIQUID CRYSTAL LOWER GLASS LOWER POLARIZER CONSTRUCTION OF TN DISPLAY FIGURE 8-6 Reflective twisted nematic LCD assembly 261 PATTERNED TRANSPARENT ELECTRODES 262 CCTV Surveillance of 200–300 nits. Outdoor high-brightness displays can have luminance in values in the range of 1000–1500 nits. 8.3.2.2 Liquid Crystal Display Modes of Operation The LCD technology lends itself to several different modes of operation. It can be operated in a transmissive or reflective mode. A transmissive LCD is illuminated from the backside and viewed from the front side. Activated cells therefore appear dark while inactive cells appear bright. The transmissive LCD technology is used in highbrightness indoor applications and for outdoor use. In this mode of operation a lamp assembly is used to illuminate the LCD panel that usually consumes more power than the LCD panel itself (Figure 8-7). The second LCD technology is the reflective type using ambient light reflected off the display. It has a lower contrast than the transmission type and is not generally useful for video security applications except for batteryoperated systems where very low power operation is required. It finds most application where small, handheld monochrome displays are required. Quality control issues in manufacturing LCD panels are different from CRT monitors. Since the digital panels contain thousands of individual pixels, a defect in the panel is visible whenever one or more of the pixels are not operating. However, the panels still may be useful if only a limited number of pixels are not operating or if the defective pixels are not in the central part of the display. Several criteria are used to grade the individual LCD panels and determine whether they are suitable for security application. One criteria used for passing or failing LCD panels was developed by IBM to quality-check their ThinkPad laptop computers. If the panel had less than a specified number of defective bright dots, or was not dark, it was passed; if more, it was rejected (Figure 8-8). The first generation LCDs used passive matrix technology. This technology uses a simple conductive grid to deliver current to the liquid crystals in the target area. The second-generation active matrix display uses a grid of transistors with the ability to hold a charge for limited time, much like a capacitor. Because of the switching action of the transistors, only the desired pixel receives a charge, improving the image quality over a passive matrix. The SINGLE PIXEL DATA LINE TRANSISTORS PIXEL ELECTRODE TFT GATE LINE FULL LCD ARRAY THIN FILM TRANSISTOR (TFT) LIQUID CRYSTAL ARRAY LAMP POWER SUPPLY LAMP LAMP LCD PANEL COMPOSITE VIDEO IN RGB IN VGA IN DC POWER IN VIDEO PROCESSOR LCD PANEL DRIVE ELECTRONICS LCD ASSEMBLY FIGURE 8-7 TFT Liquid crystal array—LCD assembly Analog Monitors and Digital Displays 263 BLEMISH CRITERION USED BY A MAJOR MANUFACTURER FOR A 12" DIAGONAL LCD DISPLAY RESOLUTION BRIGHT SPOTS DARK SPOTS TOTAL SPOTS QXGA 15 16 16 UXGA 11 16 16 SXGA+ 11 13 16 XGA 8 8 9 SVGA 5 5 9 12" LCD PANEL WITH 6 BLACK AND 3 WHITE SPOTS FIGURE 8-8 Pass/fail quality control criteria for LCD panels thin-film transistors hold a charge and therefore the pixel remains active until the next refresh occurs. The AMLED active matrix LCD is in widespread use in the security industry and the only choice of notebook computer manufacturers. They are used due to their lightweight, very good image quality, wide color range, and fast time response. The display contains an active matrix with polarizing sheets and cells of liquid crystal, and a matrix of thin-film transistors (TFTs). These transistors store the electrical state of each pixel in the display while the other pixels are being updated. This method provides a much brighter, sharper display than a passive matrix LCD of the same size. An important specification for these displays is the large viewing angle that they can accommodate. These displays have refresh rates of around 75 Hz. 8.3.3 Plasma The plasma display is an emissive flat panel where light is created by phosphors that have been excited by a plasma discharge between two flat panels of glass that have a gas discharge containing no mercury (contrary to the back lights of the AMLCD panel). The plasma display uses a mixture of the noble gases neon and xenon. The neon and xenon gases in the plasma display are contained in hundreds of thousands of tiny cells sandwiched between the two plates of glass (Figure 8-9). The control electrodes are also sandwiched between the glass plates on both sides of cells. Electronics external to the panel behind the cells address each of the pixels cells. To ionize the gas in a particular cell the plasma display’s computer charges the electrodes that intersect at that particular cell. When intersecting electrodes are charged with a voltage difference between them and electric current flows through the gas in the cell, it stimulates the gas and causes it to release ultraviolet photons. The phosphors in the plasma display give off color light when they are excited by these ultraviolet photons. As with other displays every pixel is made up of three separate sub-pixels with different colored phosphor (RGB) to produce the overall color of the pixel by varying the pulses of current flowing through different cells. Major attributes of the plasma display are that it is very bright (1000 lux or higher), has a wide range of colors, and can be produced in large sizes of up to 80 inches diagonally. Another advantage of the plasma monitor over others is its high contrast ratio, often advertised as high as 4000:1. Since contrast is generally hard to define, an absolute value for the contrast improvement over other technologies is not yet available. Plasma panels also have very high contrast ratios, creating a near-perfect black image—important when there is a need to discern picture content in LLL scenes. The display panel itself is about one-quarter inch thick while the total thickness including electronics can be less than 4 inches. Plasma displays use approximately the same power as a CRT or AMLCD monitor. Plasma monitors still cost more than all the other digital display technologies. A main advantage of the plasma display technology over others is that it is very scalable and very large, and wide screens can be produced using extremely thin materials. Plasma displays can have as many as 1024 shades, resulting in a high-quality image. Since each pixel is lit individually, the image is very bright and has a very wide viewing angle. The image quality is nearly as good as the best CRT monitors. Figure 8-9 shows examples of some standard LCD and plasma monitors. 264 CCTV Surveillance FIGURE 8-9 (A) 6.4" DIAGONAL LCD IN CASE (B) 6.4" DIAGONAL LCD UNCASED (C) 17" DIAGONAL LCD (D) 42" DIAGONAL PLASMA DISPLAY Standard LCD and plasma monitors 8.3.4 Organic LED (OLED) An organic OLED is an LED made of a semiconducting organic polymer (Figure 8-10). These devices promise to be much cheaper to fabricate than the inorganic LEDs used in other applications. The OLEDs can be fabricated in small or large arrays by using simple screen-printing methods to create the color display. One of the greatest benefits of the OLED display over traditional LCDs is that they do not require a backlight to function. They draw far less power than LCDs and can be used with small portable battery-operated devices which in the past have been using monochrome low resolution LCDs to conserve power. This also means that they will be able to operate for long periods of time with the same amount of battery charge. The first digital camera using an OLED display was shown by the Kodak Company in 2003. This was the first OLED technology and is usually referred to as small molecule OLED. The second technology and improvement over the first was developed by Cambridge Display Technologies and is called “light emitting polymer” (LEP). Although a latecomer, LEP is more promising because it uses a more straightforward production technique (Figure 8-11). The LEP materials can be applied to the substrate by a technique derived from commercial inkjet printing. This means the LEP displays can be made flexible and inexpensively. Organic LEDs operate on the principle of electroluminescence. An organic dye is the key to the operation of an OLED. To create the electro-luminescence a thin film of the dye is used and a current passed through it in a special way. The radically different manufacturing process of OLEDs lends itself to many advantages over traditional flat panel displays. Since they can be printed onto a substrate using Analog Monitors and Digital Displays 265 LIGHT OUTPUT TRANSPARENT ANODE 5–10 V ORGANIC EMITTING STACK TRANSPARANT CATHODE GLASS/PLASTIC SUBSTRATE (A) OLED DISPLAY FIGURE 8-10 (B) OLED STRUCTURE Organic LED (OLED) panel structure (A) LEP APPLICATION USING INKJET TECHNOLOGY (B) OLED STRUCTURE MOVING INKJET CARTRIDGES R,G,B LEP POLYMERS 2–10 VDC METAL CATHODE ROTATING DRUM ELECTRON TRANSPORT LAYER (ETL) ITO ANOD HOLE INJECTION LAYER (HIL) 3 OLED EMITTERS ORGANIC STACK GLASS SUBSTRATE R OLED FILM CARRIER G B FILM/SUBSTRATE DIRECTION LIGHT OUTPUT NOTE: LEP = LIGHT EMITTING POLYMER ITO = INDIUM–TIN–OXIDE FIGURE 8-11 OLED flat panel LEP technology traditional ink jet technology they have a significantly lower cost than LCDs or plasma displays. This scalable manufacturing process enables the possibility of much larger displays. Unlike most security LCD monitors employing backlighting, the OLED is capable of showing true black (completely off). In this mode the OLED element produces no light, theoretically allowing for infinite contrast ratio. The range of colors and brightness possible with OLEDs is greater than that of LCDs or plasma displays. Needing no backlight, OLEDs require less than half the power of LCDs and are well suited for mobile applications where battery operation is necessary. 8.4 MONITOR DISPLAY FORMATS There are several video formats used in different applications but the most predominant format remains the 4:3, which is used in almost all video security surveillance applications. A new format representing HDTV has not 266 CCTV Surveillance yet made any real impact in the security surveillance field (Figure 8-12). landscape scenes: parking lots, waterfronts, airport runways and aircraft parking area, and public gathering places. High definition TV has many different pixel formats and screen sizes that provide different resolutions and recommended viewing distances (Table 8-2). 8.4.1 Standard 4:3 The standard 4 ×3 video format (horizontal × vertical) has been in existence for many years and remains the predominant format at this time for the CRT and LCD. Liquid crystal displays and OLED displays are also manufactured in both the 4 × 3 and 16 × 9 formats. 8.4.3 Split-Screen Presentation Equipment that combines video images can produce a significant reduction in the number of monitors required in a security console room. While the monitors for these displays are the same as those for single-image displays, the image-combining electronics permit displaying multiple camera scenes on one monitor. Chapter 16 describes the hardware to accomplish this function. The hardware takes the form of electronic combining circuits and special 8.4.2 High Definition 16:9 The 16:9 HDTV format was introduced as a new widescreen display to satisfy the consumer and presentation markets. This format has not yet found widespread use in the video security sector but could offer advantages in specific applications such as viewing wide-angle outdoor WIDE-SCREEN 16 10 FORMAT 16 16:10 9 HORIZONTAL VERTICAL 4 3 16:9 ASPECT RATIO 16 × 10 16 × 9 4×3 4:3 ARRAY SIZE: MEGA PIXELS 320 240 76,800 VGA 640 480 307,000 SVGA 800 600 480,000 XGA 1024 768 786,000 XGA+ 1152 864 995,000 SXGA 1280 1024 1,310,000 SXGA+ 1400 1050 1,470,000 UXGA 1600 1200 1,920,000 HDTV QVGA FIGURE 8-12 STANDARD 1280 720 921,600 HDTV 1920 1080 2,073,000 WXGA 1280 768 983,000 WSXGA† 1680 1050 1,764,000 WUXGA 1920 1200 2,304,000 Monitor display formats Analog Monitors and Digital Displays 16:9 DIAGONAL d SCREEN SIZE (INCH) MINIMUM VIEWING DISTANCE D (ft) 267 MAXIMUM VIEWING DISTANCE D (ft) 20 2.5 5.0 26 3.3 6.5 30 3.8 7.6 34 4.3 8.5 42 5.3 10.5 47 5.9 11.8 50 6.3 12.5 55 6.9 12.8 60 7.5 15 65 8.1 16.2 THE HDTV DISPLAY ASPECT RATIO IS 16:9 OR ABOUT 1.78:1. STANDARD ANALOG VGA VIDEO IS 4:3 OR 1.333:1 DISPLAY OBSERVER d HDTV IS APPLICABLE TO SURVEILLANCE APPLICATIONS REQUIRING WIDE HORIZONTAL FIELD OF VIEW, HIGH RESOLUTION AND LARGE SCREEN SIZE. D Table 8-2 HDTV Screen Sizes vs. Viewing Distance applications or image-combining optics to produce multiple images—from 2 to 32 images—on one monitor screen. 8.4.4 Screen Size, Resolution Resolution specifications for monitors refer to a full camera image presented on the monitor. When a split-screen presentation is used the resolution for each of the camera scenes decreases proportionally to the decrease in horizontal width and vertical height. A four-camera (quad) presentation on a monitor decreases the horizontal resolution by two and vertical resolution by two. Likewise, nine camera scenes on a monitor decrease the horizontal resolution by three and vertical resolution by three. When the screens are split to display 16 and 32 images, the horizontal and vertical resolutions decrease proportionately. 8.4.5 Multistandard, Multi-Sync Multistandard, multisync, and multivoltage television monitor–receiver combinations are available that operate on the US NTSC (525 TV lines) and the European CCIR (625 TV lines) standards. Color systems operate on the NTSC, PAL, and SECAM formats. The multisync monitors are used primarily in computer displays where the computer monitor signal has a scan rate different from the 60 Hz (or 50 Hz) rate and the monitor must synchronize to that other scan rate. Multivoltage monitors operate from 90 to 270 volts AC, 50–60 Hz for worldwide use. 8.4.6 Monitor Magnification The overall video system magnification depends on the lens, camera, and monitor parameters. Section 4.2.2 analyzes the magnification as a function of the camera sensor size (1/4", 1/3", etc.), the lens focal length, and the display monitor size (screen diagonal). Table 4-5 summarizes the magnification of the overall video system for various monitor, lens, and sensor sizes. 8.5 INTERFACING ANALOG SIGNAL TO DIGITAL MONITOR Connecting analog video signals to a digital flat-panel display requires special consideration. All flat-panel products face a common problem. Though all of these displays generate the video image using digital techniques, many of the video sources remain firmly entrenched in the analog world. There are numerous analog sources to contend with. Two pertaining to video security are: 1. Computer video sources with component video and separate digital synchronizing signals (R, G, B, Y, C) 2. Composite video sources including NTSC and PAL signals and the S-video. 268 CCTV Surveillance INPUT(S): COMPUTER OR COMPONENT VIDEO CCTV VIDEO COMPOSITE OR S-VIDEO DIGITAL VIDEO SIGNAL OUTPUT(S): RGB, YCbCr DISPLAY TECHNOLOGY: DLP, LCD, LCoS, PLASMA CCTV VIDEO ENCODER VIDEO SCALER DISPLAY FORMAT: FRONT/REAR PROJECTION FLAT PANEL DVI/HDMI DIGITAL DVI = DIGITAL VIDEO INTERFACE (CONNECTOR) DVI-D (DIGITAL) DVI-A (ANALOG) DVI-I (INTEGRATED DIGITAL/ANALOG) HDMI = HIGH-DEFINITION MULTIMEDIA INTERFACE FIGURE 8-13 DISPLAY DRIVER LCD = LIQUID CRYSTAL DISPLAY DLP = DIGITAL LIGHT PROCESSING (PROJECTORS) LCoS = LIQUID CRYSTAL on SILICON (PROJECTORS) Analog to digital interface problem-block diagram There are also numerous digital input signals that must be interfaced to the flat-panel display. The analog signals need to be converted into a digital form so they can be scaled and optimized for the performance of the targeted digital display device. A typical system block diagram is shown in Figure 8-13. Ideally a single integrated circuit (IC) would be able to receive all of the different input video signals, perform the required scaling functions, and transmit the resulting data to the digital display subsystem. So far the challenging and often conflicting requirements of such a device have prevented development of a cost-effective single-chip solution to this problem. In operation the device captures RGB computer video at resolutions from VGA to UXGA or YCbCr component video at resolutions from 480i to 1080i, including 720p. It can support resolutions as high as 1600 × 1275 at 75 fps. Digital video interfacing (DVI) electronics is a form of video connector (interface) made to maximize the display quality of analog video on flat-panel LCD (and other) computer monitors and high-end video cards. It was developed by an industry consortium, the Digital Display Working Group (DDWG). Existing EIA standards are analog as are the monitors they are connected to. However, multifunctional LCD monitors and plasma screens internally use a digital signal. Using VGA cabling results in the computer signal being converted from the internal digital format to analog on the VGA cable and then back to digital again in the monitor for display. This obviously reduces picture quality, and a better solution is provided by DVI to simply supply the original digital signal to the monitor directly. The three types of DVI connections are: 1. DVI-D (digital) 2. DVI-A (analog) 3. DVI-I (integrated digital/analog). One shortcoming of DVI is that it lacks USB passthroughs. The data format used by DVI is based on the PanelLinkTM serial format devised by Silicon Image Inc. A basic DVI-D link consists of four twisted pairs of wire (R,G,B, and clock) to transmit 24 bits per pixel. DVI is the only widespread standard that includes analog and digital transmission options in the same connector. 8.6 MERGING VIDEO WITH PCs Many CCTV security applications require combining the CCTV image with computer text and/or graphics. To accomplish this, the video and computer display signals must be synchronized, combined, and sent to a monitor. Equipment is available to perform this integration (Chapter 16). As computer product technologies converge with video products, many theories have been discussed about how PCs and video should mesh, but the optimal blend has yet to be found. Attempts to design a merged PC and video Analog Monitors and Digital Displays display are complicated by differences in the design of displays used by PC and video applications. Computer monitors have been optimized for reading 10 point–type text and static images from a distance of 2 feet. Accordingly, the physical display size is small and uses a low brightness non-interlaced scan with fast screen refresh to produce a crisp high-definition image that is easy on the eyes at short viewing distances. Video surveillance imagery on the other hand has been optimized within the constraints of the relatively archaic NTSC system. The system was designed to generate lowresolution high brightness images, using an interlaced scan with low screen refresh rate (30 fps) for viewing moving images on a large display area at distances of 3–6 or even 10 feet. Of particular interest are PC-graphics display technologies that allow the monitor to replace the analog video display without any degradation in picture quality. To produce high fidelity pictures rivaling those of CRT video displays, PCs must incorporate digital video processing technology that adapts the video stream to the characteristics of the PC monitor. This process must preserve the inherent fidelity of the video source while simulating video scan techniques on the PC display. This process requires a combination of techniques to preserve the native resolution of the digital video stream, while simultaneously handling the special de-interlacing and frame-rate conversion tasks necessary to produce high fidelity digital video images. The nature of the NTSC display format creates enormous challenges. Analog color video produces pictures at a constant rate of 59.94 Hz. With an image size of 640 by 240 pixels, each field contains only half a full 480 line picture vertical resolution. Each field scans only alternate lines of the video display, with adjacent fields scanning 240 lines, each offseting from the other on the screen by one half-line position. The first, or odd, field which scans the display starts with a half-line scan while the second, or even, field ends with a half-line. A full frame of NTSC video actually contains 525 lines. Approximately 480 of those 525 lines on the monitor are used for the video picture (the active lines), while the remainder makes up the vertical-blanking interval. Therefore there are 262.5 lines in each field. This sequence of field pairs—scanning every other display line with a half-line off-set between fields— creates an interlaced display scan. Each pair combines to form a full 640 by 480 video picture frame. Complicating matters, each of the fields within a frame are separated in time by 1/60th of a second representing two discrete instances in time, 1/60th of the second apart. To present a true video-like picture, the PC must attempt to copy this display scanning technique as faithfully as possible. To accomplish this, the native resolution of the digital video screen must be preserved from its origin: typically an MPEG-2, or composite video decoder, through to the digital to analog converters (DACs) of the graphics device. Secondly the native video source must 269 be converted from an interlaced format to a display format that is suitable for the PC’s progressive scan display mechanism without introducing any visible image glitches. Finally the PC screen display refresh rate must be locked to the field rate of the original video source to avoid display rate conversion artifacts. The conventional simplistic approach to the interlaced to non-interlaced conversion challenge is to capture both fields of the video frame in local memory and read out both fields simultaneously to the PC display. This interlacing technique ignores the fact that the individual fields within a frame are temporally different. They occur 1/60th of a second apart and visually represent two separate instances in time. A static object such as a circle causes no problems with the simple store and read de-interlacer, but if the object traverses horizontally across the screen a distorted motion occurs between each field within the frame. In the most simplistic deinterlacing PC display, the two fields are displayed at the same instant and feathering or inter-field motion-image artifacts along the edges of horizontally moving objects are easily noticed. Another major problem that must be addressed is the frame-rate conversion. The field rate of the analog video source is fixed at approximately 60 fields per second whereas PC displays are typically refreshed at a rate of 70 or 75 Hz (frames per second). The PC screen refresh rate should match that of the original video source to truly replicate the analog video behavior. Special digital display frame-locking techniques must be used to ensure that the screen refresh rate precisely tracks the actual field rate of the original digital video source. The most effective frame-locking technique is adaptive, digital frame locking, rather than an analog phase locked loop (PPL) locking technique. 8.7 SPECIAL FEATURES There are several special features that may be incorporated into analog and digital display monitors. These include: • Interactive touch-screens allowing the monitor operator to interact actively with the monitor for control and communication functions. Technologies used include resistance, infrared, and capacitance. • Antiglare screens overlay the monitor display to provide higher contrast of the video image when reflections and glare from the surrounding environment are present. • Sunlight-readable displays using high brightness flatpanel display technology. 8.7.1 Interactive Touch-Screen The touch-screen system is particularly useful when guard personnel must quickly and decisively react to an activity. At present the technique is not in widespread use, 270 CCTV Surveillance but as system complexity and knowledge of its availability increase, more security systems will incorporate these touch screens. There are many ways to input data into the video security system. These range from keyboards to mouse to voice etc. One method that is becoming more popular is going directly to the source, using touch-screen technology. Allowing the user to input information directly eliminates the need for a mouse or other pointing device, thereby simplifying the input process. Many monitors in security applications display the outputs from video graphics and/or an alphanumeric database generated by a computer. Some advanced systems operate with computer software and hardware that permit interacting with the screen display by touching the screen at specific locations and causing specific actions to occur. The devices are called touch-screen templates and are located at the front of the monitor. The touch screen permits the operator to activate a program or hardware change by touching a specific location on the screen. Touch-screen interaction between the guard and the hardware and video system has obvious advantages. It frees the guard from a keyboard and provides a faster input command response. Also, the guard does not have to memorize keyboard commands and type the correct keys. There is also less chance for error with the touch-screen input, since the guard can point to a particular word, symbol, or location on the screen with better accuracy and reliability. Different types of touch screens are available, using different principles of operation. 8.7.1.1 Infrared Infrared touch screens rely on the interruption of an IR light grid in front of the display screen. This technique uses a row of LEDs and photo-transistor detectors each mounted on two opposite sides to create an invisible grid of IR light in front of the monitor. When the IR beam is interrupted by a finger or other stylus, causing one or more of the photo-transistors to detect the absence of light and transmit a signal with the X,Y coordinates, a signal is returned to the computer electronics to perform a predetermined action. The space within the frame attached to the front of the monitor forms the touch-active area, and a microprocessor calculates where the person has touched the screen. Figure 8-14a shows such a touch screen installed on a monitor. Since there is no film or plastic material placed in front of the monitor, there is no change or reduction in optical clarity of the displayed picture. 1 COMMAND (SINGLE CELL) 1 COMMAND (FOUR CELLS) RETRO REFLECTORS TYPICAL PROBE POINT TYPICAL LIGHT PATH CCD DETECTOR LED LIGHT SOURCE SENSING MODULE MIRROR (A) LED—REFLECTORS—PHOTODETECTOR ARRAY FIGURE 8-14 Monitor touch screens (B) CONDUCTIVE POLYESTER Analog Monitors and Digital Displays The IR technology has no limitations in terms of objects that can be used to touch the screen. The one disadvantage is that the screen may react before it is physically touched. 8.7.1.2 Resistive A second type of touch-screen technology is resistive, which is in common use and inexpensive compared to other methods. One shortcoming of resistive touch-screen is that the indium tin oxide coating typically employed is relatively fragile. The resistive touch panel consists of a transparent, conductive polyester sheet over a rigid acrylic back-plane; both are affixed to the front of the display to form a transparent switch matrix (Figure 8-14b). The switch matrix assembly has 120 separate switch locations that can be labeled with words or symbols on the underlying display, or a scene can be divided into 120 separate locations and interacted with by the operator. Individual touch cells may be grouped together to form larger touch keys via programming commands in the software. Typical light transmission for the resistive touchscreen is 65–75% so that not all the light pass through the screen to the operator and the picture has lower contrast. Resistive touch screens combine a flexible top layer overlay with a rigid resistive bottom layer that is separated from the top layer by insulated spacer dots. Pressing the flexible top layer creates a contact with the resistive bottom layer and control electronics identify the point at which the contact is made on the screen. This technology provides the benefits of high resolution and the fact that any type of pointing device can be used. One shortcoming of the resistive touch screen is its need for an overlay and spacer dots and therefore it suffers from reduced brightness and optical clarity, and the surface and the flexible top layer can be prone to surface damage, scratches, and chemicals. If a touch screen is required in an outdoor sunlit application one has to be especially aware that relatively inexpensive analog resistive models can cut light transmission by as much as 20% and reduce the effectiveness of the screen brightness. 8.7.1.3 Capacitive A third type of interactive touch-screen accessory consists of an optically clear Mylar-polyester membrane that is curved around the monitor’s front glass screen and transparent to the user. When the conductive surface of the Mylar is pressed against the conductive surface of the glass by the operator, a capacitance coupling draws the current from each of the four electrodes to the touch point. Current drawn is proportional to the distance of the contact point from each electrode, allowing the X,Y location of the contact point to be determined. This change in voltage is detected by the monitor electronics, which communicate 271 with the security system to indicate that the person has touched the screen at a particular location. The conductive coating over the surface of the display screen is connected to electrodes at each of the edges. This technology offers good resolution, fast response time, and the ability to operate with surface contamination on the face of the monitor but it is not suitable for gloved hands. Capacitance touch screens also suffer from some electronic drift, meaning that periodic recalibration is required. 8.7.1.4 Projected Capacitance Technology (PCT) Projected capacitance technology (PCT) uses embedded micro-fine wires within a glass laminate composite. Each wire has a diameter of approximately one-third of the diameter of a human hair meaning that they become nearly invisible to the human eye when viewed against the powered up display. When a conducting stylus such as finger touches the glass surface of the sensor, a change in capacitance occurs, resulting in a measurable oscillation frequency change in the wires surrounding the contact point. The integrated controller calculates this new capacitance value and these data are transferred to host controller. Software is used to translate the sensor contact point to an absolute screen position. The polyurethane layer incorporating the touch-screen sensor array is sandwiched, and protected between the glass layers and is therefore impervious to accidental and malicious damage, day-to-day wear and tear, and severe scratching. It is able to accept input from bare and gloved hands and needs no additional sealing to prevent the sensor from being affected by moisture, rain, dust, grease, or cleaning fluids. 8.7.2 Anti-Glare Screen A common problem associated with television monitor viewing is the glare coming from the screen when ambient lighting located above, behind, or to the side of the monitor reflects off the front surface of the monitor. This glare reduces the picture contrast and produces unwanted reflections. In well-designed security console rooms where the designer has taken monitor glare into consideration at the outset, glare will not significantly reduce screen intelligibility or cause viewer fatigue. For best results, face the monitor in the direction of a darkened area of the room. Keep lights away from the direction in which the monitor is pointing that are either behind the person looking at the monitor or from the ceiling above. If there are windows to the outside of the building where bright sunlight may come through, point the monitors away from the outside windows and toward the inside walls. When this cannot be accomplished and 272 CCTV Surveillance WITH FILTER (A) THIN FILM ANTI-GLARE FILTER FIGURE 8-15 NO FILTER (B) WITH AND WITHOUT FILTER Monitor contrast enhancement using glare reduction filters annoying glare would produce fatigue and reduce security, any of the various anti-glare filters available should be applied to the front of a monitor to reduce the glare and increase the contrast of the picture. With a well-designed anti-glare screen and proper installation, glare and reflection levels can be reduced significantly. Figure 8-15 is an un-retouched photograph showing the contrast enhancement (glare reduction) provided by one of these filters. These anti-glare optical filters are manufactured from polycarbonate or acrylic plastic materials and are suitable for indoor applications. Polycarbonate filters used in outdoor applications can withstand a wider temperature range and are therefore more suitable. The filters come in a range of colors, with the most common being neutral density (gray), green, yellow, or blue. The colored filters are used on graphic and data display computer terminals, whereas the neutral-density types are used for monochrome and color monitors. In the case of color displays, the lighter gray filters should be used for glare reduction. 8.7.3 Sunlight-Readable Display Display brightness is measured and reported in nits. One nit is roughly equal to the brightness of one standard candle. Marketing departments often like to use terminology like ultra-bright and sun-bright, but they do not always make the connection back into engineering units. The following levels of brightness define some of the capabilities of these monitors: • Bright (150–240 nits). This is the typical brightness of a home or office computer and is suitable for most indoor light conditions. • High-Bright (250–340 nits). These are brighter than typical panel displays. They can be situated in brightly lit rooms without reducing viewing capacity. • Ultra-Bright (350–790 nits). These may be suitable for some outdoor applications. They provide good visibility in highly illuminated environments where the light source or bright reflections would not allow Bright or High-Bright units to be easily read or images viewed. • Sun-Bright (800 nits and up). Displays this bright are suitable for outdoor sunlit applications and can be read in direct sunlight. Recent advances in LCD monitors have created new outdoor applications for sunlight viewable displays, especially where CRTs are out of the question because of their bulky size. Previously a display that provided 400 nits with a good reflective surface was acceptable. Now the demand has increased to 1500 nits with a minimum display size of about 15 inch diagonal. A 15 inch display can consume as much as 50 W for a brightness of 1500 nits. Displays larger than 10 inch diagonal pose thermal problems when used in a sun-loading environment. Manufacturers have typically used a brute force method, increasing back-light power to increase display brightness. This high power backlight in turn generates tremendous heat that causes an AMLCD to go above its safe operating temperature. Heat is detrimental to AMLCD survivability. To overcome the heating problem massive heat sinks have been employed in addition to expensive antireflective and IR face glass Analog Monitors and Digital Displays laminated onto the display to block the heat generated by sunlight. The display parameters required for sunlight readability include: • Brightness. The display must be bright enough to be legible under full sunlight. The required brightness ranges between 400 and 1500 nits. • Readability. The display image must be discernible by the naked eye under all viewing conditions from full sunlight to nighttime. In addition to high brightness monitors, antireflective coatings are often used to minimize reflections caused by sunlight and an infrared coating is used to reject heat caused by sun loading. The rejected heat spectrum covers the near IR spectral region. 8.8 RECEIVER/MONITOR, VIEWFINDER, MOBILE DISPLAY Various manufacturers produce small lightweight CRT and LCD television monitors or receiver/monitors to accept base-band video signal inputs and/or VHF/UHF commercial RF channels and are powered by 6, 9, or 12 volts DC (Figure 8-16). These television monitors are particularly useful in portable and mobile surveillance, law enforcement, and for servicing applications. Often a portable surveillance camera will be transmitting the video signal (perhaps also audio) via an RF or UHF video transmitter operating on one of the commercial channels or at 900 MHz or 2.4 GHz. The small receiver-monitors with 1.5- to 5-inch-diagonal CRT or LCD displays can receive and display the transmitted video signal and have an output to provide the (A) HIGH RESOULUTION LCD MONITOR FIGURE 8-16 Small flat screen receiver/monitor 273 base-band video signal for a VCR, DVR, or video printer at the receiver site. These devices usually have medium resolution (250–400 lines) often sufficient to provide useful security information and for camera installation and testing. Shock, vibration, and dirt are probably the most common causes of failure for flat-panel displays in harsh environments or used in mobile applications. The typical 15-inch TFT LCD monitor weighs about 13 lbs and so an acceleration of 2 g makes it weigh 26 lbs. Standard office-style LCDs cannot stand up to the kind of shock and vibration found that most mobile environments and to some degree industrial hardening is required. At a minimum they should be shock rated at 1.5 g and have a vibration rating of 1 g. More severe environments call for higher specifications. Aside from shock and vibration, monitors will be subject high levels of grit, grime, water, dust, and oil than would be expected in a normal office environment. This is where the industrial, environmental and structural NEMA ratings are helpful. A few key NEMA ratings for flat-panel displays used in mobile and harsh environments include the following: • NEMA 3 enclosures are suitable for outdoor applications, and repel falling dirt, rain, sleet, snow, and windblown dust: enclosure contents are undamaged by external ice formation. • NEMA 4 adds protection from splashing and hosedirected water to NEMA 3 standard. • NEMA 6 is similar to the NEMA 4 but the enclosure prevents ingress of water during occasional temporary submersion to a limited depth. • NEMA 6P protects against water ingress after a prolonged submersion. (B) TUNABLE 900 MHz TO 2.4 GHz 2.5" LCD RECEIVER/MONITOR 274 CCTV Surveillance 8.9 PROJECTION DISPLAY The digital projector is an electro-optical device that converts a video image or computer graphics and data into a bright image that is projected and imaged onto a distant wall or screen using a lens or lens-mirror system. The projector serves the following purposes: • Visualization of video and stored computer data for monitoring or presentation. • Replaces the whiteboard and written documents. • Provides the ability to view video images and other data by many personnel at the same time. • Provides the ability to playback images from a VCR, DVR, and digital video disk onto a large screen. Digital projection technologies include: • High-intensity CRT • LCD projectors using LCD light gates • Texas Instruments DLP technology. The current dominant technology at the high-end for portable digital projectors is the Texas Instruments DLP technology with LCD projectors dominating the lowend. Digital projectors take the form of a small tabletop portable projector using an external screen or a rear projection screen forming a single unified display device. The typical resolution for the portable projector is the SVGA standard (800 × 600 pixels), with more expensive devices supporting the XVGA (1024 × 768 pixels) format. Projectors costs are determined in most part by their resolution and brightness. Higher requirements cost more. For large conference rooms the brightness should be between 1000 and 4000 lumens. CRT devices are only suitable for fixed installations because of their weight. 8.10 SUMMARY There are several monitor types that can be used in the security room. These include the standard analog CRT monitor and the digital flat-screen LCD, plasma display, and to a lesser extent the new organic light emitting diode (OLED) displays. Where space permits, the standard CRT monitor is a cost-effective solution and can provide a bright, high-resolution monochrome or color image. In more confined spaces or in any completely new installation the flat-panel display should be considered. The quality of the image displayed is a function of the number of TV lines the analog monitor can display, and the number of horizontal and vertical pixel elements available in the digital flat screen monitor. The standard video format is 4 units wide by 3 units high and is almost exclusively the format used in the video surveillance industry. A new format designed for high definition consumer television monitors has a 16 by 9 format but has limited use in the security industry. Video monitors are available in multistandard, multisync configurations for use with all available voltages and scan rates. The use of digital flat screen monitors is increasing rapidly, and interfacing digital computer systems with the monitor has brought about problems which need solving. These include interfacing the analog signal provided by most cameras to the new digital monitors. The use of analog cameras and digital Internet cameras in the same surveillance system further complicates the interface of these cameras to the digital displays. Video monitors are available having special features such as interactive touch screens, anti-glare screens, and sunlight-readable displays. In applications where the guard must make quick, accurate decisions and many cameras and security functions are involved, the touch screen can serve a very important function for making the guard more effective. There are several technologies available for touch screen monitors including: infrared, resistive, capacitive and projected capacitive technology (PCT). Under difficult indoor and outdoor lighting situations in which reflections are prevalent on the monitor screen, anti-glare filters are available to reduce or eliminate this problem. Under extreme sunlight conditions, new flatpanel displays are available with sunlight-readable displays having very high brightness and high contrast. There are many new applications in which the use of automated video surveillance from remote sites or mobile monitor stations is required. Using the digital networks LAN, WAN, Wireless LAN (WLAN, WiFi) and Internet cameras, laptop computers, PDAs, and other portable devices using flat-screen display technology are now available. When video monitoring systems must be portable or transportable or set up rapidly, small mobile, low power flat-panel displays are available with monochrome or color displays. These monitors have low electrical power requirements so that they can operate for days using small rechargeable batteries. Using battery power, these displays are suitable for rapid deployment video surveillance systems and testing and maintenance applications. When the video scenes must be viewed by many personnel and a large-screen video image is required, video projectors are available for fixed installation or portable use. Chapter 9 Analog, Digital Video Recorders CONTENTS 9.1 9.2 9.3 Overview 9.1.1 Analog Video Cassette Recorder (VCR) 9.1.2 Digital Video Recorder (DVR) 9.1.2.1 DVR in a Box 9.1.2.2 Basic DVR 9.1.2.3 Multiplex DVR 9.1.2.4 Multi-channel DVR 9.1.2.5 Network Video Recorder (NVR) Analog Video Recorder 9.2.1 Video Cassette Recorder 9.2.2 VCR Formats 9.2.2.1 VHS, VHS-C, S-VHS 9.2.2.2 8 mm, Hi-8 Sony 9.2.2.3 Magnetic Tape Types 9.2.3 Time-Lapse(TL) VCR 9.2.4 VCR Options 9.2.4.1 Camera Switching/Selecting 9.2.4.2 RS-232 Communications 9.2.4.3 Scrambling 9.2.4.4 On-Screen Annotating and Editing Digital Video Recorder (DVR) 9.3.1 DVR Technology 9.3.1.1 Digital Hardware Advances 9.3.1.1.1 Hard Disk Drive storage 9.3.1.1.2 Video Motion Detection (VMD) 9.3.1.1.3 Optical-Disk Image Storage 9.3.1.1.4 Non-Erasable, Write-Once Read-Many (WORM) Disk 9.3.1.1.5 Erasable Optical Disk 9.3.1.1.6 Digital Audio Tape (DAT) 9.3.1.2 Digital Storage Software Advances 9.3.1.3 Transmission Advances 9.3.1.4 Communication Control 9.3.2 9.3.3 9.3.4 9.3.5 9.3.6 9.3.7 9.3.8 9.3.9 DVR Generic Types 9.3.2.1 DVR in a Box 9.3.2.2 DVR Basic Plug-and-Play VCR Replacement 9.3.2.3 Multiplex 9.3.2.4 Multi-Channel 9.3.2.4.1 Redundant Array of Independent Disks (RAID) 9.3.2.5 Network Video Recorder (NVR) 9.3.2.6 Hybrid NVR/DVR System DVR Operating Systems (OS) 9.3.3.1 Windows 9X, NT, and 2000 Operating Systems 9.3.3.2 UNIX Mobile DVR Digital Compression, Encryption 9.3.5.1 JPEG 9.3.5.2 MPEG-X 9.3.5.3 Wavelet 9.3.5.4 SMICT Image Quality 9.3.6.1 Resolution 9.3.6.2 Frame Rate 9.3.6.3 Bandwidth Display Format—CIF Network/DVR Security 9.3.8.1 Authentication 9.3.8.2 Watermark 9.3.8.3 Virtual Private Network (VPN) 9.3.8.3.1 Trusted VPNs 9.3.8.3.2 Secure VPNs 9.3.8.3.3 Hybrid VPNs 9.3.8.4 Windows Operating System VCR/DVR Hardware/Software Protection 9.3.9.1 Uninterrupted Power Supply (UPS) 9.3.9.2 Grounding 275 276 CCTV Surveillance 9.3.9.3 9.4 9.5 9.6 Analog/Digital Hardware Precautions 9.3.9.4 Maintenance Video Recorder Comparison: Pros, Cons 9.4.1 VCR Pros and Cons 9.4.2 DVR Pros and Cons Checklist and Guidelines 9.5.1 Checklist 9.5.2 Guidelines Summary 9.1 OVERVIEW 9.1.1 Analog Video Cassette Recorder (VCR) Prior to 1970s real-time video recording systems used magnetic reel-to-reel tape media and required manually changing of the magnetic tape reels. Operation was cumbersome and unreliable, and the tape was prone to damage or accidental erasing. The recorder required the operator to manually thread the tape from the tape reel through the recorder onto an empty take-up reel, similar to threading 8 mm and 16 mm film projectors. Not very much video security recording was done in this era. In the 1970s the first analog VCR was introduced to the security industry. The arrival of the VCR permitted easy loading and unloading of the tape cassette without the user contacting the tape. The VCR machines provided real-time 30 fps recording. VCRs found widespread use in security when the VHS tape format became the dominant consumer VCR format. However, most security applications require 24/7/365 day operation without stopping. The consumer VCRs were not designed for continuous use and did not operate in a TL mode, and were not a reliable choice for security applications. The security industry had the manufacturers design industrial versions that have served the security industry for many years. These TL VCRs were especially designed to withstand the additional burden of long-term continuous recording and the start–stop of TL recording. The VCR was the only viable technology until the late 1990s, and still is a convenient method for recording security video images. During this period specialized functions were added that further enhanced their usefulness which included: (1) alarm activation (2) T-160 24 -hour real-time recording, and 40 day extended TL recording (960 hours) on a single cassette. The recorded video images are used for general surveillance of premises, to apprehend and prosecute thieves and offenders, and to train and correct personnel procedures of security personnel. A played-back camera image from a high-quality recording system can be as valuable as a live observation. The video image on a monitor however is fleeting, but the recorded image can be played back over and over again and a hard copy printed for later use. The original VHS tape can easily be given to the police or other law enforcement agency for investigation or prosecution. The TL VCR records single pictures at closely spaced time intervals longer than the real-time 1/30-second frame time. This means that the TL mode conserves tape, while permitting the real-time recording of significant security events. When a security event of significance occurs, a VMD or alarm input signal causes the VCR to switch from TL to real-time recording. These real-time events are the ones the security guard would consider important and normally view and act on at a command monitor. TL recording permits the efficient use of a recorder so that changing of VCR tapes is minimized. Video cassette recorders have some shortcomings however. They can generate grainy and poor quality images during playback because videotapes are frequently reused, and record/playback heads are worn or misaligned, degrading the quality of the video image. VCR tapes are changed manually which leaves room for human error, whereby the tape may not be changed at the required time or a recorded tape is inserted into the machine, erasing previously recorded images. Clean space for storing the videotapes can also become a problem particularly true in high-usage casino applications. The DVR eliminates all these problems. 9.1.2 Digital Video Recorder (DVR) The movement from tape-based real-time and TL video recording to today’s DVRs has been a vast improvement, and a quantum jump forward in technology. There is a new generation of video cameras, with digital signal processing and IP cameras and other digital devices to interface to the DVRs, CRTs and flat screen LCD and plasma digital monitors. Today’s DVRs do not represent the end of the technological advancement, but are rather the beginning of intelligent recording devices that are more user-friendly and economical. The DVR was first introduced to the video industry in the early 1990s and it recorded the video image on a HD drive in digital form. Why the change to DVR technology? The VCR has been in use for years but has been a weak link in the overall video security system. One important reason is the VCR’s requirement for excessive tape maintenance, deterioration of the tapes over time and use, inability to reproduce the high resolution and image quality produced by the digital cameras, and the excessive manpower required to review the tapes. Another disadvantage of the VCR technology is the sheer volume of storage space required to archive the tapes in mid- to large-size systems at casinos, etc. This prompted many a dealers, systems integrators and end users to switch over to DVR equipment. Analog, Digital Video Recorders Another advantage is the significantly better image quality on playback. One reason for this is that the analog VCR only records a single field of the video image. The DVR records a full frame of information. The VCR reduces the detail of the video image during playback by one half, thereby rendering a poorer image than the original live image. DVRs, on the other hand, record the full video image on the HD and do not introduce picture noise, and provide high stability and higher quality video image. The DVR also eliminates the need for head and tape replacement thereby significantly reducing maintenance over its lifetime. The DVR converts the incoming video camera signal into a recorded magnetic form on a magnetic HD. The recorder later reconstructs the video signal into a form suitable for display on a video monitor or to be printed on a hard-copy video printer or for transmission to a remote site. Most DVRs have more than one internal HD drive. The software controlling them automatically moves the recorded images internally, so that if there is a failure, only a portion of the data is lost. The average 80 GByte HD drive can store approximately 100 hours of data, while the VCR can store just 8 hours of data. Images on HD drives do not degrade and can be retrieved, copied, and reused hundreds of times without compromising the picture quality. Important security images can also be stored permanently (archived) on HD drives, recorded on digital audio tape (DAT) recorders, or burned into DVDs for future use. Digital video recorders provide higher quality images than VCRs particularly during picture pause in which the DVR exhibits no distortion and no picture tearing during pause, single frame advance, or rewind and fast-forward modes. This is true from the first viewing, after many viewings of the same image, regardless of the number of times the digital image is viewed or copied. Switching from VCR to DVR machines eliminates all tape hassles: tapes will not have to be changed, no prying the tape out of the VCR slot, and no cleaning or replacing tape heads again. Unlike the VCR, the DVR permits programming the picture resolution as required by the application. It can be programmed locally or remotely to permit recording in real-time or TL modes depending on the programming, and can respond to video motion alerts or alarm inputs. The video image quality remains the same regardless of how many times the images are stored or re-recorded. The DVR hardware is available in several configurations: DVR in a box, DVR Basic (plug-and-play), DVR multiplex, and DVR multi-channel. Large systems networking to remote sites use network video recorders (NVR). 9.1.2.1 DVR in a Box The DVR in a box is created by adding a printed circuit (PC) card to a PC computer and converting it into a DVR. 277 This solution might be expedient but it does have some limitations: it has very few user-friendly features. It is used in low-cost, small video systems. 9.1.2.2 Basic DVR The single channel DVR is the basic replacement for the real-time or TL VCR. The single channel DVR looks, feels, and has the controls similar to a VCR, and can be set by the operator from a local or remote site. It functions like a VCR but has many advantages over the VCR. DVRs produce sharp images over long periods of time and after many copies have been made. The DVR is especially well suited to perform video motion detection and can be activated by external alarms. A primary market for the basic DVR is as a VCR replacement in a legacy analog installation. In this application the video cabling infrastructure is already in place for transmission of video images from the cameras to the recorder, and the most cost-effective solution is the basic DVR. A single channel DVR is a costeffective replacement for a VCR and provides long-term maintenance-free recordings that can be viewed locally, remotely, and transmitted anywhere over wired or wireless networks. Making the switch from the traditional VCR to the DVR is now here at an affordable price and can replace the VCR with no changes to the rest of the system. 9.1.2.3 Multiplex DVR Midsize DVRs using multiplexer technology provide highquality recording capabilities for 4–16 cameras. For midsize surveillance systems, DVRs with built-in multiplexers operate like traditional multiplexers connected to VCRs. The equipment offers on-screen menus and require only simple keystrokes to find images or events by alarm input, time, date, camera number, or other identifiers. These recorded images can be stored for days, weeks, or months. DVR multiplexers are available with Ethernet connections to provide high-quality remote transmission to other sites on the network. This makes the video images available at local monitoring sites or at a central monitoring station, thereby providing instant access to critical recordings. The images can be retrieved using standard IP addresses and PCs to make the remote monitoring and recordings available to authorized personnel anywhere on the network. Multiplexed systems are used in midsize systems to record multiple cameras. While recording multiple cameras, the multiplexed DVR cannot record all cameras in real-time but rather time-share and record the cameras in sequence. DVR time-sharing operates in the same way as a standalone video multiplexer. Multiplexed recorders usually have a maximum storage capacity of 480–600 GByte representing 600–750 hours of real-time recording (depends on resolution). The combination of multiplexer and DVR has the advantage that the interface 278 CCTV Surveillance between the multiplexer and VCR is already accomplished by the manufacturer. 9.1.2.4 Multi-channel DVR The multi-channel DVR technology requires significantly more HD drives and error correcting codes to reliably store and manipulate the images. The result is a system where all cameras are recorded at all times without any loss of video image information. Large enterprise systems using multi-channel machines can record images from hundreds of cameras for months or more, storing high resolution video image scenes in real-time, near real-time or TL. To store a large number of images the video image files are compressed by removing redundant data in the image file. The number of bytes in each image file is reduced significantly so that the files can be stored efficiently on the HD. The multi-channel DVR records each camera on individual channels (not multiplexed) and is designed for applications with a large number of cameras as in an Enterprise system. This system can be expanded to nearly an unlimited number of video channels by adding additional HD memory and appropriate software control. The multi-channel DVR permits storing 60 images per second per camera whereas the multiplexed unit divides the 60 images by the number of cameras. The multiplexed scanning often causes a jerky motion of the image during playback and is especially noticeable when the image per second (IPS) for the cameras falls below 15 IPS. Most DVRs are triplex rather than duplex in design. This means that they can: (1) display and record live video, (2) display the recorded video locally, and (3) display the recorded video remotely all simultaneously. In the case of remote viewing, a standard Web browser connected to the Internet via an ISP is all that is needed to view the video images. Digital video recorders allow fast searching of recorded information based on time, date, video motion detection, alarm input, or other external events. This permits fast retrieval of video images and avoids wading through countless frames of video information as required with standard VCR technology. The operator can view the information of interest in a matter of seconds. This is a primary advantage of DVR technology over VCR. 9.1.2.5 Network Video Recorder (NVR) The NVR records video and audio data streams received over Ethernet networks using the TCP/IP protocol. The NVR receives compressed video data streams from the transmission channel and transfers the streams to an internal HD for storage in the DVR. The NVR technique uses the Ethernet networks already in place in most buildings, and features such as motion detection, scene analysis, and alarm notification are employed. These features have added to the growing popularity of network surveillance. All digital video sources or analog cameras connected to video servers feed the digital data streams into the network. A computer with sufficient storage capacity serves as the DVR. The DVR accesses the video data streams of the remote network cameras and video servers and stores them on the HD. 9.2 ANALOG VIDEO RECORDER 9.2.1 Video Cassette Recorder The innovation of video cassettes and the VCR resulted in wide acceptance of this recording medium for over 25 years. VCRs used the Victor Home System (VHS) video cassette as the recording medium. The newer Sony 8 mm format gained some popularity in the security market because of its small compact size while the VHS-C format found limited use. Present real-time VCR systems record 2, 4, or 6 hours of real-time monochrome or color video with about 300 lines of resolution on one VHS or 8 mm cassette. TL recorders can have total elapsed recording times of up to 960 hours. Most TL recorders have alarm input contacts that switch the recorder to real-time recording when an alarm condition occurs. The VCR has always been the weakest link in the video security system with respect to image quality and reliability. Both the VHS and 8 mm recorders fall short in that they do not record high resolution camera images. The main reason for this is that both recorders record a field rather than a frame, thereby losing half the camera resolution. The VHS and 8 mm formats called S-VHS and Hi-8 increased resolution and picture quality but do not meet the resolution capabilities of most monochrome and color cameras. TL recording makes maximum use of the space available on the video cassette by recording individual images at a slow pre-selected rate, thereby slowing the recording rate. Instead of recording at a normal 30 fps the TL VCR records one picture every fraction of a second or every few seconds. Prior to the use of DVRs using computer HD drives, all security installations recorded the video images using VCRs. The TL tape machine may take about two to three minutes to search from the beginning to the end of the tape since the tape in the TL recorder is advanced and reversed linearly by mechanical devices and motors. 9.2.2 VCR Formats Almost all security VCRs use the 1/2-inch wide magnetic tape format. A compact cassette called VHS-C and sold by JVC Company found little use in the security industry. In the 1990s Sony developed the compact tape formats using an 8 mm (1/4-inch)-wide tape cartridge. While most Analog, Digital Video Recorders security video recorders use the standard VHS cassette format, many portable systems use the more compact Sony 8 mm and Hi-8 cassette. 279 VHS electronics and encoding scheme. The VHS-C cartridge is played back on a standard VHS machine with a VHS-C-to-VHS cartridge adapter. 9.2.2.2 8 mm, Hi-8 Sony 9.2.2.1 VHS, VHS-C, S-VHS Standard real-time continuous recording times for VHS tapes are 2, 4, and 6 hours. When these cassettes are used in TL mode, where a single image or a selected sequence of images are recorded, 8, 24, 40, and up to 960 hours can be recorded on a single 2-hour VHS cassette. VCRs record the video scene on magnetic tape using the same laws of physics as used in audiotape recorders (Figure 9-1). The challenging aspect of recording a video picture on a magnetic tape is that the standard US NTSC video signal has a wide bandwidth and includes frequencies above 4 MHz (4 million cycles per second) and down to 30 Hz, as compared with an audio signal with frequencies between 20 and 20,000 Hz. To record the high video frequencies the tape must slide over the recording head at a speed of approximately 6 meters per second or faster. All VCRs have a helical-scan design, in which the magnetic tape wraps around the revolving drum about half a turn and is pulled slowly around a rapidly rotating drum having magnetic record, playback, and erase heads (Figure 9-2). This design reduces the cassette tape speed an order of magnitude (one-tenth) slower than the linear recording head tape speed. The audio is recorded conventionally along one edge of the tape as a single (monaural) or a dual (stereo) channel. Along the other tape edge is the control track, normally a 30 Hz square-wave signal (NTSC system) that synchronizes the VCR to the monitor during playback. Some VCR machines have a full-track erase head on the drum to erase any prerecorded material on the tape. The VHS-C tape format makes use of a small tape cartridge slightly larger than the Sony 8 mm and uses the Sony developed the smaller 8 mm and Hi-8 format video technology (Figure 9-3). The format and cassette are significantly smaller than the VHS but maintain image quality and system capability similar to that of the larger format. Cassette running times are 1/2 hour, 1 hour, and 2 hours. The 8 mm configuration is particularly suitable for covert applications requiring a small, light-weight recorder. The resolution obtained with standard color VHS and 8 mm VCRs, whether operating in real-time or TL mode, is between 230 and 240 TV lines. This is not sufficient for many security applications. Monochrome TL recorders provide 350 TV-line resolution. The new color S-VHS and Hi-8 format real-time and TL recorders increase the horizontal resolution to more than 400 TV lines, suitable for facial identification and other security applications. As with the standard VHS and 8 mm systems there is no compatibility between the S-VHS and Hi-8 formats. There is some compatibility between VHS and S-VHS, and between 8 mm and Hi-8. Some important differences between and features of the standard VHS and 8 mm, and the S-VHS and Hi-8 formats are as follows: • S-VHS and Hi-8 recordings cannot be played back on conventional VHS or 8 mm machines. • S-VHS and Hi-8 video cassettes require high coercivity, fine-grain cobalt-ferric-oxide and metal tapes to record the high-frequency, high-bandwidth signals. • All S-VHS and Hi-8 recorders can record and playback in standard mode. The cassettes have a special sensing notch that automatically triggers the VCR to switch to the correct mode. TAPE HEAD ROTATING CAPSTAN E ON IELD F V T 1 AUDIO TRACK 1/2" TAPE WIDTH TAPE GUIDE 2 TAPE GUIDE HEA D 1 HEA D 2 TAPE TRAVEL FIGURE 9-1 VHS video cassette recorder geometry and format HEA D 2 HEA D 1 VHS TAPE FORMAT CONTROL TRACK 280 CCTV Surveillance HELICAL SCANWHEEL ROTATION INERTIA IDLER 1/2" VHS TAPE PATH INERTIA IDLER VIDEO HEADS ERASE HEAD AUDIO AND CONTROL HEADS CASSETTE CAPSTAN SUPPLY REEL FIGURE 9-2 TAKE UP REEL VHS recorder technology HELICAL TAPE PATH DESIGN TAPE WRAP CONFIGURATION ROTATING DRUM ROTATING HEADS TAPE TRAVEL TAPE TRAVEL TAPE GUIDES 8 mm TAPE FORMAT HEAD A 8 mm TAPE WIDTH FM AUDIO AND VIDEO SIGNAL HEAD TRAVEL DIRECTION DIGITAL AUDIO HEAD B FIGURE 9-3 Sony 8 mm video cassette recorder and format TAPE TRAVEL Analog, Digital Video Recorders 9.2.2.3 Magnetic Tape Types The magnetic tape grade plays a critical role in determining the final quality of the video picture and life of the recorder heads. Manufacturers have improved tape materials resulting in significant improvements in picture quality, and maintaining “clean” pictures (low signal dropout and noise) over long periods of time and after many tape replays. For security applications it is important to choose a high-quality tape with matched characteristics for the VCR equipment and format used. Most security videotape formats when grouped by size fall into two categories: 8 mm and VHS (Table 9-1). Video cassette recorders record multiple 2:1 interlaced cameras best by synchronizing them sequentially using a 2:1 sync generator. This technique provides stability, enhances picture quality, and prevents picture roll, jitter, tearing, or other disturbances and artifacts. If randominterlace cameras are used, they should be externally synchronized. Table 9-2 summarizes the physical parameters and record/play times of the VHS and 8 mm tape cassettes. The real-time or TL VCR provides a means for recording consecutive video images over a period of time ranging from seconds to many hours and recording thousands of individual video pictures on magnetic tape. A 2-hour VHS cassette records 216,000 images of video (2 hr @ 30 frames/sec = 216,000 frames). If camera information (ID) and time and date are coded on the tape, equipment is available for the operator to enter the camera number, time, and date to retrieve the corresponding images on the tape. However, to locate a specific frame, many minutes may be needed to shuttle the tape, playback, and display the image. Locating a specific frame or time on the tape is a lengthy process since the video cassette tape is a serial medium and retrieval time is related to the location of the picture on the tape. The random access nature of the DVR or optical HD performs this task easily, resulting in quick retrieval of any image anywhere on the disk (Section 9.3.1). 9.2.3 Time-Lapse (TL) VCR The TL recorder is a real-time VCR that pauses to record a single video field (or frame) every fraction of a second or number of seconds, based on a predetermined time interval (Figure 9-4). Standard VCRs record the video scene in real-time: the fields or frames displayed by the camera are sequentially recorded on the tape and then played back in real-time, slow-motion, or a frame at a time. In the TL mode, the VCR records only selected fields (or frames) a fraction of the time. Time-lapse recorders have the ability to record in both real-time and a variety of TL ratios, which are operator-selected either manually or automatically. The automatic switchover from TL mode to real-time mode is triggered by an auxiliary input to the VCR. When the signal from an alarm device or VMD is applied to the VCR input, it records in real-time mode for a predetermined length of time after an alarm is received, and then returns to the TL mode until another alarm is received. TAPE FORMAT LUMINANCE (Y) BANDWIDTH (MHz) VHS 3.4–4.4 (1.0) 629 240 VHS-C 3.4–4.4 (1.0) 629 240 S-VHS 5.4–7.0 (1.6) 629 400 8 mm 4.2–5.4 (1.2) 743 270 Hi-8 5.7–7.7 (2.0) 743 430 DIGITAL 8 13.5 * DV MINI DV CHROMINANCE (C) CENTER FREQUENCY (KHz) RESOLUTION (TV LINES) 3.375 ** 500 13.5 3.375 500 13.5 3.375 500 * SAMPLING RATE ** SAMPLING RATE FOR UV CHROMINANCE SIGNALS Table 9-1 281 VHS, S-VHS, 8 mm, and Hi-8 Parameter Comparison 282 CCTV Surveillance PLAYING TIME (HRS) MAXIMUM RESOLUTION (TV LINES) TAPE WIDTH mm (inches) CASSETTE SIZE L × W × H (mm) * 240 12.7 (1/2) T-60 240 T-120 TAPE FORMAT STANDARD (SP) LONG (LP) EXTENDED (EP) 188 × 104 × 25 0.33 0.66 1.0 12.7 (1/2) 188 × 104 × 25 1 2 3 240 12.7 (1/2) 188 × 104 × 25 2 4 6 400 12.7 (1/2) 188 × 104 × 25 2 4 6 P6-60 270 8.0 (0.31) 95 × 62.5 × 15 1 — — P6-120 270 8.0 (0.31) 95 × 62.5 × 15 2 — — P6-60 430 8.0 (0.31) 95 × 62.5 × 15 1 2 — P6-120 430 8.0 (0.31) 95 × 62.5 × 15 2 4 — DIGITAL 8 500 8.0 (0.31) 95 × 62.5 × 15 2 — — DV 500 6.35 (0.25) 125 × 78 × 14.6 3 — — *** MINI DV 500 6.35 (0.25) 66 × 48 × 12 1 1.5 — VHS-C VHS: S-VHS * 8 mm ** Hi-8 * TAPES AVAILABLE—15, 30, 90, 120 MINUTES ** TAPES AVAILABLE—30, 60, 80 MINUTES (SP MODE) *** TAPES AVAILABLE—45, 90, 120 MINUTES (LP MODE) Table 9-2 Video Cassette Recorder Tape Physical Parameters and Formats STANDARD REAL- TIME VIDEO RECORDING: 30 FRAMES/SEC, 60 FIELDS/SEC FRAME 1 FIELD 1 FRAME 2 2 5 4 3 FRAME 4 FRAME 3 6 8 7 9 10 11 12 13 14 T=0 T = 1/60 sec T = 1/30 sec TIME LAPSE VIDEO RECORDING TIME LAPSE RECORDER FIELD 1 FIELD 7 FIELD 14 T=0 T = 6/60 = 1/10 sec T = 12/60 = 1/5 sec TIME INTERVAL PROGRAMMED BY OPERATOR FIGURE 9-4 Time lapse (TL) video cassette recorder (VCR) Time-lapse video recording consists of selecting specific video images to be recorded at a slower rate than they are being generated by the camera. The video camera generates 30 frames (60 fields) per second. One TV frame consists of the interlaced combination of all the evennumbered lines in one field and all the odd-numbered lines in the second field. Each field is essentially a complete picture of the scene but viewed with only half the Analog, Digital Video Recorders TOTAL RECORDING PERIOD * HOURS 2 ** TIME-LAPSE RATIO DAYS RECORDING/PLAYBACK SPEED (RECORDING INTERVAL) 1 FIELD PER ___ SEC 1 FRAME PER ___SEC RECORDING/PLAYBACK (PICTURES/SECOND) FIELDS FRAMES .083 1:1 0.017 0.034 60 30 .50 6:1 0.1 0.2 10 5 0.2 0.4 5 2.5 12 24 1 12:1 48 2 24:1 0.4 0.8 2.5 1.25 72 3 36:1 0.6 1.2 1.7 0.85 60:1 1.0 2.0 1.0 0.5 120 5 180 7.5 90:1 1.5 3.0 0.66 0.33 10 120:1 2.0 4.0 0.50 0.25 360 15 180:1 3.0 6.0 0.33 0.17 480 20 240:1 4.0 8.0 0.25 0.13 600 25 300:1 5.0 10.0 0.20 0.10 720 30 360:1 6.0 12.0 0.16 0.08 960 40 480:1 8.0 16.0 0.2 0.11 240 283 * TAPE CASSETTE: T-120 ** STANDARD REAL-TIME VIDEO Table 9-3 Time-Lapse Recording Times vs. Playback Speeds vertical resolution (262 1/2 horizontal lines). Therefore, by selecting individual fields—as most TL VCRs do—and recording them at a rate slower than 60 per second, the TL VCR records less resolution than available from the camera. When the TL recorded tape is played back and viewed on the monitor at the same speed at which it was recorded, the pictures on the monitor will appear as a series of animated still scenes. Table 9-3 presents a comparison of TL modes as a function of TL ratio, total recording period, recording interval fields, and fields per second recorded. It is apparent that the larger the TL ratio, the fewer the pictures recorded over any period of time. For example, for a TL ratio of 6:1, the recorder captures 10 images (fields) per second, whereas in real-time (or 1:1) it captures 60. Although the recorder is only recording individual fields spaced out in time, if nothing significant is occurring during these times, no information is lost. The choice of the particular TL ratio for an application depends on various factors including the following: • Length of time the VCR will record on a 2-, 4-, or 6-hour video cassette • Type, number and duration of significant alarm events likely to occur • Elapsed time period before the cassette can be replaced or reused • TL ratios available on the VCR. To minimize tape usage and maximize information recorded, select the lowest TL ratio consistent with the requirement. By carefully analyzing operating conditions and requirements and using TL, it is possible to record events without sacrificing important information—and at substantially less tape cost than real-time recording. In the TL recording mode the videotape speed is much slower than in real-time since the video pictures are being recorded intermittently. This maximizes the use of tape storage space and eliminates the inconvenience of having to change the cassette every few hours. To review the tape it is scanned at faster than normal playback speed. When more careful examination of a particular series of images is required, the playback speed is slowed or paused (stopped) and a more careful scrutiny of the tape is made. 9.2.4 VCR Options The following are some VCR options: • Built-in camera switcher • Time/date generator • Sequence or interval recording of multiple cameras on one VCR • Interface with other devices: cash register, ATM, etc. • Remote control via RS-232 • 12-volt DC power operation for portable use. 9.2.4.1 Camera Switching/Selecting Time-lapse VCRs allow recording of multiple cameras and selected playback of numerically coded cameras. 284 CCTV Surveillance CAMERA 8 CAMERA 5 VCR OR DVR (RECORD MODE) G 1 2 3 4 5 6 7 8 16 CAMERA VIDEO ENCODER FOR 16 CAMERAS CAMERA 1 SCENE FROM CAMERA (5) STORED ON VIDEO TAPE EVERY 8TH FRAME o o o 3 4 G 5 6 7 8 1 G 3 2 4 G 5 6 o o o G INTERVAL = 1/30 sec × 8 = 0.266 sec CAMERA VIDEO DECODER FOR 16 CAMERAS VCR OR DVR CAMERA 5 SELECTED (PLAYBACK MODE) FIGURE 9-5 05 G SCENE 5 DISPLAYED CONTINUOUSLY AND UPDATED EVERY 0. 266 sec Multiplexing multiple cameras onto one time-lapse VCR Figure 9-5 shows the technique for a 16-camera input system using 8 cameras. The VCR multiplexes up to 16 cameras onto one videotape, reducing equipment cost by eliminating the need for one VCR per camera input. The VCR separates the recordings from each camera and displays the fields from one camera. When many cameras are recorded on one VCR, rather than sorting through scenes from all the cameras when only one is of interest, the operator can select a particular camera for viewing. To locate a specific video image, the operator shuttles the tape, advance or backup, one image at a time. In operation during real-time or TL video recording, the VCR electronics inserts a binary synchronizing code on the video signal for every image with each camera uniquely identified. During real-time playback the scenes from any one of the 16 cameras can be chosen for display. In Figure 9-5 camera scene 5 has been chosen for presentation where the pictures are updated every 0.266 seconds. 9.2.4.2 RS-232 Communications Video cassette recorders interface and communicate twoway data with computer systems via a RS-232 port enabling the computer to communicate with the VCR and control it. The RS-232 port permits the recorder to communicate via digital networks, telephone lines, dedicated two-wire or wireless channels. The computer becomes a command post for remote control of recorder functions: real-time or TL mode, TL speeds, stop, play, record, fast rewind, fast-forward, scan, reverse-scan, pause, and advance. These remote functions are put at the fingertips of the security operator whether in the console room, at the computer across the street, or at a distant location. 9.2.4.3 Scrambling Video recordings are often made that contain inherently highly sensitive security information and there is a need for scrambling the video signal on the recording. Equipment Analog, Digital Video Recorders BASEBAND VIDEO TRANSMISSION OR SCRAMBLED VIDEO DESCRAMBLED VIDEO RECORDING 285 BASEBAND VIDEO CAMERA WIRED * SCRAMBLER RECORDED SCRAMBLED VIDEO ON VCR OR DVR VIDEO TRANSMITTER WIRELESS * DESCRAMBLER VIDEO RECEIVER DESCRAMBLER DESCRAMBLER RECORDED SCRAMBLED VIDEO ON VCR OR DVR * VCR/ DVR DESCRAMBLER * MODES OF OPERATION: (1) WIRELESS TRANSMISSION (2) WIRED TRANSMISSION (3) VIDEO CASSETTE RECORDING FIGURE 9-6 VIDEO MONITOR Video recording scrambling is available to scramble the videotape signal to prevent unauthorized viewing of video recordings, making it very difficult to reconstruct an intelligible picture (Figure 9-6). The scrambling technology safeguards the video signal as it is recorded or transmitted, producing a secure signal unusable in its scrambled form. The scrambling code is changed constantly and automatically and renders frameby-frame decoding fruitless. It is password-protected so that only personnel entitled to view the tape can gain access to it. Time access codes can be programmed to restrict descrambling to a scheduled time interval. The complete system consists of an encoder connected in the camera output and decoder connected at the monitor location. control the on-screen editing of the videotape whenever changes are required. These are typical text entered by a security operator: • Superimposed listing of cash register transactions • Personal identification number (PIN) of an individual using an ID card • Verification of the authenticity of ID cards used at remote ATMs, gas pumps, retail stores, and cashdispensing machines • Record of action taken by security personnel reacting to alarms or video camera activity. 9.3 DIGITAL VIDEO RECORDER (DVR) 9.2.4.4 On-Screen Annotating and Editing Video cassette recorders have built-in alpha-numeric character generators to annotate tape with: time, date, day of week, recording speed, alarm input, camera identifier, and time on/off status information. In retail applications the recorder can annotate the video image with the cash register dollar amount to check a cashier’s performance. In a bank ATM application a video image is annotated with the transaction number to identify the person performing the transaction. The RS-232 interface permits the operator to There are several key differences between DVRs and VCRs that result in significant advantages for DVR users. The most notable difference between the DVR and VCR is the medium used for recording the video images. VCRs record images on magnetic tapes, while digital systems use HD drives, DATs or DVDs. This differentiation has significant implications in terms of the video image quality, speed of information retrieval, image transmission speed, and remote monitoring capabilities. Digital video systems using DVRs can be accessed over LAN, intranets, and the Internet. This permits security personnel to monitor remote 286 CCTV Surveillance sites across the street, town, or locations hundreds or thousands miles away. Using an Internet browser or other application software on any PC or laptop allows security with personnel or corporate management to view recorded digital video images at a secure IP (Internet protocol) address from anywhere in the world. Security systems using DVRs can play a major role in alarm verification. Having the ability to perform video assessment from remote locations means the system can be used to prevent false alarm responses by security and police personnel. The ability to remotely and instantly view the alarm site means that if a review of the video images indicates there are no intruders, a false alarm can be declared and no law enforcement personnel need be notified. The digital video images are stored on HD drives similar to those used in the PC industry and have storage capacities measured in hundreds of megabytes or gigabytes, providing a low cost storage media for the compressed video files. Small and medium-size systems use several HD drives, while large enterprise systems use a large number of HD drives. These HD drives are synchronized and shared to store images from many video cameras reliably, and available for rapid access by the user. The DVR has high reliability as compared to the VCR recorder. The DVD provides higher image quality as compared to its VCR predecessor. The DVR video images can be downloaded to an external medium. The Zip file/disk or an email over the Internet is the easiest for the DVR. 9.3.1 DVR Technology The technology difference between analog and digital recording is that the analog tape recorder incorporates a magnetic field to align the magnetic particles on the surface of the VHS tape to correspond to the video signal image. In contrast, the DVR converts the analog signal into a digital signal of ones and zeros, compresses this digital signal, and then stores it on the magnetic DVR HD drive, DVD, or DAT. The combination of affordable image compression technologies and large capacity HD drives has made the development of the DVR a reality. Although HD DVR recording like VHS still uses a magnetic recording medium, the digital nature of the data insures that all retrieved footage is an identical copy of the originally recorded signals. Standard DVRs have some shortcomings when used in mid-range and large size Enterprise systems. Since video inputs are local to the DVR, and a camera source has to be wired at the location of the DVR, this results in a significant investment in cable. DVRs are rapidly replacing the VCR as the preferred storage/retrieval medium for video security systems. An obvious difference between the two technologies is that VCRs use standard VHS format magnetic tape, while digital DVRs store images on the DVR HD, DVD, DAT, or any combination of these media. The operators of DVR can search for recorded information based on time, date, video image, camera input, alarm, or video motion. Operators achieve much faster retrieval times as compared with VCRs. Rather than wading through countless frames of video information, the DVR operator can locate the desired images in a fraction of a second. The DVR is superior to the VCR in image quality. The VCR records only every other field of the video image, while the DVR records a full frame (two fields per frame) producing twice the resolution. The DVR digital image does not deteriorate on playback or re-recording whereas with the VCR there is deterioration of the image after each new copy is made. The DVR requires far less servicing as compared to the VCR with all its mechanical drives and the VCR magnetic tape is prone to tape failure. DVRs offer additional features such as remote video retrieval, combining the multiplexer with the DVR, pre- and post-image recording, retrieval on alarm, and networking capabilities. The basic block diagram of a DVR is shown in Figure 9-7. The analog video signal from the camera is converted into a digital signal via the analog-to-digital converter (A/D) at the front end of the DVR. Following the A/D converter is the digital compression electronics with its programmed compression algorithm. The amount of compression (compression ratio) is based on the compression algorithm chosen: JPEG (Joint Photographic Engineers Group), MPEG-4 (Motion Picture Engineers Group), (Wavelet, H.263, H.264, etc.). The compression algorithm chosen is based on the DVR storage capacity and the image rate and quality of the images required. Following the compression electronics is the authentication electronics that imbeds a security code into each image. The digitized video signal is then ready for storage in the HD drive. The HD drive stores the compressed video image and other data on a magnetic coating on the HD. A magnetic head is held by an actuator alarm and is used to write and read the data. The disk rotates with constant rpm and data is organized on the disk in cylinders and tracks. The tracks are divided into sectors (Figure 9-8). Hard Disk storage capacity is measured in hundreds of megabytes, gigabytes (1000 MByte), or terabytes (1000 GByte). Video image retrieval is fast but not instantaneous. There is a delay between the time an operator inputs a command to retrieve an image and when the image is displayed on the monitor screen. With DVR systems this time is a small fraction of a second. With VCRs it is seconds to minutes. Image retention time refers to how long the DVR can record before it begins to write over its oldest images. The amount of recording time required depends on the Analog, Digital Video Recorders INPUT: ANALOG VIDEO VIDEO * COMPRESSION ANALOG/DIGITAL CONVERTER 287 SECURITY AUTHENTICATION DIGITAL ** STORAGE OUTPUT: DIGITAL DATA RETREIVAL (ACCESS) VIDEO DE-COMPRESSION DIGITAL/ANALOG CONVERTER ANALOG * COMPRESSION RATIO BASED ON STORAGE CAPACITY AS WELL AS NUMBER AND QUALITY OF IMAGES STORED ** STORAGE MEDIA: MAGNETIC HARD DISK FIGURE 9-7 Digital video recorder (DVR) basic block diagram application: local codes, regulations, or business classification. Mandated storage times generally range from a week to a month or months. To record for longer periods of time or to archive, a compact disk (CD), additional HD drives, or DAT recorders are used. Since real-time recording and high image resolutions consume significant HD space quickly, the new compression schemes using MPEG-4, H.264, and others are needed to offset these higher image per second (IPS) recording rates and image quality requirements. In summary, DVRs offer these advantages over analog recorders: (1) better picture quality, (2) less maintenance, (3) random access search, (4) pre- and post-alarm recording, (5) built-in or optional multiplexer, (6) expandable storage for longer recording time, (7) real-time and TL event recording modes, (8) network interface through LAN, WAN, and Internet, (9) motion detection, and (10) password protection. 9.3.1.1 Digital Hardware Advances 9.3.1.1.1 Hard Disk Drive storage Hard disk drive storage capacity and speed have increased dramatically in the past years and the trend will continue. DVRs can include built-in 40-, 80-, 160-, or 250 GByte HD drives that can provide storage of high-resolution monochrome or color images for days, weeks, or months. To achieve reliability, the older small computer system interface (SCSI) drives were previously specified as the choice for DVR applications. Now the integrated drive electronics (IDE) drives offer similar performance and reliability at a much lower cost. These IDE HD drives found in mid-range and enterprise recorders provide storage in the terabyte (1000 GByte) range. These enterprise class recorders can have almost unlimited storage using external HD including configurations that can tolerate a failed drive without losing any video recorded video images. The IDE HD drives have narrowed the gap in speed and reliability compared to the relatively expensive SCSI HD drives, making IDE ATA100 thermally compensated drives a popular storage media for DVRs. Figure 9-9 summarizes Digital video storage media. 9.3.1.1.2 Video Motion Detection (VMD) Every video scene at some time has video motion caused by a person moving, an object moving, or some other motion activity. Many DVRs have VMD built into them. Digital signal processing (DSP) is used to detect the motion in the video image and cause some form of alarm or video representation on the monitor screen. This feature enables the DVR to remain in an inactive or TL operational mode until activity occurs, and increases the recording rate and displays the alarm on-screen in the form of an outlined object of the person moving or other activity in the camera field of view. This technology increases overall video 288 CCTV Surveillance CAMERA ANALOG TO DIGITAL CONVERTER COMPRESSION MAGNETIC HARD DISK MEDIA PLAYBACK (RETRIEVE) HEAD DE - COMPRESSION DIGITAL TO ANALOG CONVERTER SCENE DIGITIZED TRACKS OF VIDEO MAGNETIC RECORDING (RETRIEVING) HEADS DIGITAL VIDEO SIGNAL DISK ROTATION MONITOR VIDEO SIGNAL T 0 1/60 1 FIELD 2/60 = 1/30 3/60 (SEC) 1 FIELD 1 FRAME 3 FIELDS FIGURE 9-8 Magnetic hard disk (HD) digital video storage image storage time since the DVR does not have to record non-events or records them at a slower rate. When activity occurs it becomes visible on the video screen or causes some other alarm notification. One should be aware that for prosecutorial applications images acquired through a motion detection DVR may be inadmissible if there is no recording made prior to and after the time of the event. The ability to respond to alarm inputs—whether individual contact closures or software-generated procedures— are a major feature of DVRs and these important capabilities should be included in the design. The ability to incorporate immediate automatic recording on alarm is one of the features that puts the basic DVR a step above the off-the-shelf PCs equipped with video capture cards and base level software for setting parameters. Digital video recorders with internal VMD create a searchable audit-trail by camera every time there is motion. Unlike when using the VCR, security personnel can quickly find the video images of interest on the DVR by date, time, image motion activation, or alarm input. 9.3.1.1.3 Optical-Disk Image Storage For very long-term video image recording and archiving, an optical-disk medium is chosen. Optical storage media are durable, removable disks that store video images in digital format. There are two generic systems available: (1) nonerasable write-once read-many (WORM) and (2) erasable. These two electro-optical storage systems are described in the following sections. The optical disk recorder stores the video image on a disk using optical recording media rotating at high speed. The picture is stored and identified by coding the signal to the specific camera and the time and date at which it was put on disk. At a later time the stored picture can be retrieved in random access at high speed. Most optical disks used in security applications are WORM disks since these are admissible in law enforcement investigation and prosecution cases. 9.3.1.1.4 Non-Erasable, Write-Once Read-Many (WORM) Disk The WORM optical-disk recording system provides a compact means to store large volumes of video images. The drive uses a 800-megabyte, double-sided, removable diskette, which is rugged and reliable. In security applications, a WORM drive has a significant advantage over magnetic recording media because the optical image cannot be overwritten, eliminating the risk of accidental or intentional removal or deletion of video pictures. This is Analog, Digital Video Recorders ANALOG TO DIGITAL CONVERTER COMPRESSION RECORDING HEAD MAGNETIC HARD DISK MEDIA PLAYBACK (RETRIEVE) HEAD DE-COMPRESSION 289 DIGITAL TO ANALOG CONVERTER CRT MONITOR (ANALOG) ANALOG CAMERA IP DIGITAL CAMERA RECORDING MEDIA TYPE/FORMAT STORAGE RECORDING TIME (HRS) NUMBER OF CAMERAS FRAME RATE (fps) COMPUTER HARD DRIVE (HD) 60, 120 GB INTERNAL 160, 500 GB EXTERNAL HOURS TO MONTHS 4,8.16.32 EXPANDABLE UP TO 120 650 MB BACKUP OF HARD DRIVE — — 60, 120 GB INTERNAL 160, 500 GB EXTERNAL 333, 666, 1555, 3444 4 4 30 30 250 MB DEPENDS ON DATA — — CD-ROM CD-R CD-RW MOBILE HARD DRIVE (HD) REMOVABLE ZIP FIGURE 9-9 FLAT PANEL DISPLAY (DIGITAL) Digital video storage media important in law enforcement applications. The WORM disk containing the video images is removable and can therefore be secured under lock and key, stored in a vault when the terminal is shut down or the system is turned off, or sent to another location or person. Reliability is extremely high, with manufacturers quoting indefinite life for the disk and a minimum mean time between failure (MTBF) of greater than 10 years. The reason for this longevity is that nothing touches the disk itself except a light beam used to write onto and read from the disk. 9.3.1.1.5 Erasable Optical Disk Erasable optical-disk media is now available that can be erased (as on present magnetic media) and overwritten with new images (Figure 9-10). Each image stored on the optical HD is uniquely identified and may be retrieved in random access in less than 1 second. Optical disks store huge amounts of data; approximately 31 reels of data tape are equivalent to one single 5¼-inch-diameter optical disk—the size of an ordinary compact disc. Standard optical disks can store many terabytes of information. While most optical disks used in security are WORM, erasable optical disks are also in use. Erasable disks use the principle of magneto-optics to record the video information onto the disk in digital form. The video image data or other information is erasable, allowing the same disk to be reused many times, just like magnetic HD. Reading, writing, and erasing the information on the optical disk is done using light energy and not magnetic heads that touch or skim across the recording material. Therefore, magneto-optical disks have a much longer life and a higher reliability than magnetic disks. They are immune to wear and head crashes as occasionally occur in magnetic HD drives. This catastrophic event occurs when a sudden vibration or dust particle cause the mechanical head in the drive to bump into the recording material, thereby damaging it. In the case of the optical disk, the opto-magnetic layer storing the information is imbedded within a layer of plastic or glass, protecting it from dust and wear. The optical disk is an excellent medium when large amounts of high-resolution video images need to be stored and retrieved for later use. 9.3.1.1.6 Digital Audio Tape (DAT) Digital audio tape is a format for storing or backing up video data (originally for music) on magnetic tape. It was co-developed in the mid-1980s by the Sony and Philips Corporations. DAT uses a rotary-head (or helical scan) format where the read/write head spins diagonally across the tape like a VCR. It uses a small 4 mm-wide tape having a signal quality that can surpass that of a CD and can record data (video images) at a rate of 5 MBytes/minute. The DAT storage capacity is 6 GBytes on a standard 120-minute cartridge. DAT decks have both analog and digital inputs and outputs. 290 CCTV Surveillance VIDEO IN MODULATED LASER LENS OPTICAL POLARIZER OPTICAL ANALYZER LENS VIDEO OUT OPTICAL TRACKS BIT DIGITAL VIDEO DETECTOR MAGNETO - OPTIC LAYER BONDING AGENT OPTICAL DISK BIAS COIL FIGURE 9-10 Erasable optical disk recording 9.3.1.2 Digital Storage Software Advances Digital technology, faster microprocessors, high density inexpensive solid state memory, and the availability of larger and cheaper HD drives have made DVRs affordable in security video applications. Adding the combination of affordable image compression technologies and large capacity HD drives has made the development of the DVR a reality. Although HD DVR recording like VHS still uses a magnetic recording medium, the digital nature of the HD data permits transmitting the video images over networks to remote sites and insures that all retrieved images are identical copies of the original images. Digital video images can be stored on a HD but several things must be considered since around-the-clock recording of pictures requires a vast amount of storage space. To overcome this, when there is motion in the scene, a startand-stop recording mode can be implemented. Alternatively only a few frames per second can be stored much as the TL recorder in the analog régime. In Table 9-4 examples of the image sizes and storage requirements for five different image resolutions and image recording rates are given. The formula for calculating these storage requirements and image rates is: DVR storage capacity = HD drive storage capacity image size × pictures per day = storage time in days or hours Example: Calculate the storage capacity of a 250 GByte DVR to record an image frame rate of 15 images per second with a picture quality set to standard = 18 KByte: DVR storage capacity = 250 GByte 18 KByte × 15 ips × 86400 sec = 10.7 days = 10 days and seventeen hours This calculation is based on the DVR recording continuously at the selected recording speed for a 24-hour period (86,400 sec). This is a worst case scenario, since the DVR can be programmed to record only if motion is present or at selected times of the day. Both of these settings will dramatically increase the unit storage potential and eliminate the storage of unneeded or useless video images. Analog, Digital Video Recorders PICTURE* QUALITY DVR RECORDING SPEED (IMAGES/SEC) † (250 GByte HARD DRIVE) 30 15 10 7.5 5 3 1 2D/6H 4D/9H 6D/14H 8D/9H 13D/4H 22D/0H 66D/2H 60 291 HIGHEST 1D/3H HIGH 1D/17H 3D/3H 6D/14H 9D/21H 13D/4D 19D/19H 33D/0H 99D/4H STANDARD 2D/16H 5D/1H 10D/17H 15D/9H 20D/12H 30D/19H 51D/9H 154D/7H BASIC 4D/0H 7D/16H 15D/9H 24D/0H 30D/19H 46D/4H 77D/2H 231D/9H LOW 8D/0H 15D/10H 30D/19H 46D/4H 61D/16H 92D/12H 154D/7H 462D/20H IMAGES/DAY = 60 pps—5.184 MBytes 30 ips—2.6 MBytes 15 ips MODE—1.296 MBytes 7.5 ips MODE—648 KBytes 5.0 ips MODE—259.2 KBytes 3.0 ips MODE—86.4 KBytes 1.0 ips MODE—86.4 KBytes *IMAGE SIZES—HIGHEST = 42 KByte HIGH = 28 KByte STANDARD = 18 KByte BASIC =12 KByte LOW = 6 KByte ** ALL RECORDING TIMES BASED ON 250 GByte HARD DRIVE † RECORDING TIME: 1D/3H = 1 DAY AND 3 HOURS FORMULA FOR CALCULATING NUMBER OF IMAGES STORED ON 250 GByte HARD DISK MEMORY RECORDING TIME = HARD DISK STORAGE IMAGE SIZE × PICTURES/DAY = STORAGE TIME IN DAYS AND HOURS EXAMPLE: HARD DISK STORAGE = 250 GByte IMAGE SIZE = 12 KByte IMAGES/SEC = 10 ips SECONDS IN A DAY = 86,400 Sec PICTURES/DAY = 10 ips × 86,400 Sec RECORDING TIME = 250 GByte 12 KByte × 10 ips × 86,400 Sec = 24.1 DAYS Table 9-4 Digital Storage Requirements and Images Per Second (IPS) for Five Different Image Resolutions on a 250 GByte hard drive 9.3.1.3 Transmission Advances A fast-growing application for DVRs and digital storage systems is for the remote video retrieval via modem or network using LAN, WAN, and wireless (WiFi). Transmission speeds are increasing, compression algorithms are improving, and remote video solutions implementing automated video surveillance (AVS) at remote sites are being installed. New software that allows viewing of multiple IP addressable digital recorders from a central location is increasing and will become a must-have feature. Many companies are implementing digital recording for remote viewing in video systems using LANs, WANs, and Webbased systems. A major advantage of an IP-addressed network is its ability to receive video signals anywhere using equipment ranging from a simple Internet browser to special client-based application software. Using these networks eliminates the need to run new cabling and provides an easy solution for future system expansion. Cellular transmission is the slowest transmission method for video transmission and is not widely used in the video security market. However, in areas that offer no other service it is the only way to offer remote surveillance. The transmission speed of the cellular system is 9.6 Kbps (bits per second) and is increasing with time. Dial-up or pub- lic switched telephone network (PSTN) is the most common method of the available DVR transmission methods, but since it was designed for the human voice and not high-speed video transmission, it does not provide high bandwidth or speed of transmission. This type transmission mode has a maximum speed of 56 Kbps but in spite of the relatively slow service, its cost and availability are the major factors for its continued use. Integrated systems digital network (ISDN) is a digital phone line with two 64 Kbps channels. Competition from cable and digital subscriber line (DSL) service has reduced the pricing to acceptable levels for the video security market. DSL technology has sufficient bandwidth for highspeed access to the Internet and live video monitoring. This digital broadband link directly connects a premise to the Internet via existing copper telephone lines. The DSL speed is listed as nearly 1.5 Mbps but depends on the routing, the distance from the network hub, and the number of people on the network. A very high-speed, expensive digital system using dedicated lines is the AT&T T1 network transmitting up to 1.544 Mbps. The T3 lines have almost 30 times the capacity of T1 lines and can handle 44.736 Mbps of data. The widest transmission network is achieved using a fiber-optic optical carrier (OC) transmission channel. The 292 CCTV Surveillance TRANSMISSION TYPE TYPICAL DOWNLOAD SPEED TRANSMISSION TIME FOR 25 KByte IMAGE (sec) MAX. FRAME RATE FOR 25 KByte IMAGE CONNECTION MODE PSTN 45 Kbps 6 10 Frames/min DIAL UP ISDN 120 Kbps 2 0.5 Frames/sec DIAL UP IDSL 150 Kbps 2 0.06 ADSL—LOW END 640 Kbps 0.3 3 ADSL—HIGH END 5 Mbps 0.05 20 HDSL 1.5 Mbps 0.2 6 VDSL 20 Mbps 0.01 80 CABLE MODEM 750 Kbps 0.3 3 T1 1.5 Mbps 0.2 6 T3 45 Mbps 0.007 180 10BaseT 5 Mbps 0.05 20 100BaseT 50 Mbps 0.005 200 1000BaseT 500 Mbps 0.0005 2000 OC3 155 Mbps 0.0019 620 OC12 622 Mbps 0.0005 2500 FIREWIRE * 400 Mpbs 0.0008 1600 Frames/sec DIRECT CONNECTION DIRECT CONNECTION * APPLE COMPUTERS VERSION OF IEEE STANDARD 1394 IDSL: ISDN DSL HDSL: HIGH BIT-RATE DSL ADSL: ASYNCHRONOUS DSL VDSL: VERY HIGH DATA RATE DSL Table 9-5 Parameters of Digital Transmission Channels for DVR Use OC is used to specify the speed of fiber-optic networks conforming to synchronous optical network with synchronous optical network (SONET) standards. SONET is a physical layer network technology designed to carry large volumes of traffic over relatively long distances on fiber-optic cabling. SONET was originally designed by the American National Standards Institute (ANSI) for the public telephone network in the mid-1980s. FireWire is an ultra high-speed serial data connection developed by Apple Computer. The technology provides a high-speed serial input/output bus for computer peripherals that can transfer data at speeds of 400 Mbps. It is especially well suited for transferring very large DVR video image files for viewing or archiving. Table 9-5 summarizes the parameters of available digital transmission channels. 9.3.1.4 Communication Control All the functions available on the VCR and DVR machines can be controlled remotely using communications via the RS232 port(s) on the devices, and transmitted bidirectionally over the network to remote locations. Like- wise camera functions (zoom, focus, iris, presets, etc.), alarms, pan/tilt, and any internal DVR programming can be done remotely. 9.3.2 DVR Generic Types The DVDs can be divided into four groups or hardware implementations: • • • • DVR DVR DVR DVR in a Box-PC Card and a PC Basic Plug-and-Play VCR Replacement Multiplex Multi-channel. 9.3.2.1 DVR in a Box The DVR in a box is implemented by adding a PC Board card to the standard PC computer that instantly turns the PC into a DVR. The PC card has four video inputs providing a four-channel DVR. It seems simple to do but it does have limitations. The DVR should be a dedicated system operating alone. Mixing and matching the DVR Analog, Digital Video Recorders with other software programs can cause the total system to crash. Another shortcoming of many DVR cards is that they do not supply alarm inputs or outputs thus creating a very limited application machine. 293 available are the ability to connect and perform remote video retrieval via: modem, wired LAN, WAN, Internet, or wireless WiFi. 9.3.2.4 Multi-Channel 9.3.2.2 DVR Basic Plug-and-Play VCR Replacement The DVR basic Plug-and-Play differs from the DVR in a box in that it is a separate component designed and built specifically to be a DVR. The DVR basic is a self-contained unit having all the front panel controls that the standard industrial real-time/TL VCR has. These DVRs generally have a single- or four-channel video input capability and offer a minimum of setup parameters to permit the user to customize the picture quality, pictures size, or alarming features to meet the particular application. This DVR has been designed as a drop-in replacement for an existing analog VCR. 9.3.2.3 Multiplex The multiplex DVR is the largest of the four groups of DVR types used for video recording. The machine combines an 8- or 16-channel multiplexer with the DVR unit. This multiplex DVR shares the video input in the same way as the standalone video multiplexer. The combined DVR and multiplexer has the advantage that the installer no longer has to worry about the interface wiring and compatibility of setup programs between the two devices. Some of the features that have been included in the multiplex DVR are: (1) motion or activity detection, (2) remote video retrieval by a modem over a digital channel, (3) alarm inputs-contact closures or software generated, and (4) ability to adjust the IPS recorded. Recorders equipped with a multiplexing capability allow users to watch live and recorded images on one monitor while the multiplexing DVR continues to record. Multiplex DVR technology should be capable of multitasking, duplexing, and triplexing by performing the record and playback and live viewing functions simultaneously. VCRs cannot do that but most DVRs can. The operator using a multiplex DVR with triplex functionality can simultaneously review and archive the video images without interrupting the recording process. Uninterrupted recording ensures that no event will go unrecorded or missed. One shortcoming of the multiplexed video recorder is that it does not record all camera images from each camera connected to the system simultaneously. It incorporates a time-share system to record multiple camera inputs one at a time. The multiplex DVR technology allows a single unit to replace not only the recorder but also all the accessory items needed to run a VCR-based video system. There is no need for separate multiplexers, switchers, or any devices other than the camera, lens, and monitor. Other features Both the multiplexed and multi-channel DVR systems use a system called redundant array of independent disks (RAID) to control the multiple HD drives and provide management and distribution of the data across the system. Different RAID levels are used depending on the application to optimize fault tolerance, the speed of access, or the size of the files being stored. RAID Levels 1 and 5 are the most commonly used in video security applications. The multi-channel DVR is designed for high-end applications having many cameras and monitors. Applications using these systems require multiple, month-long storage times, real-time video recording, and a very large number (hundreds) of video inputs. Multi-channel DVRs allow cameras to be recorded at 60 IPS, whereas in the multiplex unit the cameras are time-shared between the images displayed. The primary difference between a multiplexed DVR and a multi-channel DVR is that the multiplex recorder uses only one display while the multi-channel DVR has multiple displays, either split screen or multiple monitors. Instead of time-sharing the recorded information, the multi-channel unit records all camera images at 30 IPS simultaneously. The system offers the highest performance and playback in a multiple camera system. The multi-channel DVR units have large HD drives with capability to store an excess of 480 GByte data and expanded storage derived from additional HD external memory and DAT and jukebox storage systems controlled by RAID controllers. Multi-channel DVRs using many HD drives require coordination and control. In order to store and protect as much information as possible, the RAID must control the HD drive or a DAT jukebox system. The RAID capability controls and protects the HD drive data and provides immediate online access to data despite a single disk failure. Some RAID storage systems can withstand two concurrent disk failures. RAID capability also provides online reconstruction of the contents of the failed disk to a replacement disk. 9.3.2.4.1 Redundant Array of Independent Disks (RAID) A redundant array of independent disks is a system using multiple HD drives to: (1) share or replace data among the drives and/or (2) improve performance over using a drive singularly. Originally RAID was used to connect inexpensive disks to take advantage of the ability to combine multiple low-cost devices using older technology into an array that together offered greater capacity, reliability, and/or speed than was affordable in a singular device 294 CCTV Surveillance using the newest technology. At its simplest level, RAID is a way of combining multiple hard drives into a single logical unit. In this way the operating system sees only one storage device. For the purposes of video security applications, any system that employs the basic concept of recombining physical disk space for purposes of reliability or performance is a RAID system. This system was first patented by IBM in 1978. In 1988, RAID Levels 1 through 5 were formally defined in a paper by Patterson, Gibson, and Katz. The original RAID specification suggested a number of prototype RAID levels or combinations of disks. Out of the many combinations and levels only two are in general use in security systems: RAID Level 1 and Level 5. RAID Level 1 array creates an exact copy (or mirror) of all data on two or more disks. This is useful for systems where redundancy is more important than using the maximum storage capacity of the disk. An ideal RAID Level 1 set contains two disks which increases reliability by a factor of two over a single disk. RAID Level 1 implementation can also provide enhanced read performance (playback of the video image), since many implementations can read from one disk while the other is busy. The RAID Level 5 array uses block-level striping with parity data distributed across all member disks. RAID Level 5 is one of the most popular RAID levels and is frequently used in both hardware and software implementations. Virtually all storage arrays offer RAID Level 5. Summarizing the two most common RAID formats found in DVR video security systems: 1. RAID Level 1 is the fastest, fault-tolerant RAID configuration and probably the most commonly used. RAID Level 1 is the only choice in a two drive system. In this two drive system the mirrored pair mirror each other and looks like one drive to the operating system. The increased reliability in this configuration is exhibited in that if one drive fails the video image data is available from the other drive. 2. RAID Level 5 provides data striping at the byte level and also uses stripe error correction information. This results in excellent performance and good faulttolerance. Level 5 provides better storage efficiency than Level 1 but performs a little slower. 9.3.2.5 Network Video Recorder (NVR) The future of digital video recording will be based on current information technology (IT) infrastructure, namely networking. By employing automatic network replenishment technology, the NVR can cope with network downtimes without sacrificing recording integrity. The concept of a virtual HD eliminates the concern of HD sizes. Figure 9-11 shows a block diagram of the NVR system. In a new installation where the designer has free reign to design the solution a NVR serving the storage requirements of an entire Enterprise is a choice to consider. One issue to consider if a NVR is used is that video images require large storage data files, and for the NVR installation a separate dedicated network for security may be necessary. Another consideration is that, even to simply receive video, knowledge of setup parameters for the individual camera is necessary, and the NVR would need to be programmed accordingly. Moving to a digital recording solution, whether the DVR, NVR, or a hybrid system combination of DVR, NVR requires careful planning and design. The NVR solution is a system that uses digital or analog cameras converted to IP cameras using a network server. This digital data is delivered to a network in accordance with the TCP/IP transport protocol and recorded by a NVR. The HD is usually controlled by a RAID Level 5 controller which can be expanded to other HD drives for increased storage capacity. To overcome storage shortcomings in the midsize and larger systems the NVR is used. A DVR’s capacity is based on the number of HD drives and the storage capacity of each HD. For large numbers of cameras and long archiving times separate DVR units are required. Image retrieval across separate units becomes impractical since most multiplex DVRs are of a one channel design. In order to accommodate 4, 9, 16 or more video inputs, internal or external multiplexers are used. The requirement to time-share the cameras means the IPS usually drops to a few. Dedicated DVRs do not take advantage of common IT principles like RAID storage. An NVR is basically a standard networked PC with a software application that controls the flow of digital video data. Thanks to the availability of network interface, a concept called virtual HD drive can be realized. The virtual memory concept is commonplace in today’s computer systems. The central processing unit (CPU) is made to accept the larger virtual size because of a logic unit—the memory management unit (MMU)—which is responsible for loading and unloading just a section of memory that the CPU currently needs. The concept is used for digital video recording. The data that has been successfully copied over the network may then be erased from the local HD which frees capacity on the local HD drive. The net effect is that the local HD will never fill up as long as the network storage device can accept the data. The virtual HD makes the retrieval of recorded video footage especially convenient: instead of searching over several physical disk volumes the user always sees a single disk of sufficient capacity. The most important question that must be considered before attempting remote video surveillance is whether the network available has sufficient bandwidth for video transmission to the remote site. Bandwidth requirements for quality video transmission range from 256 Kbps to 1 Mbps depending on the video compression method used, the image quality required, and image refresh rate (IPS) used Analog, Digital Video Recorders 295 SITE 2 SITE 1 ANALOG CAMERA(S) ANALOG CAMERA(S) SERVER SERVER ROUTER * ROUTER DIGITAL IP CAMERA(S) * DIGITAL IP CAMERA(S) INTERNET LAN/WAN NETWORK ** VIDEO WIFI RECORDER SITE 3 * DIGITAL COMPRESSED VIDEO (MJPEG, MPEG-2, MPEG-4). * ** SUFFICIENT STORAGE TO SUPPORT ALL SITES ANALOG CAMERA(S) SERVER SECURITY AUTHENTICATION. ROUTER RAID LEVEL 5 CONTROLLER FOR EXPANDED STORAGE CAPACITY. FIGURE 9-11 DIGITAL IP CAMERA(S) Network video recorder (NVR) system block diagram by the application. Most LAN or WAN systems can operate successfully using industry standard 10BaseT or 100BaseT Ethernet supported by standard computer operating systems. If the remote viewing system does not use the Web for its connection it is called an intranet. The intranet IP-address assignments and network parameters are controlled by the in-house network manager. 9.3.2.6 Hybrid NVR/DVR System The hybrid DVR/NVR system incorporates elements of both the DVR and NVR. This type of system uses distributed architecture with analog cameras connected to IP video servers and IP cameras connected directly to the network. The IP video from the IP cameras and the IP video servers are both stored on a server connected to the network. The NVR solution may be the most costeffective if installed on an existing shared network, but it is highly debatable whether many facilities would allow the increased network traffic created by the video images. Hybrid DVR/NVR solutions open up exciting possibilities in that they can use legacy analog cameras and existing video cabling as well as IP cameras. The hybrid solution permits the centralization of the system configuration, leading to greater flexibility in locating the equipment to where it is most convenient. 9.3.3 DVR Operating Systems (OS) The terms operating system (OS) and platform are familiar terms associated with computer systems. DVRs are computers designed to record video information with other features specifically tailored to the security application. Fundamentally an OS does several things: (1) manages the hardware of the computer system defined as the CPU processor, memory, and disk space, (2) manages software and other housekeeping functions, and (3) provides a consistent way for applications to interface with the hardware without having to know all the details about that hardware. Today’s OS take on several forms including the Microsoft Windows family and Linux embedded formats. Manufacturers of DVR products have built their systems on a variety of proprietary OS platforms. Windows 296 CCTV Surveillance systems include the Windows 9X, XP family, Windows NT, and Windows 2000. Embedded proprietary OS platforms like UNIX are specifically designed to run on a unique DVR product. By nature, this proprietary OS is used by a single manufacturer unless licensed to other competitors. One concern regarding these proprietary platforms is that their distribution is limited and the user should question how extensively the product has been field-tested against a wide variety of software security threats. 9.3.3.1 Windows 9X, NT, XP, and 2000 Operating Systems The Microsoft Windows 9X, 16 bit families of OS namely Windows 95, 98, and ME have been the primary platforms used by DVR manufacturers. Several reasons why this family of OS has been so popular with the creators of DVR products are that the product is relatively stable, familiar to most users, and less expensive than its client–server counterparts. Since no major changes in the OS have occurred over the lifetime of this family of products, developers of DVR software have been able to avoid making costly software rewrites. There is a downside, however, to the security aspect of the Microsoft 9X family of OS. The Windows 9X family has a fundamental flaw that has been recognized by Microsoft and by the manufacturers of DVR products (see Section 9.3.8). Significant improvements in Windows 2000 were modifications to the OS core to prevent crashes, to enhance dynamic system configuration, and to increase system uptime, and providing a unique method for self repair. The Windows 2000 has built-in tools that make the OS applications easier to deploy, manage, and support. Centralized management utilities, troubleshooting tools, and support for self-healing applications make the management of an IT infrastructure easier. Windows 2000 also offers use of HD drives and enhanced level of hardware support. These improvements include the largest database of third-party drivers and support for the latest hardware standards. This makes it easier for HD drive users to easily upgrade to the latest versions of software released by the HD drive manufacturer. 9.3.3.2 UNIX The UNIX OS is designed to be used by many people at the same time. The UNIX is a multi-user and multi-tasking OS developed at the Bell Telephone Laboratories, NJ, by Ken Thompson and Dennis Ritchie in 1969. It is widely used as the master control program in workstations and servers, and as an embedded OS in proprietary DVRs. It has the Internet TCP/IP protocol built-in and is the most common OS for servers on the Internet. 9.3.4 Mobile DVR Mobile DVRs are embedded systems designed specifically for use in vehicles and rapid deployment systems (RDS) and for short-term security installations (see Chapter 21). These mobile DVRs are small and rugged, vibration and shock resistant, and the choice for vehicle security, RDS, and surveillance applications are when portability is necessary. The mobile DVR system can connect from 1 to 4 video cameras and record and display at a full 30 IPS with audio recording as an option. LAN and Internet interface connections are available and vehicle status and speed can be displayed on the recording. The camera images can be displayed individually at full screen or in a quad format. Auto switching from camera to camera is supported and the dwell time per camera can be set by the user. Four input alarm contacts allow each camera to be recorded on receipt of an alarm signal. Video can be recorded continuously, on motion, on alarm or by schedule. The VMD zones can be set in each camera, or the whole field of view of each camera can be used as the motion criterion. Video files can be searched by date, time, and alarm state in a single or quad configuration. ID and password protection are provided at different levels from system manager to system operator. A standard Internet browser for remote connection with user ID protection can monitor the mobile DVR site through the Internet, LAN, WAN, or wireless WiFi. These mobile DVR systems provide far superior performance over traditional analog VHS VCRs which are prone to hardware failure due to humid and dusty environments and to shock and vibration. These DVRs contain rugged 30 GByte HD drives with input/output RS-232 control ports. DVRs are available having a microcontroller that translates its pushbutton commands into the Sony/Odetics control protocol for full configuration and control through the RS-232 control port. A serial connector allows Windows Control Software to be used for PC-based control and configuration. Figure 9-12 shows examples of fixed mobile DVRs. 9.3.5 Digital Compression, Encryption Video compression is the science of eliminating as much digital data in the video signal as possible without it being evident to the observer viewing the image. Today’s systems have compression ratios ranging from 10:1 to 2400:1 making it possible to transmit or record huge amounts of video data. Basic video compression methods can be classified into two major groups: lossy and lossless. Lossy techniques reduce data both through complex mathematical algorithms and through selective removal of visual information that our eyes and brain usually ignore. Lossless Analog, Digital Video Recorders (A) FIXED DVR 297 (B) SMALL MOBILE DVR (C) HARDENED MOBILE DVR FIGURE 9-12 Compact PC-based fixed and mobile DVRs compression, by contrast, discards only redundant information making it possible to reconstruct the exact original video image signal. The need for recording and storing days of video image scenes requires that the signals be compressed to reduce the file size. There are several different compression algorithms utilized in digital VCRs that are mostly derived from the JPEG, MPEG, Wavelet, H.263, and H.264 algorithms. Both JPEG and MPEG are both based on the discrete cosine transform (DCT) in which blocks of 8 by 8 pixels are grouped and then transformed into the frequency domain. The Wavelet algorithm transforms the entire picture into the frequency domain, resulting in relatively small file sizes as compared to the DCT-based algorithms. Likewise, the H.263 and H.264 are designed for low bit rate systems. In a typical video signal one image is similar to the next, and it is possible to make a good prediction of what the next frame or field in the sequence will look like. It is also possible to bi-directionally interpolate images based on those that came before and after. The method is to compare the most recent image with the previous image and determine if there was a change, and then decide whether to store or not to store those frames if there has been a change. There are many techniques used to compress the video image for storage in a DVR. One method is redundancy reduction and is accomplished by removing duplication from the signal source before it is compressed and stored. Three forms of redundant reduction are: 1. Spatial: Correlation is between neighboring pixel values 2. Spectral: Correlation is between different color planes or bands 3. Temporal: Correlation between adjacent frames in the sequence. A second form of reduction is called irrelevancy reduction. This method omits parts of the signal that will not be noticed by the observer. Two areas described by the Human Visual System (HVS) organization are in the lowfrequency visual response and color recognition areas. 298 CCTV Surveillance 9.3.5.1 JPEG The JPEG compression algorithm was introduced in 1974 by the Joint Photographic Expert Group and uses the DCT compression algorithm based on a video stream of 8 × 8 pixels. This is the algorithm primarily used to download images over the Internet and can achieve compression ratios up to 27:1. It is designed to exploit known limitations of the human eye, notably the fact that small color changes are perceived less accurately than small changes in brightness. Using all of these compression methods reduces in the file storage size while maintaining a high-quality stored video image. 9.3.5.2 MPEG-X The MPEG compression algorithm uses the same DCT compression found in JPEG. The difference is that MPEG compression is based on motion-compensated block-based transform coding techniques. The primary technique used in this algorithm is conditional refresh where only changes in the image scene are compressed and stored which reduces the amount of storage required. This is called inter-frame (I) compression. MPEG uses the same algorithms as JPEG to create one I-frame and then removes the redundancy from successive frames by predicting them from the I-frame, and in coding only the differences from its predictions (P-frames). B-frames are bi-directionally interpolated. MPEG compression allows for three types of frames: 1. I-frame (compress entirely within a frame). 2. P-frame (based on predictions from a previous frame). 3. B-frames (bi-directionally interpolated from previous and succeeding frames). MPEG-1, MPEG-2, MPEG-4 and the latest MPEG-4 AVC (H.264) are the four basic MPEG forms used in video compression. Each has a different compression ratio: 1. 2. 3. 4. MPEG-1 MPEG-2 MPEG-4 MPEG-4 = 25 to 100:1 = 30 to 100:1 = 50 to 100:1. AVC = 50 to 200:1 (or more) MPEG compression forms have large file sizes and therefore many of today’s DVR manufacturers have modified this standard to meet the needs of the video security industry. These modified standards called H.263 and H.264 are designed for low bit rate communications. H.263 is better than MPEG-1 or MPEG-2 for low-resolution and low bit rate images. MPEG-4 AVC (H.264) is now considered to be the best video compression standard. 9.3.5.3 Wavelet Wavelet compression technology is based on full-frame information and is based on frequency not on 8 × 8 pixel blocks. It compresses the entire image—both the high and low frequencies—and repeats the procedure several times. Wavelet compression can provide compression ratios up to 350:1. 9.3.5.4 SMICT Super motion image compression technology (SMICT) is a proprietary video compression technology that produces a small file size for high resolution image reproduction. The OS is Windows 2000 or Windows XP. SMICT can provide compression ratios from 40:1 to up to 2400:1. Typical single image file sizes range up to 2500 Bytes. The SMICT compression algorithm includes video authentication. One manufacturers’ system can record 16 cameras at the rate of 3 IPS at 220 horizontal TV lines and 320 × 240 format size onto a single 75 GByte HD for between 30 and 60 days. Recording time is higher if there is less activity in the video image. Using a PSTN connection with a 56 KByte modem, four cameras can be viewed at 2 IPS, each in quad mode. 9.3.6 Image Quality Video image quality from a DVR is dependent on the resolution in the camera image, the compression algorithm and ratio, and the IPS displayed. 9.3.6.1 Resolution The term resolution is often misused and misunderstood in the security industry. In analog systems the recorded image almost always fills the entire monitor screen. The resolution for analog video cameras and VHS recorders is defined as (1) number of TV lines in a horizontal width of the screen equal to the height of the screen or (2) the total number of horizontal TV lines across the width of the monitor. Digital resolution refers to spatial resolution (number of pixels per line) and the number of lines or rows per image, again defined in pixels. Digital resolution is also defined as the total number of pixels on the screen. This definition not only affects the overall resolution of the system but also the overall size of the displayed image. The common digital monitor image sizes are defined as 1/4 CIF, CIF, and 4 CIF (see Section 9.3.7). 9.3.6.2 Frame Rate Most video security applications require that at least 2–5 IPS per camera be recorded to ensure that enough images are captured to clearly identify a person, an object, or activity. A recording rate of 15 IPS is perceived as nearly real-time. This is the minimum rate when all motions and activities are required to be recorded as in locations such as casinos, retail stores, banks, etc. where fast motion and Analog, Digital Video Recorders sleight of hand must be detected. When basic DVRs or the multiplex DVRs are used, the number of camera inputs will affect the IPS recording rate for each camera. Today’s midsize recorders can record at rates from approximately 60 to 480 IPS. Dividing the IPS rate by the number of cameras in the multiplex systems calculates the average IPS per camera recording speed. In large Enterprise systems a multi-channel DVR or NVR recording system is required. These systems can record a large number of cameras simultaneously so that a rate of 5 IPS or higher can be achieved. The ability to change the number of recorded IPS per video input is important since the main purpose of any DVR and multiplexer is to provide a simple and costeffective method to monitor live and recorded images via a multi-screen display. This form of TL recording eliminates the gaps between video scenes created by conventional sequential switchers. 9.3.6.3 Bandwidth When video images from DVRs are transmitted to remote locations, the image frame rate and resolution are directly affected by the bandwidth of the transmission network. As a rule of thumb, the wider the network bandwidth, the more the IPS, and the better the resolution (more pixels). Bandwidth requirements for quality video transmission range from 256 Kbps to 1 Mbps depending on the video compression method used, the image quality required, and IPS used by the application. Cellular phone is the slowest transmission method and not widely used in the video security industry. Its bandwidth is 3000 Hz and has a 9.6 Kbps data rate. Dial-up or PSTN with a modem is the most common transmission method and has a maximum data rate of 56 Kbps. While relatively slow, its low cost and availability contribute to its continued use. ISDN is a digital phone line with two 64 Kbps channels. This costs more than the PSTN but with competition from the cable network and digital subscriber line (DSL), pricing for these are acceptable to the video security industry. Typical speeds for the DSL network is 1.544 Mbps but depends on cable routing, distance, and number of other clients using the same line. Much wider bandwidth choices include the AT&T T1 and T3 lines, and the OC3–OC12 optical fiber networks. 299 els) of the captured image. The larger the picture size, the larger the storage required on the hard drive. The abbreviation CIF has two definitions. The first is Common Intermediate Format (CIF), a standard developed by the International Telecommunications Union (ITU) for video teleconferencing, and is the standard in current use throughout the digital video security industry. Table 9-6 defines this CIF pixel format, aspect ratio, and bit rate for NTSC and PAL systems. The original CIF is also known as Full CIF (FCIF). Quarter CIF is designated as QCIF and Four CIF as 4CIF. The 4CIF image improves the resolution by a factor of four over the 1CIF image by doubling the number of pixels in both the vertical and horizontal axis. 4CIF uses all the camera pixels and reproduces the best image quality from a high resolution camera. The ability to identify persons, objects, and activities greatly affects the required stored image format and consequently the resolution of the image. The 1CIF image can be used to identify faces, license plates, and other detail only under favorable conditions. The 4CIF display is the format of choice and uses one of the MPEG-4, H.464, or other high compression standards. Also shown in Table 9-6, but not to be confused with Common Intermediate Format is the Common Image Format also abbreviated CIF, which is the standard frame size for digital video based on Sony’s D1 format that defines the two standard SDTV frames. 9.3.8 Network/DVR Security 9.3.8.1 Authentication 9.3.7 Display Format—CIF An important requirement in any local or remote video monitoring system is the need to keep the video information secure and error-free. DVR image degradation can be caused by equipment failure or produced by manmade activity (hackers, viruses). The security provider must be diligent and make all efforts to ensure that the information is accurate and the system tamperproof. Stateof-the-art image authentication software has increased the reliability of digital video monitoring by preventing the tampering of the signal. The safeguards can be incorporated with either special compression methods using date/time stamping or the summation of pixels changes, all of which will insure the acceptance of the digital video record in a court of law. Some standards of authentication include: Digital images from DVRs or other sources can be displayed on monitors in full size or a fraction of the monitor screen size. Compression technology is critical and a significant factor in determining the storage required and the final resolution obtained in the digital video image. The CIF image size determines the size (number of pix- • Images must be from the original VCR tape or DVR hard drive • Images should be recorded in a WORM drive • Images should have a check sum error checking methodology • Images should have a date digital signature. 300 CCTV Surveillance COMMON INTERMEDIATE FORMAT* (CIF) CIF QCIF (QUARTER CIF) SQCIF (SUB QUARTER CIF) BIT RATE AT 30 FRAMES/SEC (Mbps) SCREEN AREA PIXEL FORMAT ASPECT RATIO FULL 352 × 288 (PAL) 352 × 240 (NTSC) 1.222 36.5 1/4 176 × 144 (PAL) 176 × 120 (NTSC) 1.222 9.1 128 × 96 (PAL) 1.333 (4 × 3) 4.4 704 × 240 (NTSC) 704 × 288 (PAL) 1.222 18.3 — 1/2 CIF FULL 4CIF (4 × CIF) FULL 704 × 576 (PAL) 704 × 480 (NTSC) 1.222 146.0 16CIF (16 × CIF) FULL 1408 × 1152 (PAL) 1408 × 960 (NTSC) 1.222 583.9 * COMMON INTERMEDIATE FORMAT (CIF) DEVELOPED BY the INTERNATIONAL TELECOMMUNICATIONS UNION (ITU) IN STANDARD H.261 FOR VIDEO TELECONFERENCING. THIS FORMAT IS IN CURRENT USE THROUGHOUT THE DIGITAL VIDEO SECUIRTY INDUSTRY. MPEG COMPRESSION STANDARDS ARE BASED ON THE CIF FORMATS. COMMON IMAGE FORMAT ** SCREEN AREA PIXEL FORMAT FULL 640 × 480 1/4 VGA 1/4 320 × 240 1/16 VGA 1/16 160 × 120 D1 (SONY FORMAT) FULL 720 × 480 (NTSC) 720 × 576 (PAL) VGA ** NOT TO BE CONFUSED WITH CIF ABOVE, COMMON IMAGE FORMAT IS A STANDARD FRAME SIZE FOR DIGITAL VIDEO BASED ON SONY’S D1 FORMAT. Table 9-6 Common Intermediate and Common Image Format (CIF) Parameters The WORM format allows the operator to review video images as often as required but the images can never be altered. The check sum is a method which records the number of levels and pixels per recorded line and stores this information in the recorder’s program. On review this sum is checked and if the two are not equal an alarm or visual cue notifies the operator that a change has occurred. Authentication should also include a date/time stamping or digital signature inserted on all recorded video images. A network authentication protocol called Kerberos is designed to provide strong authentication for client/server applications by using secret-key cryptography. This protocol was developed by the Massachusetts Institute of Technology (MIT) in the mid-1980s, and is free and has been implemented into and available in many commercial products. It was created by MIT as a solution to network security problems. The Kerberos authentication system uses a series of encrypted messages to prove to a verifier that a client is running online on behalf of a particular user. It uses strong cryptography so that a client can prove its identity to a server (and vice versa) across an insecure network connection. Kerberos requires a trusted path through which passwords are entered. If the user enters a password in a program that has already been modified by an attacker (Trojan horse), then an attacker may obtain sufficient information to impersonate the user. After a client and server have used Kerberos to prove their identity, they can also encrypt all of their communications to assure privacy and data integrity as they go about their business. In 1989 Version 5 of the protocol was designed and is in use in many systems today. Analog, Digital Video Recorders 9.3.8.2 Watermark A digital watermark is a digital signal or pattern inserted into a digital image. It is inserted into each unaltered copy of the original image. The digital watermark may also serve as a digital signature for the copies. For law enforcement and prosecution purposes it is critical that digital tapes and disks be watermarked, since digital information may easily be altered and modified through software manipulation. The law in most countries requires that information recorded by DVRs not be altered or modified. An example of such watermarking techniques is used in the Panasonic digital disk recorder utilizing a proprietary algorithm to detect if the image has been altered or modified. If the image has been changed in any way, when it is played back the word altered appears on the monitor indicating that the original image is not being viewed. 9.3.8.3 Virtual Private Network (VPN) The security of digitally transmitted information has existed in the IT world for many years. With the rapid increase in the use of digital video hardware and transmission networks, the security industry looks to the IT community for additional technologies to make video transmission more secure and safe from external attack. The data security requirements have changed significantly in the past ten years as the Internet has grown, and vastly more companies have come to rely on the Internet for communications and hence the security solutions are necessary. A VPN is a private data network that makes use of the public telecommunications infrastructure, maintaining privacy and providing security through the use of a tunnel protocol and security procedures. The VPN provides an encrypted connection between user’s distributed sites over a public network such as the Internet. By contrast, a private network uses dedicated circuits and possibly encryption. The VPN is in contrast with the system of home or leased lines that can only be used by one company. The primary purpose of a VPN is to give the company the same capabilities as private leased lines but at a much lower cost. By using the shared public infrastructure, companies today are looking at using VPNs for both extranets and wide area intranets. There are three basic classifications of VPN technologies: (1) trusted VPN, (2) secure VPN, and (3) Hybrid VPN. 9.3.8.3.1 Trusted VPNs Before the Internet became nearly universal, a VPN consisted of one or more communication circuits leased from a communications provider where each leased circuit acted like a single wire that was controlled by the customer. The basic idea was that a customer could use these leased circuits in the same way that they use physical cables in their local network. The privacy afforded by these legacy 301 VPNs was only that the communications provider assured the customer that no one else would use the same circuit. The VPN customer trusted the VPN provider to maintain integrity of circuits and to use the best available business practices to avoid snooping of the network traffic. This methodology really offers no real security. 9.3.8.3.2 Secure VPNs Networks that are constructed using encryption are called secure VPNs. Vendors created protocols that would allow traffic to be encrypted at the edge of one network or at the originating computer, move over the Internet like any other data, and then be decrypted when it reached the corporate network or a receiving computer. This encrypted traffic acted like a tunnel between the two networks. Even if an attacker could see the traffic it could not be read to make a change in the traffic or make use of the data, without the changes being seen by the receiving party who would therefore reject the data. The encrypted tunnel provides a secure path for network applications and requires no changes to the application. 9.3.8.3.3 Hybrid VPNs The hybrid VPN uses a secure VPN that is run as part of a trusted VPN, creating a third type of VPN. The secure parts of the hybrid VPN can be controlled by the customer or the same provider that provides the trusted part of the hybrid VPN. Sometimes an entire hybrid VPN is secured with the secure VPN, but more commonly only a part of a hybrid VPN is secure. 9.3.8.4 Windows Operating System Within a year of Windows 95 release, Microsoft identified a major security problem that could not be fixed without a complete software rewrite. Microsoft then embarked on the development of a completely new platform which resulted in Windows NT that was built on the concept of creating a high level network security OS. However, the majority of DVR manufacturers continue to use Windows 95 and Windows 98 rather than take the costly route of rewriting their software with the more secure Windows NT. The newer Windows 2000, the 32-bit OS that was built on the Windows NT technology provides the users of HD drives many comprehensive security features that protect sensitive video and other security data. These enhanced security features provide local protection in addition to securing information as it is transmitted over a LAN, WAN, WiFi, phone line, or the Internet. With Windows 2000, the system administrator and authorized users can select from multiple levels of security. For advanced users, Windows 2000 also supports standard Internet security features such as IP security, Kerberos authentication, Layer 2 Tunneling Protocol, and VPNs. Many large companies have migrated to Windows 2000 to take advantage of this secure OS. 302 CCTV Surveillance 9.3.9 VCR/DVR Hardware/Software Protection Both VCRs and DVRs require various types of protection and handling to avoid hardware and software failures. They also require periodic maintenance. In particular, the VCR recorder and the VHS magnetic tape cassettes require special care because of their complex mechanical tape handling mechanism and the vulnerable videotape cassette. 9.3.9.1 Uninterrupted Power Supply (UPS) Hardware main power protection via power conditioning is a must for VCRs and DVRs. As with computer systems, voltage surge and power line filtration must be included in any recorder installation. Installations in areas prone to lightning or other electrical disturbances require extra precautions. Power protection must not be treated as “just another box” that must be included with the system. Unfortunately, it is usually after a major failure that most people realize that this protection is critical. Appropriate protection includes a UPS and surge protector (Chapter 23). 9.3.9.2 Grounding Like other electronic equipment, DVRs require proper electrical grounding to insure that transient voltages on the power line are safely directed to ground. This ground connection will greatly reduce the possibility of damage to the recorder and its internal HD drive. Such grounding is important in high-risk areas that experience lightning storms and applications where electromagnetic interference (EMI) or radio frequency interference (RFI) may be present. The grounding wire on a three-pronged power cord is sufficient for grounding the recorder, but a check should be made that the AC power socket into which it is plugged is connected to earth ground. This can be tested by using an ohmmeter and measuring the resistance between this location and the earth ground location. It should measure near zero ohms. 9.3.9.3 Analog/Digital Hardware Precautions Both the analog and digital video recorders contain mechanical moving parts. As such they should be treated with care when installing, moving, or relocating them. Do not rough handle them. The VCRs and DATs have many mechanical parts which can become misaligned or damaged if the machine is dropped or mishandled, or if tape insertion/removal is performed carelessly. This can render the machine inoperative. Digital video recorders have one or more HD for storage of the video image. Do not unnecessarily jar or drop the machine as this could damage or reduce the lifetime of the HD. After powering down the DVR, it should not be moved for a few minutes after power shutoff to insure that the HD platter has come to a complete stop and that the HD head that reads and writes information from the disk has come to a parked position. 9.3.9.4 Maintenance Since the VCR, VCR time lapse, DAT, and DVR recorders are used over a long periods of time, preventative maintenance for these devices is important. To ensure reliable operation this is especially true for the VCR video heads. The video heads rotate at 1800 rpm and the video head gradually wears out and head-to-tape contact is reduced, resulting in a noisy picture. VCRs must be operated in a dust-free, controlled humidity and temperature environment to ensure reliable operation. If the VCR tape fails or the cassette jams, retrieve the cassette and carefully manually remove the broken tape, and then splice the tape to salvage the remaining information recorded. In the case of the DVR, the PC-based operating system (OS) may crash and therefore it is wise to back up the recorded information onto external backup storage or use a RAID-configured HD drive system. Short of these measures, it is a challenge to retrieve the information from the HD. If DAT recorders are used for backup, head clogging is often difficult to detect because of the powerful error correcting built into these machines. They operate even with only one head operating. To test for head clogging, turnoff the error correction and read the error rate of the unit and see whether it is within the manufacturer’s specifications. If not, clean or replace the heads. 9.4 VIDEO RECORDER COMPARISON: PROS, CONS Although the VCR has served the video surveillance industry well for several decades, the VCR technology has several shortcomings. These have been brought into the limelight with the introduction of the DVR in the late 1990s. The following is a list of many pros and cons for the DVR and VCR. 9.4.1 VCR Pros and Cons The criteria used to assess the analog VCR and digital DVR cross several boundaries including cost, size of system, hardware already in place, availability of recording media, manpower to administer, and maintenance. The analog VCR has served the security industry well over the last decades but the digital DVR will clearly replace it swiftly. Analog, Digital Video Recorders VCR Pros • Low cost proven technology with long history of service • Easy to copy and provide as evidence to law enforcement • Difficult to alter video images (as compared to digital recorders). VCR Cons • Tape heads need regular maintenance and eventually wear out and need replacement • Tape handling mechanism has many precision mechanical parts that can go out of alignment or fail • Tape needs to be changed on a regular basis daily depending on application requiring manpower • Tape is sensitive to humidity, dust, chemicals, and high level magnetic fields. 9.4.2 DVR Pros and Cons DVR Pros • Produces a permanently clear and crisp record on a HD drive • Serves as long-term backup device and requires no additional data management costs • Reproduces the original picture quality after many copies are made • Provides multi-channel recording in real-time when required • Simultaneous recording, viewing, and transmitting to remote site • Intelligent motion detection acts as alarm sensor • Multiplexer can be integrated with DVR • Remote control Pan/tilt, camera zoom/focus/iris • Nonstop recording limited only by the HD drive storage space available • Remote access by a LAN, WAN, WiFi, ISBN, DSL, modem, PSTN. 303 guideline lists some of the factors to consider when choosing a video recording system. 9.5.1 Checklist • • • • • • • • • • • • • • • • • • • • How many cameras must be recorded? How many IPS per camera? What quality (resolution) of images is required? What is the size of image: 1/4 CIF, CIF, or 4 CIF? What length of storage is required? What is the location for monitoring? How many sites? Does the recording system permit limiting the amount of bandwidth required to transmit video across the network? Does the remote viewing software allow viewing cameras from multiple systems on the same screen at the same time? Does the remote viewing system software allow searching for recorded video and playing back from multiple systems at the same time, on the same screen? Can the system be administrated remotely? How much training will the staff require to use it efficiently? Can the system record at different frame rates, quality settings, and video settings for individual cameras? Does the system record video prior to the beginning of the event (pre-alarm recording)? Upon alarm condition can the system send an email notification? Can different recording schedules be programmed for each hour or day? Can the system send video to a remote location for automatic video display upon alarm condition? Can multi-camera views be created and then automatically sequenced between them on the video monitor? Can the system automatically archive video data to a network storage device? Can pan-tilt-zoom cameras be controlled from both the system and the remote software? DVR Cons • Eventual HD failure • OS and/or application program crash • Digital data more easily altered unless water-marking or other high level security is built in. 9.5 CHECKLIST AND GUIDELINES There is a large variety of VCR and DVR hardware to choose from to record the video image. Prior to the late 1990s the VCR was the only technology choice. The DVR is now the major technology choice. This checklist and 9.5.2 Guidelines • Initially install DVRs in highly sensitive areas to improve image quality, image retrieval, and searching time. • Enable remote video monitoring for authorized personnel. This can cut travel costs, improve operational efficiency, and make the DVR investment more costeffective. • Choose a basic DVR or multiplexed DVR for small- to medium-size installations. • Choose a multi-channel DVR or NVR for large systems. • Choose a security level that matches the security requirement. 304 CCTV Surveillance 9.6 SUMMARY The VCR or DVR records video images to establish an audit trail for the video surveillance activity. It can be viewed at a convenient time by security, law enforcement, or corporate personnel to identify a person, determine the activity that occurred, or assess the responses of security personnel. The video recording provides a permanent medium with which to establish credible evidence for prosecuting a person involved in criminal activity or suspected thereof, and for use in a criminal trial, civil litigation, or dismissal. The video recording provides a basis of comparison with an earlier recording to establish if there was a change in condition at a particular location, such as moved or removed equipment or personnel patterns, including times of arrival and departure. Video cassette recorders and digital video recorders are excellent tools for training and evaluation of personnel performance. They serve as a source of feedback when evaluating employee performance. By reviewing the recording, management can determine which employees are working efficiently and which employees are not performing up to standards, without on-site supervision. Magnetic HD DVRs and optical HD recorders have a clear advantage over VCRs when video images must be retrieved quickly from a large database of stored images. Retrieved video images can be printed on thermal, inkjet, or laser printers when: (1) a hard copy audit trail of video images are required for dismissal, court-room, or insurance purposes, (2) a guard needs a hard copy printout when dispatched to apprehend an individual at a suspected crime scene, (3) to produce a permanent hard copy of an activity or accident for insurance purposes, etc. The video record offers the ability to instantly replay a video image and print it. This feature is important in real-time pursuit and apprehension scenarios. The Internet has and will continue to change how video images will be recorded and distributed locally and to remote sites. The hardware, software, and transmission channels already exist to provide security personnel, corporate management, and government organizations to perform automated video security (AVS). Chapter 10 Hard Copy Video Printers CONTENTS 10.1 10.2 10.3 Overview Background Printer Technology 10.3.1 Thermal 10.3.1.1 Monochrome 10.3.1.2 Wax Transfer 10.3.1.3 Color-Dye Diffusion 10.3.2 Ink Jet, Bubble Jet 10.3.3 Laser, LED 10.3.4 Dye-Diffusion, Wax-Transfer 10.3.5 Film 10.4 Printer Comparison 10.4.1 Resolution and Speed 10.4.2 Hardware, Ink Cartridge Cost Factors 10.4.3 Paper Types 10.5 Summary 10.1 OVERVIEW Hard-copy printout from a live video monitor and VCR/DVR recorder or other transmitted surveillance images is a necessity to the video security system. Monochrome and color printers permit good to excellentquality reproduction of the scene image on hard-copy printout. The printed hard-copy image is used by security personnel for apprehending offenders, responding to security violations and for a permanent record of a scene, activity, object, or person. The video printer is a device that accepts: (1) an analog video signal from a camera or VCR or (2) a digital signal from a computer, a DVR, or an IP camera, and transfers the information to paper (or film). The information can be text, graphics, and video, and can be printed in either color or monochrome depending on the data content. Printers vary greatly in terms of their technology, sophistication, speed, and cost. 10.2 BACKGROUND The three most popular video printer technologies for video applications are thermal, ink jet, and laser. Thermal. Early models of thermal hard-copy printers produced crude facsimiles of the monitor picture with low resolution and poor gray-scale rendition. Today’s advanced technology enables printers to produce excellent monochrome or color image prints with resolution approaching that of a high-quality camera. Of the several monochrome and color printout technologies available for the security industry, the monochrome thermal printer is the most popular because of its low hardware and paper costs. They need no toner or ink, only a special paper. Ink Jet, Bubble Jet. The present ink-jet printer was built on the progress made by many earlier versions and has had a long history of development. Among the contributors to the evolution have been the Hewlett Packard (HP) and Canon Companies, claiming a substantial share of credit for the development of the modern ink jet. In 1979 Canon invented and developed the drop-on-demand inkjet method where ink drops are ejected from a nozzle by the fast growth of an ink vapor bubble on the top surface of a small heater. Canon named this bubble jet technology. In 1984 Hewlett-Packard (HP) commercialized the ink-jet printer and it was the first low-cost ink-jet printer based on the bubble jet principle. HP named the technology thermal inkjet. Since then, HP and Canon have continuously improved on the technology and currently thermal ink-jet printer dominates the major segment of the color printer market. The four major manufacturers now accounting for the majority of ink-jet printer sales are Canon, HP, 305 306 CCTV Surveillance Epson, and Lexmark. Ink-jet printers are a common type of computer printer used for video security applications. Laser, LED. Laser and LED (light emitting diode) printers provide an alternative to the ink-jet and bubble jet printers for producing hard-copy video printouts. Laser and LED printers rely on technology similar to a type of dry process photocopier that was first introduced by Xerox Corp. This process known as electro-photography was invented in 1938 and later developed for their copier machines by Xerox and Canon in the late 1980s. The first laser printer was created by Xerox researcher Gary Starkweather by modifying a Xerox copier in 1971 and was offered as a product as the Xerox Star 8010 in 1977. The first successful laser printer was the HP LaserJet, an 8-page per minute (ppm) model, released in 1984. The 8010 used a Canon printing engine controlled by HP-developed software. The laser printer uses a rotating mirror to form the image on the drum. The HP LaserJet printer was quickly followed by laser printers from Brothers Industries, IBM, and others. The Okidata Company developed and has been producing a printer using LED technology instead of a laser for many years. Okidata and Panasonic now produce LED printers using an array of small LEDs to form the latent image on the drum, and no mirror scanner is required. This LED technology offers some potential advantages over the laser system. Other. Two other technologies used to produce highquality color images are: (1) thermal transfer printer (TTP) using thermal plastic wax and (2) thermal sublimation printer (TSP) using dye diffusion. Both techniques produce brilliant colors and excellent resolution. The printer cost is high and the ink cartridges expensive, and therefore these printers are not in high use in security applications. Color laser printers are not used in the video security industry because of their high equipment cost and ink cartridge replacement cost, as compared to other technologies now available. In addition to the standard monochrome laser printers that use a single toner, there also exist color laser printers that use four toners to print in full color. Color laser printers tend to be about five to ten times as expensive as monochrome. Polaroid film technology has been used in the video industry for many years and is still used for special applications. It has lost its popularity because of the new thermal, ink-jet, laser, and LED technologies that have become available. Dot matrix printers are not suitable for monochrome or color video image printing because of their lower resolution and slower speed and high noise levels. 10.3 PRINTER TECHNOLOGY Most video image printouts are still done with monochrome thermal printers. The reason for this is the significantly lower cost of the printer hardware and the lower cost of the hard-copy printout, since no ink cartridge head or ink is required for the monochrome video printer. However, the overwhelming use of color video cameras in security monitoring systems has motivated manufacturers to provide cost-effective solutions for printing color images. In a color video system, the lens receives the color picture information and through the color camera converts the light image into three electrical signals corresponding to the red, green, and blue (R, G, B) color components in the scene. These three signals presented to an RGB monitor produce a color image on the monitor. In a color printer, the three primary colors in the video signal, R, G, and B, must be reversed to obtain their complementary colors: cyan, magenta, and yellow. 10.3.1 Thermal Three thermal technologies for producing hard-copy printout are: (1) monochrome, (2) wax transfer, and (3) color-dye diffusion. 10.3.1.1 Monochrome The monochrome thermal video printer is the most popular type used in security industry. The primary reason is that it can produce resolution comparable to the resolution of the cameras and sufficient printing speed required for video security applications. Another reason for their popularity is that the cost for the hardware, printout paper, and printer head are less than those of other printer technologies. Figure 10-1 shows a monochrome thermal video printer and hard-copy printout. Thermal monochrome printers create an image by selectively heating coated paper as the paper passes over the thermal printer head (Figure 10-2). The coating turns FIGURE 10-1 Thermal video printer Hard Copy Video Printers 307 THERMAL PRINTING STEPS VIDEO SIGNAL THERMAL HEAD CONVERSION ELECTRONICS VIDEO PRINT RANDOM ACCESS MEMORY (FREEZE FRAME) THERMAL HEAD SUPPLY ROLL PLATEN THERMAL PRINT PAPER PLASTIC WAX COATED INK PAPER PRINT PAPER FIGURE 10-2 Thermal printer block diagram black in the areas where it is heated, creating the image. Care must be taken with the handling and storage of the thermal paper, as it suffers from sensitivity to heat and abrasion which can cause darkening of the paper or fading due to light. The thermal printer converts the video signal from the camera into a digital signal and stores it in a random access memory (RAM) or other storage device. The video freeze-frame module captures and “freezes” the image as a snapshot of a moving video scene. This temporary storage allows the printer to operate at a much slower speed than the actual real-time video frame rate. After the video image has been captured, it is converted to an electrical drive signal for the thermal head located adjacent to the paper. Depending on the video drive signal level, the paper is locally heated, causing the wax on the paper to melt and turn black (or another color). Depending on the amount of heat applied, a larger or smaller dot is produced providing a gray-scale level to the image. As the video information is scanned across the slowly moving paper, the image is “burned in,” thereby creating a facsimile of the video image. Scanning an entire monochrome video image one pixel at a time takes approximately 8 seconds. Since the video image is stored in the printer until a new frame is captured, multiple copies can be made. The printed video image is recorded on a treated paper that resists fading from sunlight and physical tearing. Figure 10-3 shows a monochrome thermal video printer hard-copy printout. 10.3.1.2 Wax Transfer In the color TTP, a plastic-wax, single-color-coated ribbon (the width of the paper roll) is inserted between the thermal print head and the paper (Figure 10-4). The ribbon is heated locally from behind causing the wax-based ink coating to melt, and the image to transfer to the paper. The full-color prints are produced in the thermal plastic color printer through the multiple passes of three ribbons having the colors cyan, magenta, and yellow. The inking paper used is divided into three sections with differentcolored ink; these three sections pass the thermal printer platen in sequence. As each color passes over the thermal head, an electrical signal proportional to the amount of the respective color in the video signal heats the head so that the ink of the required color is deposited on the paper. Depending on the amount of heat applied, a larger or smaller amount of ink from the paper will be transferred from the base film to the print paper. The first time the paper passes the head, yellow is deposited on it, then magenta, then cyan. By printing these three colors, so that they are superimposed exactly on each other, the printer is able to produce a high-resolution print with excellent 308 CCTV Surveillance (A) SURVEILLANCE FIGURE 10-3 (B) FACIAL IDENTIFICATION Thermal printer quality: surveillance, facial identification PRINCIPLES OF COLOR PRINTING THERMAL HEAD LENS COLOR CCTV SCENE RED PRINT PAPER YELLOW CAMERA CCD SENSOR GREEN ELECTRONICS BLUE MAGENTA CYAN VIDEO SIGNAL: COLOR ADDITION PRINTING SIGNAL—INK COLOR SUBTRACTION BASE FILM INK LAYER THERMAL HEAD INK PAPER TRANSFERRED INK SUPPLY ROLL THERMAL HEAD PRINT PAPER PLATTEN CYAN MAGENTA YELLOW C COLOR SHEET M Y C PRINT PAPER PLATTEN (ROLLER) FIGURE 10-4 TAKEUP ROLL Plastic wax thermal transfer printer (TTP) block diagram REQUIRES 3 PASSES OF INKED RIBBON INK: YELLOW, MAGENTA, CYAN Hard Copy Video Printers color rendition. By this principle, each dot on the final print copy is transferred from the base film ink layer to the print paper. Reproducing a satisfactory color image requires precisely engineered mechanical components so that the absolute registration between the three colors is printed. It also requires precise electronic technology to accurately combine the timing, signal, and video fidelity to ensure a faithful video image. 10.3.1.3 Color-Dye Diffusion The TSP dye-diffusion printing media uses three ink dye papers (Figure 10-5). The TSP printer operates through the use of a polyester-based substrate (donor element) that contains a dye and binder layer, which when heated from the back side of the polyester sublimates (becomes gaseous) and transfers to the paper where the dye then diffuses into the paper itself. The ink paper consists of a cartridge containing three-color sequential printing inks (cyan, magenta, and yellow). 10.3.2 Ink Jet, Bubble Jet There are two different types of ink-jet printers: the continuous-jet printer and the drop-on-demand BLOCKS OF COLOR DYE printer (Figure 10-6). The continuous-jet printer uses a steady stream of ink droplets emanating from print nozzles under pressure. An electric charge is selectively applied to the droplets, causing some to be deflected toward the print paper and others away from the paper. The printout is the composite of all the individual dots in the image produced in this manner. The drop-on-demand printer is a simpler and more popular ink-jet printer. This printer forms droplets of ink in the nozzle and ejects them through appropriate timing of electronic signals, thereby producing the desired image on the paper. The majority of ink jet printers produce a single dot size for each dot. Higher resolution types use a technology called dithering to increase the resolution and smooth jagged edges in text and lines in graphs and video images. Ink jet printers have found a significant market in the surveillance field and have good resolution, color rendition, and speed per copy. Most current ink jets work by having a print cartridge with a series of tiny electrically heated chambers constructed using photolithography technology. The printer produces an image by driving a pulse of current through the heating elements. A steam explosion in the chamber forms a bubble which propels the droplets of ink onto the paper. Canon named it the Bubble Jet. When the YELLOW (Y) COLOR TRANSFER RIBBON MAGENTA (M) DYE EVAPORATES WHEN HEATED CYAN (C) COLORED IMAGE C HEATER ARRAY C M COATED PAPER • • • • FIGURE 10-5 309 DYES: CYAN, MAGENTA, YELLOW, BLACK (OPTIONAL) THERMAL DYE SUBLIMATION (SOLID STATE TO GASEOUS) REQUIRES MULTIPLE PASSES OF PAPER 256 TEMPERATURE LEVELS—NEAR CONTINUOUS TONE Dye-diffusion thermal sublimation printer (TSP) block diagram 310 CCTV Surveillance PIEZO-ELECTRIC TECHNOLOGY THERMAL TECHNOLOGY INK SUPPLY HEATING ELEMENT (RESISTOR) INK CHAMBER CAVITY PIEZO DISK FIRING CHAMBER INK DROPLET MOVING INKJET CARTRIDGES (CYAN, MAGENTA, YELLOW, BLACK) PRINTHEAD NOZZLE NI EJK ROTATING DRUM det : : inte TO OM ET pr FR NKJ I dc c ypo INK DROPLET T nirp opy PAPER DIRECTION FIGURE 10-6 Ink jet, bubble jet printer technology bubble condenses, surplus ink is sucked back up from the printing surface. The ink’s surface tension pumps another charge of ink into the chamber through a narrow channel attached to an ink reservoir. Epson’s micro-piezo technology uses a piezo-crystal in each nozzle instead of a heating element. When current is applied, the crystal bends, forcing a droplet of ink from the nozzle. The greatest advantages of ink jet printers are quiet operation, capability to produce color images with near photographic quality, and low printer prices. One downside of the ink jet printer is that although they are generally cheaper to buy than the lasers, they are far more expensive to operate when it comes to comparing the cost per page. Ink cartridges used in ink jet printers make them many times more expensive than laser printers to produce the print. 10.3.3 Laser, LED Laser printers provide an alternative to the ink jet and bubble jet printers for producing hard-copy video printouts. The laser printer can produce high-quality monochrome images with excellent resolution—300 dots per inch (dpi) and grayscale (halftone) rendition. Laser and LED printers rely on one and the same technology used in the first photocopying machines. This process is known as electro-photography and was invented in 1938 and later developed by Xerox and Canon in the late 1980s. The electro-photographic process used in laser printers involves six basic steps: 1. A photosensitive surface (photo-conductor) is uniformly charged with static electricity by a corona discharge. 2. Then the charged photo-conductor is exposed to an optical image through light to discharge it selectively and form a latent, invisible image. 3. The latent image development is done by spreading toner, a fine powder, over the surface which adheres only to the charged areas, thereby making the latent image visible. 4. At the next step an electrostatic field transfers the developed image from the photosensitive surface to the sheet of paper. 5. Then the transferred image is fixed permanently to the paper by fusing the toner with pressure and heat. Hard Copy Video Printers 6. The final step in the process occurs when all excess toner and electrostatic charges are removed from the photoconductor to make it ready for the next printing cycle. In operation the laser printer uses a laser beam to produce an image on a drum (Figure 10-7). Because an entire page is transmitted to a drum before the toner is applied, laser printers are sometimes called page printers. Figure 10-8 shows the schematic diagram of the laser and page printer and Figure 10-9 the rotating mirror scanning mechanism. Laser printing is accomplished by first projecting an electric charge onto a revolving drum by a primary electrically charged roller. The drum has a surface of a special plastic or garnet. Electronics drives a system that writes light onto the drum. The light causes the electrostatic charge to leak from the exposed parts of the drum. The light from the laser alters the electrical charge on the drum wherever it strikes. The surface of the drum then passes through a bath of very fine particles of dry plastic powder or toner. The charged parts of the drum electrostatically attract the particles of powder. The drum then deposits the powder onto the sheet of paper. The paper passes through a fuser, which with heat and pressure bonds the plastic powder to the paper. Each of these steps has numerous technical choices. One of the more interesting choices is that some “laser” printers actually use a linear array of LEDs to write the light on the drum instead of using a laser. The toner is essentially ink and also includes either wax or plastic. The chemical composition of the toner is plastic-based or waxbased so that when the paper passes through the fuser assembly the particles of toner will melt. The fuser can be an infrared oven, a heated roller, or in some very fast expensive printers, a xenon strobe light. The laser printer relies on the laser beam and scanner assembly to form a latent image on the photo conductor bit by bit. The scanning process is similar to electron-beam scanning used in a CRT monitor. The laser beam modulated by electrical signals from the printer’s controller is directed through a collimating lens onto the rotating polygon mirror that reflects the laser beam onto the drum. The laser beam then passes through a scanning lens system which makes some corrections to it and scans the beam onto the photo-conductor on the drum. This complex technology is the major key for insuring high precision in the laser spot at the focal plane. Accurate dot generation at a uniform pitch (spacing) ensures the best printer resolution. Figure 10-9 shows the light path through the laser printer from the laser source to the photoconductor on the drum. BASIC ELECTRO-STATIC COPYING PROCESS 1 CHARGING IMAGE EXPOSURE: LASER 2 6 PHOTO SENSITIVE DRUM DRUM CLEANING 5 3 DEVELOPING FIXING 4 TRANSFER NEGATIVE CHARGE POSITIVE TONER LIGHT SOURCE: LASER LASER SCANNING OPTICS ORIGINAL DOCUMENT IMAGE EXPOSURE PRIMARY CHARGE CLEANING DEVELOPER ROTATING DRUM TONER FIXING PAPER PAPER TRANSFER/SEPARATE FIGURE 10-7 Laser page printer schematic diagram 311 312 CCTV Surveillance ROTATING POLYGON MIRROR FOCUSING LENSES SCANNING LASER BEAM SOLID STATE LASER LIGHT SOURCE MIRROR ESA irp R n t c de ypo L ROTATING DRUM ed : rint OM Rp SE LA FR TO : TONER cop y PAPER DIRECTION FIGURE 10-8 Laser printer rotating mirror scanning mechanism A second type of page printer falls under the category of laser printers even though it does not use lasers at all. It uses the radiation from a linear array of LEDs to expose the image onto the drum (Figure 10-10). Once the drum is charged, however, the LED printer operates like the laser printer. The LED printers developed by Okidata and Panasonic use an array of small LEDs instead of using a laser to form the latent image on the drum. In this technology a light source controlled by the printer’s CPU illuminates a lightsensitive drum creating an attractive charge on the drum. No mirror scanner is required using this LED technology. The drum rotates past a toner attracting the toner particles where the drum has been illuminated. The drum rotates the paper to the toner, is transferred, making the image that is fused onto the paper (Figure 10-11). The LED array consists of thousands of individual digital LED light sources, spanning the width of the image drum directing light through focusing lenses directly onto the drum surface. This methodology can have an advantage over the laser light source system. In the case of the laser, a single light source and a complex system of fixed lenses and mirrors and a rotating mirror deflects the laser beam across the drum as it rotates. Complex timing is used to ensure that the laser produces a linear horizontal track across the drum surface. Careful parallax correction must be employed since the edges of the drum are farther from the laser than the center of the drum. The LED array technology eliminates any possibility of parallax errors or timing errors since they are arranged across the entire page width and are fixed. The resolution obtained with the laser and solid state LED implementations result in approximately the same resolution although the LED seems to have a slight edge. Laser heads can produce dot sizes of 60 micrometers (m) whereas LED technology can produce dot sizes as small as 34 m. Inherently the LED light source should be more reliable than the laser system since it has no moving parts. These LED machines are guaranteed for five full years. The LED design inherently has a higher speed than the laser design since it has no moving parts. There is a limit to how fast the drum in the laser system can be rotated and still maintain horizontal scanning integrity. In the LED technology there is no scanning or moving parts and therefore it can print faster at higher resolutions than the laser design. As shown in Figure 10-12 the resolution of the LED design at 600 or 1200 dpi remains constant independent of the page print speed whereas in the case of the laser design the resolution drops when the print speed is increased. Another advantage of the LED design over the laser is that the LED has a straight line paper path that is less susceptible to jams. One of the chief attributes of these laser printers is resolution. Laser printers can print between 300 and 1200 dpi. Laser printers produce very high-quality print and are capable of printing an almost unlimited variety of fonts. Most laser printers come with a basic set of fonts called Hard Copy Video Printers 313 PHOTOCONDUCTOR SURFACE ROTATING DRUM SCANNING LASER BEAM FIXED SCANNING AND FOCUSING LENS MIRROR ROTATING POLYGON MIRROR COLLIMATING OBJECTIVE LENS SOLID STATE LASER LIGHT SOURCE FIGURE 10-9 Laser printer light path from laser source to drum photoconductor internal or resident fonts, but additional fonts can be added in one of two ways: 1. Laser printers have slots to insert font cartridges utilizing read-only memory (ROM). Fonts have been pre-recorded onto these cartridges. The advantage of font cartridges is that none of the printer’s memory is used. 2. All laser printers come with a certain amount of RAM that can be expanded upon by using memory boards in the printer’s expansion slots. Fonts can then be copied from a disk to the printer’s RAM. This is called downloading fonts, and these fonts are often referred to as soft font s, to distinguish them from the hard fonts available on font cartridges. The more RAM a printer has, the more fonts that can be downloaded at one time. Laser printers can print text, graphics, and video images. Significant amounts of memory are required in the printer to print high-resolution graphics and images. For example, to print a full-page graphic/image at 300 dpi requires at least 1 MByte of printer RAM. For a 600 dpi image at least 4 MByte RAM is required. Laser and LED printers are non-impact type and are therefore very quiet. The speed of laser printers ranges from about 4 to 20 text pages per minute (ppm). If a typical rate of 6 ppm is used, this is equivalent to about 40 characters per second for text printing. Laser printers are controlled through page description languages (PDL) with the two de facto standards for PDLs being: 1. Printer Control Language (PCL) developed by HP 2. PostScript developed by Apple Computer for the Macintosh computer. PostScript has become the de facto standard for Apple Macintosh printers and for most desktop publishing systems. Most software can print using either of these PDLs. PostScript has some features that PCL lacks. Some printers support both PCL and PostScript. For video applications, in particular, there is an increased demand for print quality (image resolution, sharpness, and color rendition), and printer manufacturers have devoted considerable amounts of time and money on technology advancements. In particular, they have focused on those that eliminate smear, steps, or 314 CCTV Surveillance GALLIUM ARSENIDE (GaAs) LED ARRAY LIGHT SOURCE FOCUSING LENS GaAs LIGHT PULSES ROTATING DRUM LE DP rint ed cop y TONER co : ed OM rint DP LE TO FR : py PAPER DIRECTION FIGURE 10-10 LED printer schematic diagram with fixed LED page illumination BASIC ELECTROSTATIC COPYING PROCESS 1 CHARGING 2. IMAGE EXPOSURE FIXED LINEAR ARRAY 6 DRUM CLEANING 5 PHOTO SENSITIVE DRUM 3 DEVELOPING FIXING TRANSFER NEGATIVE CHARGE POSITIVE TONER LIGHT SOURCE: FIXED LED ARRAY ORIGINAL DOCUMENT FOCUSING OPTICS IMAGE EXPOSURE PRIMARY CHARGE DEVELOPER CLEANING ROTATING DRUM TONER FIXING PAPER PAPER TRANSFER/SEPARATE FIGURE 10-11 Light emitting diode (LED) printer block diagram Hard Copy Video Printers 315 RESOLUTION (dpi) 10000 LA SE R LED 1200 dpi 1000 LED 600 dpi 100 10 20 30 50 100 PRINT SPEED (ppm) NOTE: LED RESOLUTION REMAINS CONSTANT OVER RANGE OF PRINT SPEED LASER RESOLUTION DECREASES AS PRINT SPEED INCREASES FIGURE 10-12 Resolution vs. printing speed for the LED and laser printers other jagged edges on straight lines in the video image or graphics. The laser and ink jet technologies both place dots of ink on the paper. In order to smooth out these dots along edges of text, graphics and images, they have implemented technologies to change the size and placement of the dots to fill in and smooth out the boundaries of letters, and straight lines and curves in the images (Figure 10-13). In one technology, as many as four different-sized dots are produced and grouped in various combinations along the edges of boundaries to smooth out these images. The result is a crisper better-looking image with sharper edges, smoother curves and none of the jagged edges. The technology changes the size and placement of the dots to fill-in and smooth-out the boundaries. Both laser and LED printers offer an excellent solution for video image printing to produce high-quality images at high speeds. Table 10.1 compares the laser printer and LED printer specifications. images. Typical systems have a resolution of 500–600 pixels, have 64 levels of gray scale, and require 60–80 seconds to print out. These printers carry a very high price tag and are normally used for printing still images and therefore have not found their way into the surveillance field. Thermal wax transfer monochrome and color printers function by adhering a waxed-based ink onto the paper. As the paper and ribbons travel in unison beneath the thermal printer head, the wax-based ink from the transfer ribbon melts onto the paper. When cool, the wax is permanent. This type of thermal printer uses a full size panel of ribbon for each page to be printed regardless of the contents of the page. Monochrome printers have a black panel for each page to be printed, while color printers have three (CMY) or four (CMYK) colored panels for each page. Unlike dye sublimation printers these printers cannot vary the dot intensity, which means that the image must be dithered. These printers are not in widespread use in video security applications. 10.3.4 Dye-Diffusion, Wax-Transfer 10.3.5 Film The high-resolution thermal laser printer uses an entirely different and more complex principle to produce extremely high resolution continuous-tone laser-printed Hard-copy video images can be printed on black-andwhite or color photographic film such as the instant prints developed by Polaroid Corp. The image is first captured 316 CCTV Surveillance THE SMALLER DOTS AND DITHERING ALLOW THE DOTS TO FILL IN AND SMOOTH OUT THE BOUNDARIES OF LETTERS, GRAPHICS AND PHOTOS AREA OF ENLARGEMENT WITH SMOOTHING TECHNOLOGY WITHOUT SMOOTHING TECHNOLOGY FIGURE 10-13 LED and laser printer smoothing PRINTER TYPE LASER PRINTER LED PRINTER TECHNOLOGY (a) ELECTROPHOTOGRAPHY (b) SCANNING LASER BEAM (c) INK TONER IN CARTRIDGE (a) ELECTROPHOTOGRAPHY (b) LINEAR LED ARRAY (c) INK TONER IN CARTRIDGE PRINT SPEED— ppm (PAGES/MINUTE) STANDARD: 4–50 INDUSTRIAL: UP TO 1000 10–26 RESOLUTION— dpi (DOTS/INCH) 300–2400 MOVING PARTS PAPER HANDLING, DRUM, ROTATING MIRROR SCAN WHEEL PAPER HANDLING, DRUM, (STATIONARY LED ARRAY) NOISE LEVEL LOW VERY QUIET PRINTER COST $200– 8000 $250– 8000 PRINT COST/PAGE $0.03– 0.09 COLOR $0.01– 0.03 BLACK/WHITE $0.03– 0.09 COLOR $0.01– 0.03 BLACK/WHITE Table 10-1 Comparison of Laser Printer and LED Printer 300–1200, MAINTAINS RESOLUTION AT HIGH SPEEDS Hard Copy Video Printers in a freeze-frame image storage device. Then the film is exposed and developed with the Polaroid film back. While the resolution and rendition of the image is quite good, Polaroid film is more expensive and more difficult to work with than thermal paper. 10.4 PRINTER COMPARISON There are several criteria that should be considered by one choosing a video printer. These include resolution, speed, initial cost of equipment, cost of paper, toner or cartridge, and of course the quality of the final printed hard copy. 10.4.1 Resolution and Speed The thermal printer is in widespread use and can print with a resolution of 250–500 TV lines. This printer is probably the best choice for reproducing monochrome images with reasonable continuous-tone printing. Monochrome thermal printers provide a fast means— 8 seconds per print—for obtaining a hard-copy printout from any video signal. Operating these printers is relatively inexpensive. Ink jet printers are capable of producing high-quality print approaching that produced by laser printers. Typically models provide a resolution of 300 dpi but there are models offering higher resolutions. The laser printer can produce high-quality monochrome images with excellent resolution—300 dpi and halftone (grayscale) rendition. The cost for operating the laser printer depends on a combination of costs: paper usage, toner replacement, drum replacement, and other consumables such as the fuser assembly and transfer assembly. The laser and LED printers can print from a low resolution of 300 dpi to a high resolution of 1200 dpi. By comparison, offset printing usually prints at 1200 or 2400 dpi. Some laser printers achieve higher resolutions using special techniques. Resolution for thermal dye-diffusion and wax-color video printers is typically 500 dots horizontal, and printout time approximately 80 seconds per print. Since each point (pixel) in a picture or resolution element in the color video image is composed of three separate colors, the actual detail resolution of the image is one-third the number of dots, or typically less than 200 TV-line resolution for the printed color image. While this is significantly less than the 500 or 600 TV-line resolution in the monochrome image, the addition of color to the print adds useful information. The print paper roll produces 3- by 4-inch pictures. 10.4.2 Hardware, Ink Cartridge Cost Factors The thermal printer enjoys popular demand for printing monochrome and color video images because 317 of its ruggedness, convenience, and reasonable price. Monochrome thermal printers cost from $1100 to $1600. The typical video thermal printer (Figure 10-1) holds a roll of plastic wax-coated thermal paper sufficient to produce one-hundred-and-twenty 3 × 5-inch video pictures. There are two main design philosophies in ink jet head design. Each has strengths and weaknesses. Fixed head philosophy uses a built-in print head that is designed to last for the entire life of the printer. Consumable ink cartridge costs are typically lower in this design. If, however, the head is damaged it is usually necessary to replace the entire printer. Epson has traditionally used fixed print heads. In fact, disposable heads have proven to be equally good and are used in the HP and other popular manufacturers’ machines. The disposable head philosophy uses a print head that is part of the replaceable ink head cartridge. Every time the printer runs out of ink the entire cartridge is replaced with a new one. This adds substantially to the cost of consumables but it also means that a damaged or empty print head is only a minor problem and the user can simply buy a new cartridge. HP has traditionally favored the disposable print head as did the Canon in its early models. Canon now uses replaceable print heads in most models that are designed to last the life of the printer, but can be replaced at anytime by the user if they should become clogged or inoperative for some reason. The ink tanks are separate fo