Dissertation submitted to the Combined Faculties for the Natural Sciences and for Mathematics of the Ruperto-Carlo University of Heidelberg, Germany for the degree of Doctor of Natural Sciences Put forward by M.Sc. Muhammad Atif Born in: Sialkot, Pakistan Oral examination: 14-10-2013 Optimal Depth Estimation and Extended Depth of Field from Single Images by Computational Imaging using Chromatic Aberrations Advisor: Prof. Dr. Bernd Jähne Prof. Dr. Karl-Heinz Brenner Abstract The thesis presents a thorough analysis of a computational imaging approach to estimate the optimal depth, and the extended depth of field from a single image using axial chromatic aberrations. To assist a camera design process, a digital camera simulator is developed which can efficiently simulate different kind of lenses for a 3D scene. The main contribution in the simulator is the fast implementation of space variant filtering and accurate simulation of optical blur at occlusion boundaries. The simulator also includes sensor modeling and digital post processing to facilitate a co-design of optics and digital processing algorithms. To estimate the depth from color images, which are defocused to different amount due to axial chromatic aberrations, a low cost algorithm is developed. Due to varying contrast across colors, a local contrast independent blur measure is proposed. The normalized ratios between the blur measure of all three colors (red, green and blue) are used to estimate the depth for a larger distance range. The analysis of depth errors is performed, which shows the limitations of depth from chromatic aberrations, especially for narrowband object spectra. Since the blur changes over the field and hence depth, therefore, a simple calibration procedure is developed to correct the field varying behavior of estimated depth. A prototype lens is designed with optimal amount of axial chromatic aberrations for a focal length of 4 mm and F-number 2.4. The real captured and synthetic images show the depth measurement with the root mean square error of 10% in the distance range of 30 cm to 2 m. Taking the advantage of chromatic aberrations and estimated depth, a method is proposed to extend the depth of field of the captured image. An imaging sensor with white (W) pixel along with red, green and blue (RGB) pixels with a lens exhibiting axial chromatic aberrations is used to overcome the limitations of previous methods. The proposed method first restores the white image with depth invariant point spread function, and then transfers the sharpness information of the sharpest color or white image to blurred colors. Due to broadband color filter responses, the blur of each RGB color at its focus position is larger in case of chromatic aberrations as compared to chromatic aberrations corrected lens. Therefore, restored white image helps in getting a sharper image for these positions, and also for the objects where the sharpest color information is missing. An efficient implementation of the proposed algorithm achieves better image quality with low computational complexity. Finally, the performance of the depth estimation and extended depth of field is studied for different camera parameters. The criteria are defined to select optimal lens and sensor parameters to acquire desired results with the proposed digital post processing algorithms. Digital Camera Simulator, Depth Estimation, Extended Depth of Field, Computational Photography, Chromatic Aberrations Zusammenfassung Die Dissertation beschreibt in einer gründlichen Analyse einen rechnergestützten Bildverarbeitungsansatz zur Abschätzung der optimalen Tiefe und des erweiterten Schärfentiefebereichs aus einem einzigen Bild unter der Verwendung des Farblängsfehlers. Ein digitaler Kamerasimulator wurde entwickelt, um den Konstruktionsprozeß des Kamerasystems hinsichtlich der effizienten optischen Simulation einer 3D Szene für verschiedene Linsensysteme zu unterstützen. Die wichtigsten Beiträge zum Simulator sind die schnelle Implementierung von räumlich-variierenden Filtern und die genaue Simulation der optischen Unschärfe an Verdeckungsgrenzen. Der Simulator beinhaltet auch die SensorModellierung und digitale Nachbearbeitung, um ein Co-Design der Optik mit den digitalen Verarbeitungsalgorithmen zu ermöglichen. Ein Low-Cost-(kostengünstiger, schlanker) Algorithmus wird entwickelt, um die Tiefe von Farbbildern, die durch axiale chromatische Aberrationen unterschiedlichen defokussiert werden, zu schätzen. Da der Kontrast über die Farben variiert, wird eine vom lokalen Kontrast unabhängige Bestimmung der Unschärfe vorgeschlagen. Ein normalisiertes Maß der Unschärfe wird zwischen allen drei Farben (rot, grün und blau) verwendet, um die Tiefe für eine größere Reichweite zu schätzen. Eine Analyse fehlerbehafteter Tiefenwerte wird durchgeführt, welche die Einschränkungen der Tiefenschätzung über den Farblängsfehler speziell für schmalbandige Objekt-Spektren aufzeigen. Da sich die Unschärfe über das Feld ändert und damit die Tiefe, wird daher eine einfache Kalibrierung entwickelt, um das über das Feld unterschiedliche Verhalten der Tiefenschätzung zu korrigieren. Ein Prototypobjektiv mit der Brennweite 4 mm und der Blendenzahl 2.4 wird konstruiert, welches eine optimale Größe des Farblängsfehler besitzt. Aufgenommene und computergenerierte Bilder zeigen in der Tiefenwertbestimmung eine Abweichung bezüglich der Mittelwerte von 10% bei einem Abstandsbereich von 30 cm bis 2 m. Eine Methode zur Vergrößerung des Schärfentiefebereichs eines Bildes unter Nutzung des Farblängsfehlers und der geschätzten Tiefe wird vorgeschlagen. Um die Einschränkungen der bereits existierenden Methoden zu überwinden, wird ein Bildsensor mit weißen Pixeln zusätzlich zu den rot, grün und blauen Pixeln (RGB) und ein Objektiv mit entsprechenden Farblängsfehler verwendet. Das vorgeschlagene Verfahren stellt zuerst das Weißbild mit einer tiefeninvarianten Punktverbreiterungs-funktion her und überträgt dann die Schärfeinformation der schärfsten Farbe oder des Weißbildes zu den unscharfen Farben. Durch die breitbandigen Eigenschaften der Farbfilter, ist die Unschärfe bei Optiken mit Farblängsfehler in jeder RGB Farbe an ihren Fokuspositionen größer im Vergleich zu achromatischen Optiken. Hier hilft das wiederhergestellte weiße Bild ein schärferes Farbbild für diese Positionen zu erhalten. Dies gilt auch für Objekte in denen die schärfste Farbinformation fehlt. Eine effiziente Implementierung des vorgeschlagenen Algorithmus erzielt eine bessere Bildqualität bei geringem Rechenaufwand. Schließlich wird die Leistung der Tiefenschätzung und erweiterter Schärfentiefe für verschiedene Kameraparameter untersucht. Die Kriterien sind so definiert, daß optimale Linsen- und Sensorparameter ausgewählt werden, um die gewünschten Ergebnisse mit dem vorgeschlagenen digitalen Algorithmen zu erhalten. Digitaler Kamerasimulator (opto-digitaler Simulator), Tiefenschätzung, Erweiterte Schärfentiefe, Bildverarbeitungsbasierte Fotografie, Farbfehler Contents 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Simulation of Physical Image Formation . . . . . . . . . 2 1.2.2 Depth From Chromatic Aberrations . . . . . . . . . . . . 3 1.2.3 Extended Depth of Field Using Chromatic Aberrations . 4 1.2.4 Optimal Camera Parameters Selection . . . . . . . . . . 6 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Photo-realistic Simulation of Physical Image Formation Process 11 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Physical Image Formation . . . . . . . . . . . . . . . . . . . . . 12 2.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5 Simulating Spatially Varying Lens Blur . . . . . . . . . . . . . . 16 2.5.1 Simulating Optics for a 2D Object plane . . . . . . . . . 16 2.5.1.1 Fast Approximation of Space Variant Convolution . . . . . . . . . . . . . . . . . . . . . . . . 17 2.5.1.2 2.5.2 . . . 18 Simulating Optics for a 3D Object Space . . . . . . . . . 18 2.5.2.1 2.6 Computing Basis Point Spread Functions Simulating Partial Occlusions . . . . . . . . . . 21 Optical Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.6.1 Accuracy and Computational Complexity . . . . . . . . . 26 2.6.1.1 Rotational Symmetric Blur . . . . . . . . . . . 26 i 2.6.1.2 2.7 Real Lens Simulations . . . . . . . . . . . . . . 29 Sensor Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.7.1 Noise Sources in Image Sensors . . . . . . . . . . . . . . 32 2.7.1.1 Temporal Noise . . . . . . . . . . . . . . . . . . 32 2.7.1.2 Fixed Pattern Noise . . . . . . . . . . . . . . . 33 2.7.2 Noise Model . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.7.3 Sensor MTF and Sampling . . . . . . . . . . . . . . . . . 34 2.7.4 Color Filter Array . . . . . . . . . . . . . . . . . . . . . 35 2.8 Digital Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.9 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . 36 3 Depth From Chromatic Aberrations 3.1 Overview of Depth Estimation Methods 3.1.1 3.1.2 3.1.3 39 . . . . . . . . . . . . . 39 Passive Depth Estimation . . . . . . . . . . . . . . . . . 40 3.1.1.1 Stereoscopy . . . . . . . . . . . . . . . . . . . . 40 3.1.1.2 Depth from Focus/Defocus . . . . . . . . . . . 40 Active Depth Estimation . . . . . . . . . . . . . . . . . . 41 3.1.2.1 Depth from Time of Flight . . . . . . . . . . . 41 3.1.2.2 Depth from Active Stereoscopy . . . . . . . . . 41 Depth Estimation by Computational Imaging . . . . . . 42 3.2 Depth From Defocus . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 Depth From Axial Chromatic Aberration . . . . . . . . . . . . . 45 3.3.1 Axial Chromatic Aberrations . . . . . . . . . . . . . . . 45 3.3.2 Depth Estimation . . . . . . . . . . . . . . . . . . . . . . 47 3.3.3 3.3.4 3.4 3.3.2.1 Blur Measures Methods . . . . . . . . . . . . . 47 3.3.2.2 Comparison of Blur Measures . . . . . . . . . . 50 3.3.2.3 Contrast Independent Blur Measure . . . . . . 52 3.3.2.4 Depth Estimation from Blur Measures . . . . . 53 Analysis of Depth Errors . . . . . . . . . . . . . . . . . . 54 3.3.3.1 Practical Issues in DFCA . . . . . . . . . . . . 54 3.3.3.2 Theoretical Issues in DFCA . . . . . . . . . . . 56 Field Dependent Depth Correction . . . . . . . . . . . . 59 Dense Depth Map . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.4.1 Dense Depth Map by Segmentation . . . . . . . . . . . . 63 3.4.2 Dense Depth Map by Optimization . . . . . . . . . . . . 64 3.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 66 3.6 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . 68 4 Extended Depth of Field from Chromatic Aberrations 73 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.3 Extended DOF Using Axial Chromatic Aberration . . . . . . . . 76 4.3.1 Depth Dependent Deconvolution . . . . . . . . . . . . . 77 4.3.2 Sharpness Transport Across Color Channels . . . . . . . 78 4.3.3 Spectral Focal Sweep . . . . . . . . . . . . . . . . . . . . 79 4.3.4 Eliminating Color Difference . . . . . . . . . . . . . . . . 79 4.4 RGBW Sensor with Chromatic Aberrations . . . . . . . . . . . 80 4.5 Low Cost and Efficient Implementation of Proposed Algorithm . 83 4.5.1 Relative Blur Estimate . . . . . . . . . . . . . . . . . . . 83 4.5.2 Adaptive High-Pass Filtering 4.5.3 Deconvolution . . . . . . . . . . . . . . . . . . . . . . . . 85 4.5.4 Contrast Dependent Sharpness Transport . . . . . . . . . 86 . . . . . . . . . . . . . . . 83 4.6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 88 4.7 Physical and Practical Limitations . . . . . . . . . . . . . . . . 91 4.8 4.7.1 Narrowband Object Reflectance Spectra . . . . . . . . . 91 4.7.2 Loss of Contrast at Higher Frequencies . . . . . . . . . . 91 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . 91 5 Optimal Lens and Sensor Characteristics for Depth and extended DOF using Chromatic Aberrations 5.1 Axial Resolution of Depth from Defocus . . . . . . . . . . . . . 95 5.1.1 5.2 95 Depth Resolution of DFD using Two Images . . . . . . . 97 Optimal Parameters for Depth Estimation . . . . . . . . . . . . 98 5.2.1 Focal Length and F-number . . . . . . . . . . . . . . . . 98 5.2.2 Chromatic Focal Shift . . . . . . . . . . . . . . . . . . . 99 5.2.3 Sensor Resolution . . . . . . . . . . . . . . . . . . . . . . 100 5.2.4 Spectral Response of Color Filter Arrays . . . . . . . . . 100 5.3 Optimal Parameters for Extended DOF . . . . . . . . . . . . . . 102 5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6 Conclusion and Outlook 107 6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.3 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Bibliography 113 List of Publications 119 List of Figures 2.1 Processing flow of the digital camera simulation. . . . . . . . . . 11 2.2 Physical image formation process. . . . . . . . . . . . . . . . . . 12 2.3 A larger aperture shows narrow depth of field as compared to small aperture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 A color filter array (Bayer pattern) filters the light before it hits to the photo sensors. . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5 The right image shows the depth dependent blur applied to the input image according to the gathering method. The dark and the bright shadows can be seen around the boundaries. The input image (left) and its corresponding depth map (middle) are also shown in figure. . . . . . . . . . . . . . . . . . . . . . . 19 2.6 The process flow of the proposed scattering method. ๐๐ and ๐๐ represent the basis functions (eigenPSF) and the weights, respectively. The symbol โ*โ represents the convolution. . . . . . 21 2.7 The left and right images are blurred according to gathering and scattering approach, respectively. The scattering result is visually better than gathering as there is a smooth blur around object boundaries. . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.8 Foreground image along with its depth map. . . . . . . . . . . . 22 2.9 Background image along with its depth map. . . . . . . . . . . . 23 2.10 The plots show the results of gathering and scattering method for one dimensional case. A correct blur is applied on the boundary with the information of background signal. . . . . . . . . . . 24 2.11 The image and its two layers depth map. . . . . . . . . . . . . . 25 2.12 The image simulated with the gathering (left) and scattering, with (middle) and without (right) background blending, methods. 26 v 2.13 The image is blurred with the Gaussian filter. The standard deviation of filter changes with the depth. . . . . . . . . . . . . 27 2.14 SSIM value computed between reference image (exhaustive blurring) and PCA based filtering with 5 (left) and 3 (right) basis PSFs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.15 Image (right) shows the result of distortions and relative illumination applied to an ideal image (left). . . . . . . . . . . . . . . 30 2.16 Simulated real lens image (right) along with original image (left). 31 2.17 Cropped region shows the input image (left), and the result of optical simulations (right). The effect of optical blur and chromatic aberrations is visible with optics simulation. . . . . . 32 2.18 The combined MTF of sensor detector footprint and sampling is shown for different pixel sizes. . . . . . . . . . . . . . . . . . . 34 2.19 Spectral sensitivity of color filter arrays. . . . . . . . . . . . . . 35 2.20 Digital image post processing chain. . . . . . . . . . . . . . . . . 36 3.1 Image formation by a thin lens approximation. . . . . . . . . . . 43 3.2 Dispersion of white light into monochromatic light after passing through prism. . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3 Refractive index of glass (BK7) and plastic (Polycarbonate) materials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4 Blur diameter for RGB colors which focus at different distances due to chromatic aberrations. . . . . . . . . . . . . . . . . . . . 47 3.5 The image with and without chromatic aberrations. . . . . . . . 48 3.6 A step edge blurred with Gaussian blur of varying standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.7 Normalized blur values of different type of blur measures. . . . . 51 3.8 Normalized blur values of different blur measures method plotted with the RMSE for low CNR. . . . . . . . . . . . . . . . . . 52 3.9 Flow diagram of depth from chromatic aberrations algorithm. . 54 3.10 Ratios of blur measure with different combinations of colors . . 55 3.11 Two images captured under strong red light (top left) and white light (bottom left) illumination. The images with white color correction are shown here (right). . . . . . . . . . . . . . . . . . 56 3.12 An example of color filter with broadband spectral responses and the reflectance spectrum of an object with strong red content. 57 3.13 Edge profile of color channels for red and white light illumination. 58 3.14 Estimated depth for smooth edges is removed by applying the condition given in equation 3.17. . . . . . . . . . . . . . . . . . . 59 3.15 Left: Sagittal and tangential orientations are shown as black and gray edges respectively. Right: A PSF which produces different blur for different orientations of the edges. . . . . . . . . . . . . 60 3.16 Field dependent depth correction of the estimated depth map from DFCA algorithm. . . . . . . . . . . . . . . . . . . . . . . . 61 3.17 Relative depth measured for different lenses at different image locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.18 Field dependent depth correction of the estimated depth map from DFCA algorithm using only one image for the calibration. 62 3.19 Dense depth map is generated by optimization based method followed by joint bilateral upsampling . . . . . . . . . . . . . . . 64 3.20 The dense depth map is generated by the segmentation and optimization based methods. . . . . . . . . . . . . . . . . . . . . 65 3.21 (a) Simulated image with chromatic aberrations, (b) ground truth depth map used for testing the DfCA algorithm, (c) depth map generated with the algorithm described in section 3.3.2 (depth is estimated only at edges, and given in mm). . . . . . . 66 3.22 RMSE between true depth and estimated depth at different distances and differences of contrast between colors. . . . . . . . . 67 3.23 Depth estimation from axial chromatic aberrations for the real captured scenes. First row: input images captured with a lens having large chromatic aberrations. Second row: raw depth estimation using the algorithm proposed in this work. Third row: dense depth after propagating the raw depth to surroundings. 68 3.24 Depth estimated with the DFCA camera for different colored objects. (a) Original image, (b) DFCA depth at edges only, and (c) DFCA depth after propagating to neighboring regions. For comparison the depth from ToF camera (d) is also shown. All depth maps are shown in cm. . . . . . . . . . . . . . . . . . 69 4.1 Blur diameter versus distance for different focal lengths and lens focus position. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.2 Left: Blur diameter for RGB colors in case of axial chromatic aberrations. Right: Minimum blur diameter among all colors at each distance. . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 MTF for spatial frequency of 90 line pairs per millimeter [lp/mm] versus distances. 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . 78 a) Spectral response of color filter array, b) chromatic focal shift in the visible range of spectra and c) the blur diameter of red, green and blue colors calculated by averaging the blur diameter of all wavelengths according to the color filter array spectral response. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.5 Processing flow of extended DOF algorithm using RGBW sensor. 84 4.6 Weights versus local edge contrast to reduce the strength of sharpness transport at low contrast levels. . . . . . . . . . . . . 87 4.7 Simulation of optical and sensor properties. . . . . . . . . . . . . 88 4.8 Simulation results of a lens without (a,c,e) and with chromatic aberrations (b,d,f). . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.9 Simulation results of a lens without chromatic aberrations (a,c,e) representing a conventional lens. The images b, d, and f shows the extended DOF result generated through the proposed extended DOF algorithm. . . . . . . . . . . . . . . . . . . . . . . . 90 4.10 Chromatic aberrations are corrected from the image that is captured with the lens exhibiting large chromatic aberrations. . . . 93 5.1 Depth resolution versus object distance for different focal lengths of lens. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2 Normalized depth parameter versus object distance for different sensor resolutions. . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.3 Blur diameter is less than the circle of confusion for a complete range of depth of field. . . . . . . . . . . . . . . . . . . . . . . . 104 List of Tables 2.1 Execution time for exhaustive filtering and PCA based filtering. 28 2.2 Execution time for PCA based scattering filtering. . . . . . . . . 29 3.1 Absolute depth estimates from DFCA camera for the images shown in figure 3.24. The error with the depth from ToF is also given in percentage. . . . . . . . . . . . . . . . . . . . . . . . . . 70 ix Chapter 1 Introduction 1.1 Motivation Traditional cameras capture the visible light spectrum onto a photographic film. Usually, a lens is used to focus some portion of the incoming light from a scene at the film. Digital cameras use the electronic sensor to grab and store the light information in the digital format. Although the traditional digital cameras are being used for a long time, however, some features of the photography have a limited range, for example, limited dynamic range, loss of 3D information, fixed focus and depth of field. To overcome the limitations of traditional cameras, novel and unconventional imaging devices are designed to produce enhanced and meaningful images, which are beyond the limitations of traditional cameras. This leads into a new emerging field called โcomputational photographyโ. In spite of the fact that computational photography is being able to produce some features which are much desirable and beyond conventional imaging. But in most of the cases, the camera design is too complex which makes the camera expensive. Additionally, there is always a tradeoff between conventional imaging quality and the new features added through low cost computational photography. These challenges then directly motivate to investigate some low cost computational imaging methods which could also produce an acceptable imaging quality along with the additional features. The field of computational photography requires a modification in the conventional camera design, therefore it is always desirable to evaluate the cam1 2 era system performance during the design phase. This helps in a co-design of optics, sensor and digital processing. Hence, a digital camera simulator is required to design and evaluate the computational camera. In this work, a low cost computational photography method for two applications, depth imaging and extended depth of field (DOF) imaging is being investigated. Specifically, enhanced axial chromatic aberrations are introduced in the lens to estimate the depth from color defocus, and produce the extended DOF image by correcting chromatic aberrations (CA) according to focused color. Another aim of this work is to develop a digital camera simulator which can efficiently simulate optics for a 3D scene. 1.2 Overview Computational photography may be categorized into two groups: one where multiple conventional photographed images, with different camera settings, are fused together to extract desired features, and the other where camera optics or sensor is no more conventional, but modified to achieve desired functionality in combination with digital processing. Examples of first group are, extended DOF through focus stacking and high dynamic range imaging through exposure bracketing. Examples of second group are wavefront coding, coded aperture photography, light field camera, etc. The work in the thesis belongs to second group, as a conventional lens is designed to deliberately contain enhanced axial chromatic aberrations followed by digital processing to retrieve depth and extended DOF image. Detailed overview of the current thesis and itโs related work is given in the next sections. 1.2.1 Simulation of Physical Image Formation There are already few image simulation tools being developed to simulate different modules of a camera system. The most comprehensive simulator is the Image Systems Evaluation Toolkit (ISET) [11]. ISET is a software package that simulates the capture and the processing of a scene. It allows users to control the physical characteristics of a scene and simulate the optics, sensor and image processing-pipeline. One of the limitations of ISET is the simulation 3 of optics where only a single image plane is simulated that means no occlusions simulations. With the growth in the field of computational photography, more complex lenses are designed, like Wavefront coded lenses [9]. To simulate these kinds of complex lenses along with the traditional lenses, an efficient and flexible optical simulation module is required. In [26] and [6], the lens is integrated in the digital simulation with spatially varying point spread function (PSF). In these methods, each pixel of the image is convolved with its own PSF, or the image is divided into small regions where the PSF is considered to be constant. These methods produce more realistic lens blur for a 2D image plane, but they are computationally expensive for larger images. Moreover, the current methods only simulate one plane of a scene and none of them considers simulating the lens in a 3D space. In this work, simulation of the physical image formation process is presented to simulate the optics, sensor and digital processing modules. The main focus of the simulator to simulate the optics due to limitations of the state of the art methods. Generally, the optical transfer function of a lens varies spatially across the field of view and also longitudinally for the objects at varying distances, which makes the simulation computationally complex. An algorithm is presented in this work to simulate the lens blur, which makes a substantial reduction in computational complexity without sacrificing accuracy significantly. Moreover, occlusions are also modeled quite accurately for synthetic scenes. 1.2.2 Depth From Chromatic Aberrations Depth estimation refers to the algorithms which aim to estimate the distance of objects in a scene from the camera. The distance and geometry information is lost in the conventional imaging methods, because a set of rays from a scene is projected on a 2D instead of 3D plane. There are many different approaches to estimate the lost depth information. These different techniques can be categorized mainly into two methods, active and passive. Examples of these methods are depth from defocus, stereoscopy, depth from time of flight etc. Recently many computational imaging approaches have been presented to estimate the depth. Ng et al. [30] presented the method of light field capture through plenoptic camera, which also results 4 in obtaining the depth information. Another method for depth estimation is proposed by Zhou et al. [45], where a diffuser is inserted between scene and the camera to code the depth information in the defoucs blur. The method is similar to depth from defocus, however higher depth accuracy can be achieved with smaller aperture. Several other methods based on the aperture coding are also proposed to estimate the depth. Bando et al. [3] suggested the use of color filtered aperture to shift the color images, and hence estimating the depth from the disparity in the color images. One of the limitations of these methods is the loss of light. Chakrabarti et al. [5] modified the aperture to generate varying depth of field for different colors and infer depth from blur difference in the color images. This thesis also investigates a computational imaging method using axial chromatic aberrations to estimate the depth. The idea of estimating the depth information using chromatic aberrations was first time proposed by Molesisni [27] in 1984. In 1994, Tiziani and Uhde [39] proposed a chromatic confocal microscope for a 3D image sensing. For the conventional photography, Garcia et al. [12] proposed to use the chromatic aberrations for depth estimation and autofocusing. Although the method of extracting depth using CA is quite old, still there is no study of this method for a depth estimation of natural scenes. Moreover, all previous methods have only shown the potential of the method through limited experiments. In the presented thesis, the method of depth from CA is investigated comprehensively. The algorithm is proposed to estimate the depth of natural scenes. Calibration procedure is developed to compensate the field varying depth estimation, which mainly occurs due to optical aberrations and manufacturing tolerances. Finally, the limitations of the approach are discussed. 1.2.3 Extended Depth of Field Using Chromatic Aberrations Depth of field represents the distance in a scene for which the captured image is considered to be in focus i.e. the amount of blur introduced by optics is not perceivable in normal viewing conditions. In some cases, it is desirable to have a larger depth of field. In other cases, narrow depth of field is helpful in 5 emphasizing the desired object in a scene. One of the earliest approach, proposed by Häusler [15], is to make the blur of the object invariant to depth, and restore the sharp image through digital processing. He obtained the depth invariant blur by moving the object along the optical axis during the exposure time of the camera. A very common approach of achieving extended DOF is the focus stacking. In this method, multiple images are taken with different focus distances and combined digitally to extend DOF. For each image position, the sharpest pixel among multiple images is selected. A well known method in the field of computational photography, proposed by Dowski et al. [9] is wavefront coding. Here, the pupil function is modified through phase modulation by putting a non-absorbing optical element like cubic-phase or cosine form-phase mask. It is possible then to get the PSF which is insensitive to defocus. In the second step, standard deconvolution is used to restore the image with only one digital filter. The advantage of this method is no light loss, but it suffers with the loss of SNR, as the PSF spreads on a larger size to make it insensitive to defocus. Depth invariant PSF can also be achieved through polarization separation, by placing the birefringent plate between the lens and the sensor [44]. The plate is designed such that two polarization states contain the in-focus far and in-focus near field information of a scene and are superimposed on each other to form the image. In this case, digital restoration is required. All these methods have a complex optical design or complex post-processing and moreover, they suffer from loss of SNR. Some alternative approaches are based on the color separation which are more related to the current thesis work. Guichard et al. [14] proposed to utilize the chromatic aberrations to capture color images with different focus positions. As different colors appear sharp at different distances, hence digital processing transfers the sharpness information of sharpest color to other colors to make an extended DOF image. In other words, the proposed sharpness transportation technique takes advantage of the spectral information redundancy inherent in images to recover information that has been lost due to chromatic blurring effects. The advantage of this method is the use of conventional optics without any light loss and degradation of SNR. However, the digital processing is quite challenging to remove all chromatic aberrations. The 6 method trades off the extension of the DOF and the loss of chrominance high frequencies. An alternative solution is proposed by Kay et al. [17] by using the color aperture which stops the blue light to make larger depth of field for blue color image. Other colors are then made sharper using the sharpness information of blue color image. The method proposed by Bando [3] and Kim [18] codes the disparity information in color images through the color filter aperture. Then the extended DOF image is produced through estimated depth based deconvolution. If the lens which exhibits axial chromatic aberrations is used with the black and white (B&W) sensor, it produces depth invariant PSF, as shown by Cossairt et al. [8]. For color images, it is shown that luminance of the image have a depth invariant PSF. In this work, the method of [14] and [8] is combined using the sensor that captures monochromatic light along with the RGB colors. Moreover, on the digital side robust algorithms are proposed to produce high quality color images. 1.2.4 Optimal Camera Parameters Selection Depth from chromatic aberrations system estimate the depth from relative defocus blur between two defocused color images. Since amount of defocus blur depends on the lens parameter, therefore selection of an optimum lens for required depth performance is important. In the past, there are only few papers which addresses this issue. Blayvas et al. [4] have derived a formula to relate the depth resolution with the camera parameters. Schechner et al. [33] have computed the optimal axial interval between two defocused images to estimate the depth reliably. Subbarao [38] have analyzed the noise sensitivity of depth from defocus method for a specific spatial domain approach. In this work, the relationship between camera parameters and the performance of depth and extended DOF image is analyzed. An equation is derived to relate the parameters of lens with the depth resolution. The effect of sensor color filter arrays is studied specifically for the depth from CA method. The criteria are given for the selection of optimal camera parameters for desired depth requirements. Moreover, the optimal amount of chromatic aberrations required for a certain focal length to get an extended DOF image are also 7 formulated. 1.3 Outline The presented work in this thesis has four main goals. โ Develop a digital camera simulator that can accurately simulate the optics in a 3D space with a low computational complexity. โ Develop a low cost algorithm to estimate the depth of natural scenes using the axial CA. Study the limitations of the system in different imaging conditions. โ Establish a camera system for extended DOF image using axial CA. Develop an algorithm that can efficiently reduce the color artifacts which appear due to CA โ Develop the criteria for selecting the optimal camera parameters for depth and extended DOF imaging using CA. The content of this work is the following: Chapter 2 A digital camera simulator is described in this chapter. The simulator models the optics, sensor and digital processing steps. An algorithm is presented which can simulates the optics in a 3D space. A simple and effective method is proposed which blends the multiple layered images to simulate the occlusions. The accuracy and computational time is compared with exhaustive filtering approach. Sensor characteristics such as noise, color filter response and sensor MTF are also modeled in sensor simulations to make a complete digital camera simulator. Chapter 3 This chapter discuss the depth from CA method, and starts with the overview of state of the art depth imaging methods. The basics of depth from defocus method are discussed in detail, as it is the basis of the depth from CA. Different blur measures are compared, and a new blur measure is proposed for depth from CA which is independent of varying image contrast. Theoretical and practical limitations of the depth from CA method are also discussed. For 8 a low cost imaging, where we have relatively larger field dependent aberrations, a calibration method is proposed to estimate correct depth at any field position. As the defocus blur can be measured only at textured areas and hence depth, therefore some state of the art methods are used to generate dense depth maps. Some low cost solutions are described to reduce the computational complexity of these methods. Chapter 4 In this chapter, a combination of two state of the methods for the extended DOF is proposed by using an RGB sensor with extra panchromatic pixel. Moreover, an efficient and a low cost algorithm is implemented to correct color bleeding artifacts. Chapter 5 The effect of optical and sensor parameters on the depth and extended DOF performance are studied here. Based on the derived relationships, the criteria are discussed to select the optimal camera parameters. Chapter 6 A detailed summary of the thesis work is given in this chapter. Moreover, some tasks are described as a future work to make the outcome of the work of this thesis more robust. 1.4 Contribution The novel contributions of this thesis are as follows: โ The low cost spatially varying simulation of lens defocus blur. โ A simple and efficient solution of simulating the blur at occlusionโs boundaries. โ A local contrast independent blur measure and normalized ratios between them to estimate the depth from defocused color images. Analysis of the limitations of depth from CA in different imaging conditions. โ A simple calibration procedure to correct the field varying behavior of estimated depth. 9 โ A combination of two state of the art methods by using the RGBW sensor to produce an extended DOF image. The proposed method reduces the shortcomings of original methods. โ Analysis of effect of lens and sensor parameters on depth performance. Based on which the criteria are defined to select optimum camera parameters. 10 Chapter 2 Photo-realistic Simulation of Physical Image Formation Process 2.1 Introduction To overcome the limitations of traditional cameras, novel and unconventional imaging devices are designed to produce enhanced and meaningful images, which are beyond the limitations of traditional cameras. This leads into a new emerging field called โcomputational photographyโ. However with the progress in the field of computational photography it is becoming difficult to evaluate the camera system performance during the design phase. In this chapter, an image simulator is presented that can be used to evaluate the performance of computational cameras. Figure 2.1: Processing flow of the digital camera simulation. 11 12 The simulator deals with the spatially varying blur in 3D space to overcome the main challenge of accurate and fast simulation of the complex optical designs, especially in the field of computational photography. The simulation tool also includes effects from the sensor, like noise, sensor MTF and color filter array (CFA) sampling. Post processing algorithms relevant to some specific cameras, e.g. deconvolution for wavefront coding system, depth estimation from image defocus, are also incorporated in the post processing chain. The main blocks of the simulator are shown in figure 2.1. Lens data is retrieved from an optical design software and fed into the optical processing block. Where, lens distortions, relative illumination and optical blur are simulated for a given all focus input image. Sensor noise, sensor MTF and image sampling to sensor resolution, according to color filter array, are performed in sensor simulation block. Finally, the desired functionalities of the camera are tested through specific post processing algorithms. Figure 2.2: Physical image formation process. 2.2 Physical Image Formation The physical image formation process consists of two parts, the geometry of an image formation and the physics of light. The former determines the position and the latter determines the brightness of projection of light in the image 13 Figure 2.3: A larger aperture shows narrow depth of field as compared to small aperture. plane. A simple model is shown in figure 2.2, where a scene is illuminated by a light source, objects in a scene reflect the light towards the camera, the lens in the camera focuses the light to the image sensor that captures the light information and converts it into a digital image. In cameras, an aperture is used to control the amount of light acquisition. If the aperture opening is as small as a pinhole, a complete sharp image is formed. However in practice, a wide aperture is used to capture more light but in this case, not all of the rays of light focus on the image plane because conventional camera lenses are designed to focus accurately at only one distance. Therefore, the closer and the farther objects from focused distance appear blurred and a point in a scene spreads in the image plane forming a circle of confusion (CoC). The CoC defines the depth of field (DOF), the range of object distances that appear acceptably sharp in the image. For a smaller aperture, the DOF is larger as compared to a wider aperture due to bending of a light at larger angles. The DOF effect with two different aperture sizes is shown in figure 2.3. A digital image sensor consisting of array of pixels captures photons (light) and converts them into the electrical signal. There are two kinds of image sensors, CMOS and CCD and the difference between two is only in the readout 14 Figure 2.4: A color filter array (Bayer pattern) filters the light before it hits to the photo sensors. process. Figure 2.4 shows the CMOS sensor consisting of array of pixels. In addition, the color filter array (CFA) is needed before photo sensors to differentiate between colors. The most popular CFA is the Bayer pattern (shown in figure 2.4) which captures three primary colors, red, green and blue (RGB). In the process of image capturing, noise is also added mainly due to random arrival nature of photons and fluctuations in the process of electrical signal readout. 2.3 Motivation Traditionally, each element of a camera system is designed and tested individually. Therefore, before manufacturing, it is quite difficult to evaluate the performance of each module after integrating them in a complete imaging system. If all modules of a system behave linearly, then the performance is predictable. However, for non-linear and highly adaptive algorithms, the final image quality can hardly be predicted. Hence, evaluating the performance of an individual module separately does not guarantee that it will result in the optimal system performance. In most of the computational photography applications, the image is optically coded followed by computational decoding. This results in unconventional imaging properties which are very difficult to predict before manufacturing the lens. Lens designers widely-use the optical design programs (e.g. 15 Zemax and CODE V) to design and analyze the optical system. The software allows the user to analyze the performance of a lens with different means e.g. modulation transfer function, distortion, chromatic focal shift etc. However, the final image cannot be visualized and especially the behavior of optical blur at occlusion boundaries cannot be predicted before manufacturing the lens. 2.4 Related Work In the past, there has some work been done in designing image simulation tools, which allow to simulate different modules of a camera system. One of these kind of simulation tools is the Image Systems Evaluation Toolkit (ISET) [11]. ISET is a software package that simulates the capture and the processing of a scene. It allows users to control the physical characteristics of a scene and simulate the optics, sensor and image processing-pipeline. One of the limitations of ISET is the simulation of optics where only a single image plane is simulated that means no occlusions simulations. With the growth in the field of computational photography, more complex lenses are designed, like Wavefront coded (WFC) lenses [9]. To simulate these kinds of complex lenses along with the traditional lenses an efficient and flexible optical simulation module is required. In [26] and [6], the lens is integrated in the digital simulation with spatially varying PSF. In these methods, each pixel of the image is convolved with its own PSF, or the image is divided into smaller regions where the PSF is considered to be constant. These methods produce more realistic lens blur for 2D image plane, but they are computationally expensive for larger images. Moreover, the current methods only simulate one plane of a scene and none of them considers simulating the lens in 3D space. In this chapter, the simulation of the physical image formation process is presented. Generally, the optical transfer function of a lens varies spatially across the field of view and also longitudinally for the objects at varying distances, which makes the simulation computationally complex. An algorithm is presented in this chapter, which makes a substantial reduction in computational complexity without sacrificing accuracy significantly to simulate the lens blur. Moreover, occlusions are also modeled quite accurately. The chapter starts with the algorithm for spatially varying lens blur in section 2.5. 16 The work flow of complete optical simulation including distortion and relative illumination is presented in section 2.6. Finally, the sensor and the digital simulation are discussed in section 2.7, and 2.8 respectively. 2.5 Simulating Spatially Varying Lens Blur The response of an optical system to a point source is described by the point spread function (PSF). Optical design programs provide the PSF of a lens which models different optical effects e.g. defocus blur, chromatic aberrations and vignetting. In general the PSF varies for each point in space due to optical aberrations. Therefore, calculating PSF for each location and using it for the simulation is not a practical approach for a large number of pixels. To reduce the computational complexity, PSFs can be sampled at different points in space and modeled with the weighted summation of the basis point spread functions. Missing PSFs could be approximated through the interpolation of the weights. In the following sections, the algorithms are discussed in detail for a 2D plane and a 3D space varying filtering. 2.5.1 Simulating Optics for a 2D Object plane The output of a lens can be represented by a convolution of a pinhole image ๐ผ๐๐๐๐๐ (๐ฅ, ๐ฆ), and a PSF, ๐ (๐ฅ, ๐ฆ), โซโโซ ๐ผ๐๐๐๐๐ (๐ข, ๐ฃ)๐ (๐ฅ โ ๐ข, ๐ฆ โ ๐ฃ)๐๐ข ๐๐ฃ, ๐ผ(๐ฅ, ๐ฆ) = (2.1) โโ where ๐ฅ, ๐ฆ are the output image coordinates and ๐ข, ๐ฃ, the source image coordinates. For a spatially varying PSF, the output is described as a space variant convolution integral, โซโโซ ๐ผ(๐ฅ, ๐ฆ) = ๐ผ๐๐๐๐๐ (๐ข, ๐ฃ)๐ (๐ฅ, ๐ฆ, ๐ข, ๐ฃ)๐๐ข ๐๐ฃ. (2.2) โโ If all objects in a scene are considered to be in a single plane, the PSF varies only in ๐ฅ and ๐ฆ directions inside that plane. To make it more realistic, we need the simulation of an image for a 3D space. Before discussing the details of 3D 17 simulation, the method that is used in this work for applying the space variant blur is discussed. 2.5.1.1 Fast Approximation of Space Variant Convolution For a large number of pixels, the space variant convolution is computationally complex, mainly due to two reasons. Firstly, generating a large number of PSFs and storing them in a memory and secondly, the time required for processing each pixel with its own PSF. The simplest method that can be used for complexity (in other terms a data dimension) reduction is to divide the image in different sections and consider a constant PSF inside each section. However, minimum number of sections required, for fast processing and acceptable blur accuracy, strongly depends on the behavior of PSF variance across the image. An attractive method for data (in our case, the PSF) dimensions reduction is to project the data to lower dimensions while preserving as much of information as possible. This is exactly what is performed in principal component analysis (PCA), where a set of basis functions is computed that minimizes the squared error in original data reconstruction. Hence, the spatially varying PSF can be represented with the weighted summation of basis PSF as, ๐ (๐ฅ, ๐ฆ, ๐ข, ๐ฃ) = ๐ โ ๐ค๐ (๐ฅ, ๐ฆ)๐๐ (๐ข, ๐ฃ), (2.3) ๐=1 where ๐๐ are the basis PSF, ๐ค๐ are the corresponding weights computed through PCA and ๐ represents the number of basis PSF that contribute most to the actual PSF reconstruction. Now the equation 2.2 can be written as, โซโโซ ๐ผ(๐ฅ, ๐ฆ) = ๐ผ๐๐๐๐๐ (๐ข, ๐ฃ) ๐ผ(๐ฅ, ๐ฆ) = ๐ค๐ (๐ฅ, ๐ฆ)๐๐ (๐ฅ โ ๐ข, ๐ฆ โ ๐ฃ)๐๐ข ๐๐ฃ. (2.4) ๐ผ๐๐๐๐๐ (๐ข, ๐ฃ)๐๐ (๐ฅ โ ๐ข, ๐ฆ โ ๐ฃ)๐๐ข ๐๐ฃ. (2.5) ๐=1 โโ ๐ โ ๐ โ โซโโซ ๐ค๐ (๐ฅ, ๐ฆ) ๐=1 โโ Equation 2.5 represents the space variant convolution computed as ๐ numbers of conventional convolutions and summing them up according to weights of each location in space. 18 If only the subset of basis PSFs are used to reconstruct the original PSFs, the energy of PSFs is not preserved. As a result, there might be some overshoot and undershoots occur in the final image. However, these can be corrected by generating the normalization values through the blurring of image, containing all pixel values equal to one, with the same number of basis PSF as are used to blur the image. ๐๐ (๐ฅ, ๐ฆ) = ๐ โ ๐=1 โซโโซ ๐๐ (๐ฅ โ ๐ข, ๐ฆ โ ๐ฃ)๐๐ข ๐๐ฃ. ๐ค๐ (๐ฅ, ๐ฆ) (2.6) โโ ๐ผ(๐ฅ, ๐ฆ) . (2.7) ๐๐ (๐ฅ, ๐ฆ) Although PCA significantly reduce the data dimensionality and hence the ๐ผ(๐ฅ, ๐ฆ) = computationally complexity but still, a large amount of PSFs is to be generated and processed for PCA computation. One of the solutions to reduce the amount of PSF is to sample them at different points in space and then approximate the missing PSFs through interpolation. Since PCA represents the PSF as a linear combination of basis PSF, it becomes simpler to approximate the missing PSFs through the interpolation of weights ๐ค, corresponding to basis PSF, over the entire image plane. The accuracy of PCA based reconstruction of actual PSF with and without sampled data will be given in section 2.6.1. 2.5.1.2 Computing Basis Point Spread Functions Principal components analysis is a standard method to compute the basis functions called principal components. The eigenvectors of the covariance matrix of source PSFs represent the basis PSFs (hereafter the basis functions for PSFs are called basis PSFs). One has to consider how many basis PSF would be sufficient to reconstruct source PSF without significant error. This can easily be determined from the eigenvalues which represent the distribution of source PSF energy among each eigenvector. More details about PCA may be found in [37]. 2.5.2 Simulating Optics for a 3D Object Space Simulating the optics for only single object plane is rather simpler but one cannot analyze the effects of occlusion, or how is the perception of depth of 19 Figure 2.5: The right image shows the depth dependent blur applied to the input image according to the gathering method. The dark and the bright shadows can be seen around the boundaries. The input image (left) and its corresponding depth map (middle) are also shown in figure. field. In this section, the extension of 2D simulator to a complete 3D simulation of lens is presented. Equation 2.2 describes the space variant convolution equation which can be considered as gathering the light from surrounding pixels according to the PSF. For a 3D space, the space variant filtering is described as, โซโโซ ๐ผ(๐ฅ, ๐ฆ) = ๐ผ๐๐๐๐๐ (๐ข, ๐ฃ)๐ (๐ฅ, ๐ฆ, ๐ง, ๐ข, ๐ฃ)๐๐ข ๐๐ฃ, (2.8) โโ Now the PSF is also dependent on a third variable ๐ง, which represents the depth value. The depth information can be obtained from the depth map associated to the image. Figure 2.5 shows the image after applying the depth dependent blur according to equation 2.8. The input image and its depth map are also shown in the figure. The result does not look very good, as we can see the dark and the bright shadows around the object boundaries. The artifacts are stronger when the blur amount changes rapidly from one object to the other e.g. around the boundaries of focused objects in front of defocused objects. In real imaging, the light from a point is scattered to the surrounding pixels according to the PSF of that point. This phenomenon is different to conven- 20 tional space variant filtering, and can be described as, โซโโซ ๐ผ๐๐๐๐๐ (๐ข, ๐ฃ)๐ (๐ข, ๐ฃ, ๐ฅ โ ๐ข, ๐ฆ โ ๐ฃ, ๐ง)๐๐ข ๐๐ฃ. ๐ผ(๐ฅ, ๐ฆ) = (2.9) โโ In contrast to the gathering method (equation 2.8), in scattering method (equation 2.9) each PSF of the neighboring pixels contributes to the output of a center pixel. For this reason, equation 2.9 is more computationally complex as compared to equation 2.8. The scattering method is presented by Kosloff [21] to simulate the depth of field effect in the images. However, only specific types of PSF are used for blurring, and special filtering algorithms are used to speed up the processing. A minor modification in the PCA based method for space variant filtering discussed in section 2.5.1.1 implements the scattering method. If the image is multiplied with the weights of each basis PSF before convolving the image to the basis PSF, we get the solution of scattering equation 2.9. The scattering method using PCA based filtering is described as, ๐ (๐ฅ, ๐ฆ, ๐ง, ๐ข, ๐ฃ) = ๐ โ ๐ค๐ (๐ฅ โ ๐ข, ๐ฆ โ ๐ฃ, ๐ง)๐๐ (๐ข, ๐ฃ), ๐=1 โ ๐ผ(๐ฅ, ๐ฆ) = ๐ โซโซ โ ๐ค๐ (๐ข, ๐ฃ, ๐ง)๐ผ๐๐๐๐๐ (๐ข, ๐ฃ)๐๐ (๐ฅ โ ๐ข, ๐ฆ โ ๐ฃ)๐๐ข ๐๐ฃ. (2.10) ๐=1 โโ Since there is only single depth value ๐ง for each location ๐ข, ๐ฃ, therefore, the weights ๐ค are first multiplied with the image ๐ผ๐๐๐๐๐ followed by convolution with the basis PSF. This method also results in the overshoots and undershoots intensities at the boundaries of rapidly changing depth. This problem can also be solved by the same normalization approach discussed in section 2.5.1.1. Image is divided by the normalization values, generated according to equation 2.10. Figure 2.6 shows the process flow diagram of the proposed scattering method. The symbol โ*โ represents the convolution operation. Figure 2.7 shows the image blurred with scattering method. The final image looks much better in perception as compared to the gathering method, but it does not solve the problem completely and the boundaries are not accurately blurred. To make accurate simulations of boundaries, we need to handle the occlusions which will be discussed in the next section. 21 Spreading Iideal W1 x P1 W2 * x P2 WN PN * + I x * Figure 2.6: The process flow of the proposed scattering method. ๐๐ and ๐๐ represent the basis functions (eigenPSF) and the weights, respectively. The symbol โ*โ represents the convolution. 2.5.2.1 Simulating Partial Occlusions All light rays in the field of view of lens contribute to the final image formation. For a wide aperture lens, the image also captures some light coming from the occluded regions. The occluded regions are the background object regions which are hidden by the foreground object. However, the information of occluded regions is missing in the image, which is considered as an all in focus image, because the image is assumed to be captured either with a narrow field of view or a pinhole camera to get all in focus image. Unless we do not have any information about occluded regions, it is almost impossible to blur the objects accurately. To acquire the information of occluded regions, let assume that it is possible to capture two images along with their depths such that the first image is a normal pinhole image with occluded regions, and the second image is captured for the same scene but without any occlusions by removing foreground objects. Figure 2.8 and 2.9 shows an example for this kind of two images and their depth maps. There are no occluded regions in the background image. The blur is first applied to both images according to their depth map. The method of scattering, as discussed in the previous section, is used to blur the images. After blurring, both images must be appropriately blended to 22 Figure 2.7: The left and right images are blurred according to gathering and scattering approach, respectively. The scattering result is visually better than gathering as there is a smooth blur around object boundaries. Figure 2.8: Foreground image along with its depth map. get a final image, with a correct simulation of occlusions (blur on the object boundaries). The foreground image needs the normalization step as discussed before but the background image does not, as there are no overshoots and undershoots due to continuous change in depth. The normalization values are very valuable here, as they provide the weighting factor to blend the foreground and background image. Two images can be blended as: ๐ผ(๐ฅ, ๐ฆ) = ๐ผ๐ ๐ (๐ฅ, ๐ฆ) + ๐ผ๐๐ (๐ฅ, ๐ฆ)(1 โ ๐๐ (๐ฅ, ๐ฆ)), (2.11) where ๐ผ๐ ๐ and ๐ผ๐๐ are the foreground and background images blurred according to the scattering method, and ๐๐ are the normalization weights generated by blurring the all white image according to the depth map associated with ๐ผ๐ ๐ . 23 Figure 2.9: Background image along with its depth map. Blending the images in that way works accurately as the normalization weights represent the amount of light contributed by occluded objects to the foreground objects. To better understand the method of gathering, scattering and simulating occlusions accurately, a one dimensional signal is blurred with these methods. Figure 2.10(a) shows the signal (black) blurred according to gathering (blue) and scattering (red) method. Pixels from 41 to 50 and 51 to 60 are considered as foreground and background signal respectively. A Gaussian blur of ๐ = 3 and ๐ = 0.5 is used to blur the foreground and background respectively. Both results do not show correct blur signal as the edge must be blurred completely according to the foreground blur. Figure 2.10(b) shows the scattering method blended with the background signal. It is assumed that the background signal is available at all pixels locations from 41 to 60. The final blurred signal (blue) has a smooth blur according to the foreground blur. The partial occlusions are simulated for a computer generated image and its depth as shown in figure 2.11. The Gaussian PSFs with standard deviation changing with the depth are applied to blur the image. The focus position is set to the background. Figure 2.12 shows the comparison between gathering and scattering method with and without occlusion handling. As the results show, both gathering and scattering methods produce unrealistic blurring of the pencil (the foreground object), and its boundaries appear sharper. On the other hand, the scattering method with background blending, accurately simulates the occlusion, and the text occluded by pencil is visible through the boundaries. 24 1.8 Input signal Gathering Scattering 1.7 Intensity 1.6 1.5 1.4 1.3 1.2 1.1 1 0.9 42 44 46 48 50 52 54 56 58 60 Pixel Location (a) A signal blurred according to gathering and scattering method with Gaussian blur of ๐ = 3 and ๐ = 0.5 from pixel location 41 to 50 and 51 to 60 respectively. 1.8 Input signal Background signal Scattering Scattering with background 1.7 Intensity 1.6 1.5 1.4 1.3 1.2 1.1 1 0.9 42 44 46 48 50 52 54 56 58 60 Pixel Location (b) A foreground signal is blended with background signal. (Pixels from 41 to 50 and 51 to 60 are considered as foreground and background signal respectively). Figure 2.10: The plots show the results of gathering and scattering method for one dimensional case. A correct blur is applied on the boundary with the information of background signal. 2.6 Optical Simulation Optical system design tools can generate the PSF data for the different type of lenses, at any field point across the field of view and at different distances from the camera. PSF describes the defocus blur, chromatic aberrations, astigmatism and spherical distortion. However, it does not account for the effect of 25 lens distortions and relative illumination, but the amount of local image distortions and relative illumination can be obtained separately from optical design tools. Distortion is a deviation from the rectilinear projection and are mainly of two kinds, barrel and pincushion. Relative illumination only decreases the intensity of the image. The data provided by design tool is used to simulate the lens. Distortions are simulated through sampling grid mapping, and the relative illumination is simulated by scaling the intensity, based on each spatial position. In the next step, the image is blurred with the PSF. The space variant blurring method, discussed in the previous section, is used to apply the blur. Following steps summarize the simulation process: โ PSFs are generated for a sampled space through optical design tool. Distortions and the relative illumination data are also calculated. โ Distortions are simulated through the re-sampling of the image grid. โ Relative illumination is applied to the image by scaling the intensity at each image position. โ Basis PSF and corresponding weights are computed using PCA. Only the most significant PSF are selected from the help of eigenvalues. โ Weights are interpolated over the image domain to produce the effect of spatially varying PSF at each location. Figure 2.11: The image and its two layers depth map. 26 Figure 2.12: The image simulated with the gathering (left) and scattering, with (middle) and without (right) background blending, methods. โ The image is blurred according to the scattering method, discussed in section 2.5.2. 2.6.1 Accuracy and Computational Complexity In this section, the accuracy and the speed optimization of the algorithm for different applications is discussed. The blur is assumed to be changing across the field of view for a 3D case and also along the optical axis. In some applications only adding the lens blur to the image is sufficient, e.g. simulating the depth of field in computer generated images especially in games. For lens designers, the complete optical simulations for a real lens design are demanded. Most optical systems use lenses which exhibit radial symmetry, i.e. rotating the system about the optical axis does not alter its behavior. Radial symmetry of a lens could be utilized to reduce the computational complexity. On the other hand, if only depth of field effect is to be simulated, then the blur can also be considered as a rotational symmetric, which further reduces the computational time. The implementation details of the algorithm for a radial symmetric system, with rotational and non rotational symmetric blur are discussed now. 2.6.1.1 Rotational Symmetric Blur Most of the applications in the field of computer graphics, only needs the simulation of a depth of field effect in the image. In that case, system can be considered as a radial symmetric, and the blur as a rotational symmetric behavior. However, the blur can be of any type, with different shapes of the aperture and can vary across the field of view. 27 For an example case, let consider a Gaussian blur which varies across the field of view and along the optical axis, according to the depth value of the each pixel. The image shown in figure 2.8 is used to simulate the depth of field effect. The standard deviation ๐ of Gaussian filter has a direct relationship with the lens geometrical blur diameter. Hence, it is computed directly from the depth map as, ( ๐=๐ 1 1 โ ๐๐๐๐กโ ๐๐ ) , (2.12) where ๐๐ is the focus position and ๐ is a constant depending on the lens parameters. (a) Image blurred according to gather- (b) Standard deviation ๐, of Gaussian filing method using 5 most significant basis ter at each pixel location. PSFs. Figure 2.13: The image is blurred with the Gaussian filter. The standard deviation of filter changes with the depth. Figure 2.13a shows the simulated result with the gathering method using PCA based filtering (equation 2.8). The standard deviation of Gaussian filter at each pixel location is also shown in figure 2.13b. For PCA based filtering, the Gaussian filters of size 31x31 are created for the 100 equally spaced sigma values and basis PSFs are computed for them using the PCA. The weights of basis PSF, which are only available for the sampled sigma values, are interpolated over the entire image. In the current example, although the PSF is varying with respect to depth, but it is only a function of sigma value. For this reason, only one dimensional interpolation is used here. To estimate the accuracy of the PCA based filtering, a reference image 28 is created by filtering each pixel with its own PSF. The execution times for different algorithms are compared in table 2.1 for the image size of 1600x1200. All simulations are performed in Matlab on the 64 bit Windows system with Intel Core i5-2400 CPU of clock speed 23.14 GHz, and 4 GB RAM. The PCA based method is much faster as compared to the exhaustive filtering, without losing any significant accuracy. Number of Execution time(s) in SSIM basis PSF Spatial Do- Frequency main Domain Exhaustive filtering - 825 - - PCA based filteing 5 44 2.7 0.999 3 28 3.2 0.994 Table 2.1: Execution time for exhaustive filtering and PCA based filtering. Figure 2.14 shows the local SSIM value for PCA based filtering using 5 and 3 basis PSFs out of 100. As we see, the SSIM value is very high in both cases and human eye will not perceive significant difference. However, if other than Gaussian type of PSF is used for blurring, the SSIM value would be different for the same number of basis PSF. Figure 2.14: SSIM value computed between reference image (exhaustive blurring) and PCA based filtering with 5 (left) and 3 (right) basis PSFs. The execution times for the PCA based scattering method are also computed for the same example. These are shown in table 2.2. The occlusion 29 processing with background image takes twice as much time as without occlusion processing but still it is significantly faster than the exhaustive blurring algorithm. Occlusion Processing Number of Execution time(s) in basis PSF Spatial Do- Frequency main Domain Yes 5 85 7.5 No 5 42 3.7 Table 2.2: Execution time for PCA based scattering filtering. 2.6.1.2 Real Lens Simulations Generally, conventional lenses exhibit radial symmetry, and PSF at any one radial line could be used to blur a complete image. In this case, only one dimensional interpolation is employed to approximate the weights of PCA for complete image field. However, the PSF must be rotated to get appropriate blur at each pixel, which is very time consuming process. As the depth changes for each pixel, therefore, PSFs are generated at the sampled 3D Cartesian coordinate points, through optical design software. The sampling points in the ๐ฅ and ๐ฆ plane are selected according to the field varying behavior of the modulation transfer function (MTF). Whereas, in the ๐ง direction, the sampling points are selected according to a change in the blur spot size. The information about field varying MTF and depth varying spot size behavior can be obtained from optical design software. After generating the PSFs at sampled locations, basis PSF and corresponding weights are computed through PCA. To approximate the missing weights, interpolation must be performed in three dimensions, which makes the algorithm slower as compared to the rotational symmetric system (the example case discussed before). Before applying the blur, distortions and relative illumination are applied to the image separately. Relative illumination decreases the amount of intensity locally at each pixel location, and can be simulated by scaling the data based 30 Figure 2.15: Image (right) shows the result of distortions and relative illumination applied to an ideal image (left). on spatial positions. Distortions are mainly of two types, barrel and pincushion distortion. In barrel distortion, the image magnification decreases with depth. On the other hand in pincushion distortion, the image magnification increases with depth. We can summarize the distortion as a local non uniform stretching or shrinking of the image. The amount of distortion induced in the image is not only a function of radial image pixel coordinates, but it also depends on the scene depth. Therefore, distortion values are computed for different distances at sampled points on a radial line. Interpolation is utilized to acquire the distortion value at each pixel location according to its depth. The depth map assists in getting the depth of each pixel. Finally, the distortions are simulated through sampling grid mapping to new coordinates, followed by resampling to a rectangular grid. Figure 2.15 shows an example of distortions and relative illumination applied to an image. It can be seen that corners of the image are darker and the barrel distortions are stronger as compared to the pincushion distortions. A lens is designed in Zemax software with a focal length of 4 mm, f-number of 2.4 and a horizontal field of view of 60โ . At outer most field points, the relative illumination is 50%. Distortions are less than 2%. Lateral chromatic aberration are well corrected, however longitudinal chromatic aberrations are deliberately enhanced for an application use case. PSFs are generated for sampled grid points through the automated Matlab program. The number of sampling points, selected in ๐ฅ, ๐ฆ and ๐ง directions, are 9, 7 and 10 respectively. In total, 630 PSFs are generated for a complete 31 Figure 2.16: Simulated real lens image (right) along with original image (left). 3D space. Basis PSF and corresponding weights are computed for these PSFs, and 50 most significant basis PSF are used for simulations. Up to this step everything can be performed offline, and only once for each lens design. Optical simulations start by simulating the distortions for both the image and its depth map. Then the relative illumination is applied to the image only. Finally, the image is blurred according to the PCA based scattering method as discussed in section 2.5.2. Figure 2.16 shows the simulated real lens image. Vignetting (relative illumination) effect is visible at the corners, which are relatively darker. The cropped regions are shown in figure 2.17 at full resolution. The effect of optical blur and chromatic aberration is visible in the simulated output. 2.7 Sensor Simulation In digital cameras, sensors capture the incident light and converts it into a digital signal for displays. Image sensors are mainly of two types, charged coupled device (CCD) and CMOS sensors. Due to less power consumption and low cost, CMOS sensors are widely used commercially. Besides capturing the image, sensor also adds noise to it. In the following sections, the noise model to simulate the image sensor noise characteristics will be discussed. Moreover, sensor MTF is applied to the image before downsampling it to sensor resolution according to color filter array (CFA) pattern. Images with the resolution higher than sensor are used to simulate the effect of aliasing. 32 Figure 2.17: Cropped region shows the input image (left), and the result of optical simulations (right). The effect of optical blur and chromatic aberrations is visible with optics simulation. 2.7.1 Noise Sources in Image Sensors In CMOS image sensors, a photo sensor captures the incident light consisting of photons and converts it into the electrical signal. During the whole process, due to random nature of photons arrival and fluctuations in the electrical signal, noise is added in the captured image. The noise appears as a random variation in the intensity of the image. It plays an important role in defining the dynamic range and responsivity of the image sensor. Noise is described as either with the variance or the standard deviation in rms unit. If there are more than one noise sources, their variances are added to get the total amount of noise. Noise in image sensors is typically divided into two categories, temporal noise and fixed pattern noise. 2.7.1.1 Temporal Noise Most dominant random noise in sensors is a shot noise, which appears due to random fluctuations of charge units (electrons). This noise follows the basic laws of physics and same for all type of sensors. Shot noise is statistically described with the Poisson distribution. However, when the number of electrons are large, Poisson distribution approaches Gaussian distribution. Other significant temporal noises are readout noise and dark current noise. 33 2.7.1.2 Fixed Pattern Noise Fixed pattern noise (FPN) appears due to manufacturing mismatches in the active transistor. FPN remains constant in time for each pixel but varies spatially from pixel to pixel. It could appear due to dark signal non-uniformity (DSNU) which is always present without any illumination, and photo response non-uniformity (PRNU) which is signal dependent. 2.7.2 Noise Model To model the noise in our simulation, the linear noise model described in EMVA1288 standard [1] and in [16] is followed. The model is described here briefly. The photons hitting the light sensitive area are converted into electrons, amplified, and converted into digital signal ๐ผ by analog to digital converter (ADC). The whole process is assumed linear and can be described with the overall system gain ๐พ. Then the mean signal ๐๐ผ is given as ๐๐ผ = ๐๐ผ๐๐๐๐ + ๐พ๐๐ , (2.13) where ๐๐ผ๐๐๐๐ is the mean dark signal without light and ๐๐ is the averaged number of captured electrons. If a complete camera is considered as a black box, it is sufficient to consider only three noise sources, shot noise, readout noise (also includes amplifier noise) and quantization noise introduced by ADC converter. These noises are represented with their variances, ๐๐2 , ๐๐2 and ๐๐2 respectively. All noise sources add linearly to make a total temporal noise ๐๐ผ 2 , of the digital signal. According to laws of propagation, it is given as ๐๐ผ2 = ๐พ 2 (๐๐2 + ๐๐2 ) + ๐๐2 . (2.14) Using equation 2.13 and relationship ๐๐2 = ๐๐ , noise can be related to mean digital signal as, ๐๐ผ2 = ๐พ 2 ๐๐2 + ๐๐2 + |{z} ๐พ (๐๐ผ โ ๐๐ผ๐๐๐๐ ) . | {z } slope (2.15) offset As there is a linear relationship between noise variance and mean signal value, the overall system gain ๐พ can be computed from the slope, and dark 34 noise variance from the offset. The methods described in EMVA1288, section 6.6, are employed to measure the gain ๐พ and variance of dark noise ๐๐2 , and used these values to simulate a sensor. 2.7.3 Sensor MTF and Sampling A sensor consists of array of pixels that are arranged in matrix structure. Therefor, an incident light is sampled according to Nyquist sampling theorem. Actual sensor MTF depends on the shape and size of a pixel. If we consider a pixel of size ๐๐, then all spatial frequencies above Nyquist frequencies, ๐๐ ๐ = 1 , 2๐๐ cannot be resolved and hence results in aliasing. For a rectangular pixel, the sensor MTF is a Fourier transform of a two dimensional rectangular window, which is a 2D ๐ ๐๐๐ function. Figure 2.18 shows one dimensional MTF for different pixel sizes ๐๐, where ฮ๐ represents the spatial resolution of a lens. As we see, the smaller pixels resolve more details as compared to large pixels. 1 pp = ฮ๐ pp = 2ฮ๐ pp = 4ฮ๐ Sensor MTF 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Saptial Resolution/ฮ๐ Figure 2.18: The combined MTF of sensor detector footprint and sampling is shown for different pixel sizes. For our simulations, the input scene is taken with at least two times higher resolution than sensor resolution. After applying the sensor MTF in frequency domain, image is sampled with nearest neighbor method to actual sensor resolution. In this way, aliasing effects can be simulated. 35 2.7.4 Color Filter Array Photo sensors do not differentiate between wavelengths of light (color). Therefore, bandpass color filters must be used to capture color information. Mostly, sensor capture three primary colors red, green and blue. For an example, a color filter spectral response is shown for the commercially available Kodak color filters. To correctly simulate the effect of color filter array, the simulations are performed with hyperspectral images. After applying the optical blur, sensor noise and sensor MTF, hyperspectral images are summed up according to color filter array responses. The hyperspectral simulation is very important for a correct simulation of axial chromatic aberrations. Because, the blur is dependent on the color of the objects in a scene. Quantum Efficiency 1 0.8 R(๐) 0.6 B(๐) G(๐) 0.4 0.2 0 400 450 500 550 600 650 700 Wavelength [nm] Figure 2.19: Spectral sensitivity of color filter arrays. Generally, sensors also perform spatial sampling of the colors, and capture only one color at each pixel. Specific patterns of pixels are used to get each primarily color red, green and blue in a local neighborhood. The most common pattern is a โBayerโ pattern shown in figure 2.4. This pattern has twice green pixels as compared to red and green. 36 2.8 Digital Simulation The digital simulation implements standard color image post-processing algorithms mostly used for conventional cameras. This includes algorithms such as black level subtraction, white balancing, demosaicing, color correction and gamma curve [28]. Figure 2.20 shows the color processing chain of the simulator. RGB Image Bayer Image Black Level Subtraction White Balance Demosaicing Color Correction Gamma Figure 2.20: Digital image post processing chain. Algorithms which are related to some specific camera design or applications are also integrated in the processing chain at appropriate position. For example, deconvolution algorithm to restore the image is incorporated in the chain to retrieve extended depth of field image. For a lens exhibiting longitudinal chromatic aberration, depth estimation algorithm and the algorithm to generate extended DOF image is implemented. The details of few post processing algorithms will be discussed in the next chapters. 2.9 Conclusion and Outlook In this chapter a digital camera simulator is presented. The simulator models the complete digital camera processing chain such as optics, sensor and digital processing. The main contribution of the simulator is in optical simulation. It allows a user to simulate the conventional and unconventional optics with a correct modeling of occlusions. The blur induced by the optics is generated for a sampled 3D space with commercially available lens design tools. Missing blur information is then approximated at each pixel of the image through PCA based interpolation. For 3D scenes, the true depth map is used to blur the image according to each pixels depth and location. It is shown that two methods of filtering, scattering and gathering, mentioned in the past literature, 37 can be efficiently implemented in Fourier domain using the PCA based filtering. An efficient algorithm for space variant filtering using PCA helps in making the simulation time substantially smaller. Although the aim of the simulator is to simulate the cameras but the low cost method of space variant filtering can also be used to add the depth of field effect in real time to the computer generated scenes, which is very useful and demanding in gaming applications. The sensor simulation includes the noise addition, sampling and color filter array effects. The digital part implements the traditional camera post processing steps. 38 Chapter 3 Depth From Chromatic Aberrations In this chapter a computational imaging technique is discussed to estimate the depth from a single image using conventional camera. The optics is designed to introduce the significant amount of axial chromatic aberrations, and depth is estimated digitally through post capturing algorithm. Due to axial chromatic aberrations (ACA), different colors are focused at different distances along the lens axis. Hence, the relative blur between two color images helps in estimating the depth. The basic principle is similar to well known depth from defocus (DFD) method. The chapter is started with the overview of different depth estimation methods in section 3.1. Then the depth from defocus method is discussed in section 3.2, followed by the proposed method of depth from chromatic aberrations (DFCA) in section 3.3. Section 3.4 discusses the interpolation methods to create a dense depth map. Finally the results are discussed and analyzed in the last section. 3.1 Overview of Depth Estimation Methods Depth estimation refers to the algorithms which aim to estimate the distance of objects in a scene from the camera. The distance and geometry information is lost in the conventional imaging methods, because a set of rays from a scene is projected on a 2D instead of 3D plane. 39 40 There are many approaches which try to estimate the lost depth information in different ways. These different techniques can be categorized mainly into two methods, active and passive. 3.1.1 Passive Depth Estimation In passive methods, depth is estimated from the images captured with different camera settings in natural lighting conditions. The basic methods of passive depth imaging are based on the triangulation. If a point in a scene is observed from different views by changing the camera position, depth can be estimated. 3.1.1.1 Stereoscopy Stereoscopic method is based on the principle of triangulation. Triangulation is the process of determining the distance of a point by measuring the angles to it from known points at the corners of a base line. In stereoscopy, this principle is applied by using two cameras separated by a fixed baseline to image a specific point. As a result, the point is shifted in the imaging plane known as disparity from which depth is determined. The depth estimation algorithms identify the corresponding regions in two stereo images to determine the disparity. However, this matching problem is computationally expensive and also its accuracy depends on the local image features. Depth from stereoscopy does not always provide dense depth maps. If there are homogeneous regions in the image, the disparity cannot be computed therefore interpolation techniques are required to produce dense depth maps. 3.1.1.2 Depth from Focus/Defocus The limited depth of field of optical system helps in determining the depth. In depth from focus (DFF) method, multiple images are captured with different focus settings, by either moving the lens or the sensor. As a result, objects at different distances focus in different images. The depth estimation algorithm computes the sharpness of all images at each pixel location, and determines the depth according to that image where the pixel is sharpest. The main disadvantage of this method is to capture multiple images. It is studied that for a reasonable depth accuracy, ten or more images are required. 41 Another drawback is that the depth can be determined only at the regions where some local features exist. For homogeneous regions, interpolation is required to compute the dense depth map. In contrast to DFF, in depth from defocus (DFD) method only two images are used to estimate the depth. The amount of defocus blur changes continuously with the distance of the object. The measured value of the blur in the image directly gives the depth information. Similar to other passive depth methods, DFD only provides the depth at textures or edges in the image. To generate dense depth map, interpolation is required. 3.1.2 Active Depth Estimation Active methods project some energy on a scene and the sensor detects the depth by processing the returned energy information. These methods are more accurate and also help in providing the ground truth. In ultrasound imaging, depth is perceived through transmission of sound waves and interpreting the intensities of reflected echoes. However, in most cases infrared (IR) or incandescent light is used for active illumination of a scene. Todays, IR light is mostly used for depth imaging, for example, in Kinect or time of flight camera. The advantage is the invisibility of IR light to human eye. However, the disadvantage is that traditional imaging sensors cannot be used and specific sensor elements are required to capture IR light. 3.1.2.1 Depth from Time of Flight An active light source projects light on the scene and the camera measures the delay caused in the time for a light beam to travel a certain distance. The maximum measurable distance depends on the frequency of the projected light pulses. The method provides very accurate depth results but it requires specific hardware device along with an active light illumination source i.e. more power consumption. 3.1.2.2 Depth from Active Stereoscopy The correspondence matching problem can be reduced by replacing one camera in the stereo system with the active light source, which projects specific 42 light pattern on the image. The depth is computed from the information of distortion of projected light pattern. The accuracy of the depth increases as compared to the passive method. To further reduce the ambiguities in the matching problem, more than one pattern is projected through time or color multiplexing. 3.1.3 Depth Estimation by Computational Imaging Computational photography is a recent development in the field of imaging to acquire the image features which are not possible through conventional imaging. It modifies the optics and/or the sensor to code the image in a specific way so that the digital post processing could recover the conventional image along with some other useful features e.g. depth map, extended depth of field, high dynamic range, super resolution, etc. Ng et al. [30] presented the method of light field capture through plenoptic camera, which also results in obtaining the depth information. Another method for depth estimation is proposed by Zhou et al. [45], where a diffuser is inserted between scene and the camera to code the depth information in the defoucs blur. The method is similar to depth from defocus, however higher depth accuracy can be achieved with smaller aperture. Several other methods based on the aperture coding are also proposed to estimate the depth. Bando et al. [3] suggested the use of color filtered aperture to shift the color images and hence estimating the depth from the disparity in the color images. One of the limitations of these methods is the loss of light. Chakrabarti et al. [5] modified the aperture to generate varying depth of field for different colors and infer depth from blur difference in the color images. This thesis also investigates a computational imaging method using axial chromatic aberrations to estimate the depth. Following sections will describe the details of the method and discuss the pros and cons of this approach for depth estimation. But before discussing this, a detailed description of depth from defocus method is given which is the basis of depth from axial chromatic aberration. 43 3.2 Depth From Defocus The limited depth of field of imaging optics helps in determining the distance from the defocus blur. In section 2.2, the physical image formation process is described. In optics, if the thickness of the a lens is much smaller than focal length, it can be considered as a thin lens represented by a single principal plane. For a thin lens, in paraxial ray approximation, the relationship between the distance of object to the lens ๐ and the distance of image sensor to the lens ๐๐ is given as, 1 1 1 = + , ๐ ๐๐ ๐ (3.1) where ๐ is the lens focal length. These entities are better illustrated in figure 3.1. aperture stop d sensor f A X b di object image Figure 3.1: Image formation by a thin lens approximation. In real aperture photography, only the objects at the focus position of the lens appear sharp at the image plane. All other objects closer or farther from focus position appear blurred. The amount of blur ๐ depends on the distance of object from the focus position and relates to the camera parameters as, ( ๐ = ๐ด๐๐ 1 1 1 โ โ ๐ ๐๐ ๐ ) , (3.2) where ๐ด is the aperture diameter and ๐ is the object distance from the lens. The blur measurement in the image directly gives the distance information using the equation 3.2. Although the equation represents the geometrical blur, it helps in approximately model the behavior of optics. However, the ambiguity appears in distinguishing the objects on both sides of the cameraโs focus position because they are blurred with the similar amount. One can eliminate this ambiguity by focusing the lens at the nearest distance of the desired depth range. Another challenge is the measurement of accurate 44 blur due to varying local image features. For example it is difficult to distinguish between defocused region of sharp edge and a focused region of smoothly varying edge. To eliminate the ambiguities, two differently defocused images are used in DFD method. The images are captured by changing the camera parameters such as aperture size or lens/sensor position. Depth estimation algorithm then measure the relative blur between two defocused image to determine the distance. This solves the problem of blur measure dependency on local image feature. The other ambiguity of distinguishing closer or farther object can only be solved by changing the lens/sensor position instead of changing aperture size for capturing different defocus images. Two images with different focus positions assist to infer the depth without any ambiguities, but it adds another problem. Different focus position results in different image magnifications which results in misalignment of images features. For a thin lens, the amount of magnification ๐ depends on the focal length and the object distance. ๐= ๐ . ๐ โ๐ (3.3) However, the object distance is same for both defocused images, therefore only focal length affects the magnification difference of two image. To avoid the problem of magnification, most authors have suggested to change the aperture size to capture two defocused image. For the images captured with different focus settings, images must be registered before computing the relative blur. One possible solution is to register the images through digital processing, for example warping. The amount of magnification may be computed through calibration procedure or estimating through local image analysis. However, the image registration is not always accurate and also adds complexity in th DFD method. Watanabe et al. have proposed to make the optics telecentric, where the magnification of the image is independent of lens focus position [41]. They have proved that any conventional optics can be made telecentric by adding an aperture. In this case, the effective and nominal F-numbers are same. That property gives similar magnification for different defocus images. However, the main disadvantage of this approach is the loss of light. 45 3.3 Depth From Axial Chromatic Aberration In the present thesis, the goal is to capture different defocused images with a single shot. As traditional image sensors capture three primary colors RGB, therefore it is possible to defocus each color with different amount. Fortunately, the lenses automatically provide different focus images for different colors, because lens refraction is color dependent. In the next sections, first the behavior of lens on colors is discussed followed by the details of the algorithm to extract depth from color channels. White Light Screen Red Orange Yellow Green Blue Indigo Figure 3.2: Dispersion of white light into monochromatic light after passing through prism. 3.3.1 Axial Chromatic Aberrations The focal length of a thin lens is given as, 1 1 1 โ ), โ (๐(๐) โ 1)( ๐ (๐) ๐ 1 ๐ 2 (3.4) where ๐ 1 and ๐ 2 are the radii of curvature of the two surfaces of a lens and ๐ is the refractive index of lens material, which depends on the wavelength of light ๐. Equation 3.4 shows the dependency of focal length on the wavelength, and this occurs indirectly due to the dispersion property of the lens material. Figure 3.2 shows the white light passing through the prism and dispersed into different colors. The amount of dispersion decreases from lower wavelengths to higher. The empirical relationship between refractive index and wavelength of light is described through Cauchyโs equation which is quite accurate for visible range of light. Figure 3.3 shows the behavior of the refractive 46 1.65 Refractive index Glass - BK7 Plastic - Polycarbonate 1.6 1.55 1.5 0.4 0.5 0.6 0.7 0.8 0.9 1 Wavelength [๐m] Figure 3.3: Refractive index of glass (BK7) and plastic (Polycarbonate) materials. index for BK7 glass and polycarbonate plastic. The variation of refractive index for plastic material is larger as compared to the glass, which is the reason of larger chromatic aberrations in cheap optics made by plastic material. Since the focal length changes with the wavelength, therefore the blur diameter given by equation 3.2 also depends on the wavelength. For three primary colors RGB (๐๐ , ๐๐ , ๐๐ ), the blur diameter is given as, ( ) 1 1 1 ๐(๐๐ ) = ๐ด๐๐ โ โ , ๐ (๐๐ ) ๐๐ ๐ ( ๐(๐๐ ) = ๐ด๐๐ ( ๐(๐๐ ) = ๐ด๐๐ 1 1 1 โ โ ๐ (๐๐ ) ๐๐ ๐ ) 1 1 1 โ โ ๐ (๐๐ ) ๐๐ ๐ ) (3.5) , (3.6) . (3.7) Blur diameter for different colors is plotted versus object distances in figure 3.4, for an arbitrary lens parameters. The relative blur estimation between colors directly provides the relative depth information. As a result of chromatic aberrations, it is possible to capture three differently defocused color images in a single shot with the RGB image sensor. Image quality is degraded due to chromatic aberrations, and color bleeding artifacts appear in the image. Therefore, chromatic aberrations are usually 47 0.03 Blue Green Red Blur Diameter [mm] 0.025 0.02 0.015 0.01 0.005 0 102 103 104 Object distance [mm] (log scale) Figure 3.4: Blur diameter for RGB colors which focus at different distances due to chromatic aberrations. corrected in the lenses by designing a compound lens through the combination of multiple materials. There are also some digital processing algorithms which correct the chromatic aberrations. The details of image quality will be discussed in chapter 4. 3.3.2 Depth Estimation Depth estimation algorithms estimate and compare the blur of each defocused image to compute the relative depth. In DFCA system, blur measure is more challenging due to varying local image content in each color image. In the next section, different blur measures are discussed and compared to select the best blur measure for DFCA system. 3.3.2.1 Blur Measures Methods There have been many different type of blur measures proposed in the literature. These all can be mainly categorized into statistical, derivative and energy based blur measures. Statistical based operators compute the standard deviation, variance, central moment and etc. to estimate the amount of blur. Derivative based blur measures estimate the blur from image contrast, first derivative or second derivative of the image along with smoothing operations. 48 Figure 3.5: The image with and without chromatic aberrations. Energy of the signal is also a basis of computing the blur through discrete cosine transform (DCT) or discrete Fourier transform (DFT). There are many other proposed operators, but most of them are based on these fundamental blur measures. In this work, different blur measures are analyzed according to their ability to distinguish the minimum amount of blur in case of noise. Some blur measures are discussed here. Sum of Squared Gradient: Defocus blur is a kind of low pass filter which suppresses the higher spatial frequencies in the image. Therefore, it is desirable to estimate the amount of blur through function which responds to high frequencies. Derivative operators provide the same functionality. First derivative of the image can be computed through gradient filters. Magnitude of the gradient vector provides the information about the blur. Since the gradient filter enhance high frequencies and hence the noise, therefore, smoothing operation is combined to reduce the effect of noise. Most commonly used blur measure is the Tenengrad operator proposed by Kortokov [22], given as, ๐ต๐ = ๐ โ ๐ โ ( 2 ) ๐๐ฅ + ๐๐ฆ2 , (3.8) ๐ฅ=1 ๐ฆ=1 where ๐๐ฅ and ๐๐ฆ is the gradient vector of an image ๐ผ in ๐ฅ and ๐ฆ directions. A well known Sobel operator is used to compute the gradients in x and y direc- 49 tions. Summing up the neighboring gradients provide the further smoothing function. Gaussian Derivative: The smoothing operation can be combined with the gradient filter to make a more robust blur measure. Geusebroek et al. [13] have proposed to combine the Gaussian filter with the gradient filter to make a Gaussian derivative filter which is more robust against noise. 2D Gaussian derivative filter in the ๐ฅ direction is given as, ๐บ๐ฅ = โ ๐ฅ โ (๐ฅ2 +๐ฆ2 2 ) ๐ 2๐ . 2๐๐ 4 (3.9) Similarly the ๐บ๐ฆ is defined in ๐ฆ direction. The value of ๐ defines the amount of smoothness. Sum of Modified Laplacian: Second derivative of the image provides more sensitivity against the blur. Laplacian operator provides the second derivative of the image and the blur measure is defined as ๐ต๐ = ๐ โ ๐ โ (๐๐ฅ๐ฅ + ๐๐ฆ๐ฆ )2 , (3.10) ๐ฅ=1 ๐ฆ=1 where ๐๐ฅ๐ฅ and ๐๐ฆ๐ฆ are the second derivatives of an image ๐ผ in ๐ฅ and ๐ฆ directions respectively. Nayar and et al. [29] have noted that second derivative in two directions can cancel each other due to opposite signs. Therefore, they have modified the Laplacian operator by adding the absolute value of two directional second derivatives. ๐ต๐ = ๐ โ ๐ โ (โฃ๐๐ฅ๐ฅ โฃ + โฃ๐๐ฆ๐ฆ โฃ), (3.11) ๐ฅ=1 ๐ฆ=1 Energy of Discrete Cosine Transform (DCT) Coefficients: Local estimate of the energy provides the information of blur because the energy of the image decreases as the amount of blur increases. Shen et al. [35] suggested to use the DCT coefficients to estimate the energy. Blur estimate is defined as the ratio between low frequency coefficients and high frequency coefficients. As shown by authors, this blur measure is more robust against the noise. 50 3.3.2.2 Comparison of Blur Measures The blur measure methods must fulfill the following properties, โ independent of image content โ monotonic behavior against the blur โ large variation against the minimum change in blur โ robust against noise โ and low computational complexity. To compare the blur measures, an ideal step edge is blurred with Gaussian filter of varying sigma values from 0.1 to 5 with equal spacing, as shown in figure 3.6. Figure 3.7 shows the estimated amount of blur through different methods, Gaussian derivative, sum of squared gradients, sum of modified Laplacian and DCT ratio. Sum of Laplacian and sum of squared gradients have more steeper response for lower sigma values, however, for larger sigma values, these are less varying, which results into more sensitivity to noise. Gaussian derivative and DCT ratio provide more linear response for full range of sigma values. Normalized Intensity 1 0.8 Edge Height (Contrast) 0.6 Edge Range 0.4 0.2 0 5 10 15 20 Figure 3.6: A step edge blurred with Gaussian blur of varying standard deviation. Most important aspect of a blur measure method is its accuracy in case of higher noise. For the comparison of blur measures in case of noise, an ideal 51 Normalized Blur Measure 1 Gaussian Derivative Tenengrad Laplacian DCT Ratio 0.8 0.6 0.4 0.2 0 0 1 2 3 4 5 Standard Deviation of Gaussian (๐) Figure 3.7: Normalized blur values of different type of blur measures. step edge is defined with a low contrast of 50 gray values ranging from 900 to 950 for a 10 bit digital sensor. Since sensor noise is intensity dependent, therefore, higher mean value with low contrast results in higher noise (lower contrast to noise ratio (CNR)). In the next steps, the Gaussian blur of varying sigma values is applied to the edge followed by noise addition according to the noise model discussed in section 2.7. Figure 3.8 shows the root mean square error (RMSE) of different blur methods discussed before. The solid line shows the blur measure value for noise free case. As it can be seen, Tenengrad and Laplacian methods have a large RMSE at smaller sigma values. Gaussian derivative and DCT methods are less sensitive to noise, and their mean value is also equal to the noise free case. Whereas, Tenengrad and Laplacian methods have different mean value in case of noise. This shift in the mean value makes it impossible to retrieve correct blur value through post processing smoothing operations. Although the Gaussian derivative and DCT based method provide robust blur estimate in case of noise, but the spatial resolution of blur estimate becomes smaller, as the blur operators work on relatively larger neighborhood to estimate the blur. On the other hand, Tenengrad and Laplacian operators provide higher spatial resolution due to small operator range, but there is an offset in blur estimate. However, the depth from defocus algorithms estimate 52 the relative blur between two images, therefore, the offset in the blur estimate of individual images does not affect the relative blur estimate. For that reason, Tenengrad operator, with a small modification for contrast invariance, is used in this work to estimate the depth. (a) Gaussian derivative. (b) DCT Ratio. 1.5 Normalized Blur Measure Normalized Blur Measure 1.5 1 0.5 0 0 1 2 3 Sigma 4 1 3 2 Sigma 4 5 (d) Sum of modified Laplacian. 1.5 Normalized Blur Measure 1.5 Normalized Blur Measure 0.5 0 0 5 (c) Tenengrad operator. 1 0.5 0 0 1 1 3 2 4 5 1 0.5 0 0 1 Sigma 3 2 4 5 Sigma Figure 3.8: Normalized blur values of different blur measures method plotted with the RMSE for low CNR. 3.3.2.3 Contrast Independent Blur Measure As the goal of this work is to estimate the depth from color channels, therefore, measure of the blur must be independent of varying contrast of color images. The first choice is to normalize the image content to equal level. This can be done by scaling each edge between its maximum and minimum value, which is the local contrast. If the local contrast is computed with the window size 53 of equal to twice of edge range, normalization of complete edge would be consistent. The normalization process is defined as, ๐ ๐ ๐ผ๐๐๐ = min min ๐ผ(๐ฅ, ๐ฆ), ๐ฅ=1 ๐ฆ=1 ๐ ๐ ๐ฅ=1 ๐ฆ=1 ๐ผ๐๐๐ฅ = max max ๐ผ(๐ฅ, ๐ฆ), ๐ผ๐ (๐ฅ, ๐ฆ) = ๐ผ(๐ฅ, ๐ฆ) โ ๐ผ๐๐๐ , ๐ผ๐๐๐ฅ โ ๐ผ๐๐๐ (3.12) (3.13) (3.14) where ๐ผ(๐ฅ, ๐ฆ) is the image and ๐ผ๐ (๐ฅ, ๐ฆ) is the normalized pixel. The difference of maximum and minimum value (๐ผ๐๐๐ฅ โ๐ผ๐๐๐ ) defines the contrast of the edge. This contrast measurement is very sensitive to the noise. This can be improved by using the p-quantile filters instead of max-min operator. Since, gradient operators are bandpass filter and remove the DC value, therefore, instead of normalizing the image, computed gradients can be normalized with the local contrast. Finally for gradients based operators, the final blur measure is given as, ๐ต๐ (๐ฅ, ๐ฆ) = ๐ โ ๐ โ ๐ฅ=1 ๐ฆ=1 3.3.2.4 (๐๐ฅ2 + ๐๐ฆ2 ) , (๐ผ๐๐๐ฅ โ ๐ผ๐๐๐ )2 (3.15) Depth Estimation from Blur Measures Figure 3.9 shows the flow diagram of the DfCA algorithm. Normalized gradients are computed for each color channel and multiplied with the edge map to consider only sharp edges of the image. Then the ratio of all three blur measures are taken to the compute the estimate the relative depth. Conventional color sensors capture three colors, red, green and blue. Therefore, we have three defocused images for depth estimation, which make it possible to estimate the depth for a larger distance range as compared to DFD system, where we normally use only two images. Here, it is proposed to take the normalized ratio of all three colors, to get a single depth map for a broader range in the following way; ๐ถ (๐๐๐๐กโ) = ๐ต๐๐2 โ (๐ต๐๐ × ๐ต๐๐ ) , ๐ต๐๐2 + (๐ต๐๐ × ๐ต๐๐ ) (3.16) 54 RGB Image Edge Map Local Contrast Gradient Edge Map Local Contrast Gradient Edge Map Local Contrast Gradient / / / Averaging Averaging Averaging x x x Calibration Curve Normalized Ratio Interpolate Depth Map Figure 3.9: Flow diagram of depth from chromatic aberrations algorithm. where ๐ต๐๐ are the blur measures of color image ๐ = {๐, ๐, ๐} and ๐ถ is the calibration curve used to estimate the absolute depth. Figure 3.10 shows different combination of ratios of blur measure for the distances from focus position of blue to focus position of red image. The combined ratio of all three colors is the best in terms of steeper slope for larger distance range. 3.3.3 Analysis of Depth Errors 3.3.3.1 Practical Issues in DFCA The effect of noise on the accuracy of depth estimation has been discussed in the section of blur measures. There are some other practical issues which 55 ๐ต๐๐ .๐ต๐๐ โ๐ต๐๐2 ๐ต๐๐ .๐ต๐๐ +๐ต๐๐2 0.6 ๐ต๐๐ โ๐ต๐๐ ๐ต๐๐ +๐ต๐๐ Blur Ratios 0.4 ๐ต๐๐ โ๐ต๐๐ ๐ต๐๐ +๐ต๐๐ ๐ต๐๐ โ๐ต๐๐ ๐ต๐๐ +๐ต๐๐ 0.2 0 โ0.2 โ0.4 0 500 1000 1500 2000 2500 3000 Distance [mm] Figure 3.10: Ratios of blur measure with different combinations of colors effects the accuracy of depth estimation. Window Size of Blur Measure Different kind of blur operators work in the local neighborhood (window) to compute the amount of blur. The size of window is kept fixed to keep the minimum computational complexity. Since depth changes across the image, it is desirable to use the minimum possible window size for high resolution depth map. But it results in higher depth error. Another problem arises due to different blur sizes in two defocused images. If fixed window size is used, one of the image would contain spurious data at the border of the edges. However, it may be possible that two edges of different distances are separated by the number of pixels less than the size of the window. In this case the depth estimate is not reliable. Window size could be made adaptive on the cost of computational complexity. Texture Dependent Blur Measure Most of the blur measures are dependent on the texture of the scene. Since, difference of MTF varies with the spatial frequency in the image and blur measure operators behave as a broadband filter, therefore textures containing different frequencies results in different depth value. Watanabe et al. [42] have designed the rational filters 56 to make the operators texture invariant. They have modeled the filter for a box type of defocus blur. For a real lens design, it is hard to design texture invariant operators because the change in MTF different is not linear. 3.3.3.2 Theoretical Issues in DFCA Similar to other depth estimation methods, there are some theoretical problems which affect the performance of DFCA method. These issues are discussed as follows. Figure 3.11: Two images captured under strong red light (top left) and white light (bottom left) illumination. The images with white color correction are shown here (right). Narrowband Object Spectrum Depth from chromatic aberrations method estimates the depth from color images. However if the object does not contain the colors of a complete human visible spectrum, it is hard to estimate depth. This property put the constraint on the objectโs reflectance spectra to be broadband. Fortunately, natural scenes have a broadband object spectra as studied by [32]. Lighting or illumination of a scene also changes the reflectance spectra. In case of narrowband object spectra, depth estimate is not accurate due to broadband spectral response of color filter arrays. Figure 3.12 shows the broadband 57 color filter responses and a narrowband object reflectance spectra, which has more dominant red color. In this case the amount of blur in green channel of the image would be very similar to red color because major part of green color is coming from larger wavelengths. The effect is analyzed by capturing a black and white object with lens exhibiting strong chromatic aberrations. In first case, the object is illuminated with white light and in second case strong red light is projected on the object. Figure 3.11 shows the captured images of both cases, along with the white balanced image. The edge profile is plotted in figure 3.13 for both cases, after balancing the white color. Note that in case of white illumination, the green edge has steeper transition (a sharper edge) as compared to red edge. On the other hand, the amount of blur in both red and green color is similar for red light illumination. Quantum Efficiency 0.8 0.6 B(๐) R(๐) G(๐) 0.4 Object Spectra 0.2 0 400 450 500 550 600 650 700 Wavelength [nm] Figure 3.12: An example of color filter with broadband spectral responses and the reflectance spectrum of an object with strong red content. One of the solutions to avoid this problem is to use color filters with narrowband spectral response. But, this can lead into difficulty of correct color reproduction of a scene. Another solution is to reject or correct the depth values through confidence measure based on the ratio of colors. Larger ratios mean narrowband object spectra and vice versa. Field Varying Blur The optical blur known as point spread function varies from the center of the image to the outer regions of the image due to field 58 White Illumination Red Illumination Red Green 2 Red Green 2 Figure 3.13: Edge profile of color channels for red and white light illumination. dependent optical aberrations. The main causes of these aberrations are coma and astigmatism. As a result of field dependent blur, the depth estimation also changes across the field due to different blur ratios. A calibration method is described in the section 3.3.4 to correct the field dependent depth errors. Varying Magnification In DFCA method, different colors may contain different magnifications due to lateral chromatic aberrations. Therefore, images must be aligned before computing the relative blur. However, it is possible during lens design to minimize the lateral chromatic aberrations to such an extent that they donโt suffer the depth estimation. In the next chapter, a lens design is discussed where it is shown that lateral CA is reduced during the lens design optimization process. Smooth Edges The DFCA method (also DFD) assumes a sharp edge, blurred according to defocus blur, for depth estimation. In some cases, the edge behavior is not a step edge, and the depth estimate is not correct. This could happen due to shadows or motion blur. A simple confidence measure can reject these kind of edges from the estimated depth map. As we see in figure 3.4, there is at least one sharp color at each distance, which is the basic requirement of depth estimation. Based on this requirement, we can reject all depth estimates for which all three colors have a larger blur, basically greater than the maximum blur of all colors. We write this condition as, ๐ถ (๐๐๐๐กโ) = ๐ข๐๐๐๐๐ค๐ ๐๐ (๐ต๐๐ > ๐ต๐ )&(๐ต๐๐ > ๐ต๐ )&(๐ต๐๐ > ๐ต๐ ), (3.17) 59 where ๐ต๐ is the maximum blur of the all lenses, which can be calculated with the lens and sensor information. Figure 3.14 shows a relative depth estimate with and without this condition. As we see, most of the wrong depth estimates are rejected with the help of this condition. Normal edges become thinner but depth values are preserved at the center of edges. (a) Smooth edges gives wrong depth. (b) Smooth edges removed. Figure 3.14: Estimated depth for smooth edges is removed by applying the condition given in equation 3.17. 3.3.4 Field Dependent Depth Correction Since the optical blur varies across the field of view, therefore the estimated depth also varies depending the relative change in blur. Besides, blur also changes with the orientation of the image features. Figure 3.15 shows an the image with the edges pointing in the direction of center (black color edges) called as sagittal orientation, and edges which are perpendicular to sagittal called as tangential orientation (gray color edges). These two perpendicular planes, sagittal and tangential have different foci due to astigmatism in optics, which results into different blur for different orientation of edges. The effect of astigmatism increases from the center of the image to the outer field positions. A PSF which produces this kind of effect is shown in figure 3.15. As it can be seen, the PSF is not rotational symmetric which results into different blur amount in perpendicular directions. As a result of these field varying aberrations, the computed blur ratios are the function of three variables, distance, orientation of edge and location in 60 Figure 3.15: Left: Sagittal and tangential orientations are shown as black and gray edges respectively. Right: A PSF which produces different blur for different orientations of the edges. the image. Since, optical aberrations are rotational symmetric, it is assumed that the blur measure changes with the image height only. ๐ถ (๐๐๐๐กโ, ๐, ๐ ) = ๐ต๐๐2 โ (๐ต๐๐ × ๐ต๐๐ ) , ๐ต๐๐2 + (๐ต๐๐ × ๐ต๐๐ ) (3.18) Here, ๐ is the orientation of the edge in the image and ๐ is distance of a pixel from the center of the image. The equation shows that the calibration curve is a three dimensional lookup table (LUT) defined for different distances, edge orientation and image height. To interpolate between the LUT values, we need to create the indices of LUT from the estimated depth and image features. Image gradients help in getting the information of sagittal and tangential orientation of the edge as, โ1 ๐ฆ ๐ ๐ฆ โ1 ๐ = ๐ก๐๐ ( ) โ ๐ก๐๐ ( ) โ 90 , ๐ฅ ๐๐ฅ (3.19) where ๐ฅ and ๐ฆ are the image Cartesian coordinates and ๐๐ฅ and ๐๐ฆ are the image gradients in horizontal and vertical directions respectively. The computed value ๐, represents the edge orientation in degrees with ๐ = 0 representing sagittal and ๐ = 90 representing tangential orientation. In the calibration process, test images are captured for different values of ๐, ๐ and distance. Using these test images and the DFCA algorithm a three 61 (a) Depth map estimated with DFCA (b) Field dependent depth correction Figure 3.16: Field dependent depth correction of the estimated depth map from DFCA algorithm. dimensional LUT is created for the relative blur measures ๐ถ for known values of ๐, ๐ and distance. To verify the performance of algorithm, a target image is positioned at a distance of 60 cm from the camera. Figure 3.16a shows the depth computed with the DFCA algorithm. The depth map shows inconsistent depth values at different image positions and orientation of the edges. Whereas, figure 3.16b shows the depth map after field and edge orientation dependent depth correction. The depth values are more consistent across the field of view. The above calibration process is difficult, as multiple images are required at different distances to create a calibration LUT. To simply this process, the behavior of relative depth is measured for different lenses. Figure 3.17 shows the change of relative depth versus distance, measured at different image locations for multiple lenses. The results show that a single calibration image would be sufficient to predict the depth behavior at any image location. Therefore, if we have a set of measured depth curves, one of them could be selected on the basis of single captured image at any distance (preferably at focus position of green image, 40 cm in figure 3.17). This makes the process very simple as only one calibration image is used to calibrate all distances in the cameraโs field of view. Figure 3.18a shows a depth map computed for the object distance of 85 cm. It can be seen that the center of the image shows the correct mean depth, whereas all other image positions measure wrong depth values. The mean 62 80 Actual Depth [cm] 70 60 50 40 30 20 -0.5 -0.3 -0.1 0.1 0.3 0.5 Normalized Blur Measure Ratios Figure 3.17: Relative depth measured for different lenses at different image locations. depth value of a complete depth map is 53 cm which has large deviation from the centered region which has mean depth value of 86 cm. Figure 3.18b shows the depth map after field dependent correction. In this case, only one image is used to calibrate the DFCA method instead of multiple images as are used in the figure 3.16b. The mean depth value of complete depth map 83 cm is much closer to the actual distance of the object. (a) Depth map estimated with DFCA (b) Field dependent depth correction Figure 3.18: Field dependent depth correction of the estimated depth map from DFCA algorithm using only one image for the calibration. 63 3.4 Dense Depth Map All passive depth estimation methods compute the depth from image features. In most of the cases, a scene contains large homogeneous regions, where no depth estimation is possible. However, in some applications like object extraction, and scene re-rendering, a dense depth map is required. Therefore, to create a dense depth map, depth values from the neighboring regions must be interpolated. There have been many methods proposed in the literature to compute dense depth maps. Many methods uses the image segmentation to assign the single smoothed depth value to the complete segment. Bae et al. [2] has proposed to use the colorization using optimization method [24] to fill the depth maps, after refining the depth map with the cross bilateral filter. In this work, two methods are investigated, the segmentation and the propagation using optimization based methods. 3.4.1 Dense Depth Map by Segmentation Image segmentation methods divide the image into different regions of similar features. After segmentation, one knows which pixels belong to which objects. Details of different segmentation techniques can be found in [34]. Since this work doesnโt focus on the image segmentation, therefore, it is assumed that the image can be segmented with any suitable algorithm. Hence, there is a segmented image available and the corresponding depth map at only edges of the image. The task of this work is to compute the high resolution smooth dense depth map. The method utilized in this work includes following steps: 1. Create the histogram of depth values for each image segment. 2. Smooth the histogram by applying running averaging filter. 3. Search the maximum peak of the histogram and select the corresponding depth value 4. Assign the selected depth value to all pixels belonging to that segment 5. Repeat step 1 to 4 for all segments in the image 64 6. Apply cross bilateral filter [10] on the filled depth map. The weights of the range filter are taken from intensity image. Although segmentation based method provides high resolution dense depth map, its accuracy depends on the performance of the segmentation algorithm. Most of the automatic segmentation techniques fails to provide good segmentation of image. Intensity Image Depth Map Median based Downsampling Depth fill using optimization method Joint Bilateral Upsampling Dense Depth Map Figure 3.19: Dense depth map is generated by optimization based method followed by joint bilateral upsampling 3.4.2 Dense Depth Map by Optimization Levin et al. [24] proposed to fill the gray scale image with the given color using an optimization technique. The colors are provided by the user as scribbles. Bae et al. [2] applied the same method to fill the depth maps, where estimated depth values at edges or textured areas are considered as a scribbles. The optimization problem tries to minimize the difference between estimated depth value and the weighted average of neighboring depth values. The weights are assigned according to the intensity and color of the neighboring pixels. The constraint is imposed that neighboring pixels have similar depth value if the color and intensity are similar. The optimization problem is solved through least square optimization of linear system of equations. The computational complexity of such algorithm is much higher for high resolution images, as the number of equations are equal to the number of pixels in the image. 65 (a) Original Image (b) Depth from CA (c) Segmentation Based (d) Optimization Based Figure 3.20: The dense depth map is generated by the segmentation and optimization based methods. Optimization problem can be solved on a downsampled low resolution depth map, followed by joint bilateral upsampling. This helps in reducing the computational complexity. Joint bilateral upsampling method is proposed Kopf et al. [20] to produce the high resolution outputs. In our case, joint bilateral upsampling is performed by taking the spatial weights of the bilateral filter from low resolution depth map and the weights of the range filter are taken from the high resolution intensity image. The algorithm flow diagram is shown in figure 3.19. Figure 3.20 shows the results of two approaches. The actual estimated depth values at strong edges are shown in figure 3.20b. The dense depth maps generated with the two algorithms are shown in figures 3.20c and 3.20d. 66 (a) Simulated image (b) Actual depth (c) Estimated depth Figure 3.21: (a) Simulated image with chromatic aberrations, (b) ground truth depth map used for testing the DfCA algorithm, (c) depth map generated with the algorithm described in section 3.3.2 (depth is estimated only at edges, and given in mm). 3.5 Results and Discussion To verify the DFCA algorithm, a lens with F-number 2.4, focal length 4 mm and the chromatic focal shift of 50 ๐m is simulated to estimate the depth from 30 cm to 2 m. A synthetic image shown in figure 3.21a is blurred according to the true depth shown in figure 3.21b. Image is then converted to Bayer format after adding the sensor noise. Figure 3.21c shows the depth map computed with the proposed algorithm. For the analysis of depth accuracy, only actual estimated depth values at edges are shown. Root mean square (RMS) error between actual and estimated depth is computed for the detected depth regions. Figure 3.22 shows the RMS error at 67 250 200 RMSE [mm] RMSE [mm] 200 150 100 100 50 50 0 150 500 1000 1500 Actual Depth [mm] 2000 0 0 100 200 300 400 500 Difference of Contrast between Colors Figure 3.22: RMSE between true depth and estimated depth at different distances and differences of contrast between colors. different distances and difference of local contrast between color images. The depth is estimated with the RMS error of 6โผ10% for a small focal length of 4 mm. The local contrast ๐ป๐ : ๐ = {๐, ๐, ๐}, is defined as the height of edge and โ difference of contrast is defined as ๐ป๐๐๐ ๐ = โฃ๐ป๐2 โ ๐ป๐ ๐ป๐ โฃ according to equation 3.16. For the edges where one of the color edge is missing due to similar foreground and background color (๐ป๐ โ 0), depth estimation is not possible. Otherwise, the proposed normalized blur meausre works well for color edges, and the average RMS error of different color edges for a distance range of 40 cm to 2 m is mostly less than 100 mm. Figure 3.23 shows the depth maps computed for real images captured with a lens having axial chromatic aberrations. The lens has a focal length of 2.66 mm and F-number of 2.8. The chromatic focal shift is 50 ๐m in the wavelength range of 450 nm to 600 nm. Results show that depth propagation works quite good for detected objects, but large homogeneous regions are not filled correctly. However, for the applications such as 2D to 3D image conversion or digital refocusing, homogeneous regions donโt have any affect on the image. DFCA results are compared with the depth maps generated with the time of flight (ToF) camera. This time different lens is used with a focal length of 4.4 mm and F-number of 2.4. The chromatic focal shift is 66 ๐m in the wavelength range of 450 nm to 600 nm. Both cameras, DFCA and ToF, are placed at the same position to estimate the depth. The results are shown in figure 3.24 with color coding. The depth estimate of DFCA is close to ToF 68 Figure 3.23: Depth estimation from axial chromatic aberrations for the real captured scenes. First row: input images captured with a lens having large chromatic aberrations. Second row: raw depth estimation using the algorithm proposed in this work. Third row: dense depth after propagating the raw depth to surroundings. camera at the objects boundaries. The depth propagation results fill the local objects quite well, but the large homogeneous regions do not show correct depth. This is one of the drawback of all passive depth estimation methods. The average depth values of each object along with the average depth value from ToF camera are given in table 3.1. The error of DFCA with the ToF camera are also given in percentage. All objects which have well defined edges show least depth error. Whereas the error for low contrast edges and texture regions is comparatively larger. 3.6 Conclusion and Outlook This chapter analyzes a method similar to the depth from defocus for depth imaging. Instead of capturing multiple defocused image, axial chromatic aber- 69 (a) (b) (c) (d) Figure 3.24: Depth estimated with the DFCA camera for different colored objects. (a) Original image, (b) DFCA depth at edges only, and (c) DFCA depth after propagating to neighboring regions. For comparison the depth from ToF camera (d) is also shown. All depth maps are shown in cm. rations are used to capture the defocused information in multiple colors with a single shot. The previous related literature has only shown the feasibility of this approach for limited experimental setups. Therefore, a thorough analysis is performed to develop a method which works well for different imaging conditions. The existing focus/blur measures were evaluated and it is shown that these measures are contrast dependent which makes them infeasible for DFCA system. A new blur measure based on the normalized gradients is proposed which is independent of local image contrast. Absolute depth is inferred from blur measures by taking the normalized ratios and calibration procedure. The depth error analysis has shown that there are some common challenges in DFD and DFCA system, for example, low texture area, sensor noise, field varying blur. For a field varying depth correction, a simple calibration procedure is 70 Image Object 1 2 3 DFCA Depth [cm] ToF Depth [cm] Error [%] Cylinder 1 30 30 0 Cylinder 2 41 42 2.4 Cylinder 3 52 54 3.7 Cylinder 4 63 67 6 Cylinder 5 98 97 1 Green Ball 32 32 0 Red Ball 62 61 1.6 Leaves 106 100 5.7 Bird 31 36 14 Leaves 50 60 17 Pumpkin 90 99 9 Table 3.1: Absolute depth estimates from DFCA camera for the images shown in figure 3.24. The error with the depth from ToF is also given in percentage. proposed. The test chart image used for calibration setup contains the information of sagittal and tangential edge orientations at different field positions. It is analyzed that the behavior of the depth change is similar for different lenses. Therefore, if this behavior is measured once or obtained from the lens design data, then a single image is sufficient to determine the field varying depth error and its correction. The main advantage of DFCA method over DFD is a single shot depth imaging system which helps in avoiding miss-registration problems and also varying image magnifications because the lateral color shift can be reduced during lens design process. The major challenge to DFCA method is narrowband object reflectance spectra. In this case, either the color information is missing or the blur is different as compared to the broadband spectra case. Since depth from defocus method only measures the depth for texture regions, therefore, a low cost implementation of two existing methods are used to create dense depth maps. One method is based on image segmentation and filling of each segment with the median depth value of this segment. Another solution uses the optimization method to propagate the depth to neighboring regions of the same intensity. Since optimization problem is computationally 71 complex, hence it is proposed to propagate the depth at very low resolution followed by joint bilateral upsampling. 72 Chapter 4 Extended Depth of Field from Chromatic Aberrations 4.1 Introduction Depth of field represents the distance in a scene for which the captured image is considered to be in focus i.e. the amount of blur introduced by optics is not perceivable in normal viewing conditions. In some cases, it is desirable to have a larger depth of field (DOF). In other cases, narrow depth of field is helpful in emphasizing the desired object in a scene. Maximum blur diameter which is visible as a point to human eye is called circle of confusion that defines the range of depth of field. Equation 3.2 is useful in determining the dependency of lens parameters on the depth of field. In terms of lens F-number ๐น #, the blur diameter is reformulated as, ( ) ๐ ๐๐๐ ๐ โ ๐๐ ๐(๐) = , ๐น# ๐ (4.1) where ๐๐๐ is the lens magnification for lens focus position ๐๐ , given by equation 3.3. The effect of lens parameters can be better observed with the plot of blur diameter versus object distances shown in figure 4.1. For an example, let the circle of confusion ๐ = 0.004 mm define the DOF. For a distance range where ๐ โค ๐, the blur is imperceptible and the details are within DOF. The lens parameters which effect the DOF are F-number, focal length and lens focus position. For a smaller F-number, larger focal length and closer focus position, there is a rapid change in the amount of blur. Therefore, the DOF is smaller 73 74 0.03 f = 4 mm, d๐ = 1 m f = 8 mm, d๐ = 1 m f = 8 mm, d๐ = 4.5 m Blur Diameter [mm] 0.025 0.02 DOF 0.015 0.01 Circle of confusion 0.005 0 102 103 104 Object distance [mm] (log scale) Figure 4.1: Blur diameter versus distance for different focal lengths and lens focus position. in these cases and vice versa. 4.2 Related Work As we observe from figure 4.1, the blur changes with the distance, hence it is difficult to restore a sharp image without the depth information. Even if the depth is known, it is hard to recover image in case of large blur, as the point spread function (PSF) has zeros in its frequency response. Therefore, many computational imaging approaches have been emerged to extend the depth of field by joint optimization of the optical system and the digital post-processing. Some of the previous methods are discussed here briefly. A common approach of achieving extended DOF is the focus stacking. In this method, multiple images are taken with different focus distances and combined digitally to extend DOF. For each image position, the sharpest pixel among multiple images is selected. The disadvantage of this approach is that it requires multiple images. Most of the computational methods try to make the PSF invariant to distance. One of the earliest approach, proposed by Häusler [15], is to make the blur of the object invariant to depth, and restore the sharp image through 75 digital processing. He obtained the depth invariant blur by moving the object along the optical axis during the exposure time of the camera. Depth invariant PSF can also be achieved through Apodization [31]. Apodization is typically obtained by central or annular obstruction in the aperture of the system. The pupil is then shaped in a way that its frequency response reaches zero for higher frequencies compared to the unaltered pupil. The major drawback is the reduced light transmission through optical system, thus leading to lower SNR. A well known method in the field of computational photography proposed by Dowski et al. [9] is wavefront coding. The main advantage is that it does not suffer with the limited light transmission. In wavefront coding, the pupil function is modified through phase modulation by putting a non-absorbing optical element like cubic-phase or cosine form-phase mask. It is possible then to get the PSF which is insensitive to defocus. In the second step, standard deconvolution is used to restore the image with only one digital filter. Although there is no light loss, but it suffers with the loss of SNR, as the PSF spreads on a larger size to make it insensitive to defocus. Depth invariant PSF can also be achieved through polarization separation, by placing the birefringent plate between the lens and the sensor [44]. The plate is designed such that two polarization states contain the in-focus far and in-focus near field information of a scene and are superimposed on each other to form the image. In this case, digital restoration is required. The major drawback in these approaches is the complex optical design as the phase mask is to be integrated with lens design. Some alternative approaches are based on the color separation. Guichard et al. [14] proposed to utilize the chromatic aberrations to capture color images with different focus positions. As different colors appear sharp at different distances, hence digital processing transfers the sharpness information of sharpest color to other colors. In this way, the resultant image is sharp for broader distance range. The sharpness transportation technique takes advantage of the spectral information redundancy inherent in images to recover information that has been lost due to chromatic blurring effects. The advantage of this method is the use of conventional optics without any light loss and degradation of SNR. However, the digital processing is quite challenging to remove all chromatic aberrations. The method trades off the extension of the DOF 76 and the loss of chrominance high frequencies. Kay et al. [17] have proposed to use the color aperture which stops the blue light to make larger depth of field for blue color image. Other colors are then made sharper using the sharpness information of blue color image. The method proposed by Bando [3] and Kim [18] codes the disparity information in color images through the color filter aperture. The extended DOF image is produced through estimated depth based deconvolution. If the lens exhibiting axial chromatic aberrations is used with the black and white (B&W) sensor, it produces depth invariant PSF, as shown by Cossairt et al. [8]. For color images, it is shown that luminance of the image have a depth invariant PSF. However, deconvolution process is required to recover the sharp image. In this work, the shortcomings of methods proposed by Guichard et al.[14] and Cossairt et al. [8] are reduced using the sensor that captures monochromatic (W) light along with the RGB colors. Moreover, on the digital side robust algorithms are proposed to produce high quality color images. 4.3 Extended DOF Using Axial Chromatic Aberration In the previous chapter, it is shown that multiple focused images could be captured with a single shot by optimal introduction of axial chromatic aberrations in the lens. Therefore, focus stacking method may be employed on multiple color images to produce larger DOF effect. However, it is not very simple like traditional focus stacking, because only one color is sharp at each image location. Therefore, instead of selecting the sharpest pixel, we need to sharpen other colors according to the sharpness information of the sharpest color. Figure 4.2 shows the blur diameters of RGB colors in case of axial chromatic aberrations, and the minimum blur diameter among all colors at each distance. If each color is made sharper according to the sharpest color, depth of field would be larger as shown in the right side plot. In the following sections, few existing methods are described to digitally restore the extended DOF image in case of axial chromatic aberrations. 77 Blur Diameter [mm] 0.03 0.025 0.02 0.015 0.01 0.005 0 102 103 Object distance [mm] (log scale) 102 103 Object distance [mm] (log scale) Figure 4.2: Left: Blur diameter for RGB colors in case of axial chromatic aberrations. Right: Minimum blur diameter among all colors at each distance. 4.3.1 Depth Dependent Deconvolution We know from previous chapter that depth can be estimated if there exists axial CA in the lens. Therefore, the simplest solution to restore extended DOF image is to deconvolve the image with point spread function which varies according to depth. However, in case larger blur, MTF of the optical system drops to zeros at higher spatial frequencies and it becomes impossible to restore the spatial resolution. Figure 4.3 shows the MTF versus distance for 90 line pairs per millimeter image resolution. As it can be seen, the MTF of red color and blur color drops to zero for near and farther distances respectively. Lim et al. [25] have described this approach and shown that the deconvolution process, for each color channel separately, does not provide good restoration. Hence, they have proposed the alternative method where the resolution of the luminance channel is enhanced by adding the high frequency information of the sharpest color to the luminance channel. Their result shows the restoration of higher spatial frequencies of blurred colors but overshoots and undershoots are also visible. Moreover, for color images, color bleeding artifacts cannot be reduced with this approach because only luminance channel is processed. 78 Normalized Through Focus MTF 1 0.8 0.6 Blue 0.4 Green Red 0.2 0 102 103 Object distance [mm] (log scale) Figure 4.3: MTF for spatial frequency of 90 line pairs per millimeter [lp/mm] versus distances. 4.3.2 Sharpness Transport Across Color Channels Guichard et al. [14] proposed the method of sharpness transport across channel to produce extended DOF using chromatic aberrations. The method includes following steps, โ Estimate the sharpness of all color channels. โ Select the sharpest color at each image location. Due to axial CA, only one color is sharp at each image location. โ Copy the sharpness information from sharpest color to other colors. The process of sharpness transport is written as, ๐ผ๐๐ = ๐ผ๐ + ๐ค × ๐ป๐ ๐น (๐ถ๐ ,๐ ) โ (๐ผ๐ ) (4.2) where ๐ผ๐ and ๐ผ๐๐ are the un-sharp color images before and after sharpness transport. ๐ผ๐ is the sharpest color whose higher spatial frequencies are extracted through high pass filter ๐ป๐ ๐น and added to un-sahrp colors after multiplying with the weight ๐ค. The strength of high-pass filter ๐ป๐ ๐น depends on the relative sharpness ๐ถ๐ ,๐ between sharpest and blurred color. 79 Although the process seems to be very simple but in practice there are many difficulties in this restoration process. The high-pass filterโs coefficients and weights depend on the lens properties which must be computed through calibration of lens. The weights not only depend on the lens properties but also on the contrast levels in color channels. Moreover, simple gradients based relative sharpness measures are prone to errors, due to varying spectral features in color images, which results in color artifacts due to over or under compensation. 4.3.3 Spectral Focal Sweep Focus sweep methods produce depth invariant blur by moving the sensor mechanically during exposure time of the image, and restore the extended DOF image through digital processing. Cossairt and Nayar [8] have studied that if the lens exhibiting axial chromatic aberrations is used in combination with the B&W sensor, the mechanical focal sweep effect can be achieved. They named this process, the spectral focal sweep (SFS), because multiple color images which are focused at multiple distances are integrating at the B&W image sensor during exposure time. For SFS camera, the most important requirement is the broadband object reflectance spectra. The more broader the spectra, the wider DOF would be. However, it is shown in the study of Parkkinen et al. [32] about munsell colorโs reflectance spectra that most real-world objects have broadband object spectra. For SFS camera it is shown that the PSF is depth invariant in most of the cases and 95% of these colors produce minimal deblur error that does not introduce any signifcant artifacts. Although the SFS camera is primarily proposed for B&W image, the results of RGB images are also shown through deblurring of image luminance. Images look sharper but color bleeding effects are visible in the images which makes this method unsuitable for high quality imaging. 4.3.4 Eliminating Color Difference There have been many methods developed in the past to correct the chromatic aberrations through digital image processing. One of the most promis- 80 ing method is suggested by Chung et al. [7]. Their method first analyzes the color difference signal at the edges and after detecting the chromatic aberration, the color difference is removed for the correction of CA. The method is very similar to [19] suggested by Konvesky. Both methods consider the green channel as a reference channel for correction. In this work, these methods are extended by taking the sharpest channel as the reference channel for color difference correction. As a result, always the sharpest color information is preserved and hence extended DOF is achieved. The advantage of these approaches is that they work only at edges therefore noise is not enhanced. Also, in the correction process there are no parameters which depend on the image sharpness measure which is error prone. However, the methods work at their best when only one edge exists in a local region. For multiple edges or higher spatial frequencies, these methods are unable to correct axial CA. The reason is that the correction value cannot be correctly calculated due to loss of contrast at higher spatial frequencies. 4.4 RGBW Sensor with Chromatic Aberrations All state of the art methods have some limitations, or computationally complex for high quality image restoration. In this section, a method is suggested which produces better image quality without producing any significant color artifacts. Basically, the best of two methods, SFS and sharpness transport have been achieved through the use of RGBW sensor [23]. RGBW sensor contains one white (monochromatic) pixel, which is analogous to B&W image sensor, along with RGB pixels. Hence, the complete benefits of SFS approach may be utilized in combination with the correction of color bleeding artifacts to produce high quality images with wider DOF. Figure 4.4a shows the color filter (CF) spectral response of RGBW sensor. As the spectral response of white pixels is broadband i.e. more light is captured, therefore, the introduction of white pixel in the image sensor aimed at producing low noisy images under low lighting conditions [23]. Wang et al. [40] have shown the usage of RGBW sensor to produce high quality deblurring of the images. Figure 4.4c shows the geometrical blur diameter for the color filter responses 81 (b) Chromatic focal shift [mm] (a) Quantum Efficiency 1 0.8 0.6 0.4 0.2 0 400 500 600 700 50 0 -50 -100 400 500 600 700 Wavelength [nm] Wavelength [nm] (c) Blur Diameter [mm] 0.03 0.025 0.02 0.015 0.01 0.005 0 102 103 Object distance [mm] (log scale) Figure 4.4: a) Spectral response of color filter array, b) chromatic focal shift in the visible range of spectra and c) the blur diameter of red, green and blue colors calculated by averaging the blur diameter of all wavelengths according to the color filter array spectral response. and the chromatic focal shift shown in figure 4.4a and 4.4b, respectively. Two different CF responses are used to observe their behavior on the blur. One CF response is broadband which is normal case in RGB cameras, and the other has narrowband response. The blur diameter of each color RGBW is calculated by weighted summation of blur diameters of all wavelengths according to color filter response. Note that, the blur is not same for two different CF responses. The CF with broadband response has larger blur, specially for best focus position. Thus, the sharpest transport cannot produce sharp images even for 82 the best focus positions with braodband CF response. The blur diameter of white channel is mostly invariant for large range of distances. Hence, the deconvolution of white channel with its PSF helps in restoring the sharp white image. Since the spectral response of white image is broadband and have higher sensitivity, it helps in restoring the white image with relatively higher SNR. There could be different alternative methods to combine the restored white channel with the color corrected color channels. In this work, restored white channel is only used to extract higher spatial frequencies to sharpen the RGB colors. The algorithm contain the following steps: โ Restore the white channel through deconvolution with its depth invariant PSF. The restored white channel is considered as the fourth color image. โ Compute the sharpness of all four images. โ Divide the image into possible four regions where in each region one color is sharp. โ For each pixel, compute the relative sharpness between sharpest color and other colors. โ Select three high-pass filters according to the relative sharpness measures โ Extract higher spatial frequencies of sharpest color and copy them to other colors. The proposed algorithm gives larger depth of field as compared to the individual SFS and sharpness transport methods. As it can be seen in figure 4.4c, the restored white image would be still blurred at the focus position of blue color. Moreover, the sharpness transport can restore the images to the best sharpness of each color, which is already blurred due to chromatic aberrations. Therefore, the combination of restored white image with the sharpness transport among all colors produces sharper image for near distances and also for farther distances. 83 4.5 Low Cost and Efficient Implementation of Proposed Algorithm The proposed algorithm requires the computation of relative blur, higher spatial frequencies and deconvolution. Figure 4.5 shows the detailed flow diagram of the algorithm. Each computational step is discussed here individually. 4.5.1 Relative Blur Estimate Relative blur measure is similar to the relative depth measurement as discussed in section 3.3.2.4. But here, instead of measuring the depth, we require only the estimate of relative blur between colors. Since, different colors contain varying contrast level, therefore, first sharpness is measured according the equation 3.15 which is independent of image contrast. The measured sharpness values are converted to the standard deviation ๐ of Gaussian blur because the lens blur can be approximated with Gaussian blur. The ๐ and sharpness values are directly related as, ๐๐ = ๐ × 1 , ๐ต๐๐ (4.3) where ๐ is a constant factor and ๐ = {๐ , ๐บ, ๐ต}. The standard deviation of blur difference between sharpest and blurred image is given as, โ ๐๐ ,๐ = ๐๐2 โ ๐๐ 2 , (4.4) which is used to copy the sharpness information from sharpest color to blurred color. 4.5.2 Adaptive High-Pass Filtering The sharpness transport method extracts the higher spatial frequencies by applying the high-pass filter and copy them to other colors. The goal is to make the sharpness/MTF of the blurred color similar to the sharpest color. Therefore, high-pass filter must be designed to extract the high frequencies which represent the difference of two colors MTF. For this reason, the blur difference ๐๐ ,๐ is used as the basis of selecting the high-pass filter characteristics. 84 Deconvolution W R G B Sh Sharpness M Measure Extract Higher Spatial Frequencies of Sharpest p Color Add High Frequencies to Blurred Colors Figure 4.5: Processing flow of extended DOF algorithm using RGBW sensor. 85 Higher spatial frequencies are extracted according to the Gaussian low-pass filter as, ๐ผ๐ป = ๐ผ โ ๐ฟ๐ ๐น๐บ (๐ถ) โ ๐ผ; (4.5) where ๐ผ๐ป are the high frequencies of the image ๐ผ, and ๐ฟ๐ ๐น๐บ is the Gaussian filter depending on the relative blur. The equation of two dimensional Gaussian filter is, ๐บ(๐ฅ, ๐ฆ) = 1 โ ๐ฅ2 +๐ฆ2 2 ๐ 2๐ , 2๐๐ 2 (4.6) where ๐ฅ is the distance from the origin in the horizontal axis, ๐ฆ is the distance from the origin in the vertical axis, and ๐ is the standard deviation of the Gaussian distribution. In case of three colors, image can be divided into three regions, where in region one of colors is sharp. Afterwards, sharpness information is transferred to blurred colors. Since, the blur varies continuously with the depth, therefore only one filter for each sharp color is insufficient to completely remove the blur difference. To implement continuously varying sharpness transport, coefficients of Gaussian filter may be computed at run time for each relative blur value or stored in a lookup table for sampled values of relative sharpness. To reduce the computational complexity, the filtering process is applied through interpolation of Gaussian blur using principal component analysis (PCA). Gaussian filters are designed for a sampled sigma values in the range of ๐ = 0 and ๐ = 8 and these are modeled with the weighted summation of the basis Gaussian filter which are computed through PCA. Then the missing Gaussian filter are approximated through the interpolation of the weights at the run time. Details of the method are described in section 2.5.1.1. The PCA based method is much faster as compared to other methods as shown in the the section 2.1. 4.5.3 Deconvolution The image captured by the camera can be written as a convolution process between camera blur and input image. Deconvolution is the process used to reverse the effect of the convolution. Similar to SFS method as discussed before, the blur of the white image is distance invariant for specific range of 86 distances. Hence, the deconvolution of captured white image with the blur i.e. point spread function ๐๐ค of white image restores the sharp image. The process may be written as, ๐ผ๐ค๐ = ๐ผ๐ค โ ๐๐คโ1 , (4.7) where ๐ผ๐ค๐ is the deconvolved image and ๐๐คโ1 is the inverse of the PSF. Theoretically, the deconvolution process restores exact image, if correct inverse PSF is used. However, in practice it is impossible to reverse the convolution process because of the noise added in the capturing process. The restoration process becomes worst in case of large blur, where the frequency response of the blur approaches to zeros at higher frequencies. In this case the restoration process results in ringing artifacts around the edges due to over enhancement of frequencies which are lost in the capturing process. The frequency response for the PSF of white image does not approaches to zero for a larger distance range. Hence, image can be restored without artifacts but with low SNR, as the noise is enhanced in the restoration process. Since, the white image is used only to extract high frequencies to sharpen the blurred colors, hence the noise enhancement does not have much effect on the final image quality. The loss in SNR depends on the strength of the inverse filter. As given by Sherif et al. [36], the SNR of the restored image with respect to unrestored image is defined as ๐ ๐๐ ๐ ๐ค = ๐๐ ๐ ๐ค โโ โ ๐ฅ (๐๐คโ1 (๐ฅ, ๐ฆ))2 , (4.8) ๐ฆ ๐ where ๐๐ ๐ ๐ค is the SNR of the restored image. 4.5.4 Contrast Dependent Sharpness Transport Local contrast of images varies in different colors. In this case, the weights ๐ค given in equation 4.2 must be adapted according to the contrast levels. The contrast of the image may be measured by taking the difference of local minimum and maximum values as shown in the equation 3.12 and 3.13. The ratio of contrasts of two colors is then used as a weighting factor during the 87 transfer of higher spatial frequencies. The weights are calculated as, ๐ค1 = ๐ ๐ โ ๐ผ๐๐๐ ๐ผ๐๐๐ฅ , ๐ ๐ ๐ผ๐๐๐ฅ โ ๐ผ๐๐๐ (4.9) where ๐ผ ๐ and ๐ผ ๐ represent the sharpest and blurred colors respectively. In case of very low contrast in sharpest or blurred color, the sharpness transport can produce color artifacts. The same thing happens when one of the color is completely missing. 1 Weights 0.8 0.6 0.4 0.2 0 50 100 150 200 250 Edge Contrast Figure 4.6: Weights versus local edge contrast to reduce the strength of sharpness transport at low contrast levels. We introduce another weighting factor, which reduces the strength of sharpness transfer in case of low contrast. The weights are defined as, ( ) ( ) (๐ผ ๐ โ๐ผ ๐ ) (๐ผ ๐ โ๐ผ ๐ ) โ ๐๐๐ฅ ๐ผ ๐๐๐ โ ๐๐๐ฅ ๐ผ ๐๐๐ ๐ค2 = 1 โ ๐ × 1โ๐ . (4.10) Figure 4.6 shows the weights ๐ค2 versus edge contrast. All edges with the contrast higher than ๐ผ get relatively higher weights. That means sharpness transport will not be effected by this weighting factor. Total weights ๐ค are the multiplication of ๐ค1 and ๐ค2 . ๐ค = ๐ค1 × ๐ค2 . (4.11) 88 4.6 Results and Discussion The algorithm is verified through synthetic and real captured images. The multispectral images are used to precisely simulate the optics and sensor properties. Point spread functions for 31 equally spaced wavelengths between 400 nm and 700 nm are generated and applied to the multispectral images. In the next step, the sensor MTF is applied and noise is added. Finally, multiple color images are averaged according to the color filter responses to produce the RGB and white (W) color images. Figure 4.7 shows the simulation flow diagram. PSF for each Spectrum Multispectral Images Optical Blur to each Spectral Image Multispectral to CFA colors Sensor MTF and Noise Raw Image Figure 4.7: Simulation of optical and sensor properties. For a verification of the algorithm performance, a lens with focal length of 4 mm and F-number of 2.4 is used with the sensor of pixel size 2.2 ๐m. For a lens exhibiting axial chromatic aberrations, the total chromatic shift in the visible range of wavelengths is 112 ๐m. The spectral sensitivity of color filters is shown in figure 4.4a. Hyperspectral images are taken from the CAVE multispectral database which contain 31 images of wavelengths ranging from 400 nm to 700 nm with equal spacing. Figure 4.8 shows the results of a lens without chromatic aberrations along with the images of a lens which exhibits axial chromatic aberrations. Figure 4.9 shows the results of extended DOF algorithm and conventional lens. The focus positions of the lenses are set such that objects at infinity are sharp. For that reason the extended DOF effect is more visible at near distances. The extended DOF images are much sharper and without any color artifacts, especially for 30๐๐, whereas the conventional lens shows blurred image for the same distance. Figure 4.10 shows the result of extended DOF for real captured image. 89 (a) 30 cm (b) 30 cm (c) In-focus (d) In-focus (e) 3 m (f) 3 m Figure 4.8: Simulation results of a lens without (a,c,e) and with chromatic aberrations (b,d,f). 90 (a) 30 cm (b) 30 cm (c) In-focus (d) In-focus (e) 3 m (f) 3 m Figure 4.9: Simulation results of a lens without chromatic aberrations (a,c,e) representing a conventional lens. The images b, d, and f shows the extended DOF result generated through the proposed extended DOF algorithm. 91 4.7 Physical and Practical Limitations The method discussed in the previous section is a simple way to restore an extended DOF image using chromatic aberrations. As it has been discussed before, the algorithm works best in case of broadband object reflectance spectra. Another important factor which determines the performance of sharpness transport is the accuracy of sharpness measure. Due to wrong computation of sharpness value, color artifacts appears in the final image. The physical and practical limitations of the system and there impact on the image quality are discussed here. 4.7.1 Narrowband Object Reflectance Spectra Narrowband object spectra results in a color image with one of the RGB color is dominant as compared to other colors. In case of chromatic aberrations, if blue or red color is weak at near and far distances respectively, the sharpness information is completely lost. As a result, the sharpness transport method fails to produce sharper objects for these cases. The SFS method also suffers from deblurring artifacts, because the amount of blur is different as compared to the object which has broadband spectra. 4.7.2 Loss of Contrast at Higher Frequencies The weights calculated for sharpness transport are not optimum at higher spatial frequencies. If the spatial resolution of structures or textures in the image is less than the amount of blur in any one of the colors, the contrast information is lost. As a result, the calculated ๐ค doesnโt represent the original contrast of the image. This limitation puts the requirement of more sophisticated algorithm for the sharpness transport method. 4.8 Conclusion and Outlook In this chapter, the method of obtaining extended depth of field image using chromatic aberrations is described. Since there is already some work being done on these methods, therefore solutions to some existing challenges are 92 presented. One of the existing method [8] only increases the MTF through deconvolution and doesnโt correct chromatic aberrations. The other method [14] corrects the chromatic aberration according to sharpest color but the infocus position still remains blurred as shown in section 4.4. In this work, both methods are combined using the RGBW sensor. As the correction of color aberrations requires the relative sharpness information, therefore, the method developed in chapter 3 provides better relative sharpness information as compared to the existing methods. A low cost and efficient implementation of the sharpness transport algorithm is also presented in section 4.5. The method developed in chapter 2 for space variant filtering is utilized here to extract sharpness information at each pixel of the sharpest color according to continuously varying relative sharpness between sharpest and blurred colors. 93 (a) Image with axial CA (b) original (c) Corrected (d) Original (e) Corrected Figure 4.10: Chromatic aberrations are corrected from the image that is captured with the lens exhibiting large chromatic aberrations. 94 Chapter 5 Optimal Lens and Sensor Characteristics for Depth and extended DOF using Chromatic Aberrations An important aspect in the performance of depth estimation and extended depth of field imaging is camera system properties. For a depth from defocus system, lens and sensor play a role in determining the accuracy of depth estimation. Change in the defocus blur and sensor resolution determines the axial resolution of depth maps. For an extended DOF, the amount of chromatic aberrations defines the range of extended DOF. In this chapter, the affect of lens and sensor parameters on the performance of depth estimation and extended DOF are studied. The depth from CA, which is the target of this work, is similar to depth from defocus system, therefore, most of the camera parameters have same impact on DFD and depth from CA system. Only the chromatic focal shift and properties of color filter arrays are specific to depth and extended DOF method using CA. 5.1 Axial Resolution of Depth from Defocus Depth estimation algorithms first estimate the amount of blur in each defocus image and then compute the relative depth estimate. Therefore, it is preferable 95 96 to analyze the depth resolution of a single defocus image. The accuracy of depth could be defined in terms of minimum detectable difference ๐ฟ๐, between two distances. Therefore, we derive the relationship between optical parameters and ๐ฟ๐ to observe the effect of optics on depth resolution. Assuming the photographic lens where the object distance ๐ is much larger than focal length ๐ i.e. ๐ >> ๐ , the lens magnification is given as ๐ = ๐ /๐๐ . Hence, the blur disk diameter can be written as, ๐2 ๐(๐) = ๐น# ( 1 1 โ ๐๐ ๐ ) , (5.1) Now taking the derivate of the blur with respect to distance ๐ gives, ๐ฟ๐ ๐2 = , ๐ฟ๐ ๐น #๐2 ๐ฟ๐ = ๐น #๐2 ๐ฟ๐ , ๐2 (5.2) (5.3) where ๐ฟ๐ is the change in the blur which is detectable in the case of noise. The equation is very similar to the one derived by Blayvas et al. [4]. Equation 5.3 shows that ๐ฟ๐ is directly proportional to the F-number and inverse proportional to the square of focal length. A very noteworthy relationship of depth resolution is with the distance of the object. ๐ฟ๐ increases with the square of the object distances. Therefore, DFD system with conventional photographic lenses is unable to differentiate the distances of far objects. The important parameter in equation 5.3 is ๐ฟ๐, the minimum detectable blur difference in case of noise, or in other terms, the blur estimation error. From the depth resolution equation derived by Blayvas et al. [4], ๐ฟ๐ is slightly greater than the maximum of spatial resolution of sensor and optics. This is the case for an image with high SNR. For a more practical case, the standard deviation of blur measurement defines the parameter ๐ฟ๐. More details of the camera parameters will be discussed in the next section. Depth resolution given by equation 5.3 is very similar to the depth resolution of stereoscopy derived in [16] as ๐ฟ๐๐ ๐ก๐๐๐๐ = ๐2 ๐ฟ๐ , ๐๐ (5.4) 97 where ๐ฟ๐ is the error in the estimated disparity and ๐ is the distance between optical axis of two camera lenses (stereobase). Equations 5.3 and 5.4 show that the depth resolution decreases with the square of the object distance for both stereoscopy and DFD systems. Moreover, if the depth estimation errors ๐ฟ๐ and ๐ฟ๐ are same for both systems, they can provide the similar depth resolution if the stereobasis is equal to aperture diameter ๐ด. ( ๐น๐# = ๐ด). 5.1.1 Depth Resolution of DFD using Two Images In the previous section, we have studied the defocus behavior of a single image, and its effect on the depth resolution. However, in most DFD methods, two defocused images are used to compute the relative depth. Let ๐๐ (๐) and ๐๐ (๐) define the blur diameters for two near and far focused images. ๐2 ๐๐ (๐) = ๐น# ( ๐2 ๐๐ (๐) = ๐น# The relative depth is defined as, ( ๐ถ(๐) = 1 1 โ ๐๐ ๐ ) 1 1 โ ๐๐ ๐ ) (5.5) ๐๐ (๐) โ ๐๐ (๐) , ๐๐ (๐) + ๐๐ (๐) (5.6) (5.7) The relative depth is only unique for the distances between the focus positions ๐๐ and ๐๐ of two images. Since the denominator part is just used to normalize the differences, therefore, we only consider the numerator part for the analysis of depth resolution. From equations 5.5 and 5.6, the numerator part is given as, ( ) ๐2 2 1 1 ๐๐ (๐) โ ๐๐ (๐) = โ โ ๐๐ โค ๐ โค ๐๐ . (5.8) ๐น # ๐ ๐๐ ๐๐ Taking the derivative with respect to distance, and substituting ๐ถ for ๐๐ โ๐๐ , we get, ๐ฟ๐ถ 2๐ 2 = . ๐ฟ๐ ๐น #๐2 ๐ฟ๐ = ๐ฟ๐ถ๐น #๐2 . 2๐ 2 (5.9) (5.10) 98 Equation 5.10 shows that the relative depth resolution has two times less resolution as compared to single image i.e. ๐ฟ๐ถ = 2๐ฟ๐. This is obvious due to addition of noise sources of two images in the case of relative depth estimation. The major advantage of using two images is to make the relative depth estimation invariant to local image features. 5.2 Optimal Parameters for Depth Estimation In the previous section, we have derived the axial resolution of depth from defocus system. The equation helps in selecting the parameters of the lens to obtain the required depth resolution. In the next subsections, we analyze different parameters of lens and sensor which affect the depth performance. As the goal of this work is to estimate the depth by deliberately introducing the chromatic aberration in a lens which also degrades the image quality, therefore, this type of lens is only feasible for low cost imaging systems. For that reason, we will discuss the parameters relevant to low cost imaging optics. 5.2.1 Focal Length and F-number For a desired depth resolution ๐ฟ๐ according to application requirements, focal length and F-number may be selected during the lens design or selection process. Equation 5.3 shows that the lower F-number and larger focal length gives finer depth resolution. The design of low F-number lens is more complex and it also makes the optics expensive. Therefore, the typical value of F-number which is used for low cost imaging lenses may be selected, which is around 2.4. We have more freedom in choosing the optimum value of focal length to fulfill the desired depth accuracy requirements. Figure 5.1 shows the depth performance for different focal lengths with ๐น # = 2.4 and ๐ฟ๐ = 0.4 ๐m. Note that increasing the focal length twice gives the four times finer depth resolution. Equation 5.10 helps in selecting the minimum value of focal length which is required to get the desired depth resolution ๐ฟ๐ at certain distance ๐. โ ๐= ๐ฟ๐ถ๐2 ๐น # . 2๐ฟ๐ (5.11) 99 10 f = 2 mm f = 3 mm f = 4 mm Depth Error [m] 8 6 4 2 0 0.5 1 1.5 2 2.5 3 3.5 4 Object Distance2 [m2 ] Figure 5.1: Depth resolution versus object distance for different focal lengths of lens. 5.2.2 Chromatic Focal Shift If the desired depth range is given then the chromatic focal shift can be estimated for a given focal length. The optimum focal length for desired depth resolution is calculated from equation 5.11, that represents green color. Assuming the sensor is positioned to focus the green color, sensor distance is computed using thin lens formula. Now let ๐๐ and ๐๐ are the focus positions of blue and red color image which represent near and far focused images respectively. We can compute the focal lengths of blue ๐๐ and red ๐๐ color image using the thin lens formula. 1 1 1 = + , ๐๐ ๐๐ ๐๐ (5.12) 1 1 1 = + . ๐๐ ๐๐ ๐๐ (5.13) Hence, ฮ๐ = ๐๐ โ ๐๐ is the chromatic focal shift that must be introduced in the lens for depth from CA system to get the desired depth resolution in a given distance range. 100 Blur Measures Ratios 0.4 0.2 0 pp pp pp pp -0.2 -0.4 = = = = 1.12 2.24 4.48 8.96 103 Distance [mm] (log scale) Figure 5.2: Normalized depth parameter versus object distance for different sensor resolutions. 5.2.3 Sensor Resolution The resolution of a sensor is the smallest change in an image that it can distinguish. As a result of spatial sampling of an image, defocus blur is also sampled according to sensor resolution ฮ๐ฅ. In equation 5.3, the parameter ๐ฟ๐ is affected by sensor resolution. Since blur diameter changes with the distance, therefore, finer sampling increases the depth resolution. If the pixel size is increased by some factor, the depth resolution decreases with the same factor. Figure 5.2 shows the computed normalized ratios between blur measures of two defocused images for different pixel resolutions. The plots are generated for a focal length of 4 mm and F-number of 2.4. As it can be seen, if the pixel size is larger the variation in the depth curve becomes smaller. Therefore, the lowest possible pixel size is the best for finer depth resolution. Since there is also a limit on the lens spatial resolution, therefore, the optimum pixel resolution is equal to the lens spatial resolution given by Rayleigh criterion, i.e. ฮ๐ฅ = 1.22๐๐น #. 5.2.4 Spectral Response of Color Filter Arrays Conventional color image sensors capture the color information through sampling of visible light spectrum with the color filters. Usually, sensors capture 101 three primary colors red, green and blue. Since practical filters are smooth, and a sharp transition of filters are not realizable, therefore, a broad range of wavelengths contribute to each color. Another reason for a broadband color spectra response is to match it with the chromatic response of a human eye. For an example, the spectral response of commercially available Kodak Wratten Filters is shown in figure 4.4a. To study the effect of spectral response on the depth performance, we first observe the behavior of point spread function (PSF) in case of chromatic aberrations. The image irradiance ๐ธ of a point source at the sensor position ๐ฅ and ๐ฆ can be written as ( ๐ธ(๐ฅ, ๐ฆ, ๐) = ๐ (๐)๐๐๐๐ ๐ ๐(๐) ) , (5.14) where ๐ (๐) is the object reflectance spectra and ๐๐๐๐ is a circular function which defines the blur with the diameter ๐. With color filter sampling the PSF is given as ๐ (๐ฅ, ๐ฆ) = โ ๐ธ(๐ฅ, ๐ฆ, ๐)๐(๐) , (5.15) ๐ ๐ (๐ฅ, ๐ฆ) = โ ๐ ( ๐(๐)๐ (๐)๐๐๐๐ ๐ ๐(๐) ) , (5.16) where ๐ is the spectral response of color filters. Hence, the PSF is a summation of concentric discs weighted with the object and color filter spectra. For a depth estimation with chromatic aberrations, the change in the PSF must be independent of the object reflectance spectra. This is only possible if the object has an equal reflectance for each wavelength covered by a color filter spectral response. In reality this may be only valid for a uniform broadband object reflectance spectra. On the other hand, if an object has a narrowband reflectance spectra, the PSF would vary according to wavelength which results into an error in depth estimation. The wavelength dependency of depth estimation can be minimized by choosing a very narrowband color filter spectra. In this case the depth of any colored object which have some amount of reflectance for red, green and blue color can be estimated as accurately as a depth from defocus method. Parkinnen et al. [32] have shown that most of the natural scenes have a broadband object reflectance spectra. Therefore, depth from CA method can give accurate depth 102 estimation with a narrowband color spectral response for most of the natural scenes. For an RGB color sensor with narrowband color filter response, we get a tradeoff between accurate depth estimation and an accurate reproduction of colors of visible spectrum. This tradeoff can be avoided through multispectral imaging. Yasuma et al. [43] have proposed a general assorted pixel camera to control the spectrum after the image is captured. This type of camera with a lens exhibiting chromatic aberrations can be used to generate narrowband images of at least two colors after capturing the image. These narrowband images could be then used to estimate the depth without the impact of spectral response of color filters. 5.3 Optimal Parameters for Extended DOF The method of extended DOF discussed in the previous chapter restores the intensity image to enhance the MTF and also corrects the chromatic aberrations according to sharpest color at each location. The amount of chromatic aberrations must be introduced such that the far DOF limit of blue image is greater or equal to near DOF limit of green image, and near DOF limit of red image is less or equal to far DOF limit of green image. In this way, we make sure that the blur diameter is less than circle of confusion for a complete extended DOF. We formulate these conditions using the depth of field formulas. If the focus position ๐๐ is much larger than focal length ๐๐ , then the far and near limits ๐ท๐๐น๐ and ๐ท๐๐น๐ of DOF are given as, ๐ท๐๐น๐๐ โ ๐๐ ๐๐2 ๐๐2 โ ๐๐๐ ๐น # (5.17) ๐ท๐๐น๐๐ โ ๐๐ ๐๐2 ๐๐2 + ๐๐๐ ๐น # (5.18) where ๐ is the maximum blur diameter which is perceivable as a point to human eye and ๐ represents color. Circle of confusion ๐ is directly related to the sensor image format and can be given according to Zeiss formula as ๐ = โ/1500, where โ is the diagonal height of the sensor. 103 We consider two use cases to compute the optimal amount of chromatic aberrations for extended DOF. In the first case, we want to get DOF from certain distance to infinity and we want to find the optimum focus positions of each color image. In the other case, the focus position of the lens for one color (mostly green) is given and we have to find the focus position of other two colors. For the selection of focus positions, we put the constraint that DOF of two adjacent focused color images overlaps with each other. First Use Case: If the red image is focused at hyperfocal distance ๐ป, the ๐ท๐๐น๐๐ is infinity and ๐ท๐๐น๐๐ is half of the hyperfocal distance. Hyperfocal distance is given as, ๐ปโ ๐2 . ๐๐น # (5.19) Now taking ๐ท๐๐น๐๐ = ๐ท๐๐น๐๐ = ๐ป/2, we compute the focal length of green image using equation 3.2 as, ๐ 1 1 1 = + + , ๐๐ ๐ด๐๐ ๐๐ ๐ท๐๐น๐๐ (5.20) where sensor distance is computed according to focus position of red image as, 1 1 1 = โ . ๐๐ ๐๐ ๐ป (5.21) The focus position of green image ๐๐ for the focal length ๐๐ is computed using thin lens formula 3.1. Now the near DOF limit ๐ท๐๐น๐๐ of green image can be computed using equation 5.18. Similarly, the focal length ๐๐ and focus positions ๐๐ of the blue color image can be computed using equations 3.1, 5.18 and 5.20. Now the chromatic focal shift (chromatic aberrations) is given as ฮ๐ = ๐๐ โ ๐๐ . If a lens contains this amount of chromatic aberrations for the given parameters, it provides the extended DOF ranging from ๐ท๐๐น๐๐ to infinity. For an example, let assume a lens with ๐ = 4๐๐, ๐น # = 2.4 and the sensor of diagonal height โ = 6 mm. Following the above criteria and equations, we get ๐๐ = 3.962 mm, ๐๐ = 3.981 mm and ๐๐ = 4 mm, which results into ฮ๐ = 38 ๐m. The depth of field of a lens without chromatic aberrations is from ๐ท๐๐น๐๐ = 83 cm to infinity (assuming the lens focuses at hyperfocal distance). 104 Now with chromatic aberrations and the use of algorithms presented in chapter 4, the DOF is from ๐ท๐๐น๐๐ = 33.3 cm to infinity. Second Use Case: Let assume the focus position of green image ๐๐ is given. We compute the near and far limit of DOF ๐ท๐๐น๐๐ and ๐ท๐๐น๐๐ using equations 5.17 and 5.18. The focal lengths of red and blue images can be computed using equation 3.2 as, 1 ๐ 1 1 =โ + + . ๐๐ ๐ด๐๐ ๐๐ ๐ท๐๐น๐๐ (5.22) 1 ๐ 1 1 = + + . ๐๐ ๐ด๐๐ ๐๐ ๐ท๐๐น๐๐ (5.23) where ๐๐ is computed using thin lens formula for ๐๐ and given focal length ๐ ๐. The chromatic focal shift is then ฮ๐ = ๐๐ โ ๐๐ . Figure 5.3 shows the blur diameter versus object distances for different colors which are focused at different positions due to chromatic aberrations. The blur diameter is computed for the lens and sensor parameters used in the example of first use case. As the plot shows, the blur diameter of at least one color image is less than the circle of confusion for a large distance range of 33.3 cm to infinity. Blur Diameter [mm] 0.02 Blue Green Red 0.015 0.01 Circle of confusion 0.005 0 102 103 104 Object distance [mm] (log scale) Figure 5.3: Blur diameter is less than the circle of confusion for a complete range of depth of field. 105 5.4 Conclusion In this chapter, the effect of optical and sensor parameters on the depth and extended DOF imaging are studied. The relationship between camera parameters and depth axial resolution is derived to facilitate the optimal lens and sensor design parameters. It is found that the most critical parameter for finer depth resolution is the focal length, specially for low cost imaging applications. Based on the derived relationship, the criteria are defined to choose the optimal lens and sensor parameters such as focal length, chromatic focal shift, sensor resolution. It is shown that the narrowband spectral response of color filter arrays is optimal for accurate depth estimation. Section 5.3 describes different ways of computing the optimal amount of chromatic aberrations for enhanced DOF. By introducing these amount of chromatic aberrations, one makes sure that the blur is imperceivable in the complete range of extended DOF. 106 Chapter 6 Conclusion and Outlook The work in this thesis has provided thorough analysis of depth and extended DOF imaging using axial chromatic aberrations. Besides this, a digital camera simulator is presented to efficiently simulate the camera optics. 6.1 Summary In chapter 2 a digital camera simulator is presented. The simulator models the complete digital camera processing chain such as optics, sensor and digital processing. The main contribution of the simulator is in optical simulation. It allows a user to simulate the conventional and unconventional optics with a correct modeling of occlusions. The blur induced by the optics is generated for a sampled 3D space with commercially available lens design tools. Missing blur information is then approximated at each pixel of the image through PCA based interpolation. For 3D scenes, the true depth map is used to blur the image according to each pixels depth and location. It is shown that two methods of filtering, scattering and gathering, mentioned in the past literature, can be efficiently implemented in Fourier domain using the PCA based filtering. An efficient algorithm for space variant filtering using PCA helps in making the simulation time substantially smaller. Although the aim of the simulator is to simulate the cameras but the low cost method of space variant filtering can also be used to add the depth of field effect in real time to the computer generated scenes, which is very useful and demanding in gaming applications. The sensor simulation includes the noise addition, sampling and color filter array effects. 107 108 The digital part implements the traditional camera post processing steps. Chapter 3 analyzes a method similar to the depth from defocus for depth imaging. Instead of capturing multiple defocused image, axial chromatic aberrations are used to capture the defocused information in multiple colors with a single shot. The previous related literature has only shown the feasibility of this approach for limited experimental setups. Therefore, a thorough analysis is performed to develop a method which works well for different imaging conditions. The existing focus/blur measures were evaluated and it is shown that these measures are contrast dependent which makes them infeasible for depth from CA system. A new blur measure based on the normalized gradients is proposed which is independent of local image contrast. Absolute depth is inferred from blur measures by taking the normalized ratios and calibration procedure. The depth error analysis has shown that there are some common challenges in DFD and depth from CA system, for example, low texture area, sensor noise, field varying blur. Since blur varies across the field view due to lens aberrations, astigmatism and manufacturing tolerances, a simple calibration procedure is proposed to correct field varying depth. The test image used for calibration setup contains the information of sagittal and tangential edge orientations at different field positions. It is analyzed that the behavior of the depth change is similar for different lenses. Therefore, by measuring this once or obtained from the lens design data, a single image is sufficient to determine the field varying depth error and its correction. The main advantage of depth from CA method over DFD is a single shot depth imaging system which helps in avoiding miss-registration problems and also varying image magnifications, because the lateral color shift can be reduced during lens design process. The major challenge to depth from CA method is narrowband object reflectance spectra. In this case, either the color information is missing, or the blur is different as compared to the broadband spectra case. Since depth from defocus method only measures the depth for texture regions, therefore, a low cost implementation of two existing methods are used to create dense depth maps. One method is based on image segmentation and filling of each segment with the median depth value of this segment. Another solution uses the optimization method to propagate the depth to neighboring 109 regions of the same intensity. As the optimization problem is computationally complex, hence it is proposed to propagate the depth at very low resolution followed by joint bilateral upsampling. In chapter 4, the method of obtaining extended depth of field image using chromatic aberrations is described. Since there is already some work being done on these methods, therefore solutions to some existing challenges are presented. One of the existing method [8] only increases the MTF through deconvolution and doesnโt correct chromatic aberrations. The other method [14] corrects the chromatic aberration according to sharpest color but the infocus position still remains blurred as shown in section 4.4. In this work, both methods are combined using the RGBW sensor. As the correction of color aberrations requires the relative sharpness information, therefore, the method developed in chapter 3 provides better relative sharpness information as compared to the existing methods. A low cost and efficient implementation of the sharpness transport algorithm is also presented in section 4.5. The method developed in chapter 2 for space variant filtering is utilized here to extract sharpness information at each pixel of the sharpest color according to continuously varying relative sharpness between sharpest and blurred colors. In chapter 5, the effect of optical and sensor parameters on the depth and extended DOF imaging are studied. The relationship between camera parameters and depth axial resolution is derived to facilitate the optimal lens and sensor design parameters. It is found that the most critical parameter for finer depth resolution is the focal length, specially for low cost imaging applications. Based on the derived relationship, the criteria are defined to choose the optimal lens and sensor parameters such as focal length, chromatic focal shift, sensor resolution. It is shown that the narrowband spectral response of color filter arrays is optimal for accurate depth estimation and sharper extended DOF images. The derived formulas provide the optimum amount of chromatic aberrations for extended DOF. For a complete distance range of extended DOF, the blur diameter is less than circle of confusion for at least one of the color. 110 6.2 Conclusion The thesis has thoroughly studied a computational imaging method to compute depth and extend the DOF using chromatic aberration. Besides this, a complete simulator is presented to simulate the camera system. The work has provided an efficient solution to spatially varying optical simulation with accurate modeling of the occlusions. The simulator is very useful to simulate the optics of the computational cameras which behave differently on the occlusion boundaries as compared to traditional optics. Moreover, the optical simulation is useful in simulating depth of field effect in real time for computer generated scenes e.g. in gaming applications. The proposed blur measure and their normalized ratios given in chapter 3 provide depth estimation for natural scenes. The analysis of the depth errors shows the strong dependency of depth from CA on the object reflectance spectra. However, it is shown in chapter 5 that a narrowband color filter response can reduce this dependency to a negligible amount. The proposed depth estimation method could be a cheap solution to a human machine interface as it requires only one shot (camera) and does not suffer with miss-registration and magnification problems. The extended DOF method using chromatic aberrations and RGBW sensor, described in chapter 4, give sharper images without significant color artifacts. The use of panchromatic pixels provides low noise images which results in low noise gain in image restoration. At the same time the low cost color correction method in combination with proposed depth estimation method (chapter 3) gives high quality images. The derived formulas and criteria in chapter 5 help in selecting an optimum camera parameters for depth and extended DOF. 6.3 Outlook The thorough analysis in this work has given some solutions to the existing methods. However, there are still some possible extension to the work to make it more robust and practical in different imaging conditions. The optical simulator uses multiple image planes to simulate the occlu- 111 sions. Although it is sufficient to simulate a simple scene with foreground and background objects to analyze the occlusion problem, however, it would be beneficial to develop a simpler approach to extend this method for more complex scenes. The blur measure proposed for depth estimation is exact for well defined edges. For texture regions, it fails to provide accurate measure, therefore, a robust blur measure for different type of sceneโs content can substantially improve the depth results and make it practical for many applications. This texture invariant blur measure can also be useful for extended DOF algorithm to completely remove the color bleeding artifacts which appear due to chromatic aberrations. 112 Bibliography [1] Association, European Machine V. u. a.: EMVA Standard 1288, Standard for Characterization of Image Sensors and Cameras. 2010 [2] Bae, S. ; Durand, F.: Defocus magnification. In: Computer Graphics Forum Bd. 26 Wiley Online Library, 2007, S. 571โ579 [3] Bando, Y. ; Chen, B.Y. ; Nishita, T.: Extracting depth and matte using a color-filtered aperture. In: ACM Transactions on Graphics (TOG) Bd. 27 ACM, 2008, S. 134 [4] Blayvas, I. ; Kimmel, R. ; Rivlin, E.: Role of optics in the accuracy of depth-from-defocus systems. In: JOSA A 24 (2007), Nr. 4, S. 967โ972 [5] Chakrabarti, A. ; Zickler, T.: Depth and deblurring from a spectrally-varying depth-of-field. In: Proc. ECCV, 2012 [6] Chen, J. ; Venkataraman, K. ; Bakin, D. ; Rodricks, B. ; Gravelle, R. ; Rao, P. ; Ni, Y.: Digital camera imaging system simulation. In: Electron Devices, IEEE Transactions on 56 (2009), Nr. 11, S. 2496โ 2505 [7] Chung, S.W. ; Kim, B.K. ; Song, W.J.: Detecting and eliminating chromatic aberration in digital images. In: Image Processing (ICIP), 2009 16th IEEE International Conference on IEEE, 2009, S. 3905โ3908 [8] Cossairt, O. ; Nayar, S.: Spectral focal sweep: Extended depth of field from chromatic aberrations. In: Computational Photography (ICCP), 2010 IEEE International Conference on IEEE, 2010, S. 1โ8 [9] Dowski, E.R. ; Cathey, W.T.: Extended depth of field through wavefront coding. In: Applied Optics 34 (1995), Nr. 11, S. 1859โ1866 113 114 [10] Eisemann, E. ; Durand, F.: Flash photography enhancement via intrinsic relighting. In: ACM Transactions on Graphics (TOG) Bd. 23 ACM, 2004, S. 673โ678 [11] Farrell, J.E. ; Xiao, F. ; Catrysse, P. ; Wandell, B.A.: A simulation tool for evaluating digital camera image quality. In: Proceedings of the SPIE Electronic Imaging Conference Bd. 5294, 2003, S. 124 [12] Garcia, J. ; Sánchez, J.M. ; Orriols, X. ; Binefa, X.: Chromatic aberration and depth extraction. In: Pattern Recognition, 2000. Proceedings. 15th International Conference on Bd. 1 IEEE, 2000, S. 762โ765 [13] Geusebroek, J.M. ; Cornelissen, F. ; Smeulders, A.W.M. ; Geerts, H.: Robust autofocusing in microscopy. In: Cytometry 39 (2000), Nr. 1, S. 1โ9 [14] Guichard, F. ; Nguyen, H.P. ; Tessières, R. ; Pyanet, M. ; Tarchouna, I. ; Cao, F.: Extended depth-of-field using sharpness transport across color channels. In: Proc. SPIE Bd. 7250, 2009, S. 72500N [15] Häusler, G.: A method to increase the depth of focus by two step image processing. In: Optics Communications 6 (1972), Nr. 1, S. 38โ42 [16] Jaehne, B.: Digital Image Processing. Springer, 2005 [17] Kay, A. ; Mather, J. ; Walton, H.: Extended depth of field by colored apodization. In: Optics letters 36 (2011), Nr. 23, S. 4614โ4616 [18] Kim, S. ; Lee, E. ; Hayes, M.H. ; Paik, J.: Multifocusing and Depth Estimation Using a Color Shift Model-Based Computational Camera. In: Image Processing, IEEE Transactions on 21 (2012), Nr. 9, S. 4152โ4166 [19] Konevsky, O: Method of Correction of Longitudinal Chromatic Aberrations. In: Graphicon (2008) [20] Kopf, J. ; Cohen, M.F. ; Lischinski, D. ; Uyttendaele, M.: Joint bilateral upsampling. In: ACM Transactions on Graphics 26 (2007), Nr. 3, S. 96 115 [21] Kosloff, T.J. ; Tao, M.W. ; Barsky, B.A.: Depth of field postprocessing for layered scenes using constant-time rectangle spreading. In: Proceedings of Graphics Interface 2009 Canadian Information Processing Society, 2009, S. 39โ46 [22] Krotkov, E.: Focusing. In: International Journal of Computer Vision 1 (1988), Nr. 3, S. 223โ237 [23] Kumar, M. ; Morales, E.O. ; Adams, JE ; Hao, W.: New digital camera sensor architecture for low light imaging. In: Image Processing (ICIP), 2009 16th IEEE International Conference on IEEE, 2009, S. 2681โ2684 [24] Levin, A. ; Lischinski, D. ; Weiss, Y.: Colorization using optimization. In: ACM Transactions on Graphics (TOG) Bd. 23 ACM, 2004, S. 689โ694 [25] Lim, J. ; Kang, J. ; Ok, H.: Robust local restoration of space-variant blur image. In: Electronic Imaging 2008 International Society for Optics and Photonics, 2008, S. 68170Sโ68170S [26] Maeda, P. ; Catrysse, P.B. ; Wandell, B.A.: Integrating lens design with digital camera simulation. In: SPIEProceedings SPIE Electronic Imaging, San Jose, CA 5678 (2005), S. 48โ58 [27] Molesini, G ; Pedrini, G ; Poggi, P ; Quercioli, F: Focus-wavelength encoded optical profilometer. In: Optics communications 49 (1984), Nr. 4, S. 229โ233 [28] Nakamura, J.: Image sensors and signal processing for digital still cameras. CRC, 2006 [29] Nayar, S.K. ; Nakagawa, Y.: Shape from focus: An effective approach for rough surfaces. In: Robotics and Automation, 1990. Proceedings., 1990 IEEE International Conference on IEEE, 1990, S. 218โ225 [30] Ng, R. ; Levoy, M. ; Brédif, M. ; Duval, G. ; Horowitz, M. ; Hanrahan, P.: Light field photography with a hand-held plenoptic camera. In: Computer Science Technical Report CSTR 2 (2005) 116 [31] Ojeda-Castañeda, J. ; Ramos, R. ; Noyola-Isgleas, A.: High focal depth by apodization and digital restoration. In: Applied optics 27 (1988), Nr. 12, S. 2583โ2586 [32] Parkkinen, J.P.S. ; Hallikainen, J. ; Jaaskelainen, T.: Characteristic spectra of Munsell colors. In: JOSA A 6 (1989), Nr. 2, S. 318โ322 [33] Schechner, Yoav Y. ; Kiryati, Nahum: The optimal axial interval in estimating depth from defocus. In: Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on Bd. 2 IEEE, 1999, S. 843โ848 [34] Shapiro, L. ; Stockman, G.C.: Computer Vision. 2001. 2001 [35] Shen, C.H. ; Chen, H.H.: Robust focus measure for low-contrast images. In: Consumer Electronics, 2006. ICCEโ06. 2006 Digest of Technical Papers. International Conference on IEEE, 2006, S. 69โ70 [36] Sherif, S.S. ; Dowski, E.R. ; Cathey, W.T.: Effect of detector noise in incoherent hybrid imaging systems. In: Optics letters 30 (2005), Nr. 19, S. 2566โ2568 [37] Smith, L.I.: A tutorial on principal components analysis. In: Cornell University, USA 51 (2002), S. 52 [38] Subbarao, Murali ; Tyan, Jenn-Kwei: Noise sensitivity analysis of depth-from-defocus by a spatial-domain approach. In: Proc. SPIE Bd. 3174 Citeseer, 1994, S. 174โ187 [39] Tiziani, Hans J. ; Uhde, H-M: Three-dimensional image sensing by chromatic confocal microscopy. In: Applied optics 33 (1994), Nr. 10, S. 1838โ1843 [40] Wang, S. ; Hou, T. ; Border, J. ; Qin, H. ; Miller, R.: High-quality image deblurring with panchromatic pixels. In: ACM Transactions on Graphics (TOG) 31 (2012), Nr. 5, S. 120 [41] Watanabe, M. ; Nayar, S.K.: Telecentric optics for focus analysis. In: Pattern Analysis and Machine Intelligence, IEEE Transactions on 19 (1997), Nr. 12, S. 1360โ1365 117 [42] Watanabe, M. ; Nayar, S.K.: Rational filters for passive depth from defocus. In: International Journal of Computer Vision 27 (1998), Nr. 3, S. 203โ225 [43] Yasuma, Fumihito ; Mitsunaga, Tomoo ; Iso, Daisuke ; Nayar, Shree K.: Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum. In: Image Processing, IEEE Transactions on 19 (2010), Nr. 9, S. 2241โ2253 [44] Zalevsky, Z. ; Ben-Yaish, S.: Extended depth of focus imaging with birefringent plate. In: Optics Express 15 (2007), Nr. 12, S. 7202โ7210 [45] Zhou, C. ; Cossairt, O. ; Nayar, S.: Depth from diffusion. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on IEEE, 2010, S. 1110โ1117 118 List of Publications [1] Atif, Muhammad ; Jähne, Bernd: A Space-Variant (3D) Image Simulation Tool for Computational Cameras. In: International Conference on Computational Photography (ICCP), 2010. โ Poster [2] Atif, Muhammad ; Jähne, Bernd: Optimal Depth Estimation from a Single Image by Computational Imaging using Chromatic Aberrations. In: Puente León, Fernando (Hrsg.) ; Heizmann, Michael (Hrsg.): Forum Bildverarbeitung. Regensburg : KIT Scientific Publishing, November 2012, S. 23โ34 [3] Atif, Muhammad ; Jähne, Bernd: Optimal Depth Estimation from a Single Image by Computational Imaging using Chromatic Aberrations. In: tm-Technisches Messen (2013). โ To be published [4] Atif, Muhammad ; Siddiqui, Muhammad: Method and Optical System for Determining a Depth Map of an Image. April 18 2012. โ European Patent Application EP12002701 [5] Atif, Muhammad ; Siddiqui, Muhammad ; Unruh, Christian ; Kamm, Markus: Infrared Imaging System, and Method of Operating. November 8 2012. โ US Patent 20,120,281,081 [6] Atif, Muhammad ; Siddiqui, Muhammad ; Unruh, Christian ; Kamm, Markus u. a.: Image System Using a Lens Unit With Longitudinal Chromatic Aberations and Method of Operating. Juli 20 2012. โ WO Patent 2,012,095,322 119

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement