Dissertation atif
Dissertation
submitted to the
Combined Faculties for the Natural Sciences and for Mathematics
of the
Ruperto-Carlo University of Heidelberg, Germany
for the degree of
Doctor of Natural Sciences
Put forward by
M.Sc. Muhammad Atif
Born in: Sialkot, Pakistan
Oral examination: 14-10-2013
Optimal Depth Estimation and Extended Depth of
Field from Single Images by Computational Imaging
using Chromatic Aberrations
Advisor: Prof. Dr. Bernd Jähne
Prof. Dr. Karl-Heinz Brenner
Abstract
The thesis presents a thorough analysis of a computational imaging approach
to estimate the optimal depth, and the extended depth of field from a single
image using axial chromatic aberrations. To assist a camera design process, a
digital camera simulator is developed which can efficiently simulate different
kind of lenses for a 3D scene. The main contribution in the simulator is the
fast implementation of space variant filtering and accurate simulation of optical
blur at occlusion boundaries. The simulator also includes sensor modeling and
digital post processing to facilitate a co-design of optics and digital processing
algorithms.
To estimate the depth from color images, which are defocused to different
amount due to axial chromatic aberrations, a low cost algorithm is developed.
Due to varying contrast across colors, a local contrast independent blur measure is proposed. The normalized ratios between the blur measure of all three
colors (red, green and blue) are used to estimate the depth for a larger distance
range. The analysis of depth errors is performed, which shows the limitations
of depth from chromatic aberrations, especially for narrowband object spectra. Since the blur changes over the field and hence depth, therefore, a simple
calibration procedure is developed to correct the field varying behavior of estimated depth. A prototype lens is designed with optimal amount of axial
chromatic aberrations for a focal length of 4 mm and F-number 2.4. The real
captured and synthetic images show the depth measurement with the root
mean square error of 10% in the distance range of 30 cm to 2 m.
Taking the advantage of chromatic aberrations and estimated depth, a
method is proposed to extend the depth of field of the captured image. An
imaging sensor with white (W) pixel along with red, green and blue (RGB)
pixels with a lens exhibiting axial chromatic aberrations is used to overcome
the limitations of previous methods. The proposed method first restores the
white image with depth invariant point spread function, and then transfers the
sharpness information of the sharpest color or white image to blurred colors.
Due to broadband color filter responses, the blur of each RGB color at its focus
position is larger in case of chromatic aberrations as compared to chromatic
aberrations corrected lens. Therefore, restored white image helps in getting a
sharper image for these positions, and also for the objects where the sharpest
color information is missing. An efficient implementation of the proposed algorithm achieves better image quality with low computational complexity.
Finally, the performance of the depth estimation and extended depth of
field is studied for different camera parameters. The criteria are defined to
select optimal lens and sensor parameters to acquire desired results with the
proposed digital post processing algorithms.
Digital Camera Simulator, Depth Estimation, Extended Depth of Field,
Computational Photography, Chromatic Aberrations
Zusammenfassung
Die Dissertation beschreibt in einer gründlichen Analyse einen rechnergestützten Bildverarbeitungsansatz zur Abschätzung der optimalen Tiefe und des erweiterten Schärfentiefebereichs aus einem einzigen Bild unter der Verwendung
des Farblängsfehlers. Ein digitaler Kamerasimulator wurde entwickelt, um den
Konstruktionsprozeß des Kamerasystems hinsichtlich der effizienten optischen
Simulation einer 3D Szene für verschiedene Linsensysteme zu unterstützen.
Die wichtigsten Beiträge zum Simulator sind die schnelle Implementierung
von räumlich-variierenden Filtern und die genaue Simulation der optischen
Unschärfe an Verdeckungsgrenzen. Der Simulator beinhaltet auch die SensorModellierung und digitale Nachbearbeitung, um ein Co-Design der Optik mit
den digitalen Verarbeitungsalgorithmen zu ermöglichen.
Ein Low-Cost-(kostengünstiger, schlanker) Algorithmus wird entwickelt, um
die Tiefe von Farbbildern, die durch axiale chromatische Aberrationen unterschiedlichen defokussiert werden, zu schätzen. Da der Kontrast über die Farben variiert, wird eine vom lokalen Kontrast unabhängige Bestimmung der
Unschärfe vorgeschlagen. Ein normalisiertes Maß der Unschärfe wird zwischen
allen drei Farben (rot, grün und blau) verwendet, um die Tiefe für eine größere
Reichweite zu schätzen. Eine Analyse fehlerbehafteter Tiefenwerte wird durchgeführt, welche die Einschränkungen der Tiefenschätzung über den Farblängsfehler speziell für schmalbandige Objekt-Spektren aufzeigen. Da sich die Unschärfe über das Feld ändert und damit die Tiefe, wird daher eine einfache
Kalibrierung entwickelt, um das über das Feld unterschiedliche Verhalten der
Tiefenschätzung zu korrigieren. Ein Prototypobjektiv mit der Brennweite 4
mm und der Blendenzahl 2.4 wird konstruiert, welches eine optimale Größe des
Farblängsfehler besitzt. Aufgenommene und computergenerierte Bilder zeigen
in der Tiefenwertbestimmung eine Abweichung bezüglich der Mittelwerte von
10% bei einem Abstandsbereich von 30 cm bis 2 m.
Eine Methode zur Vergrößerung des Schärfentiefebereichs eines Bildes unter
Nutzung des Farblängsfehlers und der geschätzten Tiefe wird vorgeschlagen.
Um die Einschränkungen der bereits existierenden Methoden zu überwinden,
wird ein Bildsensor mit weißen Pixeln zusätzlich zu den rot, grün und blauen
Pixeln (RGB) und ein Objektiv mit entsprechenden Farblängsfehler verwendet.
Das vorgeschlagene Verfahren stellt zuerst das Weißbild mit einer tiefeninvarianten Punktverbreiterungs-funktion her und überträgt dann die Schärfeinformation der schärfsten Farbe oder des Weißbildes zu den unscharfen Farben.
Durch die breitbandigen Eigenschaften der Farbfilter, ist die Unschärfe bei Optiken mit Farblängsfehler in jeder RGB Farbe an ihren Fokuspositionen größer
im Vergleich zu achromatischen Optiken. Hier hilft das wiederhergestellte weiße Bild ein schärferes Farbbild für diese Positionen zu erhalten. Dies gilt auch
für Objekte in denen die schärfste Farbinformation fehlt. Eine effiziente Implementierung des vorgeschlagenen Algorithmus erzielt eine bessere Bildqualität
bei geringem Rechenaufwand.
Schließlich wird die Leistung der Tiefenschätzung und erweiterter Schärfentiefe für verschiedene Kameraparameter untersucht. Die Kriterien sind so
definiert, daß optimale Linsen- und Sensorparameter ausgewählt werden, um
die gewünschten Ergebnisse mit dem vorgeschlagenen digitalen Algorithmen
zu erhalten.
Digitaler Kamerasimulator (opto-digitaler Simulator), Tiefenschätzung, Erweiterte Schärfentiefe, Bildverarbeitungsbasierte Fotografie, Farbfehler
Contents
1 Introduction
1
1.1
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2.1
Simulation of Physical Image Formation . . . . . . . . .
2
1.2.2
Depth From Chromatic Aberrations . . . . . . . . . . . .
3
1.2.3
Extended Depth of Field Using Chromatic Aberrations .
4
1.2.4
Optimal Camera Parameters Selection . . . . . . . . . .
6
1.3
Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.4
Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2 Photo-realistic Simulation of Physical Image Formation Process
11
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2
Physical Image Formation . . . . . . . . . . . . . . . . . . . . . 12
2.3
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5
Simulating Spatially Varying Lens Blur . . . . . . . . . . . . . . 16
2.5.1
Simulating Optics for a 2D Object plane . . . . . . . . . 16
2.5.1.1
Fast Approximation of Space Variant Convolution . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5.1.2
2.5.2
. . . 18
Simulating Optics for a 3D Object Space . . . . . . . . . 18
2.5.2.1
2.6
Computing Basis Point Spread Functions
Simulating Partial Occlusions . . . . . . . . . . 21
Optical Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.1
Accuracy and Computational Complexity . . . . . . . . . 26
2.6.1.1
Rotational Symmetric Blur . . . . . . . . . . . 26
i
2.6.1.2
2.7
Real Lens Simulations . . . . . . . . . . . . . . 29
Sensor Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7.1
Noise Sources in Image Sensors . . . . . . . . . . . . . . 32
2.7.1.1
Temporal Noise . . . . . . . . . . . . . . . . . . 32
2.7.1.2
Fixed Pattern Noise . . . . . . . . . . . . . . . 33
2.7.2
Noise Model . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7.3
Sensor MTF and Sampling . . . . . . . . . . . . . . . . . 34
2.7.4
Color Filter Array . . . . . . . . . . . . . . . . . . . . . 35
2.8
Digital Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.9
Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . 36
3 Depth From Chromatic Aberrations
3.1
Overview of Depth Estimation Methods
3.1.1
3.1.2
3.1.3
39
. . . . . . . . . . . . . 39
Passive Depth Estimation . . . . . . . . . . . . . . . . . 40
3.1.1.1
Stereoscopy . . . . . . . . . . . . . . . . . . . . 40
3.1.1.2
Depth from Focus/Defocus . . . . . . . . . . . 40
Active Depth Estimation . . . . . . . . . . . . . . . . . . 41
3.1.2.1
Depth from Time of Flight
. . . . . . . . . . . 41
3.1.2.2
Depth from Active Stereoscopy . . . . . . . . . 41
Depth Estimation by Computational Imaging . . . . . . 42
3.2
Depth From Defocus . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3
Depth From Axial Chromatic Aberration . . . . . . . . . . . . . 45
3.3.1
Axial Chromatic Aberrations . . . . . . . . . . . . . . . 45
3.3.2
Depth Estimation . . . . . . . . . . . . . . . . . . . . . . 47
3.3.3
3.3.4
3.4
3.3.2.1
Blur Measures Methods . . . . . . . . . . . . . 47
3.3.2.2
Comparison of Blur Measures . . . . . . . . . . 50
3.3.2.3
Contrast Independent Blur Measure . . . . . . 52
3.3.2.4
Depth Estimation from Blur Measures . . . . . 53
Analysis of Depth Errors . . . . . . . . . . . . . . . . . . 54
3.3.3.1
Practical Issues in DFCA . . . . . . . . . . . . 54
3.3.3.2
Theoretical Issues in DFCA . . . . . . . . . . . 56
Field Dependent Depth Correction . . . . . . . . . . . . 59
Dense Depth Map . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4.1
Dense Depth Map by Segmentation . . . . . . . . . . . . 63
3.4.2
Dense Depth Map by Optimization . . . . . . . . . . . . 64
3.5
Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 66
3.6
Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . 68
4 Extended Depth of Field from Chromatic Aberrations
73
4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3
Extended DOF Using Axial Chromatic Aberration . . . . . . . . 76
4.3.1
Depth Dependent Deconvolution . . . . . . . . . . . . . 77
4.3.2
Sharpness Transport Across Color Channels . . . . . . . 78
4.3.3
Spectral Focal Sweep . . . . . . . . . . . . . . . . . . . . 79
4.3.4
Eliminating Color Difference . . . . . . . . . . . . . . . . 79
4.4
RGBW Sensor with Chromatic Aberrations . . . . . . . . . . . 80
4.5
Low Cost and Efficient Implementation of Proposed Algorithm . 83
4.5.1
Relative Blur Estimate . . . . . . . . . . . . . . . . . . . 83
4.5.2
Adaptive High-Pass Filtering
4.5.3
Deconvolution . . . . . . . . . . . . . . . . . . . . . . . . 85
4.5.4
Contrast Dependent Sharpness Transport . . . . . . . . . 86
. . . . . . . . . . . . . . . 83
4.6
Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 88
4.7
Physical and Practical Limitations . . . . . . . . . . . . . . . . 91
4.8
4.7.1
Narrowband Object Reflectance Spectra . . . . . . . . . 91
4.7.2
Loss of Contrast at Higher Frequencies . . . . . . . . . . 91
Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . 91
5 Optimal Lens and Sensor Characteristics for Depth and extended DOF using Chromatic Aberrations
5.1
Axial Resolution of Depth from Defocus . . . . . . . . . . . . . 95
5.1.1
5.2
95
Depth Resolution of DFD using Two Images . . . . . . . 97
Optimal Parameters for Depth Estimation . . . . . . . . . . . . 98
5.2.1
Focal Length and F-number . . . . . . . . . . . . . . . . 98
5.2.2
Chromatic Focal Shift . . . . . . . . . . . . . . . . . . . 99
5.2.3
Sensor Resolution . . . . . . . . . . . . . . . . . . . . . . 100
5.2.4
Spectral Response of Color Filter Arrays . . . . . . . . . 100
5.3
Optimal Parameters for Extended DOF . . . . . . . . . . . . . . 102
5.4
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6 Conclusion and Outlook
107
6.1
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.3
Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Bibliography
113
List of Publications
119
List of Figures
2.1
Processing flow of the digital camera simulation. . . . . . . . . . 11
2.2
Physical image formation process. . . . . . . . . . . . . . . . . . 12
2.3
A larger aperture shows narrow depth of field as compared to
small aperture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4
A color filter array (Bayer pattern) filters the light before it hits
to the photo sensors. . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5
The right image shows the depth dependent blur applied to the
input image according to the gathering method. The dark and
the bright shadows can be seen around the boundaries. The
input image (left) and its corresponding depth map (middle)
are also shown in figure. . . . . . . . . . . . . . . . . . . . . . . 19
2.6
The process flow of the proposed scattering method. ๐‘ƒ๐‘– and
๐‘Š๐‘– represent the basis functions (eigenPSF) and the weights,
respectively. The symbol โ€™*โ€™ represents the convolution. . . . . . 21
2.7
The left and right images are blurred according to gathering
and scattering approach, respectively. The scattering result is
visually better than gathering as there is a smooth blur around
object boundaries. . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.8
Foreground image along with its depth map. . . . . . . . . . . . 22
2.9
Background image along with its depth map. . . . . . . . . . . . 23
2.10 The plots show the results of gathering and scattering method
for one dimensional case. A correct blur is applied on the boundary with the information of background signal. . . . . . . . . . . 24
2.11 The image and its two layers depth map. . . . . . . . . . . . . . 25
2.12 The image simulated with the gathering (left) and scattering,
with (middle) and without (right) background blending, methods. 26
v
2.13 The image is blurred with the Gaussian filter. The standard
deviation of filter changes with the depth. . . . . . . . . . . . . 27
2.14 SSIM value computed between reference image (exhaustive blurring) and PCA based filtering with 5 (left) and 3 (right) basis
PSFs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.15 Image (right) shows the result of distortions and relative illumination applied to an ideal image (left). . . . . . . . . . . . . . . 30
2.16 Simulated real lens image (right) along with original image (left). 31
2.17 Cropped region shows the input image (left), and the result
of optical simulations (right). The effect of optical blur and
chromatic aberrations is visible with optics simulation. . . . . . 32
2.18 The combined MTF of sensor detector footprint and sampling
is shown for different pixel sizes. . . . . . . . . . . . . . . . . . . 34
2.19 Spectral sensitivity of color filter arrays. . . . . . . . . . . . . . 35
2.20 Digital image post processing chain. . . . . . . . . . . . . . . . . 36
3.1
Image formation by a thin lens approximation. . . . . . . . . . . 43
3.2
Dispersion of white light into monochromatic light after passing
through prism. . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3
Refractive index of glass (BK7) and plastic (Polycarbonate) materials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4
Blur diameter for RGB colors which focus at different distances
due to chromatic aberrations. . . . . . . . . . . . . . . . . . . . 47
3.5
The image with and without chromatic aberrations. . . . . . . . 48
3.6
A step edge blurred with Gaussian blur of varying standard
deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.7
Normalized blur values of different type of blur measures. . . . . 51
3.8
Normalized blur values of different blur measures method plotted with the RMSE for low CNR. . . . . . . . . . . . . . . . . . 52
3.9
Flow diagram of depth from chromatic aberrations algorithm. . 54
3.10 Ratios of blur measure with different combinations of colors . . 55
3.11 Two images captured under strong red light (top left) and white
light (bottom left) illumination. The images with white color
correction are shown here (right). . . . . . . . . . . . . . . . . . 56
3.12 An example of color filter with broadband spectral responses
and the reflectance spectrum of an object with strong red content. 57
3.13 Edge profile of color channels for red and white light illumination. 58
3.14 Estimated depth for smooth edges is removed by applying the
condition given in equation 3.17. . . . . . . . . . . . . . . . . . . 59
3.15 Left: Sagittal and tangential orientations are shown as black and
gray edges respectively. Right: A PSF which produces different
blur for different orientations of the edges. . . . . . . . . . . . . 60
3.16 Field dependent depth correction of the estimated depth map
from DFCA algorithm. . . . . . . . . . . . . . . . . . . . . . . . 61
3.17 Relative depth measured for different lenses at different image
locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.18 Field dependent depth correction of the estimated depth map
from DFCA algorithm using only one image for the calibration.
62
3.19 Dense depth map is generated by optimization based method
followed by joint bilateral upsampling . . . . . . . . . . . . . . . 64
3.20 The dense depth map is generated by the segmentation and
optimization based methods. . . . . . . . . . . . . . . . . . . . . 65
3.21 (a) Simulated image with chromatic aberrations, (b) ground
truth depth map used for testing the DfCA algorithm, (c) depth
map generated with the algorithm described in section 3.3.2
(depth is estimated only at edges, and given in mm). . . . . . . 66
3.22 RMSE between true depth and estimated depth at different distances and differences of contrast between colors. . . . . . . . . 67
3.23 Depth estimation from axial chromatic aberrations for the real
captured scenes. First row: input images captured with a lens
having large chromatic aberrations. Second row: raw depth
estimation using the algorithm proposed in this work. Third
row: dense depth after propagating the raw depth to surroundings. 68
3.24 Depth estimated with the DFCA camera for different colored
objects. (a) Original image, (b) DFCA depth at edges only,
and (c) DFCA depth after propagating to neighboring regions.
For comparison the depth from ToF camera (d) is also shown.
All depth maps are shown in cm. . . . . . . . . . . . . . . . . . 69
4.1
Blur diameter versus distance for different focal lengths and lens
focus position. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2
Left: Blur diameter for RGB colors in case of axial chromatic
aberrations. Right: Minimum blur diameter among all colors
at each distance. . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.3
MTF for spatial frequency of 90 line pairs per millimeter [lp/mm]
versus distances.
4.4
. . . . . . . . . . . . . . . . . . . . . . . . . . 78
a) Spectral response of color filter array, b) chromatic focal shift
in the visible range of spectra and c) the blur diameter of red,
green and blue colors calculated by averaging the blur diameter
of all wavelengths according to the color filter array spectral
response. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.5
Processing flow of extended DOF algorithm using RGBW sensor. 84
4.6
Weights versus local edge contrast to reduce the strength of
sharpness transport at low contrast levels. . . . . . . . . . . . . 87
4.7
Simulation of optical and sensor properties. . . . . . . . . . . . . 88
4.8
Simulation results of a lens without (a,c,e) and with chromatic
aberrations (b,d,f). . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.9
Simulation results of a lens without chromatic aberrations (a,c,e)
representing a conventional lens. The images b, d, and f shows
the extended DOF result generated through the proposed extended DOF algorithm. . . . . . . . . . . . . . . . . . . . . . . . 90
4.10 Chromatic aberrations are corrected from the image that is captured with the lens exhibiting large chromatic aberrations. . . . 93
5.1
Depth resolution versus object distance for different focal lengths
of lens. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2
Normalized depth parameter versus object distance for different
sensor resolutions. . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.3
Blur diameter is less than the circle of confusion for a complete
range of depth of field. . . . . . . . . . . . . . . . . . . . . . . . 104
List of Tables
2.1
Execution time for exhaustive filtering and PCA based filtering.
28
2.2
Execution time for PCA based scattering filtering. . . . . . . . . 29
3.1
Absolute depth estimates from DFCA camera for the images
shown in figure 3.24. The error with the depth from ToF is also
given in percentage. . . . . . . . . . . . . . . . . . . . . . . . . . 70
ix
Chapter 1
Introduction
1.1
Motivation
Traditional cameras capture the visible light spectrum onto a photographic
film. Usually, a lens is used to focus some portion of the incoming light from
a scene at the film. Digital cameras use the electronic sensor to grab and
store the light information in the digital format. Although the traditional
digital cameras are being used for a long time, however, some features of the
photography have a limited range, for example, limited dynamic range, loss of
3D information, fixed focus and depth of field. To overcome the limitations of
traditional cameras, novel and unconventional imaging devices are designed to
produce enhanced and meaningful images, which are beyond the limitations of
traditional cameras. This leads into a new emerging field called โ€œcomputational
photographyโ€.
In spite of the fact that computational photography is being able to produce
some features which are much desirable and beyond conventional imaging. But
in most of the cases, the camera design is too complex which makes the camera expensive. Additionally, there is always a tradeoff between conventional
imaging quality and the new features added through low cost computational
photography. These challenges then directly motivate to investigate some low
cost computational imaging methods which could also produce an acceptable
imaging quality along with the additional features.
The field of computational photography requires a modification in the conventional camera design, therefore it is always desirable to evaluate the cam1
2
era system performance during the design phase. This helps in a co-design
of optics, sensor and digital processing. Hence, a digital camera simulator is
required to design and evaluate the computational camera.
In this work, a low cost computational photography method for two applications, depth imaging and extended depth of field (DOF) imaging is being
investigated. Specifically, enhanced axial chromatic aberrations are introduced
in the lens to estimate the depth from color defocus, and produce the extended
DOF image by correcting chromatic aberrations (CA) according to focused
color. Another aim of this work is to develop a digital camera simulator which
can efficiently simulate optics for a 3D scene.
1.2
Overview
Computational photography may be categorized into two groups: one where
multiple conventional photographed images, with different camera settings, are
fused together to extract desired features, and the other where camera optics
or sensor is no more conventional, but modified to achieve desired functionality
in combination with digital processing. Examples of first group are, extended
DOF through focus stacking and high dynamic range imaging through exposure bracketing. Examples of second group are wavefront coding, coded
aperture photography, light field camera, etc.
The work in the thesis belongs to second group, as a conventional lens is
designed to deliberately contain enhanced axial chromatic aberrations followed
by digital processing to retrieve depth and extended DOF image. Detailed
overview of the current thesis and itโ€™s related work is given in the next sections.
1.2.1
Simulation of Physical Image Formation
There are already few image simulation tools being developed to simulate
different modules of a camera system. The most comprehensive simulator is
the Image Systems Evaluation Toolkit (ISET) [11]. ISET is a software package
that simulates the capture and the processing of a scene. It allows users to
control the physical characteristics of a scene and simulate the optics, sensor
and image processing-pipeline. One of the limitations of ISET is the simulation
3
of optics where only a single image plane is simulated that means no occlusions
simulations.
With the growth in the field of computational photography, more complex
lenses are designed, like Wavefront coded lenses [9]. To simulate these kinds
of complex lenses along with the traditional lenses, an efficient and flexible
optical simulation module is required. In [26] and [6], the lens is integrated
in the digital simulation with spatially varying point spread function (PSF).
In these methods, each pixel of the image is convolved with its own PSF, or
the image is divided into small regions where the PSF is considered to be
constant. These methods produce more realistic lens blur for a 2D image
plane, but they are computationally expensive for larger images. Moreover,
the current methods only simulate one plane of a scene and none of them
considers simulating the lens in a 3D space.
In this work, simulation of the physical image formation process is presented
to simulate the optics, sensor and digital processing modules. The main focus
of the simulator to simulate the optics due to limitations of the state of the
art methods. Generally, the optical transfer function of a lens varies spatially
across the field of view and also longitudinally for the objects at varying distances, which makes the simulation computationally complex. An algorithm is
presented in this work to simulate the lens blur, which makes a substantial reduction in computational complexity without sacrificing accuracy significantly.
Moreover, occlusions are also modeled quite accurately for synthetic scenes.
1.2.2
Depth From Chromatic Aberrations
Depth estimation refers to the algorithms which aim to estimate the distance
of objects in a scene from the camera. The distance and geometry information
is lost in the conventional imaging methods, because a set of rays from a scene
is projected on a 2D instead of 3D plane.
There are many different approaches to estimate the lost depth information. These different techniques can be categorized mainly into two methods,
active and passive. Examples of these methods are depth from defocus, stereoscopy, depth from time of flight etc. Recently many computational imaging
approaches have been presented to estimate the depth. Ng et al. [30] presented
the method of light field capture through plenoptic camera, which also results
4
in obtaining the depth information. Another method for depth estimation is
proposed by Zhou et al. [45], where a diffuser is inserted between scene and
the camera to code the depth information in the defoucs blur. The method is
similar to depth from defocus, however higher depth accuracy can be achieved
with smaller aperture.
Several other methods based on the aperture coding are also proposed to
estimate the depth. Bando et al. [3] suggested the use of color filtered aperture
to shift the color images, and hence estimating the depth from the disparity in
the color images. One of the limitations of these methods is the loss of light.
Chakrabarti et al. [5] modified the aperture to generate varying depth of field
for different colors and infer depth from blur difference in the color images.
This thesis also investigates a computational imaging method using axial
chromatic aberrations to estimate the depth. The idea of estimating the depth
information using chromatic aberrations was first time proposed by Molesisni
[27] in 1984. In 1994, Tiziani and Uhde [39] proposed a chromatic confocal
microscope for a 3D image sensing. For the conventional photography, Garcia
et al. [12] proposed to use the chromatic aberrations for depth estimation and
autofocusing. Although the method of extracting depth using CA is quite old,
still there is no study of this method for a depth estimation of natural scenes.
Moreover, all previous methods have only shown the potential of the method
through limited experiments.
In the presented thesis, the method of depth from CA is investigated comprehensively. The algorithm is proposed to estimate the depth of natural
scenes. Calibration procedure is developed to compensate the field varying
depth estimation, which mainly occurs due to optical aberrations and manufacturing tolerances. Finally, the limitations of the approach are discussed.
1.2.3
Extended Depth of Field Using Chromatic Aberrations
Depth of field represents the distance in a scene for which the captured image
is considered to be in focus i.e. the amount of blur introduced by optics is
not perceivable in normal viewing conditions. In some cases, it is desirable to
have a larger depth of field. In other cases, narrow depth of field is helpful in
5
emphasizing the desired object in a scene.
One of the earliest approach, proposed by Häusler [15], is to make the blur
of the object invariant to depth, and restore the sharp image through digital
processing. He obtained the depth invariant blur by moving the object along
the optical axis during the exposure time of the camera.
A very common approach of achieving extended DOF is the focus stacking.
In this method, multiple images are taken with different focus distances and
combined digitally to extend DOF. For each image position, the sharpest pixel
among multiple images is selected.
A well known method in the field of computational photography, proposed
by Dowski et al. [9] is wavefront coding. Here, the pupil function is modified
through phase modulation by putting a non-absorbing optical element like
cubic-phase or cosine form-phase mask. It is possible then to get the PSF
which is insensitive to defocus. In the second step, standard deconvolution is
used to restore the image with only one digital filter. The advantage of this
method is no light loss, but it suffers with the loss of SNR, as the PSF spreads
on a larger size to make it insensitive to defocus. Depth invariant PSF can
also be achieved through polarization separation, by placing the birefringent
plate between the lens and the sensor [44]. The plate is designed such that two
polarization states contain the in-focus far and in-focus near field information
of a scene and are superimposed on each other to form the image. In this
case, digital restoration is required. All these methods have a complex optical
design or complex post-processing and moreover, they suffer from loss of SNR.
Some alternative approaches are based on the color separation which are
more related to the current thesis work. Guichard et al. [14] proposed to
utilize the chromatic aberrations to capture color images with different focus
positions. As different colors appear sharp at different distances, hence digital
processing transfers the sharpness information of sharpest color to other colors to make an extended DOF image. In other words, the proposed sharpness
transportation technique takes advantage of the spectral information redundancy inherent in images to recover information that has been lost due to
chromatic blurring effects. The advantage of this method is the use of conventional optics without any light loss and degradation of SNR. However, the
digital processing is quite challenging to remove all chromatic aberrations. The
6
method trades off the extension of the DOF and the loss of chrominance high
frequencies. An alternative solution is proposed by Kay et al. [17] by using the
color aperture which stops the blue light to make larger depth of field for blue
color image. Other colors are then made sharper using the sharpness information of blue color image. The method proposed by Bando [3] and Kim [18]
codes the disparity information in color images through the color filter aperture. Then the extended DOF image is produced through estimated depth
based deconvolution.
If the lens which exhibits axial chromatic aberrations is used with the black
and white (B&W) sensor, it produces depth invariant PSF, as shown by Cossairt et al. [8]. For color images, it is shown that luminance of the image have
a depth invariant PSF.
In this work, the method of [14] and [8] is combined using the sensor that
captures monochromatic light along with the RGB colors. Moreover, on the
digital side robust algorithms are proposed to produce high quality color images.
1.2.4
Optimal Camera Parameters Selection
Depth from chromatic aberrations system estimate the depth from relative
defocus blur between two defocused color images. Since amount of defocus
blur depends on the lens parameter, therefore selection of an optimum lens
for required depth performance is important. In the past, there are only few
papers which addresses this issue. Blayvas et al. [4] have derived a formula to
relate the depth resolution with the camera parameters. Schechner et al. [33]
have computed the optimal axial interval between two defocused images to
estimate the depth reliably. Subbarao [38] have analyzed the noise sensitivity
of depth from defocus method for a specific spatial domain approach.
In this work, the relationship between camera parameters and the performance of depth and extended DOF image is analyzed. An equation is derived
to relate the parameters of lens with the depth resolution. The effect of sensor
color filter arrays is studied specifically for the depth from CA method. The
criteria are given for the selection of optimal camera parameters for desired
depth requirements. Moreover, the optimal amount of chromatic aberrations
required for a certain focal length to get an extended DOF image are also
7
formulated.
1.3
Outline
The presented work in this thesis has four main goals.
โˆ™ Develop a digital camera simulator that can accurately simulate the optics in a 3D space with a low computational complexity.
โˆ™ Develop a low cost algorithm to estimate the depth of natural scenes using the axial CA. Study the limitations of the system in different imaging
conditions.
โˆ™ Establish a camera system for extended DOF image using axial CA.
Develop an algorithm that can efficiently reduce the color artifacts which
appear due to CA
โˆ™ Develop the criteria for selecting the optimal camera parameters for
depth and extended DOF imaging using CA.
The content of this work is the following:
Chapter 2 A digital camera simulator is described in this chapter. The
simulator models the optics, sensor and digital processing steps. An algorithm
is presented which can simulates the optics in a 3D space. A simple and effective method is proposed which blends the multiple layered images to simulate
the occlusions. The accuracy and computational time is compared with exhaustive filtering approach. Sensor characteristics such as noise, color filter
response and sensor MTF are also modeled in sensor simulations to make a
complete digital camera simulator.
Chapter 3 This chapter discuss the depth from CA method, and starts with
the overview of state of the art depth imaging methods. The basics of depth
from defocus method are discussed in detail, as it is the basis of the depth from
CA. Different blur measures are compared, and a new blur measure is proposed
for depth from CA which is independent of varying image contrast. Theoretical
and practical limitations of the depth from CA method are also discussed. For
8
a low cost imaging, where we have relatively larger field dependent aberrations,
a calibration method is proposed to estimate correct depth at any field position.
As the defocus blur can be measured only at textured areas and hence depth,
therefore some state of the art methods are used to generate dense depth maps.
Some low cost solutions are described to reduce the computational complexity
of these methods.
Chapter 4 In this chapter, a combination of two state of the methods for the
extended DOF is proposed by using an RGB sensor with extra panchromatic
pixel. Moreover, an efficient and a low cost algorithm is implemented to correct
color bleeding artifacts.
Chapter 5 The effect of optical and sensor parameters on the depth and extended DOF performance are studied here. Based on the derived relationships,
the criteria are discussed to select the optimal camera parameters.
Chapter 6 A detailed summary of the thesis work is given in this chapter.
Moreover, some tasks are described as a future work to make the outcome of
the work of this thesis more robust.
1.4
Contribution
The novel contributions of this thesis are as follows:
โˆ™ The low cost spatially varying simulation of lens defocus blur.
โˆ™ A simple and efficient solution of simulating the blur at occlusionโ€™s
boundaries.
โˆ™ A local contrast independent blur measure and normalized ratios between
them to estimate the depth from defocused color images. Analysis of the
limitations of depth from CA in different imaging conditions.
โˆ™ A simple calibration procedure to correct the field varying behavior of
estimated depth.
9
โˆ™ A combination of two state of the art methods by using the RGBW sensor
to produce an extended DOF image. The proposed method reduces the
shortcomings of original methods.
โˆ™ Analysis of effect of lens and sensor parameters on depth performance.
Based on which the criteria are defined to select optimum camera parameters.
10
Chapter 2
Photo-realistic Simulation of
Physical Image Formation Process
2.1
Introduction
To overcome the limitations of traditional cameras, novel and unconventional
imaging devices are designed to produce enhanced and meaningful images,
which are beyond the limitations of traditional cameras. This leads into a new
emerging field called โ€œcomputational photographyโ€. However with the progress
in the field of computational photography it is becoming difficult to evaluate
the camera system performance during the design phase. In this chapter, an
image simulator is presented that can be used to evaluate the performance of
computational cameras.
Figure 2.1: Processing flow of the digital camera simulation.
11
12
The simulator deals with the spatially varying blur in 3D space to overcome
the main challenge of accurate and fast simulation of the complex optical
designs, especially in the field of computational photography. The simulation
tool also includes effects from the sensor, like noise, sensor MTF and color filter
array (CFA) sampling. Post processing algorithms relevant to some specific
cameras, e.g. deconvolution for wavefront coding system, depth estimation
from image defocus, are also incorporated in the post processing chain.
The main blocks of the simulator are shown in figure 2.1. Lens data is
retrieved from an optical design software and fed into the optical processing
block. Where, lens distortions, relative illumination and optical blur are simulated for a given all focus input image. Sensor noise, sensor MTF and image
sampling to sensor resolution, according to color filter array, are performed in
sensor simulation block. Finally, the desired functionalities of the camera are
tested through specific post processing algorithms.
Figure 2.2: Physical image formation process.
2.2
Physical Image Formation
The physical image formation process consists of two parts, the geometry of an
image formation and the physics of light. The former determines the position
and the latter determines the brightness of projection of light in the image
13
Figure 2.3: A larger aperture shows narrow depth of field as compared to
small aperture.
plane. A simple model is shown in figure 2.2, where a scene is illuminated
by a light source, objects in a scene reflect the light towards the camera, the
lens in the camera focuses the light to the image sensor that captures the light
information and converts it into a digital image.
In cameras, an aperture is used to control the amount of light acquisition. If
the aperture opening is as small as a pinhole, a complete sharp image is formed.
However in practice, a wide aperture is used to capture more light but in this
case, not all of the rays of light focus on the image plane because conventional
camera lenses are designed to focus accurately at only one distance. Therefore,
the closer and the farther objects from focused distance appear blurred and a
point in a scene spreads in the image plane forming a circle of confusion (CoC).
The CoC defines the depth of field (DOF), the range of object distances that
appear acceptably sharp in the image. For a smaller aperture, the DOF is
larger as compared to a wider aperture due to bending of a light at larger
angles. The DOF effect with two different aperture sizes is shown in figure 2.3.
A digital image sensor consisting of array of pixels captures photons (light)
and converts them into the electrical signal. There are two kinds of image
sensors, CMOS and CCD and the difference between two is only in the readout
14
Figure 2.4: A color filter array (Bayer pattern) filters the light before it hits
to the photo sensors.
process. Figure 2.4 shows the CMOS sensor consisting of array of pixels.
In addition, the color filter array (CFA) is needed before photo sensors to
differentiate between colors. The most popular CFA is the Bayer pattern
(shown in figure 2.4) which captures three primary colors, red, green and blue
(RGB). In the process of image capturing, noise is also added mainly due to
random arrival nature of photons and fluctuations in the process of electrical
signal readout.
2.3
Motivation
Traditionally, each element of a camera system is designed and tested individually. Therefore, before manufacturing, it is quite difficult to evaluate the
performance of each module after integrating them in a complete imaging
system. If all modules of a system behave linearly, then the performance is
predictable. However, for non-linear and highly adaptive algorithms, the final
image quality can hardly be predicted. Hence, evaluating the performance of
an individual module separately does not guarantee that it will result in the
optimal system performance.
In most of the computational photography applications, the image is optically coded followed by computational decoding. This results in unconventional imaging properties which are very difficult to predict before manufacturing the lens. Lens designers widely-use the optical design programs (e.g.
15
Zemax and CODE V) to design and analyze the optical system. The software
allows the user to analyze the performance of a lens with different means e.g.
modulation transfer function, distortion, chromatic focal shift etc. However,
the final image cannot be visualized and especially the behavior of optical blur
at occlusion boundaries cannot be predicted before manufacturing the lens.
2.4
Related Work
In the past, there has some work been done in designing image simulation
tools, which allow to simulate different modules of a camera system. One of
these kind of simulation tools is the Image Systems Evaluation Toolkit (ISET)
[11]. ISET is a software package that simulates the capture and the processing
of a scene. It allows users to control the physical characteristics of a scene
and simulate the optics, sensor and image processing-pipeline. One of the
limitations of ISET is the simulation of optics where only a single image plane
is simulated that means no occlusions simulations.
With the growth in the field of computational photography, more complex
lenses are designed, like Wavefront coded (WFC) lenses [9]. To simulate these
kinds of complex lenses along with the traditional lenses an efficient and flexible
optical simulation module is required. In [26] and [6], the lens is integrated in
the digital simulation with spatially varying PSF. In these methods, each pixel
of the image is convolved with its own PSF, or the image is divided into smaller
regions where the PSF is considered to be constant. These methods produce
more realistic lens blur for 2D image plane, but they are computationally
expensive for larger images. Moreover, the current methods only simulate one
plane of a scene and none of them considers simulating the lens in 3D space.
In this chapter, the simulation of the physical image formation process is
presented. Generally, the optical transfer function of a lens varies spatially
across the field of view and also longitudinally for the objects at varying distances, which makes the simulation computationally complex. An algorithm
is presented in this chapter, which makes a substantial reduction in computational complexity without sacrificing accuracy significantly to simulate the
lens blur. Moreover, occlusions are also modeled quite accurately. The chapter starts with the algorithm for spatially varying lens blur in section 2.5.
16
The work flow of complete optical simulation including distortion and relative
illumination is presented in section 2.6. Finally, the sensor and the digital
simulation are discussed in section 2.7, and 2.8 respectively.
2.5
Simulating Spatially Varying Lens Blur
The response of an optical system to a point source is described by the point
spread function (PSF). Optical design programs provide the PSF of a lens
which models different optical effects e.g. defocus blur, chromatic aberrations
and vignetting. In general the PSF varies for each point in space due to optical
aberrations. Therefore, calculating PSF for each location and using it for the
simulation is not a practical approach for a large number of pixels. To reduce
the computational complexity, PSFs can be sampled at different points in space
and modeled with the weighted summation of the basis point spread functions.
Missing PSFs could be approximated through the interpolation of the weights.
In the following sections, the algorithms are discussed in detail for a 2D plane
and a 3D space varying filtering.
2.5.1
Simulating Optics for a 2D Object plane
The output of a lens can be represented by a convolution of a pinhole image
๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ฅ, ๐‘ฆ), and a PSF, ๐‘ƒ (๐‘ฅ, ๐‘ฆ),
โˆซโˆžโˆซ
๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ข, ๐‘ฃ)๐‘ƒ (๐‘ฅ โˆ’ ๐‘ข, ๐‘ฆ โˆ’ ๐‘ฃ)๐‘‘๐‘ข ๐‘‘๐‘ฃ,
๐ผ(๐‘ฅ, ๐‘ฆ) =
(2.1)
โˆ’โˆž
where ๐‘ฅ, ๐‘ฆ are the output image coordinates and ๐‘ข, ๐‘ฃ, the source image
coordinates. For a spatially varying PSF, the output is described as a space
variant convolution integral,
โˆซโˆžโˆซ
๐ผ(๐‘ฅ, ๐‘ฆ) =
๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ข, ๐‘ฃ)๐‘ƒ (๐‘ฅ, ๐‘ฆ, ๐‘ข, ๐‘ฃ)๐‘‘๐‘ข ๐‘‘๐‘ฃ.
(2.2)
โˆ’โˆž
If all objects in a scene are considered to be in a single plane, the PSF varies
only in ๐‘ฅ and ๐‘ฆ directions inside that plane. To make it more realistic, we need
the simulation of an image for a 3D space. Before discussing the details of 3D
17
simulation, the method that is used in this work for applying the space variant
blur is discussed.
2.5.1.1
Fast Approximation of Space Variant Convolution
For a large number of pixels, the space variant convolution is computationally
complex, mainly due to two reasons. Firstly, generating a large number of PSFs
and storing them in a memory and secondly, the time required for processing
each pixel with its own PSF.
The simplest method that can be used for complexity (in other terms a data
dimension) reduction is to divide the image in different sections and consider
a constant PSF inside each section. However, minimum number of sections
required, for fast processing and acceptable blur accuracy, strongly depends
on the behavior of PSF variance across the image.
An attractive method for data (in our case, the PSF) dimensions reduction is
to project the data to lower dimensions while preserving as much of information
as possible. This is exactly what is performed in principal component analysis
(PCA), where a set of basis functions is computed that minimizes the squared
error in original data reconstruction. Hence, the spatially varying PSF can be
represented with the weighted summation of basis PSF as,
๐‘ƒ (๐‘ฅ, ๐‘ฆ, ๐‘ข, ๐‘ฃ) =
๐‘
โˆ‘
๐‘ค๐‘– (๐‘ฅ, ๐‘ฆ)๐‘๐‘– (๐‘ข, ๐‘ฃ),
(2.3)
๐‘–=1
where ๐‘๐‘– are the basis PSF, ๐‘ค๐‘– are the corresponding weights computed
through PCA and ๐‘ represents the number of basis PSF that contribute most
to the actual PSF reconstruction. Now the equation 2.2 can be written as,
โˆซโˆžโˆซ
๐ผ(๐‘ฅ, ๐‘ฆ) =
๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ข, ๐‘ฃ)
๐ผ(๐‘ฅ, ๐‘ฆ) =
๐‘ค๐‘– (๐‘ฅ, ๐‘ฆ)๐‘๐‘– (๐‘ฅ โˆ’ ๐‘ข, ๐‘ฆ โˆ’ ๐‘ฃ)๐‘‘๐‘ข ๐‘‘๐‘ฃ.
(2.4)
๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ข, ๐‘ฃ)๐‘๐‘– (๐‘ฅ โˆ’ ๐‘ข, ๐‘ฆ โˆ’ ๐‘ฃ)๐‘‘๐‘ข ๐‘‘๐‘ฃ.
(2.5)
๐‘–=1
โˆ’โˆž
๐‘
โˆ‘
๐‘
โˆ‘
โˆซโˆžโˆซ
๐‘ค๐‘– (๐‘ฅ, ๐‘ฆ)
๐‘–=1
โˆ’โˆž
Equation 2.5 represents the space variant convolution computed as ๐‘ numbers of conventional convolutions and summing them up according to weights
of each location in space.
18
If only the subset of basis PSFs are used to reconstruct the original PSFs,
the energy of PSFs is not preserved. As a result, there might be some overshoot
and undershoots occur in the final image. However, these can be corrected by
generating the normalization values through the blurring of image, containing
all pixel values equal to one, with the same number of basis PSF as are used
to blur the image.
๐‘Š๐‘› (๐‘ฅ, ๐‘ฆ) =
๐‘
โˆ‘
๐‘–=1
โˆซโˆžโˆซ
๐‘๐‘– (๐‘ฅ โˆ’ ๐‘ข, ๐‘ฆ โˆ’ ๐‘ฃ)๐‘‘๐‘ข ๐‘‘๐‘ฃ.
๐‘ค๐‘– (๐‘ฅ, ๐‘ฆ)
(2.6)
โˆ’โˆž
๐ผ(๐‘ฅ, ๐‘ฆ)
.
(2.7)
๐‘Š๐‘› (๐‘ฅ, ๐‘ฆ)
Although PCA significantly reduce the data dimensionality and hence the
๐ผ(๐‘ฅ, ๐‘ฆ) =
computationally complexity but still, a large amount of PSFs is to be generated and processed for PCA computation. One of the solutions to reduce the
amount of PSF is to sample them at different points in space and then approximate the missing PSFs through interpolation. Since PCA represents the PSF
as a linear combination of basis PSF, it becomes simpler to approximate the
missing PSFs through the interpolation of weights ๐‘ค, corresponding to basis
PSF, over the entire image plane. The accuracy of PCA based reconstruction
of actual PSF with and without sampled data will be given in section 2.6.1.
2.5.1.2
Computing Basis Point Spread Functions
Principal components analysis is a standard method to compute the basis functions called principal components. The eigenvectors of the covariance matrix
of source PSFs represent the basis PSFs (hereafter the basis functions for PSFs
are called basis PSFs). One has to consider how many basis PSF would be
sufficient to reconstruct source PSF without significant error. This can easily
be determined from the eigenvalues which represent the distribution of source
PSF energy among each eigenvector. More details about PCA may be found
in [37].
2.5.2
Simulating Optics for a 3D Object Space
Simulating the optics for only single object plane is rather simpler but one
cannot analyze the effects of occlusion, or how is the perception of depth of
19
Figure 2.5: The right image shows the depth dependent blur applied to the
input image according to the gathering method. The dark and the bright
shadows can be seen around the boundaries. The input image (left) and its
corresponding depth map (middle) are also shown in figure.
field. In this section, the extension of 2D simulator to a complete 3D simulation
of lens is presented.
Equation 2.2 describes the space variant convolution equation which can
be considered as gathering the light from surrounding pixels according to the
PSF. For a 3D space, the space variant filtering is described as,
โˆซโˆžโˆซ
๐ผ(๐‘ฅ, ๐‘ฆ) =
๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ข, ๐‘ฃ)๐‘ƒ (๐‘ฅ, ๐‘ฆ, ๐‘ง, ๐‘ข, ๐‘ฃ)๐‘‘๐‘ข ๐‘‘๐‘ฃ,
(2.8)
โˆ’โˆž
Now the PSF is also dependent on a third variable ๐‘ง, which represents the
depth value. The depth information can be obtained from the depth map
associated to the image. Figure 2.5 shows the image after applying the depth
dependent blur according to equation 2.8. The input image and its depth
map are also shown in the figure. The result does not look very good, as we
can see the dark and the bright shadows around the object boundaries. The
artifacts are stronger when the blur amount changes rapidly from one object
to the other e.g. around the boundaries of focused objects in front of defocused
objects.
In real imaging, the light from a point is scattered to the surrounding pixels
according to the PSF of that point. This phenomenon is different to conven-
20
tional space variant filtering, and can be described as,
โˆซโˆžโˆซ
๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ข, ๐‘ฃ)๐‘ƒ (๐‘ข, ๐‘ฃ, ๐‘ฅ โˆ’ ๐‘ข, ๐‘ฆ โˆ’ ๐‘ฃ, ๐‘ง)๐‘‘๐‘ข ๐‘‘๐‘ฃ.
๐ผ(๐‘ฅ, ๐‘ฆ) =
(2.9)
โˆ’โˆž
In contrast to the gathering method (equation 2.8), in scattering method
(equation 2.9) each PSF of the neighboring pixels contributes to the output of
a center pixel. For this reason, equation 2.9 is more computationally complex
as compared to equation 2.8. The scattering method is presented by Kosloff
[21] to simulate the depth of field effect in the images. However, only specific
types of PSF are used for blurring, and special filtering algorithms are used to
speed up the processing.
A minor modification in the PCA based method for space variant filtering
discussed in section 2.5.1.1 implements the scattering method. If the image is
multiplied with the weights of each basis PSF before convolving the image to
the basis PSF, we get the solution of scattering equation 2.9. The scattering
method using PCA based filtering is described as,
๐‘ƒ (๐‘ฅ, ๐‘ฆ, ๐‘ง, ๐‘ข, ๐‘ฃ) =
๐‘
โˆ‘
๐‘ค๐‘– (๐‘ฅ โˆ’ ๐‘ข, ๐‘ฆ โˆ’ ๐‘ฃ, ๐‘ง)๐‘๐‘– (๐‘ข, ๐‘ฃ),
๐‘–=1
โˆž
๐ผ(๐‘ฅ, ๐‘ฆ) =
๐‘ โˆซโˆซ
โˆ‘
๐‘ค๐‘– (๐‘ข, ๐‘ฃ, ๐‘ง)๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ (๐‘ข, ๐‘ฃ)๐‘๐‘– (๐‘ฅ โˆ’ ๐‘ข, ๐‘ฆ โˆ’ ๐‘ฃ)๐‘‘๐‘ข ๐‘‘๐‘ฃ.
(2.10)
๐‘–=1 โˆ’โˆž
Since there is only single depth value ๐‘ง for each location ๐‘ข, ๐‘ฃ, therefore,
the weights ๐‘ค are first multiplied with the image ๐ผ๐‘–๐‘‘๐‘’๐‘Ž๐‘™ followed by convolution
with the basis PSF. This method also results in the overshoots and undershoots
intensities at the boundaries of rapidly changing depth. This problem can also
be solved by the same normalization approach discussed in section 2.5.1.1.
Image is divided by the normalization values, generated according to equation
2.10. Figure 2.6 shows the process flow diagram of the proposed scattering
method. The symbol โ€™*โ€™ represents the convolution operation.
Figure 2.7 shows the image blurred with scattering method. The final image
looks much better in perception as compared to the gathering method, but it
does not solve the problem completely and the boundaries are not accurately
blurred. To make accurate simulations of boundaries, we need to handle the
occlusions which will be discussed in the next section.
21
Spreading
Iideal
W1
x
P1
W2
*
x
P2
WN
PN
*
+
I
x
*
Figure 2.6: The process flow of the proposed scattering method. ๐‘ƒ๐‘– and ๐‘Š๐‘–
represent the basis functions (eigenPSF) and the weights, respectively. The
symbol โ€™*โ€™ represents the convolution.
2.5.2.1
Simulating Partial Occlusions
All light rays in the field of view of lens contribute to the final image formation. For a wide aperture lens, the image also captures some light coming
from the occluded regions. The occluded regions are the background object
regions which are hidden by the foreground object. However, the information
of occluded regions is missing in the image, which is considered as an all in
focus image, because the image is assumed to be captured either with a narrow
field of view or a pinhole camera to get all in focus image. Unless we do not
have any information about occluded regions, it is almost impossible to blur
the objects accurately.
To acquire the information of occluded regions, let assume that it is possible
to capture two images along with their depths such that the first image is a
normal pinhole image with occluded regions, and the second image is captured
for the same scene but without any occlusions by removing foreground objects.
Figure 2.8 and 2.9 shows an example for this kind of two images and their depth
maps. There are no occluded regions in the background image.
The blur is first applied to both images according to their depth map. The
method of scattering, as discussed in the previous section, is used to blur
the images. After blurring, both images must be appropriately blended to
22
Figure 2.7: The left and right images are blurred according to gathering and
scattering approach, respectively. The scattering result is visually better than
gathering as there is a smooth blur around object boundaries.
Figure 2.8: Foreground image along with its depth map.
get a final image, with a correct simulation of occlusions (blur on the object
boundaries). The foreground image needs the normalization step as discussed
before but the background image does not, as there are no overshoots and
undershoots due to continuous change in depth. The normalization values are
very valuable here, as they provide the weighting factor to blend the foreground
and background image. Two images can be blended as:
๐ผ(๐‘ฅ, ๐‘ฆ) = ๐ผ๐‘“ ๐‘” (๐‘ฅ, ๐‘ฆ) + ๐ผ๐‘๐‘” (๐‘ฅ, ๐‘ฆ)(1 โˆ’ ๐‘Š๐‘› (๐‘ฅ, ๐‘ฆ)),
(2.11)
where ๐ผ๐‘“ ๐‘” and ๐ผ๐‘๐‘” are the foreground and background images blurred according
to the scattering method, and ๐‘Š๐‘› are the normalization weights generated by
blurring the all white image according to the depth map associated with ๐ผ๐‘“ ๐‘” .
23
Figure 2.9: Background image along with its depth map.
Blending the images in that way works accurately as the normalization
weights represent the amount of light contributed by occluded objects to the
foreground objects.
To better understand the method of gathering, scattering and simulating
occlusions accurately, a one dimensional signal is blurred with these methods.
Figure 2.10(a) shows the signal (black) blurred according to gathering (blue)
and scattering (red) method. Pixels from 41 to 50 and 51 to 60 are considered
as foreground and background signal respectively. A Gaussian blur of ๐œŽ = 3
and ๐œŽ = 0.5 is used to blur the foreground and background respectively. Both
results do not show correct blur signal as the edge must be blurred completely
according to the foreground blur. Figure 2.10(b) shows the scattering method
blended with the background signal. It is assumed that the background signal
is available at all pixels locations from 41 to 60. The final blurred signal (blue)
has a smooth blur according to the foreground blur.
The partial occlusions are simulated for a computer generated image and
its depth as shown in figure 2.11. The Gaussian PSFs with standard deviation
changing with the depth are applied to blur the image. The focus position is
set to the background. Figure 2.12 shows the comparison between gathering
and scattering method with and without occlusion handling. As the results
show, both gathering and scattering methods produce unrealistic blurring of
the pencil (the foreground object), and its boundaries appear sharper. On
the other hand, the scattering method with background blending, accurately
simulates the occlusion, and the text occluded by pencil is visible through the
boundaries.
24
1.8
Input signal
Gathering
Scattering
1.7
Intensity
1.6
1.5
1.4
1.3
1.2
1.1
1
0.9
42
44
46
48
50
52
54
56
58 60
Pixel Location
(a) A signal blurred according to gathering and scattering method with Gaussian
blur of ๐œŽ = 3 and ๐œŽ = 0.5 from pixel location 41 to 50 and 51 to 60 respectively.
1.8
Input signal
Background signal
Scattering
Scattering with background
1.7
Intensity
1.6
1.5
1.4
1.3
1.2
1.1
1
0.9
42
44
46
48
50
52
54
56
58 60
Pixel Location
(b) A foreground signal is blended with background signal. (Pixels from 41 to 50
and 51 to 60 are considered as foreground and background signal respectively).
Figure 2.10: The plots show the results of gathering and scattering method
for one dimensional case. A correct blur is applied on the boundary with the
information of background signal.
2.6
Optical Simulation
Optical system design tools can generate the PSF data for the different type of
lenses, at any field point across the field of view and at different distances from
the camera. PSF describes the defocus blur, chromatic aberrations, astigmatism and spherical distortion. However, it does not account for the effect of
25
lens distortions and relative illumination, but the amount of local image distortions and relative illumination can be obtained separately from optical design
tools. Distortion is a deviation from the rectilinear projection and are mainly
of two kinds, barrel and pincushion. Relative illumination only decreases the
intensity of the image.
The data provided by design tool is used to simulate the lens. Distortions
are simulated through sampling grid mapping, and the relative illumination is
simulated by scaling the intensity, based on each spatial position. In the next
step, the image is blurred with the PSF. The space variant blurring method,
discussed in the previous section, is used to apply the blur. Following steps
summarize the simulation process:
โˆ™ PSFs are generated for a sampled space through optical design tool.
Distortions and the relative illumination data are also calculated.
โˆ™ Distortions are simulated through the re-sampling of the image grid.
โˆ™ Relative illumination is applied to the image by scaling the intensity at
each image position.
โˆ™ Basis PSF and corresponding weights are computed using PCA. Only
the most significant PSF are selected from the help of eigenvalues.
โˆ™ Weights are interpolated over the image domain to produce the effect of
spatially varying PSF at each location.
Figure 2.11: The image and its two layers depth map.
26
Figure 2.12: The image simulated with the gathering (left) and scattering,
with (middle) and without (right) background blending, methods.
โˆ™ The image is blurred according to the scattering method, discussed in
section 2.5.2.
2.6.1
Accuracy and Computational Complexity
In this section, the accuracy and the speed optimization of the algorithm for
different applications is discussed. The blur is assumed to be changing across
the field of view for a 3D case and also along the optical axis. In some applications only adding the lens blur to the image is sufficient, e.g. simulating
the depth of field in computer generated images especially in games. For lens
designers, the complete optical simulations for a real lens design are demanded.
Most optical systems use lenses which exhibit radial symmetry, i.e. rotating
the system about the optical axis does not alter its behavior. Radial symmetry of a lens could be utilized to reduce the computational complexity. On
the other hand, if only depth of field effect is to be simulated, then the blur
can also be considered as a rotational symmetric, which further reduces the
computational time. The implementation details of the algorithm for a radial symmetric system, with rotational and non rotational symmetric blur are
discussed now.
2.6.1.1
Rotational Symmetric Blur
Most of the applications in the field of computer graphics, only needs the
simulation of a depth of field effect in the image. In that case, system can
be considered as a radial symmetric, and the blur as a rotational symmetric
behavior. However, the blur can be of any type, with different shapes of the
aperture and can vary across the field of view.
27
For an example case, let consider a Gaussian blur which varies across the
field of view and along the optical axis, according to the depth value of the
each pixel. The image shown in figure 2.8 is used to simulate the depth of field
effect. The standard deviation ๐œŽ of Gaussian filter has a direct relationship
with the lens geometrical blur diameter. Hence, it is computed directly from
the depth map as,
(
๐œŽ=๐‘˜
1
1
โˆ’
๐‘‘๐‘’๐‘๐‘กโ„Ž ๐‘‘๐‘œ
)
,
(2.12)
where ๐‘‘๐‘œ is the focus position and ๐‘˜ is a constant depending on the lens parameters.
(a) Image blurred according to gather- (b) Standard deviation ๐œŽ, of Gaussian filing method using 5 most significant basis ter at each pixel location.
PSFs.
Figure 2.13: The image is blurred with the Gaussian filter. The standard
deviation of filter changes with the depth.
Figure 2.13a shows the simulated result with the gathering method using
PCA based filtering (equation 2.8). The standard deviation of Gaussian filter
at each pixel location is also shown in figure 2.13b. For PCA based filtering,
the Gaussian filters of size 31x31 are created for the 100 equally spaced sigma
values and basis PSFs are computed for them using the PCA. The weights of
basis PSF, which are only available for the sampled sigma values, are interpolated over the entire image. In the current example, although the PSF is
varying with respect to depth, but it is only a function of sigma value. For
this reason, only one dimensional interpolation is used here.
To estimate the accuracy of the PCA based filtering, a reference image
28
is created by filtering each pixel with its own PSF. The execution times for
different algorithms are compared in table 2.1 for the image size of 1600x1200.
All simulations are performed in Matlab on the 64 bit Windows system with
Intel Core i5-2400 CPU of clock speed 23.14 GHz, and 4 GB RAM. The PCA
based method is much faster as compared to the exhaustive filtering, without
losing any significant accuracy.
Number of
Execution time(s) in
SSIM
basis PSF
Spatial
Do-
Frequency
main
Domain
Exhaustive filtering
-
825
-
-
PCA based filteing
5
44
2.7
0.999
3
28
3.2
0.994
Table 2.1: Execution time for exhaustive filtering and PCA based filtering.
Figure 2.14 shows the local SSIM value for PCA based filtering using 5 and
3 basis PSFs out of 100. As we see, the SSIM value is very high in both cases
and human eye will not perceive significant difference. However, if other than
Gaussian type of PSF is used for blurring, the SSIM value would be different
for the same number of basis PSF.
Figure 2.14: SSIM value computed between reference image (exhaustive blurring) and PCA based filtering with 5 (left) and 3 (right) basis PSFs.
The execution times for the PCA based scattering method are also computed for the same example. These are shown in table 2.2. The occlusion
29
processing with background image takes twice as much time as without occlusion processing but still it is significantly faster than the exhaustive blurring
algorithm.
Occlusion Processing Number of
Execution time(s) in
basis PSF
Spatial
Do-
Frequency
main
Domain
Yes
5
85
7.5
No
5
42
3.7
Table 2.2: Execution time for PCA based scattering filtering.
2.6.1.2
Real Lens Simulations
Generally, conventional lenses exhibit radial symmetry, and PSF at any one
radial line could be used to blur a complete image. In this case, only one
dimensional interpolation is employed to approximate the weights of PCA for
complete image field. However, the PSF must be rotated to get appropriate
blur at each pixel, which is very time consuming process.
As the depth changes for each pixel, therefore, PSFs are generated at the
sampled 3D Cartesian coordinate points, through optical design software. The
sampling points in the ๐‘ฅ and ๐‘ฆ plane are selected according to the field varying behavior of the modulation transfer function (MTF). Whereas, in the ๐‘ง
direction, the sampling points are selected according to a change in the blur
spot size. The information about field varying MTF and depth varying spot
size behavior can be obtained from optical design software.
After generating the PSFs at sampled locations, basis PSF and corresponding weights are computed through PCA. To approximate the missing weights,
interpolation must be performed in three dimensions, which makes the algorithm slower as compared to the rotational symmetric system (the example
case discussed before).
Before applying the blur, distortions and relative illumination are applied to
the image separately. Relative illumination decreases the amount of intensity
locally at each pixel location, and can be simulated by scaling the data based
30
Figure 2.15: Image (right) shows the result of distortions and relative illumination applied to an ideal image (left).
on spatial positions. Distortions are mainly of two types, barrel and pincushion
distortion. In barrel distortion, the image magnification decreases with depth.
On the other hand in pincushion distortion, the image magnification increases
with depth. We can summarize the distortion as a local non uniform stretching
or shrinking of the image. The amount of distortion induced in the image is
not only a function of radial image pixel coordinates, but it also depends on the
scene depth. Therefore, distortion values are computed for different distances
at sampled points on a radial line. Interpolation is utilized to acquire the
distortion value at each pixel location according to its depth. The depth map
assists in getting the depth of each pixel. Finally, the distortions are simulated
through sampling grid mapping to new coordinates, followed by resampling to
a rectangular grid.
Figure 2.15 shows an example of distortions and relative illumination applied to an image. It can be seen that corners of the image are darker and the
barrel distortions are stronger as compared to the pincushion distortions.
A lens is designed in Zemax software with a focal length of 4 mm, f-number
of 2.4 and a horizontal field of view of 60โˆ˜ . At outer most field points, the
relative illumination is 50%. Distortions are less than 2%. Lateral chromatic
aberration are well corrected, however longitudinal chromatic aberrations are
deliberately enhanced for an application use case.
PSFs are generated for sampled grid points through the automated Matlab
program. The number of sampling points, selected in ๐‘ฅ, ๐‘ฆ and ๐‘ง directions,
are 9, 7 and 10 respectively. In total, 630 PSFs are generated for a complete
31
Figure 2.16: Simulated real lens image (right) along with original image (left).
3D space. Basis PSF and corresponding weights are computed for these PSFs,
and 50 most significant basis PSF are used for simulations. Up to this step
everything can be performed offline, and only once for each lens design.
Optical simulations start by simulating the distortions for both the image
and its depth map. Then the relative illumination is applied to the image
only. Finally, the image is blurred according to the PCA based scattering
method as discussed in section 2.5.2. Figure 2.16 shows the simulated real
lens image. Vignetting (relative illumination) effect is visible at the corners,
which are relatively darker. The cropped regions are shown in figure 2.17 at
full resolution. The effect of optical blur and chromatic aberration is visible in
the simulated output.
2.7
Sensor Simulation
In digital cameras, sensors capture the incident light and converts it into a
digital signal for displays. Image sensors are mainly of two types, charged
coupled device (CCD) and CMOS sensors. Due to less power consumption
and low cost, CMOS sensors are widely used commercially. Besides capturing
the image, sensor also adds noise to it. In the following sections, the noise
model to simulate the image sensor noise characteristics will be discussed.
Moreover, sensor MTF is applied to the image before downsampling it to
sensor resolution according to color filter array (CFA) pattern. Images with
the resolution higher than sensor are used to simulate the effect of aliasing.
32
Figure 2.17: Cropped region shows the input image (left), and the result of
optical simulations (right). The effect of optical blur and chromatic aberrations
is visible with optics simulation.
2.7.1
Noise Sources in Image Sensors
In CMOS image sensors, a photo sensor captures the incident light consisting
of photons and converts it into the electrical signal. During the whole process,
due to random nature of photons arrival and fluctuations in the electrical
signal, noise is added in the captured image. The noise appears as a random
variation in the intensity of the image. It plays an important role in defining
the dynamic range and responsivity of the image sensor. Noise is described
as either with the variance or the standard deviation in rms unit. If there are
more than one noise sources, their variances are added to get the total amount
of noise.
Noise in image sensors is typically divided into two categories, temporal
noise and fixed pattern noise.
2.7.1.1
Temporal Noise
Most dominant random noise in sensors is a shot noise, which appears due to
random fluctuations of charge units (electrons). This noise follows the basic
laws of physics and same for all type of sensors. Shot noise is statistically
described with the Poisson distribution. However, when the number of electrons are large, Poisson distribution approaches Gaussian distribution. Other
significant temporal noises are readout noise and dark current noise.
33
2.7.1.2
Fixed Pattern Noise
Fixed pattern noise (FPN) appears due to manufacturing mismatches in the
active transistor. FPN remains constant in time for each pixel but varies
spatially from pixel to pixel. It could appear due to dark signal non-uniformity
(DSNU) which is always present without any illumination, and photo response
non-uniformity (PRNU) which is signal dependent.
2.7.2
Noise Model
To model the noise in our simulation, the linear noise model described in
EMVA1288 standard [1] and in [16] is followed. The model is described here
briefly.
The photons hitting the light sensitive area are converted into electrons,
amplified, and converted into digital signal ๐ผ by analog to digital converter
(ADC). The whole process is assumed linear and can be described with the
overall system gain ๐พ. Then the mean signal ๐œ‡๐ผ is given as
๐œ‡๐ผ = ๐œ‡๐ผ๐‘‘๐‘Ž๐‘Ÿ๐‘˜ + ๐พ๐œ‡๐‘’ ,
(2.13)
where ๐œ‡๐ผ๐‘‘๐‘Ž๐‘Ÿ๐‘˜ is the mean dark signal without light and ๐œ‡๐‘’ is the averaged
number of captured electrons.
If a complete camera is considered as a black box, it is sufficient to consider
only three noise sources, shot noise, readout noise (also includes amplifier
noise) and quantization noise introduced by ADC converter. These noises are
represented with their variances, ๐œŽ๐‘’2 , ๐œŽ๐‘‘2 and ๐œŽ๐‘ž2 respectively. All noise sources
add linearly to make a total temporal noise ๐œŽ๐ผ 2 , of the digital signal. According
to laws of propagation, it is given as
๐œŽ๐ผ2 = ๐พ 2 (๐œŽ๐‘‘2 + ๐œŽ๐‘’2 ) + ๐œŽ๐‘ž2 .
(2.14)
Using equation 2.13 and relationship ๐œŽ๐‘’2 = ๐œ‡๐‘’ , noise can be related to mean
digital signal as,
๐œŽ๐ผ2 = ๐พ 2 ๐œŽ๐‘‘2 + ๐œŽ๐‘ž2 + |{z}
๐พ (๐œ‡๐ผ โˆ’ ๐œ‡๐ผ๐‘‘๐‘Ž๐‘Ÿ๐‘˜ ) .
| {z } slope
(2.15)
offset
As there is a linear relationship between noise variance and mean signal
value, the overall system gain ๐พ can be computed from the slope, and dark
34
noise variance from the offset. The methods described in EMVA1288, section
6.6, are employed to measure the gain ๐พ and variance of dark noise ๐œŽ๐‘‘2 , and
used these values to simulate a sensor.
2.7.3
Sensor MTF and Sampling
A sensor consists of array of pixels that are arranged in matrix structure.
Therefor, an incident light is sampled according to Nyquist sampling theorem.
Actual sensor MTF depends on the shape and size of a pixel. If we consider a pixel of size ๐‘๐‘, then all spatial frequencies above Nyquist frequencies,
๐‘“๐‘ ๐‘ =
1
,
2๐‘๐‘
cannot be resolved and hence results in aliasing. For a rectangular
pixel, the sensor MTF is a Fourier transform of a two dimensional rectangular
window, which is a 2D ๐‘ ๐‘–๐‘›๐‘ function. Figure 2.18 shows one dimensional MTF
for different pixel sizes ๐‘๐‘, where ฮ”๐‘™ represents the spatial resolution of a lens.
As we see, the smaller pixels resolve more details as compared to large pixels.
1
pp = ฮ”๐‘™
pp = 2ฮ”๐‘™
pp = 4ฮ”๐‘™
Sensor MTF
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Saptial Resolution/ฮ”๐‘™
Figure 2.18: The combined MTF of sensor detector footprint and sampling
is shown for different pixel sizes.
For our simulations, the input scene is taken with at least two times higher
resolution than sensor resolution. After applying the sensor MTF in frequency
domain, image is sampled with nearest neighbor method to actual sensor resolution. In this way, aliasing effects can be simulated.
35
2.7.4
Color Filter Array
Photo sensors do not differentiate between wavelengths of light (color). Therefore, bandpass color filters must be used to capture color information. Mostly,
sensor capture three primary colors red, green and blue. For an example, a
color filter spectral response is shown for the commercially available Kodak
color filters.
To correctly simulate the effect of color filter array, the simulations are
performed with hyperspectral images. After applying the optical blur, sensor
noise and sensor MTF, hyperspectral images are summed up according to
color filter array responses. The hyperspectral simulation is very important
for a correct simulation of axial chromatic aberrations. Because, the blur is
dependent on the color of the objects in a scene.
Quantum Efficiency
1
0.8
R(๐œ†)
0.6
B(๐œ†)
G(๐œ†)
0.4
0.2
0
400
450
500
550
600
650
700
Wavelength [nm]
Figure 2.19: Spectral sensitivity of color filter arrays.
Generally, sensors also perform spatial sampling of the colors, and capture
only one color at each pixel. Specific patterns of pixels are used to get each
primarily color red, green and blue in a local neighborhood. The most common
pattern is a โ€œBayerโ€ pattern shown in figure 2.4. This pattern has twice green
pixels as compared to red and green.
36
2.8
Digital Simulation
The digital simulation implements standard color image post-processing algorithms mostly used for conventional cameras. This includes algorithms such
as black level subtraction, white balancing, demosaicing, color correction and
gamma curve [28]. Figure 2.20 shows the color processing chain of the simulator.
RGB Image
Bayer Image
Black Level Subtraction
White Balance
Demosaicing
Color Correction
Gamma
Figure 2.20: Digital image post processing chain.
Algorithms which are related to some specific camera design or applications are also integrated in the processing chain at appropriate position. For
example, deconvolution algorithm to restore the image is incorporated in the
chain to retrieve extended depth of field image. For a lens exhibiting longitudinal chromatic aberration, depth estimation algorithm and the algorithm to
generate extended DOF image is implemented.
The details of few post processing algorithms will be discussed in the next
chapters.
2.9
Conclusion and Outlook
In this chapter a digital camera simulator is presented. The simulator models
the complete digital camera processing chain such as optics, sensor and digital
processing. The main contribution of the simulator is in optical simulation.
It allows a user to simulate the conventional and unconventional optics with
a correct modeling of occlusions. The blur induced by the optics is generated
for a sampled 3D space with commercially available lens design tools. Missing
blur information is then approximated at each pixel of the image through
PCA based interpolation. For 3D scenes, the true depth map is used to blur
the image according to each pixels depth and location. It is shown that two
methods of filtering, scattering and gathering, mentioned in the past literature,
37
can be efficiently implemented in Fourier domain using the PCA based filtering.
An efficient algorithm for space variant filtering using PCA helps in making the
simulation time substantially smaller. Although the aim of the simulator is to
simulate the cameras but the low cost method of space variant filtering can also
be used to add the depth of field effect in real time to the computer generated
scenes, which is very useful and demanding in gaming applications. The sensor
simulation includes the noise addition, sampling and color filter array effects.
The digital part implements the traditional camera post processing steps.
38
Chapter 3
Depth From Chromatic
Aberrations
In this chapter a computational imaging technique is discussed to estimate the
depth from a single image using conventional camera. The optics is designed to
introduce the significant amount of axial chromatic aberrations, and depth is
estimated digitally through post capturing algorithm. Due to axial chromatic
aberrations (ACA), different colors are focused at different distances along the
lens axis. Hence, the relative blur between two color images helps in estimating
the depth. The basic principle is similar to well known depth from defocus
(DFD) method.
The chapter is started with the overview of different depth estimation methods in section 3.1. Then the depth from defocus method is discussed in section
3.2, followed by the proposed method of depth from chromatic aberrations
(DFCA) in section 3.3. Section 3.4 discusses the interpolation methods to create a dense depth map. Finally the results are discussed and analyzed in the
last section.
3.1
Overview of Depth Estimation Methods
Depth estimation refers to the algorithms which aim to estimate the distance
of objects in a scene from the camera. The distance and geometry information
is lost in the conventional imaging methods, because a set of rays from a scene
is projected on a 2D instead of 3D plane.
39
40
There are many approaches which try to estimate the lost depth information
in different ways. These different techniques can be categorized mainly into
two methods, active and passive.
3.1.1
Passive Depth Estimation
In passive methods, depth is estimated from the images captured with different
camera settings in natural lighting conditions. The basic methods of passive
depth imaging are based on the triangulation. If a point in a scene is observed
from different views by changing the camera position, depth can be estimated.
3.1.1.1
Stereoscopy
Stereoscopic method is based on the principle of triangulation. Triangulation is
the process of determining the distance of a point by measuring the angles to it
from known points at the corners of a base line. In stereoscopy, this principle is
applied by using two cameras separated by a fixed baseline to image a specific
point. As a result, the point is shifted in the imaging plane known as disparity
from which depth is determined. The depth estimation algorithms identify
the corresponding regions in two stereo images to determine the disparity.
However, this matching problem is computationally expensive and also its
accuracy depends on the local image features. Depth from stereoscopy does
not always provide dense depth maps. If there are homogeneous regions in the
image, the disparity cannot be computed therefore interpolation techniques
are required to produce dense depth maps.
3.1.1.2
Depth from Focus/Defocus
The limited depth of field of optical system helps in determining the depth. In
depth from focus (DFF) method, multiple images are captured with different
focus settings, by either moving the lens or the sensor. As a result, objects at
different distances focus in different images. The depth estimation algorithm
computes the sharpness of all images at each pixel location, and determines
the depth according to that image where the pixel is sharpest.
The main disadvantage of this method is to capture multiple images. It is
studied that for a reasonable depth accuracy, ten or more images are required.
41
Another drawback is that the depth can be determined only at the regions
where some local features exist. For homogeneous regions, interpolation is
required to compute the dense depth map.
In contrast to DFF, in depth from defocus (DFD) method only two images
are used to estimate the depth. The amount of defocus blur changes continuously with the distance of the object. The measured value of the blur in
the image directly gives the depth information. Similar to other passive depth
methods, DFD only provides the depth at textures or edges in the image. To
generate dense depth map, interpolation is required.
3.1.2
Active Depth Estimation
Active methods project some energy on a scene and the sensor detects the
depth by processing the returned energy information. These methods are more
accurate and also help in providing the ground truth.
In ultrasound imaging, depth is perceived through transmission of sound
waves and interpreting the intensities of reflected echoes. However, in most
cases infrared (IR) or incandescent light is used for active illumination of a
scene. Todays, IR light is mostly used for depth imaging, for example, in
Kinect or time of flight camera. The advantage is the invisibility of IR light
to human eye. However, the disadvantage is that traditional imaging sensors
cannot be used and specific sensor elements are required to capture IR light.
3.1.2.1
Depth from Time of Flight
An active light source projects light on the scene and the camera measures
the delay caused in the time for a light beam to travel a certain distance. The
maximum measurable distance depends on the frequency of the projected light
pulses. The method provides very accurate depth results but it requires specific
hardware device along with an active light illumination source i.e. more power
consumption.
3.1.2.2
Depth from Active Stereoscopy
The correspondence matching problem can be reduced by replacing one camera in the stereo system with the active light source, which projects specific
42
light pattern on the image. The depth is computed from the information of
distortion of projected light pattern. The accuracy of the depth increases as
compared to the passive method. To further reduce the ambiguities in the
matching problem, more than one pattern is projected through time or color
multiplexing.
3.1.3
Depth Estimation by Computational Imaging
Computational photography is a recent development in the field of imaging
to acquire the image features which are not possible through conventional
imaging. It modifies the optics and/or the sensor to code the image in a
specific way so that the digital post processing could recover the conventional
image along with some other useful features e.g. depth map, extended depth
of field, high dynamic range, super resolution, etc.
Ng et al. [30] presented the method of light field capture through plenoptic camera, which also results in obtaining the depth information. Another
method for depth estimation is proposed by Zhou et al. [45], where a diffuser
is inserted between scene and the camera to code the depth information in the
defoucs blur. The method is similar to depth from defocus, however higher
depth accuracy can be achieved with smaller aperture.
Several other methods based on the aperture coding are also proposed to
estimate the depth. Bando et al. [3] suggested the use of color filtered aperture
to shift the color images and hence estimating the depth from the disparity in
the color images. One of the limitations of these methods is the loss of light.
Chakrabarti et al. [5] modified the aperture to generate varying depth of field
for different colors and infer depth from blur difference in the color images.
This thesis also investigates a computational imaging method using axial
chromatic aberrations to estimate the depth. Following sections will describe
the details of the method and discuss the pros and cons of this approach for
depth estimation. But before discussing this, a detailed description of depth
from defocus method is given which is the basis of depth from axial chromatic
aberration.
43
3.2
Depth From Defocus
The limited depth of field of imaging optics helps in determining the distance
from the defocus blur. In section 2.2, the physical image formation process is
described. In optics, if the thickness of the a lens is much smaller than focal
length, it can be considered as a thin lens represented by a single principal
plane. For a thin lens, in paraxial ray approximation, the relationship between
the distance of object to the lens ๐‘‘ and the distance of image sensor to the
lens ๐‘‘๐‘– is given as,
1
1
1
= + ,
๐‘“
๐‘‘๐‘– ๐‘‘
(3.1)
where ๐‘“ is the lens focal length. These entities are better illustrated in figure
3.1.
aperture stop
d
sensor
f
A
X
b
di
object
image
Figure 3.1: Image formation by a thin lens approximation.
In real aperture photography, only the objects at the focus position of the
lens appear sharp at the image plane. All other objects closer or farther from
focus position appear blurred. The amount of blur ๐‘ depends on the distance
of object from the focus position and relates to the camera parameters as,
(
๐‘ = ๐ด๐‘‘๐‘–
1
1
1
โˆ’ โˆ’
๐‘“
๐‘‘๐‘– ๐‘‘
)
,
(3.2)
where ๐ด is the aperture diameter and ๐‘‘ is the object distance from the lens.
The blur measurement in the image directly gives the distance information
using the equation 3.2. Although the equation represents the geometrical blur,
it helps in approximately model the behavior of optics.
However, the ambiguity appears in distinguishing the objects on both sides
of the cameraโ€™s focus position because they are blurred with the similar amount.
One can eliminate this ambiguity by focusing the lens at the nearest distance
of the desired depth range. Another challenge is the measurement of accurate
44
blur due to varying local image features. For example it is difficult to distinguish between defocused region of sharp edge and a focused region of smoothly
varying edge.
To eliminate the ambiguities, two differently defocused images are used in
DFD method. The images are captured by changing the camera parameters
such as aperture size or lens/sensor position. Depth estimation algorithm
then measure the relative blur between two defocused image to determine the
distance. This solves the problem of blur measure dependency on local image
feature. The other ambiguity of distinguishing closer or farther object can only
be solved by changing the lens/sensor position instead of changing aperture
size for capturing different defocus images.
Two images with different focus positions assist to infer the depth without
any ambiguities, but it adds another problem. Different focus position results
in different image magnifications which results in misalignment of images features. For a thin lens, the amount of magnification ๐‘€ depends on the focal
length and the object distance.
๐‘€=
๐‘“
.
๐‘“ โˆ’๐‘‘
(3.3)
However, the object distance is same for both defocused images, therefore
only focal length affects the magnification difference of two image. To avoid the
problem of magnification, most authors have suggested to change the aperture
size to capture two defocused image.
For the images captured with different focus settings, images must be registered before computing the relative blur. One possible solution is to register
the images through digital processing, for example warping. The amount of
magnification may be computed through calibration procedure or estimating
through local image analysis. However, the image registration is not always
accurate and also adds complexity in th DFD method. Watanabe et al. have
proposed to make the optics telecentric, where the magnification of the image
is independent of lens focus position [41]. They have proved that any conventional optics can be made telecentric by adding an aperture. In this case,
the effective and nominal F-numbers are same. That property gives similar
magnification for different defocus images. However, the main disadvantage of
this approach is the loss of light.
45
3.3
Depth From Axial Chromatic Aberration
In the present thesis, the goal is to capture different defocused images with a
single shot. As traditional image sensors capture three primary colors RGB,
therefore it is possible to defocus each color with different amount. Fortunately,
the lenses automatically provide different focus images for different colors, because lens refraction is color dependent. In the next sections, first the behavior
of lens on colors is discussed followed by the details of the algorithm to extract
depth from color channels.
White Light
Screen
Red
Orange
Yellow
Green
Blue
Indigo
Figure 3.2: Dispersion of white light into monochromatic light after passing
through prism.
3.3.1
Axial Chromatic Aberrations
The focal length of a thin lens is given as,
1
1
1
โˆ’
),
โ‰ˆ (๐‘›(๐œ†) โˆ’ 1)(
๐‘“ (๐œ†)
๐‘…1 ๐‘…2
(3.4)
where ๐‘…1 and ๐‘…2 are the radii of curvature of the two surfaces of a lens and
๐‘› is the refractive index of lens material, which depends on the wavelength of
light ๐œ†. Equation 3.4 shows the dependency of focal length on the wavelength,
and this occurs indirectly due to the dispersion property of the lens material.
Figure 3.2 shows the white light passing through the prism and dispersed
into different colors. The amount of dispersion decreases from lower wavelengths to higher. The empirical relationship between refractive index and
wavelength of light is described through Cauchyโ€™s equation which is quite accurate for visible range of light. Figure 3.3 shows the behavior of the refractive
46
1.65
Refractive index
Glass - BK7
Plastic - Polycarbonate
1.6
1.55
1.5
0.4
0.5
0.6
0.7
0.8
0.9
1
Wavelength [๐œ‡m]
Figure 3.3: Refractive index of glass (BK7) and plastic (Polycarbonate) materials.
index for BK7 glass and polycarbonate plastic. The variation of refractive index for plastic material is larger as compared to the glass, which is the reason
of larger chromatic aberrations in cheap optics made by plastic material.
Since the focal length changes with the wavelength, therefore the blur diameter given by equation 3.2 also depends on the wavelength. For three primary
colors RGB (๐œ†๐‘Ÿ , ๐œ†๐‘” , ๐œ†๐‘ ), the blur diameter is given as,
(
)
1
1
1
๐‘(๐œ†๐‘Ÿ ) = ๐ด๐‘‘๐‘–
โˆ’ โˆ’
,
๐‘“ (๐œ†๐‘Ÿ ) ๐‘‘๐‘– ๐‘‘
(
๐‘(๐œ†๐‘” ) = ๐ด๐‘‘๐‘–
(
๐‘(๐œ†๐‘ ) = ๐ด๐‘‘๐‘–
1
1
1
โˆ’ โˆ’
๐‘“ (๐œ†๐‘” ) ๐‘‘๐‘– ๐‘‘
)
1
1
1
โˆ’ โˆ’
๐‘“ (๐œ†๐‘ ) ๐‘‘๐‘– ๐‘‘
)
(3.5)
,
(3.6)
.
(3.7)
Blur diameter for different colors is plotted versus object distances in figure
3.4, for an arbitrary lens parameters. The relative blur estimation between
colors directly provides the relative depth information. As a result of chromatic
aberrations, it is possible to capture three differently defocused color images
in a single shot with the RGB image sensor.
Image quality is degraded due to chromatic aberrations, and color bleeding
artifacts appear in the image. Therefore, chromatic aberrations are usually
47
0.03
Blue
Green
Red
Blur Diameter [mm]
0.025
0.02
0.015
0.01
0.005
0
102
103
104
Object distance [mm] (log scale)
Figure 3.4: Blur diameter for RGB colors which focus at different distances
due to chromatic aberrations.
corrected in the lenses by designing a compound lens through the combination of multiple materials. There are also some digital processing algorithms
which correct the chromatic aberrations. The details of image quality will be
discussed in chapter 4.
3.3.2
Depth Estimation
Depth estimation algorithms estimate and compare the blur of each defocused
image to compute the relative depth. In DFCA system, blur measure is more
challenging due to varying local image content in each color image. In the next
section, different blur measures are discussed and compared to select the best
blur measure for DFCA system.
3.3.2.1
Blur Measures Methods
There have been many different type of blur measures proposed in the literature. These all can be mainly categorized into statistical, derivative and
energy based blur measures. Statistical based operators compute the standard
deviation, variance, central moment and etc. to estimate the amount of blur.
Derivative based blur measures estimate the blur from image contrast, first
derivative or second derivative of the image along with smoothing operations.
48
Figure 3.5: The image with and without chromatic aberrations.
Energy of the signal is also a basis of computing the blur through discrete
cosine transform (DCT) or discrete Fourier transform (DFT). There are many
other proposed operators, but most of them are based on these fundamental
blur measures. In this work, different blur measures are analyzed according to
their ability to distinguish the minimum amount of blur in case of noise. Some
blur measures are discussed here.
Sum of Squared Gradient: Defocus blur is a kind of low pass filter which
suppresses the higher spatial frequencies in the image. Therefore, it is desirable to estimate the amount of blur through function which responds to high
frequencies. Derivative operators provide the same functionality.
First derivative of the image can be computed through gradient filters. Magnitude of the gradient vector provides the information about the blur. Since
the gradient filter enhance high frequencies and hence the noise, therefore,
smoothing operation is combined to reduce the effect of noise. Most commonly used blur measure is the Tenengrad operator proposed by Kortokov
[22], given as,
๐ต๐‘€ =
๐‘€ โˆ‘
๐‘
โˆ‘
( 2
)
๐‘”๐‘ฅ + ๐‘”๐‘ฆ2 ,
(3.8)
๐‘ฅ=1 ๐‘ฆ=1
where ๐‘”๐‘ฅ and ๐‘”๐‘ฆ is the gradient vector of an image ๐ผ in ๐‘ฅ and ๐‘ฆ directions. A
well known Sobel operator is used to compute the gradients in x and y direc-
49
tions. Summing up the neighboring gradients provide the further smoothing
function.
Gaussian Derivative: The smoothing operation can be combined with the
gradient filter to make a more robust blur measure. Geusebroek et al. [13]
have proposed to combine the Gaussian filter with the gradient filter to make
a Gaussian derivative filter which is more robust against noise. 2D Gaussian
derivative filter in the ๐‘ฅ direction is given as,
๐บ๐‘ฅ = โˆ’
๐‘ฅ โˆ’ (๐‘ฅ2 +๐‘ฆ2 2 )
๐‘’ 2๐œŽ .
2๐œ‹๐œŽ 4
(3.9)
Similarly the ๐บ๐‘ฆ is defined in ๐‘ฆ direction. The value of ๐œŽ defines the amount
of smoothness.
Sum of Modified Laplacian: Second derivative of the image provides more
sensitivity against the blur. Laplacian operator provides the second derivative
of the image and the blur measure is defined as
๐ต๐‘€ =
๐‘€ โˆ‘
๐‘
โˆ‘
(๐‘”๐‘ฅ๐‘ฅ + ๐‘”๐‘ฆ๐‘ฆ )2 ,
(3.10)
๐‘ฅ=1 ๐‘ฆ=1
where ๐‘”๐‘ฅ๐‘ฅ and ๐‘”๐‘ฆ๐‘ฆ are the second derivatives of an image ๐ผ in ๐‘ฅ and ๐‘ฆ directions
respectively. Nayar and et al. [29] have noted that second derivative in two
directions can cancel each other due to opposite signs. Therefore, they have
modified the Laplacian operator by adding the absolute value of two directional
second derivatives.
๐ต๐‘€ =
๐‘€ โˆ‘
๐‘
โˆ‘
(โˆฃ๐‘”๐‘ฅ๐‘ฅ โˆฃ + โˆฃ๐‘”๐‘ฆ๐‘ฆ โˆฃ),
(3.11)
๐‘ฅ=1 ๐‘ฆ=1
Energy of Discrete Cosine Transform (DCT) Coefficients: Local estimate of the energy provides the information of blur because the energy of
the image decreases as the amount of blur increases. Shen et al. [35] suggested
to use the DCT coefficients to estimate the energy. Blur estimate is defined
as the ratio between low frequency coefficients and high frequency coefficients.
As shown by authors, this blur measure is more robust against the noise.
50
3.3.2.2
Comparison of Blur Measures
The blur measure methods must fulfill the following properties,
โˆ™ independent of image content
โˆ™ monotonic behavior against the blur
โˆ™ large variation against the minimum change in blur
โˆ™ robust against noise
โˆ™ and low computational complexity.
To compare the blur measures, an ideal step edge is blurred with Gaussian filter of varying sigma values from 0.1 to 5 with equal spacing, as shown
in figure 3.6. Figure 3.7 shows the estimated amount of blur through different methods, Gaussian derivative, sum of squared gradients, sum of modified
Laplacian and DCT ratio. Sum of Laplacian and sum of squared gradients have
more steeper response for lower sigma values, however, for larger sigma values,
these are less varying, which results into more sensitivity to noise. Gaussian
derivative and DCT ratio provide more linear response for full range of sigma
values.
Normalized Intensity
1
0.8
Edge Height
(Contrast)
0.6
Edge Range
0.4
0.2
0
5
10
15
20
Figure 3.6: A step edge blurred with Gaussian blur of varying standard deviation.
Most important aspect of a blur measure method is its accuracy in case of
higher noise. For the comparison of blur measures in case of noise, an ideal
51
Normalized Blur Measure
1
Gaussian Derivative
Tenengrad
Laplacian
DCT Ratio
0.8
0.6
0.4
0.2
0
0
1
2
3
4
5
Standard Deviation of Gaussian (๐œŽ)
Figure 3.7: Normalized blur values of different type of blur measures.
step edge is defined with a low contrast of 50 gray values ranging from 900
to 950 for a 10 bit digital sensor. Since sensor noise is intensity dependent,
therefore, higher mean value with low contrast results in higher noise (lower
contrast to noise ratio (CNR)). In the next steps, the Gaussian blur of varying
sigma values is applied to the edge followed by noise addition according to the
noise model discussed in section 2.7. Figure 3.8 shows the root mean square
error (RMSE) of different blur methods discussed before. The solid line shows
the blur measure value for noise free case. As it can be seen, Tenengrad
and Laplacian methods have a large RMSE at smaller sigma values. Gaussian
derivative and DCT methods are less sensitive to noise, and their mean value is
also equal to the noise free case. Whereas, Tenengrad and Laplacian methods
have different mean value in case of noise. This shift in the mean value makes
it impossible to retrieve correct blur value through post processing smoothing
operations.
Although the Gaussian derivative and DCT based method provide robust
blur estimate in case of noise, but the spatial resolution of blur estimate becomes smaller, as the blur operators work on relatively larger neighborhood
to estimate the blur. On the other hand, Tenengrad and Laplacian operators
provide higher spatial resolution due to small operator range, but there is an
offset in blur estimate. However, the depth from defocus algorithms estimate
52
the relative blur between two images, therefore, the offset in the blur estimate
of individual images does not affect the relative blur estimate. For that reason,
Tenengrad operator, with a small modification for contrast invariance, is used
in this work to estimate the depth.
(a) Gaussian derivative.
(b) DCT Ratio.
1.5
Normalized Blur Measure
Normalized Blur Measure
1.5
1
0.5
0
0
1
2
3
Sigma
4
1
3
2
Sigma
4
5
(d) Sum of modified Laplacian.
1.5
Normalized Blur Measure
1.5
Normalized Blur Measure
0.5
0
0
5
(c) Tenengrad operator.
1
0.5
0
0
1
1
3
2
4
5
1
0.5
0
0
1
Sigma
3
2
4
5
Sigma
Figure 3.8: Normalized blur values of different blur measures method plotted
with the RMSE for low CNR.
3.3.2.3
Contrast Independent Blur Measure
As the goal of this work is to estimate the depth from color channels, therefore,
measure of the blur must be independent of varying contrast of color images.
The first choice is to normalize the image content to equal level. This can be
done by scaling each edge between its maximum and minimum value, which
is the local contrast. If the local contrast is computed with the window size
53
of equal to twice of edge range, normalization of complete edge would be
consistent. The normalization process is defined as,
๐‘€
๐‘
๐ผ๐‘š๐‘–๐‘› = min min ๐ผ(๐‘ฅ, ๐‘ฆ),
๐‘ฅ=1 ๐‘ฆ=1
๐‘€
๐‘
๐‘ฅ=1
๐‘ฆ=1
๐ผ๐‘š๐‘Ž๐‘ฅ = max max ๐ผ(๐‘ฅ, ๐‘ฆ),
๐ผ๐‘ (๐‘ฅ, ๐‘ฆ) =
๐ผ(๐‘ฅ, ๐‘ฆ) โˆ’ ๐ผ๐‘š๐‘–๐‘›
,
๐ผ๐‘š๐‘Ž๐‘ฅ โˆ’ ๐ผ๐‘š๐‘–๐‘›
(3.12)
(3.13)
(3.14)
where ๐ผ(๐‘ฅ, ๐‘ฆ) is the image and ๐ผ๐‘ (๐‘ฅ, ๐‘ฆ) is the normalized pixel. The difference
of maximum and minimum value (๐ผ๐‘š๐‘Ž๐‘ฅ โˆ’๐ผ๐‘š๐‘–๐‘› ) defines the contrast of the edge.
This contrast measurement is very sensitive to the noise. This can be improved
by using the p-quantile filters instead of max-min operator.
Since, gradient operators are bandpass filter and remove the DC value,
therefore, instead of normalizing the image, computed gradients can be normalized with the local contrast. Finally for gradients based operators, the final
blur measure is given as,
๐ต๐‘€ (๐‘ฅ, ๐‘ฆ) =
๐‘€ โˆ‘
๐‘
โˆ‘
๐‘ฅ=1 ๐‘ฆ=1
3.3.2.4
(๐‘”๐‘ฅ2 + ๐‘”๐‘ฆ2 )
,
(๐ผ๐‘š๐‘Ž๐‘ฅ โˆ’ ๐ผ๐‘š๐‘–๐‘› )2
(3.15)
Depth Estimation from Blur Measures
Figure 3.9 shows the flow diagram of the DfCA algorithm. Normalized gradients are computed for each color channel and multiplied with the edge map
to consider only sharp edges of the image. Then the ratio of all three blur
measures are taken to the compute the estimate the relative depth.
Conventional color sensors capture three colors, red, green and blue. Therefore, we have three defocused images for depth estimation, which make it possible to estimate the depth for a larger distance range as compared to DFD
system, where we normally use only two images. Here, it is proposed to take
the normalized ratio of all three colors, to get a single depth map for a broader
range in the following way;
๐ถ (๐‘‘๐‘’๐‘๐‘กโ„Ž) =
๐ต๐‘€๐‘Ÿ2 โˆ’ (๐ต๐‘€๐‘ × ๐ต๐‘€๐‘” )
,
๐ต๐‘€๐‘Ÿ2 + (๐ต๐‘€๐‘ × ๐ต๐‘€๐‘” )
(3.16)
54
RGB
Image
Edge
Map
Local
Contrast
Gradient
Edge
Map
Local
Contrast
Gradient
Edge
Map
Local
Contrast
Gradient
/
/
/
Averaging
Averaging
Averaging
x
x
x
Calibration
Curve
Normalized
Ratio
Interpolate
Depth Map
Figure 3.9: Flow diagram of depth from chromatic aberrations algorithm.
where ๐ต๐‘€๐‘ are the blur measures of color image ๐‘ = {๐‘Ÿ, ๐‘”, ๐‘} and ๐ถ is the calibration curve used to estimate the absolute depth. Figure 3.10 shows different
combination of ratios of blur measure for the distances from focus position of
blue to focus position of red image. The combined ratio of all three colors is
the best in terms of steeper slope for larger distance range.
3.3.3
Analysis of Depth Errors
3.3.3.1
Practical Issues in DFCA
The effect of noise on the accuracy of depth estimation has been discussed
in the section of blur measures. There are some other practical issues which
55
๐ต๐‘€๐‘” .๐ต๐‘€๐‘Ÿ โˆ’๐ต๐‘€๐‘Ÿ2
๐ต๐‘€๐‘” .๐ต๐‘€๐‘Ÿ +๐ต๐‘€๐‘Ÿ2
0.6
๐ต๐‘€๐‘ โˆ’๐ต๐‘€๐‘Ÿ
๐ต๐‘€๐‘ +๐ต๐‘€๐‘Ÿ
Blur Ratios
0.4
๐ต๐‘€๐‘ โˆ’๐ต๐‘€๐‘”
๐ต๐‘€๐‘ +๐ต๐‘€๐‘Ÿ
๐ต๐‘€๐‘” โˆ’๐ต๐‘€๐‘Ÿ
๐ต๐‘€๐‘” +๐ต๐‘€๐‘Ÿ
0.2
0
โˆ’0.2
โˆ’0.4
0
500
1000
1500
2000
2500
3000
Distance [mm]
Figure 3.10: Ratios of blur measure with different combinations of colors
effects the accuracy of depth estimation.
Window Size of Blur Measure Different kind of blur operators work in
the local neighborhood (window) to compute the amount of blur. The size of
window is kept fixed to keep the minimum computational complexity. Since
depth changes across the image, it is desirable to use the minimum possible
window size for high resolution depth map. But it results in higher depth error.
Another problem arises due to different blur sizes in two defocused images. If
fixed window size is used, one of the image would contain spurious data at the
border of the edges.
However, it may be possible that two edges of different distances are separated by the number of pixels less than the size of the window. In this case
the depth estimate is not reliable. Window size could be made adaptive on
the cost of computational complexity.
Texture Dependent Blur Measure Most of the blur measures are dependent on the texture of the scene. Since, difference of MTF varies with
the spatial frequency in the image and blur measure operators behave as a
broadband filter, therefore textures containing different frequencies results in
different depth value. Watanabe et al. [42] have designed the rational filters
56
to make the operators texture invariant. They have modeled the filter for a
box type of defocus blur. For a real lens design, it is hard to design texture
invariant operators because the change in MTF different is not linear.
3.3.3.2
Theoretical Issues in DFCA
Similar to other depth estimation methods, there are some theoretical problems
which affect the performance of DFCA method. These issues are discussed as
follows.
Figure 3.11: Two images captured under strong red light (top left) and white
light (bottom left) illumination. The images with white color correction are
shown here (right).
Narrowband Object Spectrum Depth from chromatic aberrations method
estimates the depth from color images. However if the object does not contain the colors of a complete human visible spectrum, it is hard to estimate
depth. This property put the constraint on the objectโ€™s reflectance spectra to
be broadband. Fortunately, natural scenes have a broadband object spectra
as studied by [32].
Lighting or illumination of a scene also changes the reflectance spectra. In
case of narrowband object spectra, depth estimate is not accurate due to broadband spectral response of color filter arrays. Figure 3.12 shows the broadband
57
color filter responses and a narrowband object reflectance spectra, which has
more dominant red color. In this case the amount of blur in green channel of
the image would be very similar to red color because major part of green color
is coming from larger wavelengths.
The effect is analyzed by capturing a black and white object with lens
exhibiting strong chromatic aberrations. In first case, the object is illuminated
with white light and in second case strong red light is projected on the object.
Figure 3.11 shows the captured images of both cases, along with the white
balanced image. The edge profile is plotted in figure 3.13 for both cases, after
balancing the white color. Note that in case of white illumination, the green
edge has steeper transition (a sharper edge) as compared to red edge. On the
other hand, the amount of blur in both red and green color is similar for red
light illumination.
Quantum Efficiency
0.8
0.6
B(๐œ†)
R(๐œ†)
G(๐œ†)
0.4
Object Spectra
0.2
0
400
450
500
550
600
650
700
Wavelength [nm]
Figure 3.12: An example of color filter with broadband spectral responses and
the reflectance spectrum of an object with strong red content.
One of the solutions to avoid this problem is to use color filters with narrowband spectral response. But, this can lead into difficulty of correct color
reproduction of a scene. Another solution is to reject or correct the depth
values through confidence measure based on the ratio of colors. Larger ratios
mean narrowband object spectra and vice versa.
Field Varying Blur The optical blur known as point spread function varies
from the center of the image to the outer regions of the image due to field
58
White Illumination
Red Illumination
Red
Green
2
Red
Green
2
Figure 3.13: Edge profile of color channels for red and white light illumination.
dependent optical aberrations. The main causes of these aberrations are coma
and astigmatism. As a result of field dependent blur, the depth estimation
also changes across the field due to different blur ratios. A calibration method
is described in the section 3.3.4 to correct the field dependent depth errors.
Varying Magnification In DFCA method, different colors may contain different magnifications due to lateral chromatic aberrations. Therefore, images
must be aligned before computing the relative blur. However, it is possible
during lens design to minimize the lateral chromatic aberrations to such an
extent that they donโ€™t suffer the depth estimation. In the next chapter, a lens
design is discussed where it is shown that lateral CA is reduced during the lens
design optimization process.
Smooth Edges The DFCA method (also DFD) assumes a sharp edge,
blurred according to defocus blur, for depth estimation. In some cases, the
edge behavior is not a step edge, and the depth estimate is not correct. This
could happen due to shadows or motion blur. A simple confidence measure
can reject these kind of edges from the estimated depth map. As we see in
figure 3.4, there is at least one sharp color at each distance, which is the basic
requirement of depth estimation. Based on this requirement, we can reject all
depth estimates for which all three colors have a larger blur, basically greater
than the maximum blur of all colors. We write this condition as,
๐ถ (๐‘‘๐‘’๐‘๐‘กโ„Ž) = ๐‘ข๐‘›๐‘˜๐‘›๐‘œ๐‘ค๐‘› ๐‘–๐‘“ (๐ต๐‘€๐‘Ÿ > ๐ต๐‘š )&(๐ต๐‘€๐‘” > ๐ต๐‘š )&(๐ต๐‘€๐‘ > ๐ต๐‘š ), (3.17)
59
where ๐ต๐‘š is the maximum blur of the all lenses, which can be calculated
with the lens and sensor information. Figure 3.14 shows a relative depth
estimate with and without this condition. As we see, most of the wrong depth
estimates are rejected with the help of this condition. Normal edges become
thinner but depth values are preserved at the center of edges.
(a) Smooth edges gives wrong depth.
(b) Smooth edges removed.
Figure 3.14: Estimated depth for smooth edges is removed by applying the
condition given in equation 3.17.
3.3.4
Field Dependent Depth Correction
Since the optical blur varies across the field of view, therefore the estimated
depth also varies depending the relative change in blur. Besides, blur also
changes with the orientation of the image features. Figure 3.15 shows an the
image with the edges pointing in the direction of center (black color edges)
called as sagittal orientation, and edges which are perpendicular to sagittal
called as tangential orientation (gray color edges). These two perpendicular
planes, sagittal and tangential have different foci due to astigmatism in optics,
which results into different blur for different orientation of edges. The effect of
astigmatism increases from the center of the image to the outer field positions.
A PSF which produces this kind of effect is shown in figure 3.15. As it can
be seen, the PSF is not rotational symmetric which results into different blur
amount in perpendicular directions.
As a result of these field varying aberrations, the computed blur ratios are
the function of three variables, distance, orientation of edge and location in
60
Figure 3.15: Left: Sagittal and tangential orientations are shown as black
and gray edges respectively. Right: A PSF which produces different blur for
different orientations of the edges.
the image. Since, optical aberrations are rotational symmetric, it is assumed
that the blur measure changes with the image height only.
๐ถ (๐‘‘๐‘’๐‘๐‘กโ„Ž, ๐œƒ, ๐‘…) =
๐ต๐‘€๐‘Ÿ2 โˆ’ (๐ต๐‘€๐‘ × ๐ต๐‘€๐‘” )
,
๐ต๐‘€๐‘Ÿ2 + (๐ต๐‘€๐‘ × ๐ต๐‘€๐‘” )
(3.18)
Here, ๐œƒ is the orientation of the edge in the image and ๐‘… is distance of a pixel
from the center of the image. The equation shows that the calibration curve
is a three dimensional lookup table (LUT) defined for different distances, edge
orientation and image height.
To interpolate between the LUT values, we need to create the indices of
LUT from the estimated depth and image features. Image gradients help in
getting the information of sagittal and tangential orientation of the edge as,
โˆ’1 ๐‘ฆ
๐‘”
๐‘ฆ
โˆ’1
๐œƒ = ๐‘ก๐‘Ž๐‘› ( ) โˆ’ ๐‘ก๐‘Ž๐‘› ( ) โˆ’ 90 ,
๐‘ฅ
๐‘”๐‘ฅ
(3.19)
where ๐‘ฅ and ๐‘ฆ are the image Cartesian coordinates and ๐‘”๐‘ฅ and ๐‘”๐‘ฆ are the image gradients in horizontal and vertical directions respectively. The computed
value ๐œƒ, represents the edge orientation in degrees with ๐œƒ = 0 representing
sagittal and ๐œƒ = 90 representing tangential orientation.
In the calibration process, test images are captured for different values of
๐œƒ, ๐‘… and distance. Using these test images and the DFCA algorithm a three
61
(a) Depth map estimated with DFCA (b) Field dependent depth correction
Figure 3.16: Field dependent depth correction of the estimated depth map
from DFCA algorithm.
dimensional LUT is created for the relative blur measures ๐ถ for known values
of ๐œƒ, ๐‘… and distance.
To verify the performance of algorithm, a target image is positioned at a
distance of 60 cm from the camera. Figure 3.16a shows the depth computed
with the DFCA algorithm. The depth map shows inconsistent depth values
at different image positions and orientation of the edges. Whereas, figure
3.16b shows the depth map after field and edge orientation dependent depth
correction. The depth values are more consistent across the field of view.
The above calibration process is difficult, as multiple images are required at
different distances to create a calibration LUT. To simply this process, the behavior of relative depth is measured for different lenses. Figure 3.17 shows the
change of relative depth versus distance, measured at different image locations
for multiple lenses. The results show that a single calibration image would be
sufficient to predict the depth behavior at any image location. Therefore, if
we have a set of measured depth curves, one of them could be selected on the
basis of single captured image at any distance (preferably at focus position of
green image, 40 cm in figure 3.17). This makes the process very simple as only
one calibration image is used to calibrate all distances in the cameraโ€™s field of
view.
Figure 3.18a shows a depth map computed for the object distance of 85
cm. It can be seen that the center of the image shows the correct mean depth,
whereas all other image positions measure wrong depth values. The mean
62
80
Actual Depth [cm]
70
60
50
40
30
20
-0.5
-0.3
-0.1
0.1
0.3
0.5
Normalized Blur Measure Ratios
Figure 3.17: Relative depth measured for different lenses at different image
locations.
depth value of a complete depth map is 53 cm which has large deviation from
the centered region which has mean depth value of 86 cm. Figure 3.18b shows
the depth map after field dependent correction. In this case, only one image is
used to calibrate the DFCA method instead of multiple images as are used in
the figure 3.16b. The mean depth value of complete depth map 83 cm is much
closer to the actual distance of the object.
(a) Depth map estimated with DFCA (b) Field dependent depth correction
Figure 3.18: Field dependent depth correction of the estimated depth map
from DFCA algorithm using only one image for the calibration.
63
3.4
Dense Depth Map
All passive depth estimation methods compute the depth from image features.
In most of the cases, a scene contains large homogeneous regions, where no
depth estimation is possible. However, in some applications like object extraction, and scene re-rendering, a dense depth map is required. Therefore, to
create a dense depth map, depth values from the neighboring regions must be
interpolated.
There have been many methods proposed in the literature to compute dense
depth maps. Many methods uses the image segmentation to assign the single
smoothed depth value to the complete segment. Bae et al. [2] has proposed
to use the colorization using optimization method [24] to fill the depth maps,
after refining the depth map with the cross bilateral filter.
In this work, two methods are investigated, the segmentation and the propagation using optimization based methods.
3.4.1
Dense Depth Map by Segmentation
Image segmentation methods divide the image into different regions of similar
features. After segmentation, one knows which pixels belong to which objects.
Details of different segmentation techniques can be found in [34]. Since this
work doesnโ€™t focus on the image segmentation, therefore, it is assumed that
the image can be segmented with any suitable algorithm. Hence, there is a
segmented image available and the corresponding depth map at only edges of
the image. The task of this work is to compute the high resolution smooth
dense depth map. The method utilized in this work includes following steps:
1. Create the histogram of depth values for each image segment.
2. Smooth the histogram by applying running averaging filter.
3. Search the maximum peak of the histogram and select the corresponding
depth value
4. Assign the selected depth value to all pixels belonging to that segment
5. Repeat step 1 to 4 for all segments in the image
64
6. Apply cross bilateral filter [10] on the filled depth map. The weights of
the range filter are taken from intensity image.
Although segmentation based method provides high resolution dense depth
map, its accuracy depends on the performance of the segmentation algorithm.
Most of the automatic segmentation techniques fails to provide good segmentation of image.
Intensity Image
Depth Map
Median based
Downsampling
Depth fill using
optimization method
Joint Bilateral
Upsampling
Dense Depth
Map
Figure 3.19: Dense depth map is generated by optimization based method
followed by joint bilateral upsampling
3.4.2
Dense Depth Map by Optimization
Levin et al. [24] proposed to fill the gray scale image with the given color using
an optimization technique. The colors are provided by the user as scribbles.
Bae et al. [2] applied the same method to fill the depth maps, where estimated
depth values at edges or textured areas are considered as a scribbles. The
optimization problem tries to minimize the difference between estimated depth
value and the weighted average of neighboring depth values. The weights are
assigned according to the intensity and color of the neighboring pixels. The
constraint is imposed that neighboring pixels have similar depth value if the
color and intensity are similar.
The optimization problem is solved through least square optimization of
linear system of equations. The computational complexity of such algorithm
is much higher for high resolution images, as the number of equations are equal
to the number of pixels in the image.
65
(a) Original Image
(b) Depth from CA
(c) Segmentation Based
(d) Optimization Based
Figure 3.20: The dense depth map is generated by the segmentation and optimization based methods.
Optimization problem can be solved on a downsampled low resolution depth
map, followed by joint bilateral upsampling. This helps in reducing the computational complexity. Joint bilateral upsampling method is proposed Kopf
et al. [20] to produce the high resolution outputs. In our case, joint bilateral
upsampling is performed by taking the spatial weights of the bilateral filter
from low resolution depth map and the weights of the range filter are taken
from the high resolution intensity image. The algorithm flow diagram is shown
in figure 3.19.
Figure 3.20 shows the results of two approaches. The actual estimated
depth values at strong edges are shown in figure 3.20b. The dense depth maps
generated with the two algorithms are shown in figures 3.20c and 3.20d.
66
(a) Simulated image
(b) Actual depth
(c) Estimated depth
Figure 3.21: (a) Simulated image with chromatic aberrations, (b) ground truth
depth map used for testing the DfCA algorithm, (c) depth map generated with
the algorithm described in section 3.3.2 (depth is estimated only at edges, and
given in mm).
3.5
Results and Discussion
To verify the DFCA algorithm, a lens with F-number 2.4, focal length 4 mm
and the chromatic focal shift of 50 ๐œ‡m is simulated to estimate the depth from
30 cm to 2 m. A synthetic image shown in figure 3.21a is blurred according to
the true depth shown in figure 3.21b. Image is then converted to Bayer format
after adding the sensor noise. Figure 3.21c shows the depth map computed
with the proposed algorithm. For the analysis of depth accuracy, only actual
estimated depth values at edges are shown.
Root mean square (RMS) error between actual and estimated depth is computed for the detected depth regions. Figure 3.22 shows the RMS error at
67
250
200
RMSE [mm]
RMSE [mm]
200
150
100
100
50
50
0
150
500
1000
1500
Actual Depth [mm]
2000
0
0
100 200 300 400 500
Difference of Contrast between Colors
Figure 3.22: RMSE between true depth and estimated depth at different distances and differences of contrast between colors.
different distances and difference of local contrast between color images. The
depth is estimated with the RMS error of 6โˆผ10% for a small focal length of 4
mm. The local contrast ๐ป๐‘– : ๐‘– = {๐‘Ÿ, ๐‘”, ๐‘}, is defined as the height of edge and
โˆš
difference of contrast is defined as ๐ป๐‘‘๐‘–๐‘“ ๐‘“ = โˆฃ๐ป๐‘Ÿ2 โˆ’ ๐ป๐‘ ๐ป๐‘” โˆฃ according to equation 3.16. For the edges where one of the color edge is missing due to similar
foreground and background color (๐ป๐‘– โ‰ˆ 0), depth estimation is not possible.
Otherwise, the proposed normalized blur meausre works well for color edges,
and the average RMS error of different color edges for a distance range of 40
cm to 2 m is mostly less than 100 mm.
Figure 3.23 shows the depth maps computed for real images captured with a
lens having axial chromatic aberrations. The lens has a focal length of 2.66 mm
and F-number of 2.8. The chromatic focal shift is 50 ๐œ‡m in the wavelength
range of 450 nm to 600 nm. Results show that depth propagation works
quite good for detected objects, but large homogeneous regions are not filled
correctly. However, for the applications such as 2D to 3D image conversion or
digital refocusing, homogeneous regions donโ€™t have any affect on the image.
DFCA results are compared with the depth maps generated with the time
of flight (ToF) camera. This time different lens is used with a focal length
of 4.4 mm and F-number of 2.4. The chromatic focal shift is 66 ๐œ‡m in the
wavelength range of 450 nm to 600 nm. Both cameras, DFCA and ToF, are
placed at the same position to estimate the depth. The results are shown in
figure 3.24 with color coding. The depth estimate of DFCA is close to ToF
68
Figure 3.23: Depth estimation from axial chromatic aberrations for the real
captured scenes. First row: input images captured with a lens having large
chromatic aberrations. Second row: raw depth estimation using the algorithm
proposed in this work. Third row: dense depth after propagating the raw
depth to surroundings.
camera at the objects boundaries. The depth propagation results fill the local
objects quite well, but the large homogeneous regions do not show correct
depth. This is one of the drawback of all passive depth estimation methods.
The average depth values of each object along with the average depth value
from ToF camera are given in table 3.1. The error of DFCA with the ToF
camera are also given in percentage. All objects which have well defined edges
show least depth error. Whereas the error for low contrast edges and texture
regions is comparatively larger.
3.6
Conclusion and Outlook
This chapter analyzes a method similar to the depth from defocus for depth
imaging. Instead of capturing multiple defocused image, axial chromatic aber-
69
(a)
(b)
(c)
(d)
Figure 3.24: Depth estimated with the DFCA camera for different colored
objects. (a) Original image, (b) DFCA depth at edges only, and (c) DFCA
depth after propagating to neighboring regions. For comparison the depth
from ToF camera (d) is also shown. All depth maps are shown in cm.
rations are used to capture the defocused information in multiple colors with
a single shot. The previous related literature has only shown the feasibility of
this approach for limited experimental setups. Therefore, a thorough analysis
is performed to develop a method which works well for different imaging conditions. The existing focus/blur measures were evaluated and it is shown that
these measures are contrast dependent which makes them infeasible for DFCA
system. A new blur measure based on the normalized gradients is proposed
which is independent of local image contrast. Absolute depth is inferred from
blur measures by taking the normalized ratios and calibration procedure. The
depth error analysis has shown that there are some common challenges in DFD
and DFCA system, for example, low texture area, sensor noise, field varying
blur. For a field varying depth correction, a simple calibration procedure is
70
Image Object
1
2
3
DFCA Depth [cm] ToF Depth [cm]
Error [%]
Cylinder 1
30
30
0
Cylinder 2
41
42
2.4
Cylinder 3
52
54
3.7
Cylinder 4
63
67
6
Cylinder 5
98
97
1
Green Ball
32
32
0
Red Ball
62
61
1.6
Leaves
106
100
5.7
Bird
31
36
14
Leaves
50
60
17
Pumpkin
90
99
9
Table 3.1: Absolute depth estimates from DFCA camera for the images shown
in figure 3.24. The error with the depth from ToF is also given in percentage.
proposed. The test chart image used for calibration setup contains the information of sagittal and tangential edge orientations at different field positions.
It is analyzed that the behavior of the depth change is similar for different
lenses. Therefore, if this behavior is measured once or obtained from the lens
design data, then a single image is sufficient to determine the field varying
depth error and its correction.
The main advantage of DFCA method over DFD is a single shot depth
imaging system which helps in avoiding miss-registration problems and also
varying image magnifications because the lateral color shift can be reduced
during lens design process. The major challenge to DFCA method is narrowband object reflectance spectra. In this case, either the color information is
missing or the blur is different as compared to the broadband spectra case.
Since depth from defocus method only measures the depth for texture regions, therefore, a low cost implementation of two existing methods are used
to create dense depth maps. One method is based on image segmentation and
filling of each segment with the median depth value of this segment. Another
solution uses the optimization method to propagate the depth to neighboring
regions of the same intensity. Since optimization problem is computationally
71
complex, hence it is proposed to propagate the depth at very low resolution
followed by joint bilateral upsampling.
72
Chapter 4
Extended Depth of Field from
Chromatic Aberrations
4.1
Introduction
Depth of field represents the distance in a scene for which the captured image
is considered to be in focus i.e. the amount of blur introduced by optics is not
perceivable in normal viewing conditions. In some cases, it is desirable to have
a larger depth of field (DOF). In other cases, narrow depth of field is helpful
in emphasizing the desired object in a scene.
Maximum blur diameter which is visible as a point to human eye is called
circle of confusion that defines the range of depth of field. Equation 3.2 is
useful in determining the dependency of lens parameters on the depth of field.
In terms of lens F-number ๐น #, the blur diameter is reformulated as,
(
)
๐‘“ ๐‘š๐‘‘๐‘œ ๐‘‘ โˆ’ ๐‘‘๐‘œ
๐‘(๐‘‘) =
,
๐น#
๐‘‘
(4.1)
where ๐‘š๐‘‘๐‘œ is the lens magnification for lens focus position ๐‘‘๐‘œ , given by equation
3.3. The effect of lens parameters can be better observed with the plot of blur
diameter versus object distances shown in figure 4.1. For an example, let the
circle of confusion ๐‘ = 0.004 mm define the DOF. For a distance range where
๐‘ โ‰ค ๐‘, the blur is imperceptible and the details are within DOF. The lens
parameters which effect the DOF are F-number, focal length and lens focus
position. For a smaller F-number, larger focal length and closer focus position,
there is a rapid change in the amount of blur. Therefore, the DOF is smaller
73
74
0.03
f = 4 mm, d๐‘œ = 1 m
f = 8 mm, d๐‘œ = 1 m
f = 8 mm, d๐‘œ = 4.5 m
Blur Diameter [mm]
0.025
0.02
DOF
0.015
0.01 Circle of confusion
0.005
0
102
103
104
Object distance [mm] (log scale)
Figure 4.1: Blur diameter versus distance for different focal lengths and lens
focus position.
in these cases and vice versa.
4.2
Related Work
As we observe from figure 4.1, the blur changes with the distance, hence it is
difficult to restore a sharp image without the depth information. Even if the
depth is known, it is hard to recover image in case of large blur, as the point
spread function (PSF) has zeros in its frequency response. Therefore, many
computational imaging approaches have been emerged to extend the depth of
field by joint optimization of the optical system and the digital post-processing.
Some of the previous methods are discussed here briefly.
A common approach of achieving extended DOF is the focus stacking. In
this method, multiple images are taken with different focus distances and combined digitally to extend DOF. For each image position, the sharpest pixel
among multiple images is selected. The disadvantage of this approach is that
it requires multiple images.
Most of the computational methods try to make the PSF invariant to distance. One of the earliest approach, proposed by Häusler [15], is to make the
blur of the object invariant to depth, and restore the sharp image through
75
digital processing. He obtained the depth invariant blur by moving the object
along the optical axis during the exposure time of the camera. Depth invariant
PSF can also be achieved through Apodization [31]. Apodization is typically
obtained by central or annular obstruction in the aperture of the system. The
pupil is then shaped in a way that its frequency response reaches zero for
higher frequencies compared to the unaltered pupil. The major drawback is
the reduced light transmission through optical system, thus leading to lower
SNR.
A well known method in the field of computational photography proposed
by Dowski et al. [9] is wavefront coding. The main advantage is that it does
not suffer with the limited light transmission. In wavefront coding, the pupil
function is modified through phase modulation by putting a non-absorbing
optical element like cubic-phase or cosine form-phase mask. It is possible then
to get the PSF which is insensitive to defocus. In the second step, standard
deconvolution is used to restore the image with only one digital filter. Although
there is no light loss, but it suffers with the loss of SNR, as the PSF spreads
on a larger size to make it insensitive to defocus. Depth invariant PSF can
also be achieved through polarization separation, by placing the birefringent
plate between the lens and the sensor [44]. The plate is designed such that two
polarization states contain the in-focus far and in-focus near field information
of a scene and are superimposed on each other to form the image. In this case,
digital restoration is required. The major drawback in these approaches is the
complex optical design as the phase mask is to be integrated with lens design.
Some alternative approaches are based on the color separation. Guichard et
al. [14] proposed to utilize the chromatic aberrations to capture color images
with different focus positions. As different colors appear sharp at different distances, hence digital processing transfers the sharpness information of sharpest
color to other colors. In this way, the resultant image is sharp for broader distance range. The sharpness transportation technique takes advantage of the
spectral information redundancy inherent in images to recover information
that has been lost due to chromatic blurring effects. The advantage of this
method is the use of conventional optics without any light loss and degradation of SNR. However, the digital processing is quite challenging to remove
all chromatic aberrations. The method trades off the extension of the DOF
76
and the loss of chrominance high frequencies. Kay et al. [17] have proposed
to use the color aperture which stops the blue light to make larger depth of
field for blue color image. Other colors are then made sharper using the sharpness information of blue color image. The method proposed by Bando [3] and
Kim [18] codes the disparity information in color images through the color filter aperture. The extended DOF image is produced through estimated depth
based deconvolution.
If the lens exhibiting axial chromatic aberrations is used with the black and
white (B&W) sensor, it produces depth invariant PSF, as shown by Cossairt
et al. [8]. For color images, it is shown that luminance of the image have
a depth invariant PSF. However, deconvolution process is required to recover
the sharp image.
In this work, the shortcomings of methods proposed by Guichard et al.[14]
and Cossairt et al. [8] are reduced using the sensor that captures monochromatic (W) light along with the RGB colors. Moreover, on the digital side
robust algorithms are proposed to produce high quality color images.
4.3
Extended DOF Using Axial Chromatic Aberration
In the previous chapter, it is shown that multiple focused images could be
captured with a single shot by optimal introduction of axial chromatic aberrations in the lens. Therefore, focus stacking method may be employed on
multiple color images to produce larger DOF effect. However, it is not very
simple like traditional focus stacking, because only one color is sharp at each
image location. Therefore, instead of selecting the sharpest pixel, we need to
sharpen other colors according to the sharpness information of the sharpest
color. Figure 4.2 shows the blur diameters of RGB colors in case of axial chromatic aberrations, and the minimum blur diameter among all colors at each
distance. If each color is made sharper according to the sharpest color, depth
of field would be larger as shown in the right side plot.
In the following sections, few existing methods are described to digitally
restore the extended DOF image in case of axial chromatic aberrations.
77
Blur Diameter [mm]
0.03
0.025
0.02
0.015
0.01
0.005
0
102
103
Object distance [mm] (log scale)
102
103
Object distance [mm] (log scale)
Figure 4.2: Left: Blur diameter for RGB colors in case of axial chromatic
aberrations. Right: Minimum blur diameter among all colors at each distance.
4.3.1
Depth Dependent Deconvolution
We know from previous chapter that depth can be estimated if there exists
axial CA in the lens. Therefore, the simplest solution to restore extended
DOF image is to deconvolve the image with point spread function which varies
according to depth. However, in case larger blur, MTF of the optical system
drops to zeros at higher spatial frequencies and it becomes impossible to restore
the spatial resolution. Figure 4.3 shows the MTF versus distance for 90 line
pairs per millimeter image resolution. As it can be seen, the MTF of red color
and blur color drops to zero for near and farther distances respectively.
Lim et al. [25] have described this approach and shown that the deconvolution process, for each color channel separately, does not provide good restoration. Hence, they have proposed the alternative method where the resolution
of the luminance channel is enhanced by adding the high frequency information of the sharpest color to the luminance channel. Their result shows the
restoration of higher spatial frequencies of blurred colors but overshoots and
undershoots are also visible. Moreover, for color images, color bleeding artifacts cannot be reduced with this approach because only luminance channel is
processed.
78
Normalized Through Focus MTF
1
0.8
0.6
Blue
0.4
Green
Red
0.2
0
102
103
Object distance [mm] (log scale)
Figure 4.3: MTF for spatial frequency of 90 line pairs per millimeter [lp/mm]
versus distances.
4.3.2
Sharpness Transport Across Color Channels
Guichard et al. [14] proposed the method of sharpness transport across channel
to produce extended DOF using chromatic aberrations. The method includes
following steps,
โˆ™ Estimate the sharpness of all color channels.
โˆ™ Select the sharpest color at each image location. Due to axial CA, only
one color is sharp at each image location.
โˆ™ Copy the sharpness information from sharpest color to other colors.
The process of sharpness transport is written as,
๐ผ๐‘–๐‘€ = ๐ผ๐‘– + ๐‘ค × ๐ป๐‘ƒ ๐น (๐ถ๐‘ ,๐‘– ) โˆ— (๐ผ๐‘  )
(4.2)
where ๐ผ๐‘– and ๐ผ๐‘–๐‘€ are the un-sharp color images before and after sharpness transport. ๐ผ๐‘  is the sharpest color whose higher spatial frequencies are extracted
through high pass filter ๐ป๐‘ƒ ๐น and added to un-sahrp colors after multiplying with the weight ๐‘ค. The strength of high-pass filter ๐ป๐‘ƒ ๐น depends on the
relative sharpness ๐ถ๐‘ ,๐‘– between sharpest and blurred color.
79
Although the process seems to be very simple but in practice there are
many difficulties in this restoration process. The high-pass filterโ€™s coefficients
and weights depend on the lens properties which must be computed through
calibration of lens. The weights not only depend on the lens properties but
also on the contrast levels in color channels. Moreover, simple gradients based
relative sharpness measures are prone to errors, due to varying spectral features in color images, which results in color artifacts due to over or under
compensation.
4.3.3
Spectral Focal Sweep
Focus sweep methods produce depth invariant blur by moving the sensor mechanically during exposure time of the image, and restore the extended DOF
image through digital processing. Cossairt and Nayar [8] have studied that if
the lens exhibiting axial chromatic aberrations is used in combination with the
B&W sensor, the mechanical focal sweep effect can be achieved. They named
this process, the spectral focal sweep (SFS), because multiple color images
which are focused at multiple distances are integrating at the B&W image
sensor during exposure time.
For SFS camera, the most important requirement is the broadband object
reflectance spectra. The more broader the spectra, the wider DOF would be.
However, it is shown in the study of Parkkinen et al. [32] about munsell
colorโ€™s reflectance spectra that most real-world objects have broadband object
spectra. For SFS camera it is shown that the PSF is depth invariant in most
of the cases and 95% of these colors produce minimal deblur error that does
not introduce any signifcant artifacts.
Although the SFS camera is primarily proposed for B&W image, the results
of RGB images are also shown through deblurring of image luminance. Images
look sharper but color bleeding effects are visible in the images which makes
this method unsuitable for high quality imaging.
4.3.4
Eliminating Color Difference
There have been many methods developed in the past to correct the chromatic aberrations through digital image processing. One of the most promis-
80
ing method is suggested by Chung et al. [7]. Their method first analyzes the
color difference signal at the edges and after detecting the chromatic aberration, the color difference is removed for the correction of CA. The method is
very similar to [19] suggested by Konvesky. Both methods consider the green
channel as a reference channel for correction.
In this work, these methods are extended by taking the sharpest channel
as the reference channel for color difference correction. As a result, always the
sharpest color information is preserved and hence extended DOF is achieved.
The advantage of these approaches is that they work only at edges therefore
noise is not enhanced. Also, in the correction process there are no parameters
which depend on the image sharpness measure which is error prone. However,
the methods work at their best when only one edge exists in a local region.
For multiple edges or higher spatial frequencies, these methods are unable to
correct axial CA. The reason is that the correction value cannot be correctly
calculated due to loss of contrast at higher spatial frequencies.
4.4
RGBW Sensor with Chromatic Aberrations
All state of the art methods have some limitations, or computationally complex
for high quality image restoration. In this section, a method is suggested
which produces better image quality without producing any significant color
artifacts. Basically, the best of two methods, SFS and sharpness transport
have been achieved through the use of RGBW sensor [23]. RGBW sensor
contains one white (monochromatic) pixel, which is analogous to B&W image
sensor, along with RGB pixels. Hence, the complete benefits of SFS approach
may be utilized in combination with the correction of color bleeding artifacts
to produce high quality images with wider DOF.
Figure 4.4a shows the color filter (CF) spectral response of RGBW sensor.
As the spectral response of white pixels is broadband i.e. more light is captured, therefore, the introduction of white pixel in the image sensor aimed at
producing low noisy images under low lighting conditions [23]. Wang et al.
[40] have shown the usage of RGBW sensor to produce high quality deblurring
of the images.
Figure 4.4c shows the geometrical blur diameter for the color filter responses
81
(b)
Chromatic focal shift [mm]
(a)
Quantum Efficiency
1
0.8
0.6
0.4
0.2
0
400
500
600
700
50
0
-50
-100
400
500
600
700
Wavelength [nm]
Wavelength [nm]
(c)
Blur Diameter [mm]
0.03
0.025
0.02
0.015
0.01
0.005
0
102
103
Object distance [mm] (log scale)
Figure 4.4: a) Spectral response of color filter array, b) chromatic focal shift
in the visible range of spectra and c) the blur diameter of red, green and blue
colors calculated by averaging the blur diameter of all wavelengths according
to the color filter array spectral response.
and the chromatic focal shift shown in figure 4.4a and 4.4b, respectively. Two
different CF responses are used to observe their behavior on the blur. One CF
response is broadband which is normal case in RGB cameras, and the other has
narrowband response. The blur diameter of each color RGBW is calculated
by weighted summation of blur diameters of all wavelengths according to color
filter response. Note that, the blur is not same for two different CF responses.
The CF with broadband response has larger blur, specially for best focus
position. Thus, the sharpest transport cannot produce sharp images even for
82
the best focus positions with braodband CF response.
The blur diameter of white channel is mostly invariant for large range of
distances. Hence, the deconvolution of white channel with its PSF helps in
restoring the sharp white image. Since the spectral response of white image
is broadband and have higher sensitivity, it helps in restoring the white image
with relatively higher SNR.
There could be different alternative methods to combine the restored white
channel with the color corrected color channels. In this work, restored white
channel is only used to extract higher spatial frequencies to sharpen the RGB
colors. The algorithm contain the following steps:
โˆ™ Restore the white channel through deconvolution with its depth invariant
PSF. The restored white channel is considered as the fourth color image.
โˆ™ Compute the sharpness of all four images.
โˆ™ Divide the image into possible four regions where in each region one color
is sharp.
โˆ™ For each pixel, compute the relative sharpness between sharpest color
and other colors.
โˆ™ Select three high-pass filters according to the relative sharpness measures
โˆ™ Extract higher spatial frequencies of sharpest color and copy them to
other colors.
The proposed algorithm gives larger depth of field as compared to the individual SFS and sharpness transport methods. As it can be seen in figure 4.4c,
the restored white image would be still blurred at the focus position of blue
color. Moreover, the sharpness transport can restore the images to the best
sharpness of each color, which is already blurred due to chromatic aberrations.
Therefore, the combination of restored white image with the sharpness transport among all colors produces sharper image for near distances and also for
farther distances.
83
4.5
Low Cost and Efficient Implementation of
Proposed Algorithm
The proposed algorithm requires the computation of relative blur, higher spatial frequencies and deconvolution. Figure 4.5 shows the detailed flow diagram
of the algorithm. Each computational step is discussed here individually.
4.5.1
Relative Blur Estimate
Relative blur measure is similar to the relative depth measurement as discussed
in section 3.3.2.4. But here, instead of measuring the depth, we require only the
estimate of relative blur between colors. Since, different colors contain varying
contrast level, therefore, first sharpness is measured according the equation
3.15 which is independent of image contrast.
The measured sharpness values are converted to the standard deviation ๐œŽ
of Gaussian blur because the lens blur can be approximated with Gaussian
blur. The ๐œŽ and sharpness values are directly related as,
๐œŽ๐‘— = ๐‘˜ ×
1
,
๐ต๐‘€๐‘—
(4.3)
where ๐‘˜ is a constant factor and ๐‘— = {๐‘…, ๐บ, ๐ต}. The standard deviation of
blur difference between sharpest and blurred image is given as,
โˆš
๐œŽ๐‘ ,๐‘– = ๐œŽ๐‘–2 โˆ’ ๐œŽ๐‘ 2 ,
(4.4)
which is used to copy the sharpness information from sharpest color to blurred
color.
4.5.2
Adaptive High-Pass Filtering
The sharpness transport method extracts the higher spatial frequencies by
applying the high-pass filter and copy them to other colors. The goal is to
make the sharpness/MTF of the blurred color similar to the sharpest color.
Therefore, high-pass filter must be designed to extract the high frequencies
which represent the difference of two colors MTF. For this reason, the blur
difference ๐œŽ๐‘ ,๐‘– is used as the basis of selecting the high-pass filter characteristics.
84
Deconvolution
W
R
G
B
Sh
Sharpness
M
Measure
Extract Higher
Spatial Frequencies
of Sharpest
p Color
Add High
Frequencies to
Blurred Colors
Figure 4.5: Processing flow of extended DOF algorithm using RGBW sensor.
85
Higher spatial frequencies are extracted according to the Gaussian low-pass
filter as,
๐ผ๐ป = ๐ผ โˆ’ ๐ฟ๐‘ƒ ๐น๐บ (๐ถ) โˆ— ๐ผ;
(4.5)
where ๐ผ๐ป are the high frequencies of the image ๐ผ, and ๐ฟ๐‘ƒ ๐น๐บ is the Gaussian
filter depending on the relative blur. The equation of two dimensional Gaussian
filter is,
๐บ(๐‘ฅ, ๐‘ฆ) =
1 โˆ’ ๐‘ฅ2 +๐‘ฆ2 2
๐‘’ 2๐œŽ ,
2๐œ‹๐œŽ 2
(4.6)
where ๐‘ฅ is the distance from the origin in the horizontal axis, ๐‘ฆ is the
distance from the origin in the vertical axis, and ๐œŽ is the standard deviation
of the Gaussian distribution.
In case of three colors, image can be divided into three regions, where in
region one of colors is sharp. Afterwards, sharpness information is transferred
to blurred colors. Since, the blur varies continuously with the depth, therefore
only one filter for each sharp color is insufficient to completely remove the blur
difference. To implement continuously varying sharpness transport, coefficients
of Gaussian filter may be computed at run time for each relative blur value or
stored in a lookup table for sampled values of relative sharpness.
To reduce the computational complexity, the filtering process is applied
through interpolation of Gaussian blur using principal component analysis
(PCA). Gaussian filters are designed for a sampled sigma values in the range
of ๐œŽ = 0 and ๐œŽ = 8 and these are modeled with the weighted summation of
the basis Gaussian filter which are computed through PCA. Then the missing
Gaussian filter are approximated through the interpolation of the weights at
the run time. Details of the method are described in section 2.5.1.1. The PCA
based method is much faster as compared to other methods as shown in the
the section 2.1.
4.5.3
Deconvolution
The image captured by the camera can be written as a convolution process
between camera blur and input image. Deconvolution is the process used to
reverse the effect of the convolution. Similar to SFS method as discussed
before, the blur of the white image is distance invariant for specific range of
86
distances. Hence, the deconvolution of captured white image with the blur i.e.
point spread function ๐‘ƒ๐‘ค of white image restores the sharp image. The process
may be written as,
๐ผ๐‘ค๐‘‘ = ๐ผ๐‘ค โˆ— ๐‘ƒ๐‘คโˆ’1 ,
(4.7)
where ๐ผ๐‘ค๐‘‘ is the deconvolved image and ๐‘ƒ๐‘คโˆ’1 is the inverse of the PSF.
Theoretically, the deconvolution process restores exact image, if correct inverse PSF is used. However, in practice it is impossible to reverse the convolution process because of the noise added in the capturing process. The
restoration process becomes worst in case of large blur, where the frequency
response of the blur approaches to zeros at higher frequencies. In this case the
restoration process results in ringing artifacts around the edges due to over
enhancement of frequencies which are lost in the capturing process.
The frequency response for the PSF of white image does not approaches to
zero for a larger distance range. Hence, image can be restored without artifacts
but with low SNR, as the noise is enhanced in the restoration process. Since,
the white image is used only to extract high frequencies to sharpen the blurred
colors, hence the noise enhancement does not have much effect on the final
image quality.
The loss in SNR depends on the strength of the inverse filter. As given by
Sherif et al. [36], the SNR of the restored image with respect to unrestored
image is defined as
๐‘‘
๐‘†๐‘ ๐‘…๐‘ค
= ๐‘†๐‘ ๐‘…๐‘ค
โˆšโˆ‘ โˆ‘
๐‘ฅ
(๐‘ƒ๐‘คโˆ’1 (๐‘ฅ, ๐‘ฆ))2 ,
(4.8)
๐‘ฆ
๐‘‘
where ๐‘†๐‘ ๐‘…๐‘ค
is the SNR of the restored image.
4.5.4
Contrast Dependent Sharpness Transport
Local contrast of images varies in different colors. In this case, the weights
๐‘ค given in equation 4.2 must be adapted according to the contrast levels.
The contrast of the image may be measured by taking the difference of local
minimum and maximum values as shown in the equation 3.12 and 3.13. The
ratio of contrasts of two colors is then used as a weighting factor during the
87
transfer of higher spatial frequencies. The weights are calculated as,
๐‘ค1 =
๐‘
๐‘
โˆ’ ๐ผ๐‘š๐‘–๐‘›
๐ผ๐‘š๐‘Ž๐‘ฅ
,
๐‘ 
๐‘ 
๐ผ๐‘š๐‘Ž๐‘ฅ
โˆ’ ๐ผ๐‘š๐‘–๐‘›
(4.9)
where ๐ผ ๐‘  and ๐ผ ๐‘ represent the sharpest and blurred colors respectively. In
case of very low contrast in sharpest or blurred color, the sharpness transport
can produce color artifacts. The same thing happens when one of the color is
completely missing.
1
Weights
0.8
0.6
0.4
0.2
0
50
100
150
200
250
Edge Contrast
Figure 4.6: Weights versus local edge contrast to reduce the strength of sharpness transport at low contrast levels.
We introduce another weighting factor, which reduces the strength of sharpness transfer in case of low contrast. The weights are defined as,
(
) (
)
(๐ผ ๐‘
โˆ’๐ผ ๐‘
)
(๐ผ ๐‘ 
โˆ’๐ผ ๐‘ 
)
โˆ’ ๐‘š๐‘Ž๐‘ฅ ๐›ผ ๐‘š๐‘–๐‘›
โˆ’ ๐‘š๐‘Ž๐‘ฅ ๐›ผ ๐‘š๐‘–๐‘›
๐‘ค2 = 1 โˆ’ ๐‘’
× 1โˆ’๐‘’
.
(4.10)
Figure 4.6 shows the weights ๐‘ค2 versus edge contrast. All edges with the
contrast higher than ๐›ผ get relatively higher weights. That means sharpness
transport will not be effected by this weighting factor. Total weights ๐‘ค are the
multiplication of ๐‘ค1 and ๐‘ค2 .
๐‘ค = ๐‘ค1 × ๐‘ค2 .
(4.11)
88
4.6
Results and Discussion
The algorithm is verified through synthetic and real captured images. The
multispectral images are used to precisely simulate the optics and sensor properties. Point spread functions for 31 equally spaced wavelengths between 400
nm and 700 nm are generated and applied to the multispectral images. In the
next step, the sensor MTF is applied and noise is added. Finally, multiple
color images are averaged according to the color filter responses to produce
the RGB and white (W) color images. Figure 4.7 shows the simulation flow
diagram.
PSF for each Spectrum
Multispectral
Images
Optical Blur to each
Spectral Image
Multispectral to
CFA colors
Sensor MTF and
Noise
Raw Image
Figure 4.7: Simulation of optical and sensor properties.
For a verification of the algorithm performance, a lens with focal length
of 4 mm and F-number of 2.4 is used with the sensor of pixel size 2.2 ๐œ‡m.
For a lens exhibiting axial chromatic aberrations, the total chromatic shift in
the visible range of wavelengths is 112 ๐œ‡m. The spectral sensitivity of color
filters is shown in figure 4.4a. Hyperspectral images are taken from the CAVE
multispectral database which contain 31 images of wavelengths ranging from
400 nm to 700 nm with equal spacing.
Figure 4.8 shows the results of a lens without chromatic aberrations along
with the images of a lens which exhibits axial chromatic aberrations. Figure
4.9 shows the results of extended DOF algorithm and conventional lens. The
focus positions of the lenses are set such that objects at infinity are sharp.
For that reason the extended DOF effect is more visible at near distances.
The extended DOF images are much sharper and without any color artifacts,
especially for 30๐‘๐‘š, whereas the conventional lens shows blurred image for the
same distance. Figure 4.10 shows the result of extended DOF for real captured
image.
89
(a) 30 cm
(b) 30 cm
(c) In-focus
(d) In-focus
(e) 3 m
(f) 3 m
Figure 4.8: Simulation results of a lens without (a,c,e) and with chromatic
aberrations (b,d,f).
90
(a) 30 cm
(b) 30 cm
(c) In-focus
(d) In-focus
(e) 3 m
(f) 3 m
Figure 4.9: Simulation results of a lens without chromatic aberrations (a,c,e)
representing a conventional lens. The images b, d, and f shows the extended
DOF result generated through the proposed extended DOF algorithm.
91
4.7
Physical and Practical Limitations
The method discussed in the previous section is a simple way to restore an
extended DOF image using chromatic aberrations. As it has been discussed
before, the algorithm works best in case of broadband object reflectance spectra. Another important factor which determines the performance of sharpness
transport is the accuracy of sharpness measure. Due to wrong computation of
sharpness value, color artifacts appears in the final image. The physical and
practical limitations of the system and there impact on the image quality are
discussed here.
4.7.1
Narrowband Object Reflectance Spectra
Narrowband object spectra results in a color image with one of the RGB color
is dominant as compared to other colors. In case of chromatic aberrations, if
blue or red color is weak at near and far distances respectively, the sharpness
information is completely lost. As a result, the sharpness transport method
fails to produce sharper objects for these cases. The SFS method also suffers
from deblurring artifacts, because the amount of blur is different as compared
to the object which has broadband spectra.
4.7.2
Loss of Contrast at Higher Frequencies
The weights calculated for sharpness transport are not optimum at higher
spatial frequencies. If the spatial resolution of structures or textures in the
image is less than the amount of blur in any one of the colors, the contrast
information is lost. As a result, the calculated ๐‘ค doesnโ€™t represent the original
contrast of the image. This limitation puts the requirement of more sophisticated algorithm for the sharpness transport method.
4.8
Conclusion and Outlook
In this chapter, the method of obtaining extended depth of field image using
chromatic aberrations is described. Since there is already some work being
done on these methods, therefore solutions to some existing challenges are
92
presented. One of the existing method [8] only increases the MTF through
deconvolution and doesnโ€™t correct chromatic aberrations. The other method
[14] corrects the chromatic aberration according to sharpest color but the infocus position still remains blurred as shown in section 4.4. In this work,
both methods are combined using the RGBW sensor. As the correction of
color aberrations requires the relative sharpness information, therefore, the
method developed in chapter 3 provides better relative sharpness information
as compared to the existing methods. A low cost and efficient implementation
of the sharpness transport algorithm is also presented in section 4.5. The
method developed in chapter 2 for space variant filtering is utilized here to
extract sharpness information at each pixel of the sharpest color according to
continuously varying relative sharpness between sharpest and blurred colors.
93
(a) Image with axial CA
(b) original
(c) Corrected
(d) Original
(e) Corrected
Figure 4.10: Chromatic aberrations are corrected from the image that is captured with the lens exhibiting large chromatic aberrations.
94
Chapter 5
Optimal Lens and Sensor
Characteristics for Depth and
extended DOF using Chromatic
Aberrations
An important aspect in the performance of depth estimation and extended
depth of field imaging is camera system properties. For a depth from defocus
system, lens and sensor play a role in determining the accuracy of depth estimation. Change in the defocus blur and sensor resolution determines the axial
resolution of depth maps. For an extended DOF, the amount of chromatic
aberrations defines the range of extended DOF. In this chapter, the affect of
lens and sensor parameters on the performance of depth estimation and extended DOF are studied. The depth from CA, which is the target of this
work, is similar to depth from defocus system, therefore, most of the camera
parameters have same impact on DFD and depth from CA system. Only the
chromatic focal shift and properties of color filter arrays are specific to depth
and extended DOF method using CA.
5.1
Axial Resolution of Depth from Defocus
Depth estimation algorithms first estimate the amount of blur in each defocus
image and then compute the relative depth estimate. Therefore, it is preferable
95
96
to analyze the depth resolution of a single defocus image.
The accuracy of depth could be defined in terms of minimum detectable
difference ๐›ฟ๐‘‘, between two distances. Therefore, we derive the relationship
between optical parameters and ๐›ฟ๐‘‘ to observe the effect of optics on depth
resolution. Assuming the photographic lens where the object distance ๐‘‘ is
much larger than focal length ๐‘“ i.e. ๐‘‘ >> ๐‘“ , the lens magnification is given as
๐‘š = ๐‘“ /๐‘‘๐‘œ . Hence, the blur disk diameter can be written as,
๐‘“2
๐‘(๐‘‘) =
๐น#
(
1
1
โˆ’
๐‘‘๐‘œ ๐‘‘
)
,
(5.1)
Now taking the derivate of the blur with respect to distance ๐‘‘ gives,
๐›ฟ๐‘
๐‘“2
=
,
๐›ฟ๐‘‘
๐น #๐‘‘2
๐›ฟ๐‘‘ =
๐น #๐‘‘2
๐›ฟ๐‘ ,
๐‘“2
(5.2)
(5.3)
where ๐›ฟ๐‘ is the change in the blur which is detectable in the case of noise.
The equation is very similar to the one derived by Blayvas et al. [4]. Equation
5.3 shows that ๐›ฟ๐‘‘ is directly proportional to the F-number and inverse proportional to the square of focal length. A very noteworthy relationship of depth
resolution is with the distance of the object. ๐›ฟ๐‘‘ increases with the square of
the object distances. Therefore, DFD system with conventional photographic
lenses is unable to differentiate the distances of far objects.
The important parameter in equation 5.3 is ๐›ฟ๐‘, the minimum detectable
blur difference in case of noise, or in other terms, the blur estimation error.
From the depth resolution equation derived by Blayvas et al. [4], ๐›ฟ๐‘ is slightly
greater than the maximum of spatial resolution of sensor and optics. This is
the case for an image with high SNR. For a more practical case, the standard
deviation of blur measurement defines the parameter ๐›ฟ๐‘. More details of the
camera parameters will be discussed in the next section.
Depth resolution given by equation 5.3 is very similar to the depth resolution
of stereoscopy derived in [16] as
๐›ฟ๐‘‘๐‘ ๐‘ก๐‘’๐‘Ÿ๐‘’๐‘œ =
๐‘‘2
๐›ฟ๐‘ ,
๐‘™๐‘“
(5.4)
97
where ๐›ฟ๐‘ is the error in the estimated disparity and ๐‘™ is the distance between
optical axis of two camera lenses (stereobase). Equations 5.3 and 5.4 show that
the depth resolution decreases with the square of the object distance for both
stereoscopy and DFD systems. Moreover, if the depth estimation errors ๐›ฟ๐‘ and
๐›ฟ๐‘ are same for both systems, they can provide the similar depth resolution if
the stereobasis is equal to aperture diameter ๐ด. ( ๐น๐‘“# = ๐ด).
5.1.1
Depth Resolution of DFD using Two Images
In the previous section, we have studied the defocus behavior of a single image,
and its effect on the depth resolution. However, in most DFD methods, two
defocused images are used to compute the relative depth.
Let ๐‘๐‘› (๐‘‘) and ๐‘๐‘“ (๐‘‘) define the blur diameters for two near and far focused
images.
๐‘“2
๐‘๐‘› (๐‘‘) =
๐น#
(
๐‘“2
๐‘๐‘“ (๐‘‘) =
๐น#
The relative depth is defined as,
(
๐ถ(๐‘‘) =
1
1
โˆ’
๐‘‘๐‘› ๐‘‘
)
1
1
โˆ’
๐‘‘๐‘“
๐‘‘
)
(5.5)
๐‘๐‘“ (๐‘‘) โˆ’ ๐‘๐‘› (๐‘‘)
,
๐‘๐‘“ (๐‘‘) + ๐‘๐‘› (๐‘‘)
(5.6)
(5.7)
The relative depth is only unique for the distances between the focus positions ๐‘‘๐‘“ and ๐‘‘๐‘› of two images. Since the denominator part is just used to
normalize the differences, therefore, we only consider the numerator part for
the analysis of depth resolution. From equations 5.5 and 5.6, the numerator
part is given as,
(
)
๐‘“2 2
1
1
๐‘๐‘“ (๐‘‘) โˆ’ ๐‘๐‘› (๐‘‘) =
โˆ’
โˆ’
๐‘‘๐‘› โ‰ค ๐‘‘ โ‰ค ๐‘‘๐‘“ .
(5.8)
๐น # ๐‘‘ ๐‘‘๐‘› ๐‘‘๐‘“
Taking the derivative with respect to distance, and substituting ๐ถ for ๐‘๐‘“ โˆ’๐‘๐‘› ,
we get,
๐›ฟ๐ถ
2๐‘“ 2
=
.
๐›ฟ๐‘‘
๐น #๐‘‘2
๐›ฟ๐‘‘ =
๐›ฟ๐ถ๐น #๐‘‘2
.
2๐‘“ 2
(5.9)
(5.10)
98
Equation 5.10 shows that the relative depth resolution has two times less
resolution as compared to single image i.e. ๐›ฟ๐ถ = 2๐›ฟ๐‘. This is obvious due to
addition of noise sources of two images in the case of relative depth estimation. The major advantage of using two images is to make the relative depth
estimation invariant to local image features.
5.2
Optimal Parameters for Depth Estimation
In the previous section, we have derived the axial resolution of depth from
defocus system. The equation helps in selecting the parameters of the lens
to obtain the required depth resolution. In the next subsections, we analyze
different parameters of lens and sensor which affect the depth performance. As
the goal of this work is to estimate the depth by deliberately introducing the
chromatic aberration in a lens which also degrades the image quality, therefore,
this type of lens is only feasible for low cost imaging systems. For that reason,
we will discuss the parameters relevant to low cost imaging optics.
5.2.1
Focal Length and F-number
For a desired depth resolution ๐›ฟ๐‘‘ according to application requirements, focal
length and F-number may be selected during the lens design or selection process. Equation 5.3 shows that the lower F-number and larger focal length gives
finer depth resolution. The design of low F-number lens is more complex and
it also makes the optics expensive. Therefore, the typical value of F-number
which is used for low cost imaging lenses may be selected, which is around 2.4.
We have more freedom in choosing the optimum value of focal length to
fulfill the desired depth accuracy requirements. Figure 5.1 shows the depth
performance for different focal lengths with ๐น # = 2.4 and ๐›ฟ๐‘ = 0.4 ๐œ‡m. Note
that increasing the focal length twice gives the four times finer depth resolution.
Equation 5.10 helps in selecting the minimum value of focal length which is
required to get the desired depth resolution ๐›ฟ๐‘‘ at certain distance ๐‘‘.
โˆš
๐‘“=
๐›ฟ๐ถ๐‘‘2 ๐น #
.
2๐›ฟ๐‘‘
(5.11)
99
10
f = 2 mm
f = 3 mm
f = 4 mm
Depth Error [m]
8
6
4
2
0
0.5
1
1.5
2
2.5
3
3.5
4
Object Distance2 [m2 ]
Figure 5.1: Depth resolution versus object distance for different focal lengths
of lens.
5.2.2
Chromatic Focal Shift
If the desired depth range is given then the chromatic focal shift can be estimated for a given focal length. The optimum focal length for desired depth
resolution is calculated from equation 5.11, that represents green color. Assuming the sensor is positioned to focus the green color, sensor distance is
computed using thin lens formula. Now let ๐‘‘๐‘ and ๐‘‘๐‘Ÿ are the focus positions
of blue and red color image which represent near and far focused images respectively. We can compute the focal lengths of blue ๐‘“๐‘ and red ๐‘“๐‘ color image
using the thin lens formula.
1
1
1
= +
,
๐‘“๐‘
๐‘‘๐‘– ๐‘‘๐‘
(5.12)
1
1
1
= +
.
๐‘“๐‘Ÿ
๐‘‘๐‘– ๐‘‘๐‘Ÿ
(5.13)
Hence, ฮ”๐‘“ = ๐‘“๐‘Ÿ โˆ’ ๐‘“๐‘ is the chromatic focal shift that must be introduced
in the lens for depth from CA system to get the desired depth resolution in a
given distance range.
100
Blur Measures Ratios
0.4
0.2
0
pp
pp
pp
pp
-0.2
-0.4
=
=
=
=
1.12
2.24
4.48
8.96
103
Distance [mm] (log scale)
Figure 5.2: Normalized depth parameter versus object distance for different
sensor resolutions.
5.2.3
Sensor Resolution
The resolution of a sensor is the smallest change in an image that it can
distinguish. As a result of spatial sampling of an image, defocus blur is also
sampled according to sensor resolution ฮ”๐‘ฅ. In equation 5.3, the parameter ๐›ฟ๐‘ is
affected by sensor resolution. Since blur diameter changes with the distance,
therefore, finer sampling increases the depth resolution. If the pixel size is
increased by some factor, the depth resolution decreases with the same factor.
Figure 5.2 shows the computed normalized ratios between blur measures of
two defocused images for different pixel resolutions. The plots are generated
for a focal length of 4 mm and F-number of 2.4. As it can be seen, if the pixel
size is larger the variation in the depth curve becomes smaller. Therefore, the
lowest possible pixel size is the best for finer depth resolution. Since there
is also a limit on the lens spatial resolution, therefore, the optimum pixel
resolution is equal to the lens spatial resolution given by Rayleigh criterion,
i.e. ฮ”๐‘ฅ = 1.22๐œ†๐น #.
5.2.4
Spectral Response of Color Filter Arrays
Conventional color image sensors capture the color information through sampling of visible light spectrum with the color filters. Usually, sensors capture
101
three primary colors red, green and blue. Since practical filters are smooth,
and a sharp transition of filters are not realizable, therefore, a broad range of
wavelengths contribute to each color. Another reason for a broadband color
spectra response is to match it with the chromatic response of a human eye.
For an example, the spectral response of commercially available Kodak Wratten Filters is shown in figure 4.4a.
To study the effect of spectral response on the depth performance, we first
observe the behavior of point spread function (PSF) in case of chromatic aberrations. The image irradiance ๐ธ of a point source at the sensor position ๐‘ฅ and
๐‘ฆ can be written as
(
๐ธ(๐‘ฅ, ๐‘ฆ, ๐œ†) = ๐‘…(๐œ†)๐‘๐‘–๐‘Ÿ๐‘
๐‘Ÿ
๐‘(๐œ†)
)
,
(5.14)
where ๐‘…(๐œ†) is the object reflectance spectra and ๐‘๐‘–๐‘Ÿ๐‘ is a circular function
which defines the blur with the diameter ๐‘. With color filter sampling the PSF
is given as
๐‘ƒ (๐‘ฅ, ๐‘ฆ) =
โˆ‘
๐ธ(๐‘ฅ, ๐‘ฆ, ๐œ†)๐‘†(๐œ†) ,
(5.15)
๐œ†
๐‘ƒ (๐‘ฅ, ๐‘ฆ) =
โˆ‘
๐œ†
(
๐‘†(๐œ†)๐‘…(๐œ†)๐‘๐‘–๐‘Ÿ๐‘
๐‘Ÿ
๐‘(๐œ†)
)
,
(5.16)
where ๐‘† is the spectral response of color filters. Hence, the PSF is a summation of concentric discs weighted with the object and color filter spectra.
For a depth estimation with chromatic aberrations, the change in the PSF
must be independent of the object reflectance spectra. This is only possible if
the object has an equal reflectance for each wavelength covered by a color filter
spectral response. In reality this may be only valid for a uniform broadband
object reflectance spectra. On the other hand, if an object has a narrowband
reflectance spectra, the PSF would vary according to wavelength which results
into an error in depth estimation.
The wavelength dependency of depth estimation can be minimized by choosing a very narrowband color filter spectra. In this case the depth of any colored
object which have some amount of reflectance for red, green and blue color can
be estimated as accurately as a depth from defocus method. Parkinnen et al.
[32] have shown that most of the natural scenes have a broadband object reflectance spectra. Therefore, depth from CA method can give accurate depth
102
estimation with a narrowband color spectral response for most of the natural
scenes.
For an RGB color sensor with narrowband color filter response, we get a
tradeoff between accurate depth estimation and an accurate reproduction of
colors of visible spectrum. This tradeoff can be avoided through multispectral
imaging. Yasuma et al. [43] have proposed a general assorted pixel camera to
control the spectrum after the image is captured. This type of camera with
a lens exhibiting chromatic aberrations can be used to generate narrowband
images of at least two colors after capturing the image. These narrowband
images could be then used to estimate the depth without the impact of spectral
response of color filters.
5.3
Optimal Parameters for Extended DOF
The method of extended DOF discussed in the previous chapter restores the
intensity image to enhance the MTF and also corrects the chromatic aberrations according to sharpest color at each location. The amount of chromatic
aberrations must be introduced such that the far DOF limit of blue image is
greater or equal to near DOF limit of green image, and near DOF limit of
red image is less or equal to far DOF limit of green image. In this way, we
make sure that the blur diameter is less than circle of confusion for a complete extended DOF. We formulate these conditions using the depth of field
formulas.
If the focus position ๐‘‘๐‘œ is much larger than focal length ๐‘“๐‘œ , then the far and
near limits ๐ท๐‘‚๐น๐‘“ and ๐ท๐‘‚๐น๐‘› of DOF are given as,
๐ท๐‘‚๐น๐‘“๐‘œ โ‰ˆ
๐‘‘๐‘œ ๐‘“๐‘œ2
๐‘“๐‘œ2 โˆ’ ๐‘๐‘‘๐‘œ ๐น #
(5.17)
๐ท๐‘‚๐น๐‘›๐‘œ โ‰ˆ
๐‘‘๐‘œ ๐‘“๐‘œ2
๐‘“๐‘œ2 + ๐‘๐‘‘๐‘œ ๐น #
(5.18)
where ๐‘ is the maximum blur diameter which is perceivable as a point to
human eye and ๐‘œ represents color. Circle of confusion ๐‘ is directly related
to the sensor image format and can be given according to Zeiss formula as
๐‘ = โ„Ž/1500, where โ„Ž is the diagonal height of the sensor.
103
We consider two use cases to compute the optimal amount of chromatic
aberrations for extended DOF. In the first case, we want to get DOF from
certain distance to infinity and we want to find the optimum focus positions of
each color image. In the other case, the focus position of the lens for one color
(mostly green) is given and we have to find the focus position of other two
colors. For the selection of focus positions, we put the constraint that DOF of
two adjacent focused color images overlaps with each other.
First Use Case: If the red image is focused at hyperfocal distance ๐ป, the
๐ท๐‘‚๐น๐‘“๐‘Ÿ is infinity and ๐ท๐‘‚๐น๐‘›๐‘Ÿ is half of the hyperfocal distance. Hyperfocal
distance is given as,
๐ปโ‰ˆ
๐‘“2
.
๐‘๐น #
(5.19)
Now taking ๐ท๐‘‚๐น๐‘“๐‘” = ๐ท๐‘‚๐น๐‘›๐‘Ÿ = ๐ป/2, we compute the focal length of green
image using equation 3.2 as,
๐‘
1
1
1
=
+ +
,
๐‘“๐‘”
๐ด๐‘‘๐‘– ๐‘‘๐‘– ๐ท๐‘‚๐น๐‘“๐‘”
(5.20)
where sensor distance is computed according to focus position of red image
as,
1
1
1
=
โˆ’ .
๐‘‘๐‘–
๐‘“๐‘Ÿ ๐ป
(5.21)
The focus position of green image ๐‘‘๐‘” for the focal length ๐‘“๐‘” is computed
using thin lens formula 3.1. Now the near DOF limit ๐ท๐‘‚๐น๐‘›๐‘” of green image
can be computed using equation 5.18.
Similarly, the focal length ๐‘“๐‘ and focus positions ๐‘‘๐‘ of the blue color image
can be computed using equations 3.1, 5.18 and 5.20. Now the chromatic focal
shift (chromatic aberrations) is given as ฮ”๐‘“ = ๐‘“๐‘Ÿ โˆ’ ๐‘“๐‘ . If a lens contains
this amount of chromatic aberrations for the given parameters, it provides the
extended DOF ranging from ๐ท๐‘‚๐น๐‘›๐‘ to infinity.
For an example, let assume a lens with ๐‘“ = 4๐‘š๐‘š, ๐น # = 2.4 and the sensor
of diagonal height โ„Ž = 6 mm. Following the above criteria and equations,
we get ๐‘“๐‘ = 3.962 mm, ๐‘“๐‘” = 3.981 mm and ๐‘“๐‘Ÿ = 4 mm, which results into
ฮ”๐‘“ = 38 ๐œ‡m. The depth of field of a lens without chromatic aberrations is from
๐ท๐‘‚๐น๐‘›๐‘Ÿ = 83 cm to infinity (assuming the lens focuses at hyperfocal distance).
104
Now with chromatic aberrations and the use of algorithms presented in chapter
4, the DOF is from ๐ท๐‘‚๐น๐‘›๐‘ = 33.3 cm to infinity.
Second Use Case: Let assume the focus position of green image ๐‘‘๐‘” is given.
We compute the near and far limit of DOF ๐ท๐‘‚๐น๐‘“๐‘” and ๐ท๐‘‚๐น๐‘›๐‘” using equations
5.17 and 5.18. The focal lengths of red and blue images can be computed using
equation 3.2 as,
1
๐‘
1
1
=โˆ’
+ +
.
๐‘“๐‘Ÿ
๐ด๐‘‘๐‘– ๐‘‘๐‘– ๐ท๐‘‚๐น๐‘“๐‘”
(5.22)
1
๐‘
1
1
=
+ +
.
๐‘“๐‘
๐ด๐‘‘๐‘– ๐‘‘๐‘– ๐ท๐‘‚๐น๐‘›๐‘”
(5.23)
where ๐‘‘๐‘– is computed using thin lens formula for ๐‘‘๐‘” and given focal length
๐‘“ ๐‘”. The chromatic focal shift is then ฮ”๐‘“ = ๐‘“๐‘Ÿ โˆ’ ๐‘“๐‘ .
Figure 5.3 shows the blur diameter versus object distances for different
colors which are focused at different positions due to chromatic aberrations.
The blur diameter is computed for the lens and sensor parameters used in the
example of first use case. As the plot shows, the blur diameter of at least one
color image is less than the circle of confusion for a large distance range of 33.3
cm to infinity.
Blur Diameter [mm]
0.02
Blue
Green
Red
0.015
0.01
Circle of confusion
0.005
0
102
103
104
Object distance [mm] (log scale)
Figure 5.3: Blur diameter is less than the circle of confusion for a complete
range of depth of field.
105
5.4
Conclusion
In this chapter, the effect of optical and sensor parameters on the depth and
extended DOF imaging are studied. The relationship between camera parameters and depth axial resolution is derived to facilitate the optimal lens and
sensor design parameters. It is found that the most critical parameter for finer
depth resolution is the focal length, specially for low cost imaging applications.
Based on the derived relationship, the criteria are defined to choose the optimal lens and sensor parameters such as focal length, chromatic focal shift,
sensor resolution. It is shown that the narrowband spectral response of color
filter arrays is optimal for accurate depth estimation.
Section 5.3 describes different ways of computing the optimal amount of
chromatic aberrations for enhanced DOF. By introducing these amount of
chromatic aberrations, one makes sure that the blur is imperceivable in the
complete range of extended DOF.
106
Chapter 6
Conclusion and Outlook
The work in this thesis has provided thorough analysis of depth and extended
DOF imaging using axial chromatic aberrations. Besides this, a digital camera
simulator is presented to efficiently simulate the camera optics.
6.1
Summary
In chapter 2 a digital camera simulator is presented. The simulator models
the complete digital camera processing chain such as optics, sensor and digital
processing. The main contribution of the simulator is in optical simulation.
It allows a user to simulate the conventional and unconventional optics with
a correct modeling of occlusions. The blur induced by the optics is generated
for a sampled 3D space with commercially available lens design tools. Missing
blur information is then approximated at each pixel of the image through
PCA based interpolation. For 3D scenes, the true depth map is used to blur
the image according to each pixels depth and location. It is shown that two
methods of filtering, scattering and gathering, mentioned in the past literature,
can be efficiently implemented in Fourier domain using the PCA based filtering.
An efficient algorithm for space variant filtering using PCA helps in making the
simulation time substantially smaller. Although the aim of the simulator is to
simulate the cameras but the low cost method of space variant filtering can also
be used to add the depth of field effect in real time to the computer generated
scenes, which is very useful and demanding in gaming applications. The sensor
simulation includes the noise addition, sampling and color filter array effects.
107
108
The digital part implements the traditional camera post processing steps.
Chapter 3 analyzes a method similar to the depth from defocus for depth
imaging. Instead of capturing multiple defocused image, axial chromatic aberrations are used to capture the defocused information in multiple colors with
a single shot. The previous related literature has only shown the feasibility of
this approach for limited experimental setups. Therefore, a thorough analysis is performed to develop a method which works well for different imaging
conditions. The existing focus/blur measures were evaluated and it is shown
that these measures are contrast dependent which makes them infeasible for
depth from CA system. A new blur measure based on the normalized gradients is proposed which is independent of local image contrast. Absolute depth
is inferred from blur measures by taking the normalized ratios and calibration
procedure. The depth error analysis has shown that there are some common
challenges in DFD and depth from CA system, for example, low texture area,
sensor noise, field varying blur. Since blur varies across the field view due to
lens aberrations, astigmatism and manufacturing tolerances, a simple calibration procedure is proposed to correct field varying depth. The test image used
for calibration setup contains the information of sagittal and tangential edge
orientations at different field positions. It is analyzed that the behavior of the
depth change is similar for different lenses. Therefore, by measuring this once
or obtained from the lens design data, a single image is sufficient to determine
the field varying depth error and its correction.
The main advantage of depth from CA method over DFD is a single shot
depth imaging system which helps in avoiding miss-registration problems and
also varying image magnifications, because the lateral color shift can be reduced during lens design process. The major challenge to depth from CA
method is narrowband object reflectance spectra. In this case, either the color
information is missing, or the blur is different as compared to the broadband
spectra case.
Since depth from defocus method only measures the depth for texture regions, therefore, a low cost implementation of two existing methods are used
to create dense depth maps. One method is based on image segmentation and
filling of each segment with the median depth value of this segment. Another
solution uses the optimization method to propagate the depth to neighboring
109
regions of the same intensity. As the optimization problem is computationally
complex, hence it is proposed to propagate the depth at very low resolution
followed by joint bilateral upsampling.
In chapter 4, the method of obtaining extended depth of field image using
chromatic aberrations is described. Since there is already some work being
done on these methods, therefore solutions to some existing challenges are
presented. One of the existing method [8] only increases the MTF through
deconvolution and doesnโ€™t correct chromatic aberrations. The other method
[14] corrects the chromatic aberration according to sharpest color but the infocus position still remains blurred as shown in section 4.4. In this work,
both methods are combined using the RGBW sensor. As the correction of
color aberrations requires the relative sharpness information, therefore, the
method developed in chapter 3 provides better relative sharpness information
as compared to the existing methods. A low cost and efficient implementation
of the sharpness transport algorithm is also presented in section 4.5. The
method developed in chapter 2 for space variant filtering is utilized here to
extract sharpness information at each pixel of the sharpest color according to
continuously varying relative sharpness between sharpest and blurred colors.
In chapter 5, the effect of optical and sensor parameters on the depth and
extended DOF imaging are studied. The relationship between camera parameters and depth axial resolution is derived to facilitate the optimal lens and
sensor design parameters. It is found that the most critical parameter for finer
depth resolution is the focal length, specially for low cost imaging applications. Based on the derived relationship, the criteria are defined to choose
the optimal lens and sensor parameters such as focal length, chromatic focal
shift, sensor resolution. It is shown that the narrowband spectral response
of color filter arrays is optimal for accurate depth estimation and sharper extended DOF images. The derived formulas provide the optimum amount of
chromatic aberrations for extended DOF. For a complete distance range of
extended DOF, the blur diameter is less than circle of confusion for at least
one of the color.
110
6.2
Conclusion
The thesis has thoroughly studied a computational imaging method to compute depth and extend the DOF using chromatic aberration. Besides this, a
complete simulator is presented to simulate the camera system.
The work has provided an efficient solution to spatially varying optical simulation with accurate modeling of the occlusions. The simulator is very useful
to simulate the optics of the computational cameras which behave differently
on the occlusion boundaries as compared to traditional optics. Moreover, the
optical simulation is useful in simulating depth of field effect in real time for
computer generated scenes e.g. in gaming applications.
The proposed blur measure and their normalized ratios given in chapter 3
provide depth estimation for natural scenes. The analysis of the depth errors
shows the strong dependency of depth from CA on the object reflectance spectra. However, it is shown in chapter 5 that a narrowband color filter response
can reduce this dependency to a negligible amount. The proposed depth estimation method could be a cheap solution to a human machine interface as it
requires only one shot (camera) and does not suffer with miss-registration and
magnification problems.
The extended DOF method using chromatic aberrations and RGBW sensor,
described in chapter 4, give sharper images without significant color artifacts.
The use of panchromatic pixels provides low noise images which results in low
noise gain in image restoration. At the same time the low cost color correction
method in combination with proposed depth estimation method (chapter 3)
gives high quality images.
The derived formulas and criteria in chapter 5 help in selecting an optimum
camera parameters for depth and extended DOF.
6.3
Outlook
The thorough analysis in this work has given some solutions to the existing
methods. However, there are still some possible extension to the work to make
it more robust and practical in different imaging conditions.
The optical simulator uses multiple image planes to simulate the occlu-
111
sions. Although it is sufficient to simulate a simple scene with foreground
and background objects to analyze the occlusion problem, however, it would
be beneficial to develop a simpler approach to extend this method for more
complex scenes.
The blur measure proposed for depth estimation is exact for well defined
edges. For texture regions, it fails to provide accurate measure, therefore,
a robust blur measure for different type of sceneโ€™s content can substantially
improve the depth results and make it practical for many applications. This
texture invariant blur measure can also be useful for extended DOF algorithm
to completely remove the color bleeding artifacts which appear due to chromatic aberrations.
112
Bibliography
[1] Association, European Machine V. u. a.: EMVA Standard 1288, Standard for Characterization of Image Sensors and Cameras. 2010
[2] Bae, S. ; Durand, F.: Defocus magnification. In: Computer Graphics
Forum Bd. 26 Wiley Online Library, 2007, S. 571โ€“579
[3] Bando, Y. ; Chen, B.Y. ; Nishita, T.: Extracting depth and matte
using a color-filtered aperture. In: ACM Transactions on Graphics (TOG)
Bd. 27 ACM, 2008, S. 134
[4] Blayvas, I. ; Kimmel, R. ; Rivlin, E.: Role of optics in the accuracy
of depth-from-defocus systems. In: JOSA A 24 (2007), Nr. 4, S. 967โ€“972
[5] Chakrabarti, A. ; Zickler, T.:
Depth and deblurring from a
spectrally-varying depth-of-field. In: Proc. ECCV, 2012
[6] Chen, J. ; Venkataraman, K. ; Bakin, D. ; Rodricks, B. ; Gravelle, R. ; Rao, P. ; Ni, Y.: Digital camera imaging system simulation.
In: Electron Devices, IEEE Transactions on 56 (2009), Nr. 11, S. 2496โ€“
2505
[7] Chung, S.W. ; Kim, B.K. ; Song, W.J.:
Detecting and eliminating
chromatic aberration in digital images. In: Image Processing (ICIP),
2009 16th IEEE International Conference on IEEE, 2009, S. 3905โ€“3908
[8] Cossairt, O. ; Nayar, S.: Spectral focal sweep: Extended depth of
field from chromatic aberrations. In: Computational Photography (ICCP),
2010 IEEE International Conference on IEEE, 2010, S. 1โ€“8
[9] Dowski, E.R. ; Cathey, W.T.: Extended depth of field through wavefront coding. In: Applied Optics 34 (1995), Nr. 11, S. 1859โ€“1866
113
114
[10] Eisemann, E. ; Durand, F.: Flash photography enhancement via intrinsic relighting. In: ACM Transactions on Graphics (TOG) Bd. 23 ACM,
2004, S. 673โ€“678
[11] Farrell, J.E. ; Xiao, F. ; Catrysse, P. ; Wandell, B.A.: A simulation
tool for evaluating digital camera image quality. In: Proceedings of the
SPIE Electronic Imaging Conference Bd. 5294, 2003, S. 124
[12] Garcia, J. ; Sánchez, J.M. ; Orriols, X. ; Binefa, X.: Chromatic
aberration and depth extraction. In: Pattern Recognition, 2000. Proceedings. 15th International Conference on Bd. 1 IEEE, 2000, S. 762โ€“765
[13] Geusebroek, J.M. ; Cornelissen, F. ; Smeulders, A.W.M. ;
Geerts, H.:
Robust autofocusing in microscopy. In: Cytometry 39
(2000), Nr. 1, S. 1โ€“9
[14] Guichard, F. ; Nguyen, H.P. ; Tessières, R. ; Pyanet, M. ; Tarchouna, I. ; Cao, F.: Extended depth-of-field using sharpness transport
across color channels. In: Proc. SPIE Bd. 7250, 2009, S. 72500N
[15] Häusler, G.: A method to increase the depth of focus by two step image
processing. In: Optics Communications 6 (1972), Nr. 1, S. 38โ€“42
[16] Jaehne, B.: Digital Image Processing. Springer, 2005
[17] Kay, A. ; Mather, J. ; Walton, H.: Extended depth of field by colored
apodization. In: Optics letters 36 (2011), Nr. 23, S. 4614โ€“4616
[18] Kim, S. ; Lee, E. ; Hayes, M.H. ; Paik, J.: Multifocusing and Depth
Estimation Using a Color Shift Model-Based Computational Camera. In:
Image Processing, IEEE Transactions on 21 (2012), Nr. 9, S. 4152โ€“4166
[19] Konevsky, O: Method of Correction of Longitudinal Chromatic Aberrations. In: Graphicon (2008)
[20] Kopf, J. ; Cohen, M.F. ; Lischinski, D. ; Uyttendaele, M.: Joint
bilateral upsampling. In: ACM Transactions on Graphics 26 (2007), Nr.
3, S. 96
115
[21] Kosloff, T.J. ; Tao, M.W. ; Barsky, B.A.: Depth of field postprocessing for layered scenes using constant-time rectangle spreading. In:
Proceedings of Graphics Interface 2009 Canadian Information Processing
Society, 2009, S. 39โ€“46
[22] Krotkov, E.: Focusing. In: International Journal of Computer Vision
1 (1988), Nr. 3, S. 223โ€“237
[23] Kumar, M. ; Morales, E.O. ; Adams, JE ; Hao, W.: New digital camera sensor architecture for low light imaging. In: Image Processing (ICIP),
2009 16th IEEE International Conference on IEEE, 2009, S. 2681โ€“2684
[24] Levin, A. ; Lischinski, D. ; Weiss, Y.: Colorization using optimization.
In: ACM Transactions on Graphics (TOG) Bd. 23 ACM, 2004, S. 689โ€“694
[25] Lim, J. ; Kang, J. ; Ok, H.: Robust local restoration of space-variant
blur image. In: Electronic Imaging 2008 International Society for Optics
and Photonics, 2008, S. 68170Sโ€“68170S
[26] Maeda, P. ; Catrysse, P.B. ; Wandell, B.A.: Integrating lens design
with digital camera simulation. In: SPIEProceedings SPIE Electronic
Imaging, San Jose, CA 5678 (2005), S. 48โ€“58
[27] Molesini, G ; Pedrini, G ; Poggi, P ; Quercioli, F: Focus-wavelength
encoded optical profilometer. In: Optics communications 49 (1984), Nr.
4, S. 229โ€“233
[28] Nakamura, J.: Image sensors and signal processing for digital still cameras. CRC, 2006
[29] Nayar, S.K. ; Nakagawa, Y.: Shape from focus: An effective approach
for rough surfaces. In: Robotics and Automation, 1990. Proceedings., 1990
IEEE International Conference on IEEE, 1990, S. 218โ€“225
[30] Ng, R. ; Levoy, M. ; Brédif, M. ; Duval, G. ; Horowitz, M. ; Hanrahan, P.: Light field photography with a hand-held plenoptic camera.
In: Computer Science Technical Report CSTR 2 (2005)
116
[31] Ojeda-Castañeda, J. ; Ramos, R. ; Noyola-Isgleas, A.: High focal
depth by apodization and digital restoration. In: Applied optics 27 (1988),
Nr. 12, S. 2583โ€“2586
[32] Parkkinen, J.P.S. ; Hallikainen, J. ; Jaaskelainen, T.: Characteristic spectra of Munsell colors. In: JOSA A 6 (1989), Nr. 2, S. 318โ€“322
[33] Schechner, Yoav Y. ; Kiryati, Nahum: The optimal axial interval in
estimating depth from defocus. In: Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on Bd. 2 IEEE, 1999,
S. 843โ€“848
[34] Shapiro, L. ; Stockman, G.C.: Computer Vision. 2001. 2001
[35] Shen, C.H. ; Chen, H.H.: Robust focus measure for low-contrast images. In: Consumer Electronics, 2006. ICCEโ€™06. 2006 Digest of Technical
Papers. International Conference on IEEE, 2006, S. 69โ€“70
[36] Sherif, S.S. ; Dowski, E.R. ; Cathey, W.T.: Effect of detector noise
in incoherent hybrid imaging systems. In: Optics letters 30 (2005), Nr.
19, S. 2566โ€“2568
[37] Smith, L.I.: A tutorial on principal components analysis. In: Cornell
University, USA 51 (2002), S. 52
[38] Subbarao, Murali ; Tyan, Jenn-Kwei:
Noise sensitivity analysis of
depth-from-defocus by a spatial-domain approach. In: Proc. SPIE Bd.
3174 Citeseer, 1994, S. 174โ€“187
[39] Tiziani, Hans J. ; Uhde, H-M: Three-dimensional image sensing by
chromatic confocal microscopy. In: Applied optics 33 (1994), Nr. 10, S.
1838โ€“1843
[40] Wang, S. ; Hou, T. ; Border, J. ; Qin, H. ; Miller, R.: High-quality
image deblurring with panchromatic pixels. In: ACM Transactions on
Graphics (TOG) 31 (2012), Nr. 5, S. 120
[41] Watanabe, M. ; Nayar, S.K.:
Telecentric optics for focus analysis.
In: Pattern Analysis and Machine Intelligence, IEEE Transactions on 19
(1997), Nr. 12, S. 1360โ€“1365
117
[42] Watanabe, M. ; Nayar, S.K.: Rational filters for passive depth from
defocus. In: International Journal of Computer Vision 27 (1998), Nr. 3,
S. 203โ€“225
[43] Yasuma, Fumihito ; Mitsunaga, Tomoo ; Iso, Daisuke ; Nayar,
Shree K.:
Generalized assorted pixel camera: postcapture control of
resolution, dynamic range, and spectrum. In: Image Processing, IEEE
Transactions on 19 (2010), Nr. 9, S. 2241โ€“2253
[44] Zalevsky, Z. ; Ben-Yaish, S.: Extended depth of focus imaging with
birefringent plate. In: Optics Express 15 (2007), Nr. 12, S. 7202โ€“7210
[45] Zhou, C. ; Cossairt, O. ; Nayar, S.: Depth from diffusion. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference
on IEEE, 2010, S. 1110โ€“1117
118
List of Publications
[1] Atif, Muhammad ; Jähne, Bernd: A Space-Variant (3D) Image Simulation Tool for Computational Cameras. In: International Conference on
Computational Photography (ICCP), 2010. โ€“ Poster
[2] Atif, Muhammad ; Jähne, Bernd: Optimal Depth Estimation from a
Single Image by Computational Imaging using Chromatic Aberrations. In:
Puente León, Fernando (Hrsg.) ; Heizmann, Michael (Hrsg.): Forum
Bildverarbeitung. Regensburg : KIT Scientific Publishing, November 2012,
S. 23โ€“34
[3] Atif, Muhammad ; Jähne, Bernd: Optimal Depth Estimation from a
Single Image by Computational Imaging using Chromatic Aberrations. In:
tm-Technisches Messen (2013). โ€“ To be published
[4] Atif, Muhammad ; Siddiqui, Muhammad: Method and Optical System
for Determining a Depth Map of an Image. April 18 2012. โ€“ European
Patent Application EP12002701
[5] Atif, Muhammad ; Siddiqui, Muhammad ; Unruh, Christian ; Kamm,
Markus: Infrared Imaging System, and Method of Operating. November 8
2012. โ€“ US Patent 20,120,281,081
[6] Atif, Muhammad ; Siddiqui, Muhammad ; Unruh, Christian ; Kamm,
Markus u. a.: Image System Using a Lens Unit With Longitudinal Chromatic Aberations and Method of Operating. Juli 20 2012. โ€“ WO Patent
2,012,095,322
119
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising