Ultra-thin Multiple-channel LWIR Imaging Systems M. Shankara , R. Willetta , N. P. Pitsianisa , R. Te Kolsteb , C. Chenc , R. Gibbonsd , and D. J. Bradya a Fitzpatrick Institute for Photonics, Duke University, Durham, NC 27708 b Digital Optics Corporation, Charlotte, NC 28262 c University of Delaware, Newark, DE 19716 d Raytheon Company, McKinney, TX 75071 ABSTRACT Infrared camera systems may be made dramatically smaller by simultaneously collecting several low-resolution images with multiple narrow aperture lenses rather than collecting a single high-resolution image with one wide aperture lens. Conventional imaging systems consist of one or more optical elements that image a scene on the focal plane. The resolution depends on the wavelength of operation and the f-number of the lens system, assuming a diffraction limited operation. An image of comparable resolution may be obtained by using a multi-channel camera that collects multiple low-resolution measurements of the scene and then reconstructing a high-resolution image. The proposed infrared sensing system uses a three-by-three lenslet array with an effective focal length of 1.9mm and overall system length of 2.3mm, and we achieve image resolution comparable to a conventional single lens system having a focal length of 5.7mm and overall system length of 26mm. The high-resolution final image generated by this system is reconstructed from the noisy low-resolution images corresponding to each lenslet; this is accomplished using a computational process known as superresolution reconstruction. The novelty of our approach to the superresolution problem is the use of wavelets and related multiresolution method within a Expectation-Maximization framework to improve the accuracy and visual quality of the reconstructed image. The wavelet-based regularization reduces the appearance of artifacts while preserving key features such as edges and singularities. The processing method is very fast, making the integrated sensing and processing viable for both time-sensitive applications and massive collections of sensor outputs. 1. INTRODUCTION Practical generalized sampling strategies require balancing mathematical models against physically achievable projections. Physical constraints are as diverse as geometric restrictions on signal interconnections in two and three dimensions, thermodynamic implications of signal multiplexing, noise models, and optical aberrations. Physical design must ideally also account for natural structure in signals. As a result of fundamental coherence and spectral properties, optical signals have particularly complex native structure. Optical signals from diverse objects may produce essentially identical distributions over many projections while still producing sharply different images or interferograms. Since the end result is measurements that depend jointly on multiple signal components, the process of encoding optical sensors for generalized sampling can be termed multiplexing. The DISP group at Duke is among the world-leaders in multiplex optical sensor design. The toolbox for multiplexing includes optical preprocessing and electronic sensor array processing. It is important to understand and apply both of these components in generalized sampling system design. DISP has explored compressive sensors and inference from multiplexed data in the context of the DARPA MONTAGE program. As an example, we have recently shown that infrared camera systems can be made dramatically smaller by simultaneously collecting several low-resolution images with multiple narrow aperture lenses rather than collecting a single high-resolution image with one wide aperture lens. This concept, based on ideas originally proposed in [1], is referred to as a TOMBO (Thin Observation Module by Bound Optics) system. Our infrared camera uses a 3 × 3 lenslet array having an effective focal length of 1.9mm and overall system length of 2.3mm, and we are able to achieve image resolution comparable to a conventional single lens system having an effective focal length of 5.7mm and overall system length of 26mm. This represents a reduction in overall length by a factor of 11.3. However, the image dynamic range and linearity of our restoration are reduced from the original; this is an area of active research. Conventional imaging systems consist of one or more optical elements that form an image of the scene on the focal plane. The linear resolution of the optical system (in the image plane) depends on the wavelength and f-number, assuming diffraction limited operation. The angular resolution (in the object plane) depends on the wavelength and pupil diameter. Accordingly, a simple reduction in the effective focal length (while maintaining f-number) would result in a shorter optical system, but with the disadvantage of poorer angular resolution. However, the fact that the linear resolution in the image plane remains unchanged opens the opportunity to recover the lost resolution through the collection of multiple image samples. A considerable amount of work has been done in miniaturizing the size of imaging systems following this approach, often by mimicking the small imaging systems that are implemented in nature, for example, the compound eyes of insects [2, 3, 4, 5]. The first implementation of a multi-aperture system using microlens arrays was by Tanida et al[1]. A highresolution image is reconstructed with the multiple low-resolution images using back projection. The image is not of superior quality but the system illustrates the fact that reduction in the thickness of the optical system could be obtained using this concept. Subsequently, efforts were focused upon improving the reconstructed image quality either through improvements in optical design or image processing techniques[6, 7, 8]. A reduction in the form factor of infrared cameras could be obtained using the similar concept. Using a microlens array with short focal lengths and similar f-numbers as a conventional infrared camera, the multiple images that are obtained can be used to reconstruct a higher resolution image. We design and develop a camera similar to the TOMBO system that is suitable for use in the 8-12µm wavelength range. We also develop the algorithm to reconstruct a high-resolution image from the multiple low-resolution images. We then compare the performance of the camera with that of a conventional infrared camera. 2. SYSTEM DESCRIPTION The optical system for each channel in our camera consists of a single convex microlens placed at a focal distance away from the silicon window of the detector. We did not incorporate a separation layer between the adjacent channels and the cross-talk that results is accounted for during processing of the data. The optical system for each channel in our infrared camera system is illustrated in Figure 1. A 3x3 convex microlens array replaces the conventional imaging optical system. The lens array is etched on a 4.5mm square silicon wafer that is 150µm thick. Each of the lenslets has an aperture size of 1.3mm, the lens has an Effective Focal Length (EFL) of 1.9mm, and the overall system length is 2.3mm (Figure 2). The uncooled microbolometer detector array used by our camera is obtained from the Thermal Eye 3500AS commercial infrared camera from L-3 Electronics (Figure 3(a)). The size of the detector is 120x160 pixels with a pitch of 30µm. The accompanying imaging lens is replaced with the microlens array. A bandpass filter on the silicon window of the wafer sealed detector package serves, along with the tuned cavity absorber on the surface of the detector, to limit the optical response of the detector to the 8-12µm spectral range. The lens array is precisely positioned such as to place the detector array on the focal plane of the lens array. The position of the lens is kept fixed by designing and building an enclosure holding the lens array as well as the detector array at the required distance from the optics. The complete camera system is shown in Figure 3(b). The support structure as well as the camera enclosure is built using a rapid prototyping machine. The camera image can be viewed on a TV monitor and can be captured and saved on a computer using an appropriate USB interface. A raw image obtained from the conventional camera system is shown in Figure 3(c), Figure 1. Optical system for the multi-aperture infrared camera Figure 2. 3x3 Convex microlens array used in our camera while a raw image obtained from the TOMBO system is displayed in Figure 3(d). Nine low-resolution images are obtained corresponding to each of the microlenses. Application of the reconstruction algorithm would result in a higher resolution image and this is discussed next. 3. IMAGE RECONSTRUCTION High-resolution images can be reconstructed from the TOMBO data, which consists of several blurred and noisy low-resolution images, using a computational process known as superresolution reconstruction. Superresolution (a) (b) (c) (d) Figure 3. Camera systems and raw data. (a) Commercial IR camera obtained from Raytheon Company. (b) IR camera with the microlens array. (c) Image obtained from the commercial camera. (d) Unprocessed image obtained from the TOMBO IR camera. image reconstruction refers to the process of reconstructing a new image with a higher resolution using this collection of low-resolution, shifted, and often noisy observations. This allows users to see image detail and structures which are difficult if not impossible to detect in the raw data. The process is closely related to image deconvolution, except that the low-resolution images are not registered and their relative translations must be estimated in the process. In our processing, we use wavelets and related multiresolution methods within an expectation-maximization reconstruction process to improve the accuracy and visual quality of the reconstructed image. Simulations demonstrate the effectiveness of the proposed method, including its ability to distinguish between tightly grouped point sources using a small set of low-resolution observations. Superresolution is a useful technique in a variety of applications [7, 9], and recently researchers have begun to investigate the use of wavelets for superresolution image reconstruction [8]. We present here a method for superresolution image reconstruction based on the wavelet transform in the presence of Gaussian noise. An analogous multiscale approach in the presence of Poisson noise is described in [10]. Our experiments reveal that the noise associated with the system described in this paper is neither Gaussian nor Poisson, but that the method based on the Gaussian assumption results in images with higher visual quality. The EM algorithm proposed here extends the work of [11], which addressed image deconvolution with a method that combines the efficient image representation offered by the discrete wavelet transform (DWT) with the diagonalization of the convolution operator obtained in the Fourier domain. The algorithm alternates between an E-step based on the fast Fourier transform (FFT) and a DWT-based M-step, resulting in an efficient iterative process requiring O(N log N ) operations per iteration, where N is the number of pixels in the superresolution image. 3.1. Problem formulation In the proposed method, each low-resolution observed image, xk for k = 1, 2, . . . , 9 is modeled as a shifted, blurred, downsampled, and noisy version of the superresolution image f . The shift is caused by the relative locations of the nine lenslets, and the blur is caused by the point spread function (PSF) of the instrument optics and the integration done by the electronic focal plane array. The downsampling (subsampling) operator models the change in resolution between the observations and the desired superresolution image. If the noise can be modeled as additive white Gaussian noise, then we have the observation model xk = DBSk f + nk , k = 1, 2, . . . , 9 where D is the downsampling operator, B is the blurring operator, Sk is the shift operator for the k th observation, and nk is additive white Gaussian noise with variance σ 2 . By collecting the series of observations into one array, x, the noise observations into another array, n, and letting H be a “stacked” matrix composed of the nine matrixes DBSk for k = 1, . . . , 9, then we have the model x = Hf + n. (1) From the formulation above, it is clear that superresolution image reconstruction is a type of inverse problem in which the operator to be inverted, H, is partially unknown due to the unknown shifts of the observations. The first step of our approach is to estimate these shift parameters by registering the low-resolution observations to one another. Using these estimates, we reconstruct an initial superresolution image estimate fb in the second step. This estimate is used in the third step, where we re-estimate the shift parameters by registering each of the lowresolution observations to the initial superresolution estimate. Finally, we use a wavelet-based EM algorithm to solve for fb using the registration parameter estimates. Each of these steps is detailed below. 3.2. Registration of the observations The first step in the proposed method is to register the observed low-resolution images to one another. Assuming small shifts and that each sampled image has the same resolution, Irani and Peleg propose a method based on a Taylor series expansion [12]. In particular, let f1 and f2 be the continuous images underlying the sampled images x1 and x2 , respectively. If f2 is equal to a shifted version of f1 , then we have the relation f2 (tm , tn ) = f1 (tm + sm , tn + sn ). where (tm , tn ) denotes a location in the image domain and (sm , sn ) is the shift. A first order Taylor series approximation of f2 is then ∂f1 ∂f1 fb2 (tm , tn ) = f1 (tm , tn ) + sm + sn . ∂tm ∂tn b Let x b2 be a sampled version Pof f2 2 ; then x1 and x2 can be registered by finding the sm and sn which minimize 2 2 kx2 −b x2 k2 , where kak2 = i ai . This minimization is calculated with an iterative procedure which ensures that the motion being estimated at each iteration is small enough for the Taylor series approximation to be accurate. (Note that this method can be modified for the case where the low-resolution images are also rotated with respect to one another [10]; however, this is not a necessary consideration with the TOMBO imager.) After the registration parameters have been initially estimated using the above method, we use these estimates to calculate an initial superresolution image as fb(0) = H T x. This initial image estimate is then used to refine the registration parameter estimates. The registration method is the same as above, but instead of registering a low-resolution observation, x2 , to another low resolution observation, x1 , we instead register it to DBS1 fb(0) . The Taylor series based approach can produce highly accurate results. 3.3. Multiscale expectation-maximization Reconstruction is facilitated within the expectation-maximization (EM) framework through the introduction of a particular “unobservable” or “missing” data space. The key idea in the EM algorithm is that the indirect (inverse) problem can be broken into two subproblems; one which involves computing the expectation of unobservable data (as though no blurring or downsampling took place) and one which entails estimating the underlying image from this expectation. By carefully defining the unobservable data for the superresolution problem, we derive an EM algorithm which consists of linear filtering in the E-step and image denoising in the M-step; this idea is described in detail in [11]. The Gaussian observation model in (1) can be written with respect to the Digital Wavelet Transform (DWT) coefficients θ, where f = W θ and W denotes the inverse DWT operator [13]: x = HW θ + n. Clearly, if we had x = W θ + n = f + n (i.e. if no subsampling or blurring had occurred), we would have a pure image denoising problem with white noise, for which wavelet-based denoising techniques are very fast and nearly optimal [13]. Next note that the noise in the observation model can be decomposed into two different Gaussian noises (one of which is non-white): n = αHn1 + n2 where α is a positive parameter, and n1 and n2 are independent zero-mean Gaussian noises with covariances Σ1 = I and Σ2 = σ 2 I − α2 HH T , respectively. Using n1 and n2 , we can rewrite the Gaussian observation model as x = H (W θ + αn1 ) +n2 . | {z } z This observation is the key to our approach since it suggests treating z as missing data and using the EM algorithm to estimate θ. From these formulations of the problem, the EM algorithm produces a sequence of estimates {f (t) , t = 1, 2, . . .} by alternately applying two steps: E-Step: Updates the estimate of the missing data using the relation: h i (t) zb(t) = E z|x, θ̂ . In the case of Gaussian noise, this can be reduced to a Landweber iteration [14]: α2 zb(t) = fb(t) + 2 H T x − H fb(t) . σ Here, computing zb(t) simply involves applications of the operator H. Recall that H consists of shifting and blurring (which can be computed rapidly with the 2D FFT) and downsampling (which can be computed rapidly in the spatial domain). Thus the complexity of each E-Step is O(N log N ). M-Step: Updates the estimate of the superresolution image f . This constitutes updating the wavelet coefficient vector θ according to ( ) kW θ − zb(t) k22 (t+1) θ̂ = arg min + pen(θ) θ 2α2 (t+1) and setting fb(t+1) = W θ̂ . This optimization can be performed using any wavelet-based denoising procedure. For example, under an i.i.d. Laplacian prior, pen(θ) = − log p(θ) ∝ τ kθk1 (where kθk1 = P (t+1) is obtained by applying a soft-threshold function to the wavelet i |θi | denotes the l1 norm), θ̂ coefficients of zb(t) . For the reconstructions presented in this paper, we applied a similar denoising method described in [15], which requires O(N ) operations. The proposed method has two key advantages: first, the E-step can be computed very computationally efficiently in the Fourier domain, and second, the M-step is a denoising procedure, and the multiscale methods employed here are both near-minimax optimal. 3.4. Simulation results To demonstrate the practical effectiveness of the proposed algorithms in a controlled setting, we conduct two simulation experiments. First we study the effect of the proposed method on an image of a wireframe “resolution chart” collected with a conventional infrared camera, as displayed in Figure 4(a). Nine low-resolution 43 × 43 observation images are generated using the original 129 × 129 image, which is distorted by a 3 × 3 uniform blur, and contaminated with additive white Gaussian noise. One such observation image, corresponding to the center lenslet, is displayed in Figure 4(b). The superresolution image in Figure 4(c) is reconstructed using the waveletbased EM algorithm described above. The normalized mean squared error of this estimate is kf − fbk22 /kf k22 = 0.1620. In the simulation, the estimate is initialized with a least-squares superresolution estimate of relatively poor quality. While not presented here, experiments have shown that the proposed approach is competitive with the state of the art in superresolution image reconstruction. Note from the sample observation image in Figure 4(a) that several wires are indistinguishable prior to superresolution image reconstruction, but that after the application of the proposed method these wires are clearly visible in Figure 4(c). (a) (b) (c) Figure 4. Superresolution results for wireframe simulation. (a) True high resolution image. (b) One of 9 observation images (43 × 43), contaminated with Gaussian noise and a 3 × 3 uniform blur. (c) 129 × 129 result. kf − fbk22 /kf k22 = 0.1620. The second experiment is conducted using a lower contrast LWIR image taken with a conventional infrared camera, as displayed in Figure 5(a). Nine low-resolution observation images, each 43 × 43, are generated using the procedure outlined above, and again reconstructed using the method described in this paper. As shown in Figure 5(b), the data from any single lenslet is significantly lacking in detail, but that much of this detail can be recovered using superresolution image reconstruction. The normalized mean squared error of the estimate is kf − fbk22 /kf k22 = 0.0124. For example, the dark up of ice, fingers, and watch are all much more clearly distinguishable in the reconstructed image. (a) (b) (c) Figure 5. Simulation superresolution results for LWIR simulation. (a) True high-resolution image. (b) One of 9 observation images (43 × 43), contaminated with Gaussian noise and a 3 × 3 uniform blur. (c) 129 × 129 result. kf − fbk22 /kf k22 = 0.0124. Figure 6. Frame wound with heating wires placed at an angle 4. RESULTS To test and characterize the proposed system, we build a resolution chart for testing the infrared cameras out of pipe heating wires arranged on a frame. These are coiled within a frame in such a way as to form lines spaced about two inches apart. In order to vary the spatial frequency using these uniformly spaced lines, the entire frame is tilted with respect to the camera as shown in Figure 6. At regions of the wire frame close to the camera, the spatial frequency as seen by the camera is low and it progressively increases as the distance of the frame and the camera increases. The image obtained from a conventional infrared camera is shown in Figure 7(a) and the raw (unprocessed) image from our camera system is shown in Figure 7(b). Each of the nine low-resolution subimages correspond to the images formed by each of the microlenses on the detector. A high-resolution image is reconstructed by applying the algorithm described in the previous section to obtain and this is illustrated in Figure 7(d). Using our system, we also collect the image displayed in Figure 8(b). (For comparison, we show the image collected with a conventional LWIR system in Figure 8(a).) The image on the center lenslet is enlarged in Figure 8(c). Reconstructing a high-resolution image from the TOMBO system output using a wavelet-based superresolution reconstruction method, we are able to reconstruct the image displayed in Figure 8(d). The waveletbased regularization utilized during image reconstruction reduces the appearance of artifacts while preserving key features such as edges and singularities. The processing method is very fast, making the integrated sensing and processing viable for both time- sensitive applications (such as a helmet-mounted night vision system in defense applications) and massive collections of sensor data. (a) (b) (c) (d) Figure 7. Superresolution results for wireframe TOMBO experiment. (a) High resolution image taken with conventional LWIR camera. (b) Observed image taken with TOMBO camera. (c) Center lenslet image. (c) 129 × 129 reconstruction. (a) (b) (c) (d) Figure 8. Superresolution results for wireframe TOMBO experiment. (a) High-resolution image taken with conventional LWIR camera. (b) Observed image taken with TOMBO camera. (c) Center lenslet image. (c) 129 × 129 reconstruction. 5. CONCLUSIONS Under certain conditions, considerable reduction in the form-factor of infrared cameras can be achieved by replacing the conventional optics with microlens arrays. Provided the optical MTF is not a limiting factor, several low-resolution images can be used to reconstruct a high-resolution image by post processing. We have designed and implemented such a multi-channel infrared camera and compared its performance with a conventional infrared camera. In particular, we have built an infrared system with a 3x3 microlens array with an EFL of 1.9mm and and overall system length of 2.3mm, and demonstrated performance similar to an camera with an effective focal length of 5.7mm and overall system length of 26mm, with similar f-numbers. The camera suffers a hit in the dynamic range, a problem that we are currently addressing. The optics in this system are integrated separately, requiring precise alignment with the focal plane and mounting it at that position. Alignment of this microlens array at such a close distance from the focal plane is quite challenging and the accuracy could not be guaranteed. Also, being a fraction of the size of the conventional lens system, the dimensional precision of the part used to hold the lens array may be below what could be tolerated. The lens array is fabricated on a silicon wafer that is not AR coated which leads to a reduction in the dynamic range of the camera. This implementation does not include a separation layer between the adjacent channels which also contributes to a drop in performance. Results obtained with this thin camera compare favorably with image captured with a conventional infrared camera. We continue to work in refining the algorithms involved, and further quantifying the limits under which the algorithm can function effectively. We are also currently developing the next generation conformal system which would have the optics integrated onto the focal plane array at the time of manufacture. Inclusion of a separation layer between the adjacent channels in the system as well as having an AR coated wafer would improve performance. With a more thorough characterization of the system parameters along with a rigorous approach towards algorithm development and optimization, the foundation would be laid for developing practical cell-phone sized infrared cameras. References [1] J. Tanida, T. Kumagai, K. Yamada, S. Miyatake, K. Ishida, T. Morimoto, N. Kondou, D. Miyazaki, and Y. Ichioka, “Thin observation module by bound optics (tombo): concept and experimental verification,” Applied Optics, vol. 40, no. 11, pp. 1806–1813, 2001. [2] S. Ogata, J. Ishida, and T. Sasano, “Optical sensor array in an artificial compound eye,” Opt Eng., vol. 34, pp. 3649–3655, 1994. [3] J. S. Sanders and C. E. Halford, “Design and analysis of apposition compound eye optical sensors,” Opt Eng., vol. 34, pp. 222–235, 1995. [4] K. Hamanaka and H. Koshi, “An artificial compound eye using a microlens array and it application to scale invariant processing,” Opt Rev., vol. 3, pp. 264–268, 1996. [5] G. A. Horridge, “Apposition eyes of large diurnal insects as organs adapted to seeing,” Proc. R. Soc. London, vol. 207, pp. 287–309, 1980. [6] Y. Kitamura, R. Shogneji, K. Yamada, S. Miyatake, M. Miyamoto, T. Morimoto, Y. Masaki, N. Kondou, D. Miyazaki, J. Tanida, and Y. Ichioka, “Reconstruction of a high-resolution image on a compound-eye image-capturing system,” Applied Optics, vol. 43, no. 43, 2004. [7] R. Hardie, K. Barnard, and E. Armstrong, “Joint map registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Transactions on Image Processing, vol. 6, pp. 1621– 1633, 1997. [8] N. Nguyen, P. Milanfar, and G. Golub, “A computationally efficient superresolution image reconstruction algorithm,” IEEE Transactions on Image Processing, vol. 10, pp. 573–583, 2001. [9] R. Schultz and R. Stevenson, “Extraction of high-resolution frames from video sequences,” IEEE Transactions on Image Processing, pp. 996–1011, 1996. [10] R. Willett, I. Jermyn, R. Nowak, and J. Zerubia, “Wavelet-based superresolution in astronomy,” in Proc. Astronomical Data Analysis Software and Systems XIII, (12-15 October, Strasbourg, France), 2003. [11] M. Figueiredo and R. Nowak, “An em algorithm for wavelet-based image restoration,” IEEE Transactions on Image Processing, vol. 12, no. 8, pp. 906–916, 2003. [12] M. Irani and S. Peleg, “Improving resolution by image registration,” Computer Vis. Graph. Image Process.: Graph. Models Image Process, vol. 53, pp. 231–239, 1991. [13] S. Mallat, A Wavelet Tour of Signal Processing. San Diego, CA: Academic Press, 1998. [14] L. Landweber, “An iterative formula for fredholm integral equations of the first kind,” Amer. J. Math., vol. 73, pp. 615–624, 1951. [15] M. Figueiredo and R. Nowak, “Wavelet-based image estimation: An empirical bayes approach using jeffreys’ noninformative prior,” IEEE Transactions on Image Processing, vol. 10, no. 9, pp. 1322–1331, 2001.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising