# BINARIZING DOCUMENT IMAGES USING COPLANAR PREFILTER ,

**BINARIZING DOCUMENT IMAGES USING COPLANAR PREFILTER**

*Liying Fan, Lixin Fan and ChewLim Tan*

## School of Computing, National University of Singapore, Singapore 117543 [email protected],

*{*fanly,tancl*}*@comp.nus.edu.sg

**ABSTRACT**

We propose a novel coplanar filter, which exploits the coplanarity of gray-level distribution of neighboring pixels, to pre-filter the document images. Experiments show that the proposed filter exhibits the following desired properties for document image binarization: (1) impulsive noise removal, (2) piecewise smoothing, and (3) sharp edge preservation.

**1. INTRODUCTION**

In many document processing systems, gray-level documents images are first binarized to form twolevel (black/white) images. The binarization process involves the assignment of pixels to either foreground or background objects. The process is often achieved by global or local thresholding [9]. Both global and local thresholding schemes make use of the assumption that foreground and background pixels can be classified by comparing their intensity values with some prescribed or automatically selected thresholds. For document images corrupted by various kinds of noise, the assumption is violated wherever there is an outlier, and the binarized images may be severely blurred and degraded.

It is, thus, a common treatment to preprocess the input image using certain noise suppressing filters.

This paper proposes to use such a filter as preprocessing for document image binarization. Before we introduce the proposed filter, let us first investigate the desired properties of a noise suppressing prefilter.

First of all, an ideal filter should be able to completely remove the impulsive noise outliers, which could cause the misclassification of pixels by using direct thresholding methods. In addition, to make the selection of threshold more accurate and robust, it would also be a desired property that different parts of the same object have the similar filter outputs in terms of intensity value. This requires that intraregion intensity variations caused by Gaussian noise or lighting variation should be smoothed as much as possible. Finally, for document images, good preservation of edges and corners is crucial to

the subsequent processing, e.g. optical character recognition [10].

There are numerous existing methods for image filtering. Among many others, Gaussian [1][3] and Median [2][12] filters are the two most commonly used filters. The former is capable of removing

Gaussian noise, but smooth the sharp edges to various extents. The latter is good at removing impulsive noise, yet its output is ragged and not smooth for pixels in the same region. In this paper, we propose to use a novel coplanar filter [5] as a proprocessing for document image binarization. Working within a statistical image restoration framework, the proposed coplanar filter exploits the *coplanarity*, rather than the *continuity *, of gray-level distribution of neighboring pixels to pre-filter the document images. Experimental results show that the coplanar filter exhibits the following desired properties for document image binarization: (1) impulsive noise removal, (2) piecewise smoothing, and (3) sharp edge preservation.

We note that Fontanot and Ramponi [6] proposed a quadratic filtering approach for preprocessing mail-address images. Mo and Mathews [7] later extended this work to adapt to the spatially varying statistics in the input images.

This paper is organized as follows: Section 2 introduces the coplanar matrices, followed by an illustration of the filter’s properties. Experimental results and comparison of performances with other filters are presented in Section 3. Section 4 concludes the paper.

**2. COPLANAR FILTER**

We first briefly outline the algorithm of coplanar filtering, followed by an illustration of its properties with examples. Before introducing the filter, some notations are defined below.

()

*T*

denotes the transpose of a vector or matrix. Let

*{*x = (*x*

1

*, x*

2

) : 1

*≤ x*

1

*≤ w, *1 *≤ x*

2

*≤ h} *be the pixel coordinates of an image with width *w *and height *h*.

*I*(x) is a scalar, denoting the corresponding intensity value at **x**; and p(x) = (x*, I*(x))

*T*

denotes a 3D *pixel vector *consisting of the coordinates and intensity value of image pixel x. b b *I*(x))

*T*

denote the filter output of intensities and corresponding pixel vectors. Let

R(x) = *{*(*r*

1

*, r*

2

) : 0 *≤ |r*

1

*− x*

1

*| ≤ u, *0 *≤ |r*

2

*− x*

2

*| ≤ v} *denote a

((2*u *+ 1) *× *(2*v *+ 1)) local window centered at **x**.

**2.1. Coplanar Filtering Algorithm**

Given as input to the filter a set of pixel vectors p(x), our objective is to estimate a set of new pixel vectors b

*I*(x))

*T*

as the filter output. This can be achieved using the following two steps.

**A. Estimation of Coplanar Matrix**.

The first step involves the estimation of coplanar matrix as defined below. For each pixel vector p(x), we compute a

3 *× *3 pseudo-covariance matrix within a local window centered at x: c(x) =

1

X

*N*

x n

*∈*R(x)

[p(x n

) *− *p(x)] *· *[p(x n

) *− *p(x)]

*T*

(1) in which

*N *is the number of neighboring pixels x *∈ *R(x). Note that the above matrix characterizes the directional distribution of neighboring pixel vectors with respect to centroid pixel vector p(x). The flatter a cluster in which neighboring pixels are scattered, the smaller the determinant of this matrix is, and vice versa. Moreover, the eigenvectors of the above matrix actually correspond to the principal directions of the neighboring pixel distribution. In this work, we use the matrix itself to quantify the

’coplanarity’ around p(x), and denote it as the *coplanar matrix*.

**B. Energy Minimization for the Coplanar Filtering**.

The second step involves estimating the filter output b *I*(x))

*T*

, for all pixel vectors and associated coplanar matrix

(p(x)*, *c(x)) of given images. The ultimate goal is to update all pixel intensities, such that the filter output exhibits high coplanarities among adjacent pixels. For this purpose, one can first define a coplanar matrix based energy function within a local region centered at x, then

E(x) = *−*

P x n

*∈*R(x)

*√*

(2*π*)

3

1

*|*c(x n

)*|* exp(*−*

*d*

2

(x*,*x n

)

2

)

*d*

2

(x*, *x n

) = [ b n

)]

*T*

c(x n

)

*−*1

[ b *− *p(x n

)]

(2) in which c(x n

) are the coplanar matrices at neighboring pixels x n

, c(x n

)

*−*1 are the corresponding inverse matrices: c(x n

)

*−*1

=

*v v*

11

21

*v*

31

*v v v*

12

22

32

*v*

13

*v*

23

*v*

33

(3)

and v x3

= (*v*

13

*, v*

23

)

*T*

denote column vectors to be used below. To find the minima of

E(x), we set

*I*(x)

(*k*+1)

=

P x n

*∈*R(x)

*w*

(*k*)

[*v*

33 b (*k*)

P

*w*

(*k*)

*− *(x *− *x n

) *· *v x3

]

*· v*

33 x n

*∈*R(x)

(4) in which

(*k*) denotes the *kth *iteration, and *w*

(*k*) is computed as:

*w*

(*k*)

= *w*

(*k*)

(x*, *x n

) = p

(2*π*)

3

1

*|*c(x n

)*|* exp(*−*

*d*

2

(x*, *x n

)

2

)

(5)

The detailed derivation of (4),(5) is given in [5]. From (4) and (5), it is shown that the neighboring pixels, whose coplanar matrices have smaller determinants, will have dominant influences on estimatcontributions. We note that this coplanarity-based weighting scheme is crucial to removing impulsive noise and preserving sharp edges. In our experiments, the

**C. The Algorithm Pseudo-Code**.

We briefly outline the pseudo-code of the proposed coplanar filtering algorithm as follows:

**Input: **Pixel vectors p(x) = (x*, I*(x))

*T*

1. Estimate coplanar matrices c(x) using (1) for all **x**;

**Output: **Estimated pixel vectors b

*I*(x))

*T*

**2.2. Properties of The Coplanar Filter**

Let us illustrate the properties of the proposed coplanar filter with a 1D example (Figure 1) of signal filtering. The original signal consists of step and roof edges, and the input to the filter is corrupted by

Gaussian noise (s =1.0) and impulsive noise (p=0.02). There are three noticeable observations about the filter outputs. Firstly, it is shown that impulsive noises can be completely removed. As mentioned earlier, this is due to the fact we make use of the coplanarity assumption of gray-level distributions.

Figure 1: Left – Noise corrupted signal; Right – Filter output (solid line) vs. original signal (dotted line).

Secondly, in relatively smooth regions, the proposed coplanar filter is able to suppress small variations and result in piecewise smoothing. Note that this is a desired property for the document image binarization. Finally, step and roof edges are well preserved without significant change of magnitudes and localizations.

**3. EXPERIMENTAL RESULTS**

In this section, the results of two document image binarization experiments are presented. The noisy images are first filtered using the Gaussian, Median and coplanar filters respectively. The filter outputs are then binarized using three different thresholding schemes: (1) the prescribed thresholding; (2) the global thresholding proposed by Sahoo *et. al *[9]; and (3) the local thresholding proposed by Niblack

[8]. Note that we aim to illustrate the applicability of prefiltering as a preprocessing procedure for document image binarization, rather than compare various thresholding schemes. Detailed comparisons of different document thresholding methods can be found in [9][11][10]. In our experiment, unless stated otherwise, the neighboring window size for coplanar filter is set to

3 *× *3 (*u *= 1*, v *= 1).

**3.1. Experiment 1: Synthetic Test Image**

The test images used in the first experiment are shown in Figure 2. We use synthetic images because they make the ground truth available for performance comparison. The test images are severely corrupted by Gaussian (s =1.0) and impulsive (p=0.1) noise. The different filter outputs and the cor-

Threshold No-filter Gaussian Median Coplanar

Preset

1 Sahoo

4.0003

3.5006

1.6822

2.0633

2.2808

1.7258

0.7852

1.4648

Niblack 4.4354

4.2741

3.4688

2.1024

Preset

2 Sahoo

4.4772

4.2979

3.0529

1.0048

2.1170

1.4004

0.6422

0.2812

Niblack 4.6711

3.7270

2.8975

1.4009

Table 1: Modified Hausdorff Distances (MHD) between the binarization images and the ground truth.

Column 1 represents the results of three thresholding schemes without using any filters; column 2,

3 and 4 are results using Gaussian, Median and Coplanar filters respectively. For images 1 (upper portion) and 2 (bottom portion), row 1 is the binarization results using proscribed threshold

(*t *= 192); row 2 using Sahoo’s global thresholding method [9]; row 3 using Niblack’s local thresholding method

[8].

Figure 2: Left to right: Original test image 1; Original test image 1 corrupted by Gaussian (s =1.0) and impulsive (p=0.1) noise; Original test image 2; and corrupted by Gaussian (s =1.0) and impulsive

(p=0.1) noise.

responding binarization results are shown in Figures 7 and 8. Visual inspections show that (1) using prefiltering can significantly improve the performance of document image binarization; (2) the coplanar filter output are sharper than those of Gaussian and Median filters. Corresponding binarization results demonstrate that the proposed coplanar filter outperforms in that it efficiently suppresses the noise and better preserves the edges and corners. We also quantitatively measure the Modified Hausdorff Distances (MHD) [4] between the different binarization results and the ideal binarization image.

MHD measures the average spatial distance between two sets of image pixels, and approaches zero when two given images are identical. Table 1, which summarizes the comparison results, shows that among three different filters the coplanar filter consistently generates the lowest MHD using all three thresholding schemes.

**3.2. Experiment 2: Real Image**

More experimental results of real document images binarization are shown in Figures 3 to 6. It is shown that for real document images corrupted with noise, the final binarized images are of improved qualities by using the proposed coplanar filter as a preprocessing stage to suppress noise. This is

especially true when original images are severely corrupted (see Figure 4 for an example).

**4. CONCLUSION**

We demonstrate that including prefiltering as a preprocessing procedure for document image binarization can significantly improve the performance. Among different filters, the proposed coplanar filter outperforms Gaussian and Median filters, in that it allows piecewise smoothing and better preserves the edges and corners.

**5. ACKNOWLEDGEMENT**

The authors would like to thank Mr. Zheng Zhang for his valuable discussion and suggestion. This research is supported in part by NSTB/MOE Research grant R-252-000-071-112/303.

**6. REFERENCES**

[1] C. Bouman and K. Sauer. A generalized gaussian image model for edge-preserving map estimation. *IEEE Trans. Image Processing*, 2:296–310, July 1993.

[2] A. C. Bovik, T. Huang, and J. Munson. A generalization of median filtering using linear combination of ordering statistics. *IEEE Trans. Acoust. Speech, Signal Processing*, 31:1342–1350,

July 1983.

[3] H. Derin, R. Elliott, H. Cristi, and D. Geman. Bayes smoothing algorithms for segmentation of binary images modeled by markov random fields. *IEEE Trans. Pattern Analysis and Machine*

*Intelligence*, 6(11):707–720, November 1984.

[4] M. Dubusson and A. Jain. A modified hausdorff distance for object matching. In *Proceedings*

*of the International Conference on Pattern Recognition*, pages 566–568, Jerusalem, Israel, Oct

1994.

[5] L. Fan. A coplanar fitler for image restoration. Technical report, School of Computing, National

University of Singapore, 2001.

[6] P. Fontanot and G. Ramponi. A polynomial filter for the preprocesing of mail address images. In

*Proc. of IEEE Winter Workshop on Nonlienar Digital Signal Processing*, volume 2, pages 1–6,

1993.

[7] S. Mo and V. Mathews. Adaptive, quadratic preprocesing of document images for binarization.

*IEEE Trans. Image Processing*, 7:992–999, July 1998.

[8] W. Niblack. *An Introduction to Digital Image Processing*, pages 115–116. Prentice Hall, 1986.

Figure 3: Row 1 to 5: Original document image; Direct binarization result using t=50; Binarization

(t=50) with Gaussian filter; Binarization (t=50) with Median filter; Binarization (t=50) with Coplanar filter.

Figure 4: Left to right: Original document image; Direct binarization result using t=30; Binarization

(t=30) with Gaussian filter; Binarization (t=30) with Median filter; Binarization (t=30) with Coplanar filter.

[9] P. Sahoo, S. Soltani, and A. Wong. A survey of thresholding techniques. *Computer Vision,*

*Graphics, and Image Processing*, 41:233–260, 1988.

[10] P. D. Trier and A. Jain. Goal-directed evaluation of binarization methods. *IEEE Trans. Pattern*

*Analysis and Machine Intelligence*, 17(12):1191–1201, Dec. 1995.

[11] P. D. Trier and T. Taxt. Evaluation of binarization methods for document images. *IEEE Trans.*

*Pattern Analysis and Machine Intelligence*, 17(3):312–315, March 1995.

[12] J. Tukey. *Exploratory Data Analysis*. Addison-Wesley, MA, 1971.

Figure 5: Row 1 to 5: Original document image; Direct binarization result using t=50; Binarization

(t=50) with Gaussian filter; Binarization (t=50) with Median filter; Binarization (t=50) with Coplanar filter.

Figure 6: Row 1 to 5: Original document image; Direct binarization result using t=50; Binarization

(t=50) with Gaussian filter; Binarization (t=50) with Median filter; Binarization (t=50) with Coplanar filter.

Figure 7: Row 1 (Left to right): Original test image; Gaussian filter output; Median filter output;

Coplanar filter output; Row 2: Binarization of corresponding gray-level images in Row 1, using prescribed thresholding (t=192); Row 3: Binarization of corresponding gray-level images in Row 1, using the global thresholding method proposed by Sahoo et. al [9]; Row 4: Binarization of corresponding gray-level images in Row 1, using the local thresholding method proposed by Niblack [8] .

Figure 8: Row 1 (Left to right): Original test image; Gaussian filter output; Median filter output;

Coplanar filter output; Row 2: Binarization of corresponding gray-level images in Row 1, using prescribed thresholding (t=192); Row 3: Binarization of corresponding gray-level images in Row 1, using the global thresholding method proposed by Sahoo et. al [9]; Row 4: Binarization of corresponding gray-level images in Row 1, using the local thresholding method proposed by Niblack [8] .

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

### Related manuals

advertisement