1 Introduction
About the Editor
Born in Southern China, Professor
Da-Wen Sun is an internationally recognized figure for his leadership in food
engineering research and education. His
main research activities include cooling,
drying and refrigeration processes and
systems, quality and safety of food products, bioprocess simulation and optimization, and computer vision technology. In
particular, his innovative work on vacuum
cooling of cooked meats, pizza quality
inspection by computer vision, and edible films for shelf-life extension of fruit and vegetables have been widely reported in
national and international media. Results of his work have been published in over 150
peer-reviewed journal papers and more than 200 conference papers.
Dr Sun received First Class Honours BSc and MSc degrees in Mechanical Engineering and a PhD in Chemical Engineering in China before working in various universities
in Europe. He became the first Chinese national to be permanently employed in an Irish
University when he was appointed College Lecturer at National University of Ireland,
Dublin (University College Dublin) in 1995, and was then continuously promoted in
the shortest possible time to Senior Lecturer, Associate Professor and full Professor.
Dr Sun is now a Professor and Director of the Food Refrigeration and Computerised
Food Technology Research Group in the University College Dublin.
As a leading educator in food engineering, Professor Sun has significantly contributed to the field of food engineering. He has trained many PhD students, who have
made their own contributions to the industry and academia. Professor Sun has also given
lectures on advances in food engineering on a regular basis to academic institutions
internationally and delivered keynote speeches at international conferences. As a recognized authority in food engineering, he has been conferred adjunct/visiting/consulting
professorships from ten top universities in China, including Zhejiang University,
Shanghai Jiaotong University, Harbin Institute of Technology, China Agricultural University, South China University of Technology, Southern Yangtze University, etc. In
xii About the Editor
recognition of his significant contribution to food engineering worldwide and for his
outstanding leadership in the field, the International Commission of Agricultural Engineering (CIGR) awarded him the CIGR Merit Award in 2000 and again in 2006; the
Institution of Mechanical Engineers (IMechE) based in the UK named him “Food
Engineer of the Year 2004.”
Professor Sun is a Fellow of the Institution of Agricultural Engineers. He has also
received numerous awards for teaching and research excellence, including the President’s Research Fellowship, and has twice received the President’s Research Award of
University College Dublin. He is a member of the CIGR Executive Board and Honorary
Vice-President of CIGR, the editor-in-chief of Food and Bioprocess Technology – an
International Journal (Springer), the former editor of the Journal of Food Engineering (Elsevier), the series editor of the “Contemporary Food Engineering” book series
(CRC Press/Taylor & Francis), and an editorial board member for the Journal of Food
Process Engineering (Blackwell), Sensing and Instrumentation for Food Quality and
Safety (Springer), and the Czech Journal of Food Sciences. He is also a Chartered
Engineer registered in the UK Engineering Council.
Mohd. Zaid Abdullah (Chs 1, 20), School of Electrical and Electronic Engineering,
Engineering Campus, Universiti Sains Malaysia, 14300 Penang, Malaysia
Murat O. Balaban (Ch. 8), University of Florida, Food Science and Human Nutrition
Department, PO Box 110370, Gainesville, FL 32611-0370, USA
Jose Blasco (Ch. 10), IVIA (Instituto Valenciano de Investigaciones Agrarias), Cra.
Moncada-Naquera km 5, 46113 Moncada (Valencia), Spain
Sibel Damar (Ch. 8), University of Florida, Food Science and Human Nutrition
Department, PO Box 110370, Gainesville, FL 32611-0370, USA
Ricardo Díaz (Ch. 12), Instrumentation and Automation Department, Food Technological
Institute AINA, Paterna (Valencia) 46980, Spain
Cheng-Jin Du (Chs 4, 6, 18), Food Refrigeration and Computerised Food Technology,
University College Dublin, National University of Ireland, Dublin 2, Ireland
Prabal K. Ghosh (Ch. 15), Department of Biosystems Engineering, University of
Manitoba, Winnipeg, MB, Canada, R3T 5V6
Sundaram Gunasekaran (Ch. 19), Food and Bioprocess Engineering Laboratory,
University of Wisconsin-Madison, Madison, WI 53706, USA
Dave W. Hatcher (Ch. 21), Wheat Enzymes & Asian Products, Canadian Grain
Commission, Winnipeg, MB, Canada, R3C 3G8
Digvir S. Jayas (Ch. 15), Stored-Grain Ecosystems, Winnipeg, MB, Canada, R3T 2N2
Chithra Karunakaran (Ch. 15), Canadian Light Source, Saskatoon, Saskatchewan,
Canada, S7N 0X4
Olivier Kleynen (Ch. 9), Unité de Mécanique et Construction, Gembloux Agricultural
University, Passage des Déportés, 2, B-5030 Gembloux, Belgium
Vincent Leemans (Ch. 9), Unité de Mécanique et Construction, Gembloux Agricultural
University, Passage des Déportés, 2, B-5030 Gembloux, Belgium
Renfu Lu (Ch. 14), US Department of Agriculture, Agricultural Research Service, Sugar
beet and Bean Research Unit, Michigan State University, East Lansing, MI 48824,
Thierry Marique (Chs 13, 22), Centre Agronomique de Researches Appliquees du Hainaut
(CARAH), 7800 Ath, Belgium
Domingo Mery (Chs 13, 22), Departamento de Ciencia de la Computacion, Pontificia
Universidad Católica de Chile, Av. Vicuña Mackenna 4860 (143), Santiago, Chile
xiv Contributors
Enrique Moltó (Ch. 10), IVIA (Instituto Valenciano de Investigaciones Agrarias), Cra.
Moncada-Naquera km 5, 46113 Moncada (Valencia), Spain
Masateru Nagata (Ch. 11), Faculty of Agriculture, University of Miyazaki, Miyazaki,
889-2192 Japan
Asli Z. Odabaşi (Ch. 8), University of Florida, Food Science and Human Nutrition
Department, PO Box 110370, Gainesville, FL 32611-0370, USA
Yukiharu Ogawa (Ch. 16), Faculty of Horticulture, Chiba University, Matsudo, Chiba,
271-8510 Japan
Alexandra C.M. Oliveira (Ch. 8), Fishery Industrial Technology Center, University of
Alaska, Fairbanks, Kodiak, AK 99615, USA
Jitendra Paliwal (Ch. 15), Department of Biosystems Engineering, University of
Manitoba, Winnipeg, MB R3T 5V6, Canada
Bosoon Park (Ch. 7), US Department of Agriculture, Agricultural Research Service,
Richard B. Russell Research Center, Athens, GA 30605, USA
Franco Pedreschi (Chs 13, 22), Universidad de Santiago Chile, Departamento de Ciencia
y Tecnologia de Alimentos, Facultad Tecnologica, Av. Ecuador 3769, Santiago, Chile
Ricardo Díaz Pujol (Ch. 12), Dpto Instrumentación y Automática AINIA – Instituto
Tecnológico Agroalimentario, 46980 Paterna, Valencia, Spain
Muhammad A. Shahin (Ch. 17), Grain Research Laboratory, Canadian Grain Commission, Winnipeg, MB, Canada, R3C 3G8
Da-Wen Sun (Chs 2, 3, 4, 5, 6, 18), Food Refrigeration and Computerised Food Technology,
University College Dublin, National University of Ireland, Dublin 2, Ireland
Stephen J. Symons (Ch. 17), Grain Research Laboratory, Canadian Grain Commission,
Winnipeg, MB, Canada, R3C 3G8
Jasper G. Tallada (Ch. 11), Faculty of Agriculture, University of Miyazaki, United Graduate School of Agricultural Sciences, Kagoshima University, Miyazaki, 889-2192
Jinglu Tan (Ch. 5), Department of Biological Engineering, University of Missouri,
Columbia, MO 65211, USA
Chaoxin Zheng (Chs 2, 3), Food Refrigeration and Computerised Food Technology,
University College Dublin, National University of Ireland, Dublin 2, Ireland
Liyun Zheng (Ch. 5), Food Refrigeration and Computerised Food Technology, University
College Dublin, National University of Ireland, Dublin 2, Ireland
Based on image processing and analysis, computer vision is a novel technology for
recognizing objects and extracting quantitative information from digital images in
order to provide objective, rapid, non-contact, and non-destructive quality evaluation.
Driven by significant increases in computer power and rapid developments in imageprocessing techniques and software, the application of computer vision has been
extended to the quality evaluation of diverse and processed foods. In recent years
in particular, computer vision has attracted much research and development attention;
as a result, rapid scientific and technological advances have increasingly taken place
regarding the quality inspection, classification, and evaluation of a wide range of food
and agricultural products. As the first book in this area, Computer Vision Technology
for Food Quality Evaluation focuses on these recent advances.
The book is divided into five parts. Part I provides an outline of the fundamentals
of the technology, addressing the principles and techniques for image acquisition, segmentation, description, and recognition. Part II presents extensive coverage of the
application in the most researched areas of fresh and cooked meats, poultry, and
seafood. Part III details the application of computer vision in the quality evaluation of
agricultural products, including apples, citrus, strawberry, table olives, and potatoes.
Using computer vision to evaluate and classify the quality of grains such as wheat, rice
and corn is then discussed in Part IV. The book concludes with Part V, which is about
applying computer vision technology to other food products, including pizza, cheese,
bakery, noodles, and potato chips.
Computer Vision Technology for Food Quality Evaluation is written by international peers who have both academic and professional credentials, with each chapter
addressing in detail one aspect of the relevant technology, thus highlighting the truly
international nature of the work. The book therefore provides the engineer and technologist working in research, development, and operations in the food industry with
critical, comprehensive, and readily accessible information on the art and science of
computer vision technology. It should also serve as an essential reference source for
undergraduate and postgraduate students and researchers in universities and research
Image Acquisition
Mohd. Zaid Abdullah
School of Electrical and Electronic Engineering, Engineering Campus,
Universiti Sains Malaysia, 14300 Penang, Malaysia
1 Introduction
In making physical assessments of agricultural materials and foodstuffs, images are
undoubtedly the preferred method in representing concepts to the human brain. Many
of the quality factors affecting foodstuffs can be determined by visual inspection
and image analysis. Such inspections determine market price and, to some extent, the
“best-used-before” date. Traditionally, quality inspection is performed by trained
human inspectors, who approach the problem of quality assessment in two ways: seeing
and feeling. In addition to being costly, this method is highly variable and decisions
are not always consistent between inspectors or from day to day. This is, however,
changing with the advent of electronic imaging systems and with the rapid decline
in cost of computers, peripherals, and other digital devices. Moreover, the inspection
of foodstuffs for various quality factors is a very repetitive task which is also very
subjective in nature. In this type of environment, machine vision systems are ideally
suited for routine inspection and quality assurance tasks. Backed by powerful artificial intelligence systems and state-of-the-art electronic technologies, machine vision
provides a mechanism in which the human thinking process is simulated artificially.
To date, machine vision has extensively been applied to solve various food engineering problems, ranging from simple quality evaluation of food products to complicated
robot guidance applications (Tao et al., 1995; Pearson, 1996; Abdullah et al., 2000).
Despite the general utility of machine vision images as a first-line inspection tool, their
capabilities regarding more in-depth investigation are fundamentally limited. This is
due to the fact that images produced by vision cameras are formed using a narrow
band of radiation, extending from 10−4 m to 10−7 m in wavelength. For this reason,
scientists and engineers have invented camera systems that allow patterns of energy
from virtually any part of the electromagnetic spectrum to be visualized. Camera systems such as computed tomography (CT), magnetic resonance imaging (MRI), nuclear
magnetic resonance (NMR), single photon emission computed tomography (SPECT)
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
4 Image Acquisition Systems
and positron emission tomography (PET) operate at shorter wavelengths ranging from
10−8 m to 10−13 m. Towards the opposite end of the electromagnetic spectrum there
are infrared and radio cameras, which enable visualization to be performed at wavelengths greater than 10−6 m and 10−4 m, respectively. All these imaging modalities
rely on acquisition hardware featuring an array or ring of detectors which measure
the strength of some form of radiation, either following reflection or after the signal
has passed transversely through the object. Perhaps one thing that these camera systems have in common is the requirement to perform digital image processing of the
resulting signals using modern computing power. Whilst digital image processing is
usually assumed to be the process of converting radiant energy in a three-dimensional
world into a two-dimensional radiant array of numbers, this is certainly not so when the
detected energy is outside the visible part of the spectrum. The reason is that the technology used to acquire the imaging signals is quite different, depending on the camera
modalities. The aim of this chapter is therefore to give a brief review of the present
state-of-the-art image acquisition technologies that have found many applications in the
food industry.
Section 2 summarizes the electromagnetic spectrum which is useful in image
formation. Section 3 describes the principles of operation of machine vision technology,
along with illumination and electronics requirements. Other imaging modalities,
particularly the acquisition technologies operating at the non-visible range, are briefly
discussed in Section 4. In particular, technologies based on ultrasound, infrared, MRI
and CT are addressed, followed by some of their successful applications in food
engineering found in the literature. Section 5 concludes by addressing likely future
developments in this exciting field of electronic imaging.
2 The electromagnetic spectrum
As discussed above, images are derived from electromagnetic radiation in both visible
and non-visible ranges. Radiation energy travels in space at the speed of light in the
form of sinusoidal waves with known wavelengths. Arranged from shorter to longer
wavelengths, the electromagnetic spectrum provides information on the frequency as
well as the energy distribution of the electromagnetic radiation. Figure 1.1 shows the
electromagnetic spectrum of all electromagnetic waves.
Referring to Figure 1.1, the gamma rays with wavelengths of less than 0.1 nm constitute the shortest wavelengths of the electromagnetic spectrum. Traditionally, gamma
radiation is important for medical and astronomical imaging, leading to the development of various types of anatomical imaging modalities such as CT, MRI, SPECT,
and PET. In CT the radiation is projected onto the target from a diametrically opposed
source, whilst with others it originates from the target – by simulated emission in the
case of MRI, and through the use of radiopharmaceuticals in SPECT and PET. At
the other end of the spectrum, the longest waves are radio waves, which have wavelengths of many kilometers. The well-known ground-probing radar (GPR) and other
microwave-based imaging modalities operate in this frequency range.
The electromagnetic spectrum 5
Wavelength (␮m)
Wavelength (␮m)
Increasing resolution
Increasing energy
Radio waves
Decreasing wavelength
Figure 1.1 The electromagnetic spectrum comprising the visible and non-visible range.
Located in the middle of the electromagnetic spectrum is the visible range, consisting of narrow portion of the spectrum with wavelengths ranging from 400 nm
(blue) to 700 nm (red). The popular charge-coupled device or CCD camera operates in
this range.
Infrared (IR) light lies between the visible and microwave portions of the electromagnetic band. As with visible light, infrared has wavelengths that range from near
(shorter) infrared to far (longer) infrared. The latter belongs to the thermally sensitive
region, which makes it useful in imaging applications that rely on the heat signature.
One example of such an imaging device is the indium gallium arsenide (InGaAs)-based
near-infrared (NIR) camera, which gives the optimum response in the 900–1700-nm
band (Deobelin, 1996).
Ultraviolet (UV) light is of shorter wavelength than visible light. Similar to IR, the
UV part of the spectrum can be divided, this time into three regions: near ultraviolet
(NUV) (300 nm) (NUV), far ultraviolet (FUV) (30 nm), and extreme ultraviolet (EUV)
(3 nm). NUV is closest to the visible band, while EUV is closest to the X-ray region
and therefore is the most energetic of the three types. FUV, meanwhile, lies between
the near and extreme ultraviolet regions, and is the least explored of the three. To date
there are many types of CCD camera that provide sensitivity at the near-UV wavelength
range. The sensitivity of such a camera usually peaks at around 369 nm while offering
coverage down to 300 nm.
6 Image Acquisition Systems
Mathematically, the wavelength (λ), the frequency ( f ), and the energy (E) are related
by Planck’s equation:
where h is the Planck’s constant (6.626076 × 10−34 J s), and c is the speed of light
(2.998 × 10−34 m/s). Consequently, the energy increases as the wavelength decreases.
Therefore, gamma rays, which have the shortest wavelengths, have the highest energy
of all the electromagnetic waves. This explains why gamma rays can easily travel
through most objects without being affected. In contrast, radio waves have the longest
wavelength and hence the lowest energy. Therefore, their penetrative power is at least
hundreds order of magnitude lower than that of gamma or X-rays. Moreover, both
gamma and X-rays travel in a straight line and their paths are not affected by the
object through which these signals propagate. This is known as the hard-field effect.
Conversely, radiowaves do not travel in straight lines and their paths depend strongly
on the medium of propagation. This is the soft-field effect. Both the hard- and softfield effects have a direct effect on the quality of images produced by these signals.
The soft-field effect causes many undesirable artefacts, most notably, image blurring,
and therefore images produced by gamma rays generally appear much clearer than
do images produced by radiowaves. Another important attribute that is wavelengthdependent is image resolution. In theory, the image spatial resolution is essentially
limited to half of the interrogating wavelength, and therefore the spatial resolution also
increases as the wavelength decreases. Thus, the resolution of typical gamma rays is
less than 0.05 nm, enabling this type of electromagnetic wave to “see” extremely small
objects such as water molecules. In summary, these attributes, along with the physical
properties of the sensor materials, establish the fundamental limits to the capability of
imaging modalities and their applications.
The following sections explain the technology of image acquisition and applications
for all the imaging modalities discussed, focusing on the visible modality or computer
vision system, since this device has extensively been used for solving various food
engineering problems. Moreover, given the progress in computer technology, computer
vision hardware is now relatively inexpensive and easy to use. To date, some personal
computers offer capability for a basic vision system by including a camera and its
interface within the system. However, there are specialized systems for vision, offering
performance in more than one aspect. Naturally, as with any specialized equipment,
such systems can be expensive.
3 Image acquisition systems
In general, images are formed by incident light in the visible spectrum falling on
a partially reflective, partially absorptive surface, with the scattered photons being
gathered up in the camera lens and converted to electrical signals either by vacuum tube
or by CCD. In practice, this is only one of many ways in which images can be generated.
Generally, thermal and ultrasonic methods, X-rays, radiowaves, and other techniques
can all generate an image. This section examines the methods and procedures by which
images are generated for computer vision applications, including tomography.
Image acquisition systems 7
3.1 Computer vision
The hardware configuration of computer-based machine vision systems is relatively
standard. Typically, a vision system consists of:
an illumination device, which illuminates the sample under test
a solid-state CCD array camera, to acquire an image
a frame-grabber, to perform the A/D (analog-to-digital) conversion of scan lines
into picture elements or pixels digitized in a N row by M column image
a personal computer or microprocessor system, to provide disk storage of
images and computational capability with vendor-supplied software and specific
application programs
a high-resolution color monitor, which aids in visualizing images and the effects
of various image analysis routines.
Figure 1.2 shows a typical set-up, such as an investigator needs to start experimenting
with machine vision applications. All essential components are commercially available,
and the price for the elementary system can be as low as £2000.00.
The set-up shown in Figure 1.2 is an example of a computer vision system that can
be found in many food laboratories, mainly for research and imaging applications. In
this case, the objective is ultimately to free human inspectors from undertaking tedious,
laborious, time-consuming, and repetitive inspection tasks, allowing them to focus on
more demanding and skilled jobs. Computer vision technology not only provides a
high level of flexibility and repeatability at a relatively low cost, but also, and more
importantly, it permits fairly high plant throughput without compromising accuracy.
The food industry continues to be among the fastest-growing segments of machine
vision application, and it ranks among the top ten industries that use machine vision
systems (Gunasekaran, 1996). Currently, several commercial vendors offer automatic
vision-based quality evaluation for the food industry.
Even though machine vision systems have become increasingly simple to use,
the applications themselves can still be extremely complicated. A developer needs
to know precisely what must be achieved in order to ensure successful implementation
BNC cable
Sample under test
Color framegrabber
Figure 1.2 Essential elements of a typical computer vision system.
8 Image Acquisition Systems
of a machine vision application. Key characteristics include not only the specific part
dimensions and part tolerances, but also the level of measurement precision required
and the speed of the production line. Virtually all manufacturing processes will produce some degree of variability and, while the best machine vision technology is robust
enough to compensate automatically for minor differences over time, the applications
themselves need to take major changes into account. Additional complexity arises for
companies with complex lighting and optical strategies, or unusual materials-handling
logistics. For these reasons, it is essential to understand the characteristics of the part and
sub-assemblies of the machine system, as well as the specifications of the production
line itself.
3.1.1 Illumination
The provision of correct and high-quality illumination, in many vision applications, is
absolutely decisive. Despite the advances of machine vision hardware and electronics,
lighting for machine vision remains the art for those involved in vision integration.
Engineers and machine vision practitioners have long recognized lighting as being an
important piece of the machine vision system. However, choosing the right lighting
strategy remains a difficult problem because there is no specific guideline for integrating lighting into machine vision applications. In spite of this, some rules of thumb
exist. In general, three areas of knowledge are required to ensure a successful level of
lighting for the vision task:
1. Understanding of the role of the lighting component in machine vision
2. Knowledge of the behavior of light on a given surface
3. Understanding of the basic lighting techniques available that will allow the light
to create the desired feature extraction.
In the vast majority of machine vision applications, image acquisition deals with
reflected light, even though the use of backlit techniques can still be found. Therefore,
the most important aspect of lighting is to understand what happens when light hits
the surface – more specifically, to know how to control the reflection so that the image
appears of a reasonably good quality.
Another major area of concern is the choice of illuminant, as this is instrumental in
the capability of any form of machine vision to represent the image accurately. This
is due to the fact that the sensor response of a standard imaging device is given by a
spectral integration process (Matas et al., 1995). Mathematically,
ρk (λ)L(λ)dλ
where pxk is the response of the kth sensor at location x of the sensor array, ρk (λ) is
the responsitivity function of the kth sensor, and L(λ) is the light reflected from the
surface that is projected on pixel x. For a CCD camera the stimulus L(λ) is the product
of the spectral power distribution S(λ) of the light that illuminates the object, and the
Image acquisition systems 9
spectral reflectance C(λ) of the camera itself, i.e.
L(λ) = S(λ)C(λ)
Hence, two different illuminants, S1 (λ) and S2 (λ), may yield different stimuli using
the same camera. Therefore, the illuminant is an important factor that must be taken
into account when considering machine vision integration. Frequently, knowledgeable
selection of an illuminant is necessary for specific vision applications.
Traditionally, the two most common illuminants are fluorescent and incandescent
bulbs, even though other light sources (such as light-emitting diodes (LEDs) and electroluminescent sources) are also useful. Figure 1.3 shows the spectral distributions of
three different light sources: the sun, an incandescent bulb, and standard cool white
fluorescent light. Referring to Figure 1.3, the only difference between daylight and
electric light is the amount of energy emitted at each wavelength. Even though the
light energy itself is fundamentally the same, however, the optimum light will have
more intensity than the other sources. When the light is not as intense as it should be,
three possible damaging effects occur:
1. There may not be sufficient signal-to-noise ratio at the camera
2. The electrical noise tends to increase as the light gets dimmer and less intense
3. Most importantly, a less intense light will cause a significant loss in the camera
Additionally, effects from ambient light are more likely to occur under poor lighting
Referring to Figure 1.3 again, it can be seen that the incandescent source has a fairly
normal distribution over the visible spectrum while the fluorescent source has sharp
Nomalized spectral power
Cool white
Wavelength (nm)
Figure 1.3 Comparison in relative spectral energy distribution between daylight, incandescent, and cool
white fluorescent light (Stiles and Wyszecki, 2000).
10 Image Acquisition Systems
peaks in some regions. This means that objects under an incandescent source produce
an image with a much lower signal-to-noise ratio. This is not acceptable in some
cases, especially those that are concerned with color-image processing (Daley et al.,
1993). In contrast, fluorescent bulbs are inherently more efficient, and produce more
intense illumination at specific wavelengths. Moreover, fluorescent light provides a
more uniform dispersion of light from the emitting surface, and hence does not require
the use of diffusing optics to disseminate the light source over the field of view, as
is the case with incandescent bulbs. For these reasons, a fluorescent bulb, particularly
the cool white type, is a popular choice for many machine vision practitioners (Tao
et al., 1995; Abdullah et al., 2001, 2005; Pedreschi et al., 2006). However, care must
be taken when using the fluorescent light, as this source is normally AC driven. The
50-Hz fluorescent bulb usually introduces artefacts in the image resulting from the
oversampling of the analog-to-digital converter. In order to reduce flickering, highfrequency fluorescent bulbs, operating at a frequency in the range of a few tens of
kilohertz, are preferred rather than low-frequency ones.
Apart from the illuminant, the surface geometry is also important in the illumination
design. The key factor is to determine whether the surface is specular or diffuse.
Light striking a diffuse surface is scattered because of the multitude of surface angles.
In comparison, light striking a glossy surface is reflected at the angle of incidence.
Therefore, the position of an illuminant is very important in order to achieve high
contrast in an image. There are two common geometries for the illuminators: the ring
illuminator and the diffuse illuminator (see Figure 1.4).
The ring illuminator has the simplest geometry and is generally intended for general
purpose applications, especially for imaging flat surfaces. The diffuse illuminator,
meanwhile, delivers virtually 180◦ of diffuse illumination, and is used for imaging
challenging reflective objects. Since most food products are basically 3D objects,
the diffuse illuminator is well suited for this imaging application. However, there
has been some success in using the ring illuminator to solve lighting problems in
food engineering. For instance, a ring illuminator together with a 90-kHz ultra highfrequency fluorescent bulb has been found to be effective in the color- and shapegrading of star fruits (Abdullah et al., 2005). In an attempt to produce uniform lighting,
Paulsen (1990) mounted a ring light in a cylindrically-shaped diffuse lighting chamber.
Light source
Light source
Figure 1.4
Light source
Two possible lighting geometries: (a) the ring illuminator; (b) the diffuse illuminator.
Image acquisition systems 11
Such a set-up is extremely useful for visual inspection of grains and oilseed, with the
success rate reaching almost 100 percent.
In spite of the general utility of the ring illuminator, however, the majority of
machine vision applications are based on the diffuse illuminator. Heinemann et al.
(1994) employed this type of illumination system for the shape-grading of mushrooms. The same system was investigated by Steinmetz et al. (1996) in the quality
grading of melons. Both groups of authors have reported successful application of
machine vision, with a grading accuracy that exceeds 95 percent. There are many other
applications involving diffuse illuminator and computer vision integration. Batchelor
(1985) reviewed some of the important factors to be considered when designing a good
illumination system.
3.1.2 Electronics
Capturing the image electronically is the first step in digital image processing. Two
key elements are responsible for this: the camera and the frame-grabber. The camera
converts photons to electrical signals, and the frame-grabber then digitizes these signals
to give a stream of data or bitmap image. There are many types of camera, ranging
from the older pick-up tubes such as the vidicons to the most recent solid-state imaging
devices, such as the Complementary Metal Oxide Silicon (CMOS) cameras. The latter
is the dominant technology for cameras, and revolutionized the science of imaging
with the invention of the CCD device in 1970. As CCD cameras have less noise, higher
sensitivity and a greater dynamic range, they have also become the device of choice
for a wide variety of food engineering applications.
In general, the CCD sensor comprises a photosensitive diode and a capacitor connected in parallel. There are two different modes in which the sensor can be operated:
passive and active. Figure 1.5 shows the details of the schematics.
Referring to Figure 1.5, the photodiode converts light into electrical charges, which
are then stored in the capacitor. The charges are proportional to the light intensity. In
passive mode, these charges are transferred to a bus line when the “select” signal is
activated. In the active mode, charges are first amplified before being transferred to
a bus line, thus compensating the limited fill factor of the photodiode. An additional
“reset” signal allows the capacitor to be discharged when an image is rescanned.
FET transistors
FET transistor
Figure 1.5 Sensor operation in (a) passive mode and (b) active mode.
12 Image Acquisition Systems
CCD cells
Shift register
Shift register
Shift register
Shift register
Shift register
Shift register
Figure 1.6
Shift register
Three possible CCD architectures: (a) linear, (b) interline, and (c) frame-transfer.
Depending on the sensing applications, CCD imagers come in various designs. The
simplest form is the linear CCD scanner, which is shown schematically in Figure 1.6a.
This design is used mostly in office scanner machines. It consists of a single row of
photodiodes, which capture the photons. The sensors are lined up adjacent to a CCD
shift register, which does the readout. The picture or document to be scanned is moved,
one line at a time, across the scanner by mechanical or optical means. Figures 1.6b and
1.6c show two-dimensional CCD area arrays, which are mostly associated with modern
digital cameras. The circuit in Fig. 1.6b portrays the interline CCD architecture, while
Figure 1.6c shows that of a frame-transfer imager.
Basically, the interline CCD comprises of a stack of vertical linear scanners connected by an additional, horizontal shift register that collects and passes on the charge
readout from linear scanners, row by row. In the case of frame-transfer architecture, the
CCD elements, the entire surfaces of which are covered by photosensitive devices, form
the photo-sensing area. It can be seen from Figure 1.6c that the frame-transfer design
comprises integration and storage areas, forming the integration and storage frames,
respectively. The storage-frame array captures an image and transfers the charge to the
adjacent storage-frame array. In this way, the integration array can capture a new image
while the storage array reads the previous image. Both interline and frame-transfer
architectures are suitable for capturing motion images, whilst the linear scanner is best
suited for scanning still pictures. Full-frame CCD cameras with four million pixels
Image acquisition systems 13
Host 32-bit
PCI bus
PCI bus
Figure 1.7 General structure of a frame-grabber card, showing some important elements.
and a frame rate of more than 30 frames per second (fps) are now commercially available. Modern CCD cameras come with analog, or digital, or both outputs. The analog
signals are conform with the European CCIR (Comite Consultatif International des
Radiocommunication) or US RS170 video standards. In spite of a reduced dynamic
range, analog cameras work well for slower applications (<20 MHz). For high-speed
applications, digital cameras are preferred. These cameras have internal digitization
circuitry, and usually produce a parallel digital output. Typically, data are output on a
8- to 32-bit wide parallel bus, clocking rates of up to 40 MHz and utilizing RS422 or
RS644 international video standards. For the purpose of camera control and PC interfacing, analog cameras require an analog frame-grabber. Likewise, digital cameras
require a digital frame-grabber.
Generally, the frame-grabber comprises signal-conditioning elements, an A/D converter, a look-up table, an image buffer and a PCI bus interface. Figure 1.7 illustrates
some of the basic elements of a typical frame-grabber card, although it must be borne in
mind that the internal working circuitry of modern and state-of-the-art frame-grabbers
is more complex than in the one illustrated. Nevertheless, the basic elements remain
relatively the same from one manufacturer to another. The latest digital frame-grabbers
may feature sophisticated on-board camera control modules and on the on-the-fly data
resequencing capabilities, which are intended for high-speed applications. In addition
to PCI bus, some frame-grabbers feature PC104 capability and firewire plug-and-play
style. In summary, the three criteria in deciding on the frame-grabber are:
1. The choice of camera
2. Speed requirements
3. Computer choice.
3.2 Ultrasound
In addition to computer vision, there is growing interest in using ultrasound for food
quality and evaluation. One reason for this is that changes in acoustic properties can
be related to density changes in the food product (McClements, 1995). Furthermore,
ultrasound has the ability to differentiate between the propagation velocity within various media, and the differences in acoustic impedance between different regions within
a given volume. In the past, ultrasound has been used for measuring the moisture
14 Image Acquisition Systems
content of food products (Steele, 1974), predicting the intramuscular fat content of
bovine products (Morlein et al., 2005), and studying and evaluating the turgidity and
hydration of orange peel (Camarena and Martínez-Mora, 2005). In addition to these,
one of the most widespread and promising ultrasonic applications is for composition
measurement. Recent studies have shown that ultrasonic velocity measurements can
accurately be used to predict the fat, water, protein, and other chemical compositions
of meat-based products (Simal et al., 2003).
The information provided by ultrasound originates from the reflection or transmission of sound waves emitted by an external source. Refraction, absorption, and
scattering also play a role, but mainly as factors that degrade the ultrasonic measurements. A typical source/detector unit is based on a piezoelectric crystal resonating
between 1 MHz and 10 MHz. The basic physical parameters of importance are the
frequency of the wave and the acoustical impedance of the object through which the
sound wave travels. The acoustical impedance Z is itself a function of the speed of
sound v and the density ρ of the object, and the relationship Z = ρν holds. When an
ultrasonic plane wave propagates in a medium with Z = Z1 and is incident normally at
an interface with Z = Z2 , the intensity reflection and transmission coefficients (αγ and
αt ) are given in terms of the incident ( pi ), reflected ( pγ ), and transmitted ( pt ) sound
pressures (Wells, 1969). Mathematically,
2 p
(Z 2 − Z 1 ) 2
αr = r =
(Z 2 + Z 1 )
2 pt
4Z 1 Z 2
αt =
(Z 2 + Z 1 )2
Clearly, from equations (1.4) and (1.5), the greater the difference in impedance at the
interface, the greater the amount of energy at the interface will be. Conversely, if the
impedance is similar, most of the energy is transmitted. Therefore, it may be deduced
that reflections may provide the basis for digital imaging in the same way that a computer vision system generates images. McClements (1995) published useful data on
impedance and the interaction processes for a wide range of human tissues, including
materials of interest in food technology. For instance, the density of human muscle is
typically 1000 kg/m3 ; the speed of sound within muscle is therefore 1500 m/s. In this
case αγ at a muscle–fat interface is approximately 1 percent, and this value increases
sharply to 99.9 percent at a skin–air interface. Therefore, this technique generally
requires a coupling medium between the test sample and the transducer surface. Couplants such as water, gel, and oil are routinely used for particular inspection situations;
however, it must be stressed that the use of a couplant may not always be suitable for
some specialized applications, especially when the material property might change or
contamination damage result. Gan et al. (2005) have attempted to use a non-contactbased ultrasound inspection technique for evaluating food properties. They used a pair
of capacitive devices in air to deliver high ultrasonic energy, with the application of
a pulse-compression technique for signal recovery and analysis. The use of such a
processing technique is necessary to recover signals buried in noise, as a result of large
acoustic impedance and the mismatch between air and the sample. They experimented
Image acquisition systems 15
with this technique in monitoring palm-oil crystallization, employing a chirp signal
with a center frequency of 700 kHz and a bandwidth of 600 kHz. From the results, they
deduced that both contact and non-contact measurements are well correlated.
The majority of the studies found in the literature rely on the measurement of ultrasonic velocity, because this is the simplest and most reliable ultrasonic measurement.
It utilizes an impulse (or sequence of discrete impulses) transmitted into the medium,
and the resulting interactions provide raw data for imaging. Here, the transduction can
be sensed in two ways – either from the same viewpoint by the partial reflection of the
ultrasonic energy; or from an opposing viewpoint by transmission, where the energy is
partially attenuated. Whichever is used, the time-of-flight (TOF) and hence the velocity may be estimated from both reflection and transmission sensor data. Figure 1.8
demonstrates the general concept.
In Figure 1.8a, A represents the composite ultrasonic transducer acting as both the
transmitter (Tx) and the receiver (Rx) probe; B represents the ultrasonic receiver probe.
The interposing area between transducer probes consists of a homogeneous fluid of
density distribution f1 (x, y). Within the sample are two inhomegeneities, having densities of g1 (x, y) and g2 (x, y), in which g2 (x, y) > g1 (x, y) > f1 (x, y). The interactions
between ultrasound and the sample can be explained as follows. Probe A transmits the
ultrasonic wave, which travels in a straight line until it reaches the f1 (x, y) and g1 (x, y)
interface, which causes reflection. This is detected by the same probe, which now acts
as a receiver. The amplified signals are fed into the y-plates of the oscilloscope, and
a timebase is provided, synchronized to the transmitter pulse. Some of the energy,
however, continues to travel until it reaches the f1 (x, y) and g2 (x, y) interface, where
some energy is again reflected and hence detected by A. In similar fashion, some of the
remaining energy continues to travel until it reaches probe B, where it is again detected
and measured. Consequently, probe A provides a graph, detailing the echo signal,
in which the height corresponds to the size of the inhomegeneity and the timebase
provides its range or depth. Such a pattern is known as an A-scan (see Figure 1.8b).
Filter/rectifier Amplifier
g2(x,y) g1(x,y)
Figure 1.8 Ultrasonic measuring system showing (a) essential elements, (b) reflection, and
(c) transmission measurements.
16 Image Acquisition Systems
Figure 1.8c shows the attenuated transmitted energy as observed by probe B. Both
graphs show that information relating to the amplitude of both the transmitted and
the reflected pulses can be measured, and this can also be used for imaging. As
shown in Figure 1.8, the signals are usually rectified and filtered to present a
simple one-dimensional picture, and the timebase can be delayed to allow for a
couplant gap.
To provide a full two-dimensional image, the ultrasonic probe must be moved over
the surface of the sample. The Tx/Rx probe is connected via mechanical linkages to
position transducers, which measure its x and y coordinates and its orientation. In
this case, the output signals determine the origin and direction of the probe, while
the amplitude of the echo determines the spot brightness. As the probe is rotated and
moved over the sample, an image is built and retained in a digital store. This procedure
is known as a B-scan, and produces a “slice” through a sample, normal to the surface.
In contrast, a C-scan produces an image of a “slice” parallel to the surface. In order to
produce the C-scan image, the ultrasonic probe must again be moved but this time over
the volume of the sample. The time, together with the x and y coordinates of the image
displayed, represents the lateral movement of the beam across the plane. By time-gating
the echo signals, only those from the chosen depth are allowed to brighten the image.
C-scan images may be produced using the same equipment as for B-scanning. Most
of the studies in the literature rely on the use of A-scan or B-scan methods, probably
because the C-scan image does not provide any additional information which is useful
for further characterization.
Regardless of the methods, the ultrasound images generally share at least three
common drawbacks:
1. Low image spatial resolution – typically of a few millimeters
2. Low signal-to-noise ratio
3. Many artefacts.
The first of these is related to the wavelength and hence frequency of ultrasound,
which typically ranges from 2 to 10 MHz. In order to improve the resolution, some
ultrasound devices operate at frequencies higher than this; however, such devices must
be used with care because the skin effect increases with increasing frequency. The
factors therefore have to be balanced against each other. The second and third drawbacks
are due to the coherent nature of the sound wave and the physics of reflection. Any
coherent pulse will interfere with its reflected, refracted, and transmitted components,
giving rise to speckle, similar to the speckle observed in laser light (Fishbane et al.,
1996). On the other hand, reflection occurs when the surface has a normal component
parallel to the direction of the incident wave. Interfaces between materials that are
parallel to the wave will not reflect the wave, and are therefore not seen in ultrasound
images; such parallel interfaces form a “hole” in the ultrasound image. Despite these
drawbacks, the technique is safe and relatively inexpensive. Current research methods
tend to eliminate artefacts, improve image contrast, and simplify the presentation of
data, and many efforts are being directed towards three-dimensional data acquisition
and image representation.
Image acquisition systems 17
3.3 Infrared
When both computer vision and ultrasound systems fail to produce the desired images,
food engineers and technologists could presumably resort to the use of much longer
wavelengths for image acquisition. In the region of 700–1000 nm lies the infrared
(IR) range, and the technique responsible for generating images with infrared light is
known as thermographic photography. Thermographic imaging is based on the simple
fact that all objects emit a certain amount of thermal radiation as a function of their
temperature. Generally, the higher the temperature of the object, the more IR radiation
it emits. A specially built camera, known as an IR camera, can detect this radiation
in a way similar to that employed in an ordinary camera for visible light. However,
unlike computer vision, thermal imaging does not require an illumination source for
spectral reflectance, which can be affected by the varied surface color of a target or by
the illumination set-up.
Thermographic signatures of food are very different for different materials, and hence
IR imaging has found applications and many other uses in the food industry – such
as identification of foreign bodies in food products (Ginesu et al., 2004). Moreover,
many major physiological properties of foodstuffs (firmness, soluble-solid content, and
acidity) appear to be highly correlated with IR signals, implying that image analysis
of IR thermography is suitable for quality evaluation and shelf-life determination of
a number of fruit and vegetable products (Gómez et al., 2005). Therefore, thermal
imaging offers a potential alternative technology for non-destructive and non-contact
image-sensing applications. Good thermographic images can be obtained by leaving
the object at rest below the IR camera, applying a heat pulse produced by a flashlight, and monitoring the decreasing temperature as a function of time. Because of
different thermal capacities or heat conductivities, the objects will cool down at different speeds; therefore, the thermal conductivity of an object can be measured by
the decreasing temperature calculated from a sequence of IR images. Using these relatively straightforward procedures, Ginesu et al. (2004) performed experiments on
objects with different thermal properties, aiming to simulate foreign-body contamination in real experiments. Both the long (500-fps) and short (80-fps) sequence modes
were used to record the images, enabling the radiation patterns of objects with low
and high thermal capacities, respectively, to be monitored. Temperature data were presented in terms of average gray levels computed from 10 × 10 image pixels in the
neighborhood of each object. Figure 1.9 shows the results.
It can be seen from Figure 1.9 that the cardboard and the wooden stick behave quite
differently from other materials, as they appear to be much hotter at the beginning but
decrease in temperature rather quickly. This is due to the fact that these materials are
dry and light, whereas foods contain a large quantity of water, which heats up more
slowly and reaches lower temperatures, thus maintaining the heat for a longer time
and cooling down slowly. By plotting and analyzing the absolute differences between
radiation curves of different materials, it is possible to distinguish between food and
foreign objects.
Theoretically, the existence of such unique thermal signatures of different materials is due to the concept of a black body, defined as an object that does not reflect
18 Image Acquisition Systems
Metal chip
Wooden stick
Average gray value
Time (s)
Figure 1.9 Decreasing temperature curves of different materials plotted as a function of time (Ginesu
et al., 2004).
any radiation. Planck’s law describes the radiation emission from a black body as
(Gaossorgues, 1994):
R(λ, θ) =
2πhc 2 λ−5
where h = 6.6256 × 10−34 J s is Planck’s constant, σ = 1.38054 × 10−23 J/K is the
Stefan-Boltzman’s constant, c = 2.998 × 10−8 m/s is the speed of light, θ is the absolute temperature in degrees kelvin, and λ is again the wavelength. Usually objects
are not black bodies, and consequently the above law does not apply without certain
corrections. Non-black bodies absorb a fraction A, reflect a fraction R, and transmit
a fraction T . These fractions are selective, depending on the wavelength and on the
angle of incident radiation. By introducing the spectral emissivity ε(λ) to balance the
absorbance, it can be found that:
A(λ) = ε(λ)
ε(λ) + R(λ) + T (λ) = 1
Using these corrections, equation (1.6) can be simplified, yielding:
R(λ, θ) = ε(λ)Rblackbody (λ, θ)
This means that the emission coefficient ε(λ) relates the ideal radiation of a black body
with real non-black bodies. In summary, an ideal black body is a material that is a
Image acquisition systems 19
perfect emitter of heat energy, and therefore has the emissivity value equal to unity. In
contrast, a material with zero emissivity would be considered a perfect thermal mirror. However, most real bodies, including food objects, show wavelength-dependent
emissivities. Since emissivity varies with material, this parameter is the important factor in thermographic image formation. For accurate measurement of temperature, the
emissivity should be provided manually to the camera for its inclusion in temperature calculation. The function that describes the thermographic image f (x, y) can be
expressed as follows:
f (x, y) = f [θ(x, y), ε(x, y)]
where x and y are the coordinates of individual image pixels, θ(x, y) is the temperature
of the target at image cooordinates (x, y), and ε(x, y) is the emissivity of the sample also
at coordinates (x, y). From the computer vision viewpoint, thermographic images are
a function of two variables: the temperature and emissivity variables. Contrast in thermographic images may be the result of either different temperatures of different objects
on the scene, or different emissivities of different objects with the same temperature.
It can also be the combination of both temperature and emissivity variations.
As mentioned previously, the infrared or thermographic cameras operate at wavelengths as long as 14 000 nm (or 14 µm). The infrared sensor array is equivalent to the
CCD in the ordinary camera; sensors with a resolution of 160 × 120 pixels or higher
are widely available, and their response time is sufficient to provide live thermographic
video at 25 frames per second. However, unlike sensors used in conventional imaging
systems, the process of image formation and acquisition in thermographic cameras
is quite complex. Broadly speaking, thermographic cameras can be divided into two
types: those with cooled infrared image detectors and those without cooled detectors.
These are discussed in the following section.
3.3.1 Cooled infrared detectors
Cooled IR detectors are typically contained in a vacuum-sealed case and cryogenically
cooled. This greatly increases their sensitivity, since their temperature is much lower
than that of the objects from which they are meant to detect radiation. Typically, cooling
temperatures range from −163◦ C to −265◦ C, with −193◦ C being the most common.
In a similar way to common digital cameras, which detect and convert light to electrical
charge, the IR detectors detect and convert thermal radiation to electrical signals. In
the case of IR cameras, cooling is needed in order to suppress thermally emitted
dark currents. A further advantage of cooling is suppression of noise from ambient
radiation emitted by the apparatus. Materials used for IR detection include liquidhelium cooled bolometers, photon-counting superconducting tunnel junction arrays,
and a wide range of cheaper, narrow-gap semiconductor devices. Mercury cadmium
telluride (HgCdTe), indium antimonide (InSb) and indium gallium arsenide (InGaAs)
are the most common types of semiconductor IR detectors, with newer compositions
such as mercury manganese telluride (HgMnTe) and mercury zinc telluride (HgZnTe)
currently being developed. However, the HgCdTe detector and its extension remains
the most common IR detector. The principle of operation of an HgCdTe-based detector
is illustrated in Figure 1.10.
20 Image Acquisition Systems
Thermal radiation
Detector substrate
Indium bumps
Column bus
Figure 1.10 Hybrid focal plane architecture for HgCdTe-based IR detector showing (a) cell structure
and (b) equivalent circuit.
In Figure 1.10, the sensor is represented by a detector diode which is mechanically
bonded to a silicon (Si) multiplexer for the read-out operation. An electrical connection
is required between each pixel and the rest of the circuitry. This is formed by the heatand pressure-bonding of an indium bump or solder bond. Row and column shift registers
allow sequential access to each pixel. Similarly to other semiconductor devices, this
type of sensor is constructed using modern fabrication technologies such as vapor
deposition epitaxy (Campbell, 2001). In this method, the diode is made by depositing
CdTe on sapphire followed by liquid epitaxy growth of HgCdTe. A complete HgCeTe IR
detector system usually comprises a small printed circuit board (PCB), complete with
a digital signal processor chip (DSP) and an optical system responsible for focusing
the scene on to the plane of array. At present, large two-dimensional arrays comprising
2048 × 2048 pixels, with each pixel 18 µm in size, assembled on a 40 × 40-mm device
and with a complete infrared camera system, are commercially available. They operate
in the bands 3–5 µm or 8–12 µm, and need cooling at −196◦ C.
There are different ways to cool the detectors – mainly by using liquefied gas,
a cryogenic engine, gas expansion, or the thermoelectric effect. The most common
method is cryogenic cooling, employing liquefied gas stored in a vacuum called a
Dewar (named after Sir James Dewar, a Scottish scientist who successfully liquefied
hydrogen for the first time in 1892). Figure 1.11 shows the construction of a typical
Dewar, highlighting all the important elements. Typically, the sensor is mounted directly
on the cold surface, with a cold shield and infrared transparent window. Usually a
protective coating such as zinc sulfide is applied on to the surface of HgCeTe in order
to increase its lifespan. The most commonly used and cheapest liquefied gas is liquid
nitrogen, which provides a sustainable cold temperature of −196◦ C without regular
Another common method of achieving cooling is through the Joule–Thompson gas
expansion method. High-pressure gas such as nitrogen or argon produces droplets
of liquid nitrogen at −187◦ C following quick expansion. Compared to the Dewar,
Image acquisition systems 21
Cold shield
Liquefied gas
Figure 1.11 Schematic diagram of a typical Dewar.
this method is noisy and cumbersome. When refilling is not practical, such as for
applications in remote areas, a cooling method using a closed Stirling cycle can be
employed. This machine cools through the repetitive compression and expansion cycles
of a gas piston, and is therefore again cumbersome compared to the Dewar. Another
more practical approach to cooling is by thermoelectric elements, based on the Peltier–
Thompson effect (Fraden, 1997). This method utilizes a junction of dissimilar metals
carrying a current; the temperature rises or falls depending on the direction of the
current. Current flowing in one direction results in the Peltier effect, and current flowing
in the opposite direction produces the Thompson effect by the same law of physics.
Unfortunately, Peltier elements are unattractive for temperatures below −73◦ C owing
to high current consumption. In spite of this drawback, the thermoelectric cooling
method involves no moving parts, and is quiet and reliable. For these reasons, it is
widely used in IR cameras.
3.3.2 Uncooled IR detectors
As the name implies, uncooled thermal cameras use sensors that operate at room temperature. Uncooled IR sensors work by changes in resistance, voltage or current when
exposed to IR radiation. These changes are then measured and compared with the values at the operating temperature of the sensor. Unlike cooled detectors, uncooled IR
cameras can be stabilized at an ambient temperature, and thus do not require bulky,
expensive cryogenic coolers. This makes such IR cameras smaller and less costly. Their
main disadvantages are lower sensitivity and a longer response time, but these problems
have almost been solved with the advent of surface micro-machining technology. Most
uncooled detectors are based on pyroelectric materials or microbolometer technology.
Pyroelectricity is the ability of some materials to generate an electrical potential when
heated or cooled. It was first discovered in minerals such as quartz, tourmaline, and
other ionic crystals. The first generation of uncooled thermal cameras looked very similar to the conventional cathode ray tube, apart from the face plate and target material
22 Image Acquisition Systems
face plate
IR lens
Signal Pyroelectric
Modulator Cathode
Figure 1.12
Schematic diagram of the first-generation pyroelectric tube.
(see Figure 1.12). As infrared signals impinge on the pyroelectric plate, the surface
temperature of this plate changes. This in turn induces the charge, which accumulates
on the pyroelectric material. The electron beam scans this material, and two things may
happen depending on whether there is an absence or presence of charge. In the absence
of charge (i.e. no radiation), the electron beam is deflected toward the mesh by the
action of the x and y deflection plates. In the presence of charge, the electron beam is
focused on the spot, thus causing current to flow into an amplifier circuit. In this way a
video signal is built as the electron beam scans over the entire surface of the pyroelectric
plate. Since the accumulation of charge only occurs when the temperature of the pyroelectric material changes, the pyroelectric tube is only suitable for imaging dynamic
occurrences. This effect will benefit certain applications, such as monitoring drying
process, where only the fast changes of temperature are recorded (Fito et al., 2004).
With the advent of semiconductor technology, it is now possible to produce pyroelectric solid-state arrays with resolution reaching 320 × 240 pixels. This type of camera
offers high detectivity, but produces images at a relatively low speed (typically 1 Hz).
Furthermore, absolute temperature measurement often requires individual calibration
of each element, which significantly slows down the image acquisition time. However,
the main advantage lies with its ability to produce an image without the need for cooling. This makes it suitable for a wide range of non-destructive applications, especially
in industry.
Another type of IR camera is based on microbolometer technology. Theoretically,
a microbolometer is a monolithic sensor capable of detecting infrared radiation through
the direct or indirect heating of a low-mass, temperature-dependent film. Popular materials include thermistors with high temperature coefficients of resistance, such as vanadium oxide (VOx ), silicon devices such as the Schottky barrier diode and transistor, and
thermoelectrics such as the silicon p–n junctions. One example of the bolometer-type
uncooled infrared focal plane array (IRFPA), with a 320 × 240-pixel array and operating at a frame rate of 60 Hz, has been investigated for use in industry (Oda et al., 2003).
Image acquisition systems 23
Passivation thin film
thin film
Figure 1.13 Schematic representation of a bolometer detector showing (a) the cross-sectional view and
(b) the plan view of each bolometer pixel.
Figure 1.13 shows the schematic structure of each bolometer pixel. The pixel is divided
into two parts; a silicon readout integrated circuit (ROIC) in the lower part, and a suspended microbridge structure in the upper part. The two parts are separated by a cavity.
The microbridge structure is composed of a diaphragm and supported by two beams,
thereby thermally isolating the former from the latter heat sink.
Manufacture of microbolometers such as the one shown in Figure 1.13 uses microelectromechanical techniques, originally developed at Bell Labs for air-bridge isolation
integrated circuits. They are carefully engineered so that part of the IR radiation is
absorbed by the silicon passivation layers in the diaphragm and part is transmitted. The
transmitted radiation is perfectly reflected by the reflecting layer, and is again absorbed
by the passivation layers. In this way, more than 80 percent of the incident IR radiation
is absorbed. The absorbed radiation heats the diaphragm and changes the bolometer
resistance. Supplying a bias current enables the resistance change to be converted to
voltage and detected by ROIC. The analog signal voltage of the ROIC is digitized by
analog-to-digital conversion of the receiving circuits. These data are first corrected for
non-uniformity in bolometer responsivity, and are then adjusted for video output. The
pixel size of such a detector is 37 × 37 µm, and the fill factor is about 72 percent.
3.4 Tomographic imaging
While a computer vision system is useful for surface inspections, in many specialized
investigations the food technologists and scientists frequently need to “see” an internal
view of the sample. It should now be recognized that a clear image of an object’s interior
cannot be formed with a conventional imaging instrument because wave motion is
continuous in space and time. Wave motion brought to a focus within the region of a
particular point necessarily converges before and diverges after it, thereby inherently
contaminating the values registered outside that region. Therefore, an image formed of
the surface of a body by conventional methods can be clear, but the image depicting the
internal structure of the sample will be contaminated. Therefore, the terms “computerassisted tomography” (CAT) and “computed tomography” (CT) emerged following
24 Image Acquisition Systems
the development of a CT machine in 1972 at EMI Ltd, by the Nobel Prize winner
Geoffrey Hounsfield. This device has revolutionized clinical radiology. Nevertheless,
food tomography is a relatively new subject, since such an application requires high
expenditure. A typical medical CT scanner can cost tens of millions of pounds, and,
with no comparable increase in reimbursement, the purchase of such a system for uses
other than medical cannot easily be justified. However, some interesting applications
involving the use of tomography for food applications have started to emerge recently,
and such tomographic modalities are described here.
3.4.1 Nuclear tomography
As the name implies, nuclear tomography involves the use of nuclear energy for imaging the two-dimensional spatial distribution of the physical characteristics of an object,
from a series of one-dimensional projections. All nuclear-imaging modalities rely upon
acquisition hardware featuring a ring detector which measures the strength of radiation
produced by the system. There are two general classes of source of radiation, determined by the degree of control exerted over them by the user. The first class consists
of exterior sources (those outside the body), which are usually completely under the
control of the experimenter; this method is termed “remote sensing” (see Figure 1.14a).
The second group consists of interior sources (those inside the body), which are usually
beyond the direct control of the experimenter; this method is termed “remote probing”
(see Figure 1.14b). Computed tomography, where radiation is projected into the object,
falls into the first category; stimulated emission, as in the case of magnetic resonance imaging (MRI) and in the use of radiopharmaceuticals in single photon-emission
Figure 1.14 Two different geometries for tomographic imaging: (a) remote sensing and (b) remote
probing. (c) Typical scanning pattern showing two orthogonal projections.
Image acquisition systems 25
computed tomography (SPECT) and positron-emission tomography (PET), fall into the
second category.
Regardless of the scanning geometry, tomographic imaging shares one common
feature: the requirement to perform complex mathematical analysis of the resulting signals using a computer. There are many good reviews on this subject, and interested readers are referred to publications by Brooks and Di Chiro (1975, 1976), and Kak (1979).
Here, a brief description of the various tomographic modalities is provided, focusing
on the advancement of the technology since its inception more than 30 years ago. Computed tomography (CT)
As shown in Figure 1.14, essentially CT involves scanning the source and detector
sideways to produce single-projection data. This procedure is repeated at many viewing
angles until the required set of all projection data is obtained. Image reconstruction
from the data remains one of the important tasks in CT that can be performed using a
variety of methods. The history of these reconstruction techniques began in 1917 with
the publication of a paper by the Austrian Mathematician J. Radon, in which he proved
that a two-dimensional or three-dimensional object can be reconstructed uniquely from
the infinite set of all its projections (Herman, 1980). To date, there have been hundreds
of publications on computed tomography imaging. A good summary is provided by
Kak and Slaney (1988).
When the first CT machines were introduced in 1972, the spatial resolution achievable was three line pairs per millimeter, on a grid of 80 × 80 per projection. The
time taken to perform each projection scan was approximately 5 minutes. In contrast,
a modern machine achieves 15 line pairs per millimeter, on a grid of 1024 × 1024
per projection, with a scan time per projection of less than 1 second. The projection
thickness typically ranges from 1 to 10 mm, and the density discrimination achievable
is better than 1 percent. These machines use an X-ray source which rotates in a circular path around the sample. A collimator is employed in order to produce a sharp,
pencil-beam X-ray, which is measured using detectors comprising a static ring of several hundreds of scintillators. These have sometimes been constructed from xenon
ionization chambers, but a more compact solution is offered by solid-state systems,
where a scintillation crystal is closely coupled to a photodiode. This source–detector
combination measures parallel projections, one sample at a time, by stepping linearly
across the object. After each projection, the gantry rotates to a new position and these
procedures are repeated until data are gathered at sufficient viewing angles.
The latest generation of CT machines employs a fan-beam arrangement as opposed
to parallel-beam geometry. In this way, the size of the beam can be enlarged to cover the
object field of view. Consequently, the gantry needs only to rotate, thus speeding-up
the acquisition time. Employing a stationary ring comprising, typically, 1000 detectors,
the data acquisition time of a modern CT scanner is generally less than 0.1 s. Figure 1.15
illustrates the essential elements of such systems.
Since CT is based upon the attenuation of X-rays, its primary strength is the imaging
of calcified objects such as bone and the denser tissues. This limits its applications in
food technology, since food objects are mostly soft or semi-fluid. This, as well as the
26 Image Acquisition Systems
Fan beam
Computer for
display and
Figure 1.15
Modern CT usually employs fan-beam geometry in order to reduce the data-capturing speed.
expense, is the reason that CT imaging was initially limited to medical applications.
However, in the 30 years since its inception, its capabilities and applications have been
expanded as a result of the advancement of technology and software development.
While medical disorders are still a common reason for CT imaging, many other scientific fields – such as geology, forestry, archaeology, and food science – have found CT
imaging to be the definitive tool for diagnostic information. For instance, CT combined
with appropriate image analysis has been used to study the magnitude and gradients
of salt in dry-cured ham in the meat water phase (Vestergaard et al., 2005). In studying growth and development in animals, Kolstad (2001) used CT as a non-invasive
technique for detailed mapping of the quantity and distribution of fat in crossbred Norwegian pigs. There are other recent applications involving CT in agriculture and food
tomography, and interested readers are again directed to relevant publications (see, for
example, Sarigul et al., 2003; Fu et al., 2005; Babin et al., 2006). Magnetic resonance imaging (MRI)
Previously known as nuclear magnetic resonance (NMR) imaging, MRI gives the density of protons or hydrogen nuclei of the body at resonant frequency. Unlike CT, MRI
provides excellent renditions of soft and delicate materials. This unique characteristic
makes MRI suitable for visualization of most food objects, and applications range from
non-invasive to real-time monitoring of dynamic changes as foods are processed, stored,
packaged, and distributed. Hills (1995) gives an excellent review on MRI applications
from the food perspective.
In principle, MRI is based on the association of each spatial region in a sample
with a characteristic nuclear magnetic resonance frequency, by imposing an external
magnetic field. Without the external magnetic field, the magnetic moment would point
in all directions at random, and there would be no net magnetization. However, in
the presence of a large magnetic field, the hydrogen nuclei will preferentially align
their spin in the direction of the magnetic field. This is known as the Lamor effect,
and the frequency at which the nucleus proceeds around the axis is termed the Lamor
Image acquisition systems 27
RF excitation
RF reception
Computer for
control and
Figure 1.16 Block diagram of a typical MRI system.
frequency (McCarthy, 1994). This effect implies a transfer of energy from the spin
system to another system or lattice. The transfer of energy is characterized by an
exponential relaxation law with time constants T1 and T2 , which are also known as the
spin–lattice excitation and spin–spin relaxation times, respectively (McCarthy, 1994).
In commercial MRI, the magnetic field ranges from 0.5 to 2.0 tesla (compared with
Earth’s magnetic field of less than 60 µT). T1 is typically of the order of 0.2–2 s,
and T2 ranges from 10 to 100 ms. According to Planck’s equation E = hf, for a field
strength of 1.5 T, f corresponds to radiowaves with a frequency of 60 MHz. This is
the resonant frequency of the system. Therefore, by applying a radio-frequency (RF)
field at the resonant frequency, the magnetic moments of the spinning nuclei lose
equilibrium and hence radiate a signal which is a function of the line integral of the
magnetic resonance signature in the object. This radiation reflects the distribution of
frequencies, and a Fourier transform of these signals provides an image of the spatial
distribution of the magnetization (Rinck, 2001). A basic block diagram of a typical
MRI data-acquisition system is shown in Figure 1.16. In general, the MRI system
comprises a scanner, which has bore diameter of a few tens of centimeters; a static
magnetic field, which is generated by a superconducting coil; and RF coils, which
are used to transmit radio-frequency excitation into the material to be imaged. This
excites a component of magnetization in the transverse plane which can be detected
by a RF reception coil. The signals are transduced and conditioned prior to image
reconstruction. Current MRI scanners generate images with sub-millimeter resolution
of virtual slices through the sample. The thickness of the slices is also of the order of a
millimeter. Contrast resolution between materials depends strongly on the strength of
the magnetization, T1 , T2 , and movement of the nuclei during imaging sequences. The
most striking artefacts appear when the magnetic field is disturbed by ferromagnetic
objects. Other artefacts, such as ringing, are due to the image reconstruction algorithm
and sensor dynamics.
Owing to the fact that MRI provides rapid, direct, and, most importantly, noninvasive, non-destructive means for the determination of not only the quantity of
28 Image Acquisition Systems
water present but also the structure dynamic characteristics of the water, this relatively
new imaging technique has become useful for food engineering. There are numerous
applications of MRI, since water is the basic building block of many food materials.
Figure 1.17 shows examples of MRI-captured images within corn kernels during the
freezing process (Borompichaichartkul et al., 2005). The brighter areas show locations
where proton mobility is high, and thus water exists as a liquid. In this example, MRI
provides useful information for characterizing the physical state of water in frozen corn.
Other interesting applications include real-time monitoring of ice gradients in a
doughstick during the freezing and thawing processes (Lucas et al., 2005), mapping
the temperature distribution patterns in food sauce during microwave-induced heating
(Nott and Hall, 1999), and predicting sensory attributes related to the texture of cooked
potatoes (Thybo et al., 2004). These examples – a far from exhaustive list – serve to
emphasize the potential of MRI for revolutionizing food science and engineering.
As with CT imagers, the major drawback is the current expense of an MRI machine –
typically between £500 000 and £1 million. Consequently, at present MRI machines
are only used as a research and development tool in food science. In order for MRI
to be applied successfully on commercial basis, the possible benefits must justify the
expense. However, the rapidly decreasing cost of electronic components, combined
with the ever-increasing need for innovation in the food industry, indicate that it should
not be too long before a commercial and affordable MRI machine is developed for food
quality control.
Figure 1.17 Examples of MRI images showing the distribution of water and its freezing behavior in different
areas within the corn kernels: (a) images captured before freezing at different moisture contents; (b) and (c)
images acquired at specified temperatures and moisture content levels (Borompichaichartkul et al., 2005).
Image acquisition systems 29
3.4.2 Electrical tomography
Unlike nuclear imaging, electrical tomography (ET) uses electrical signals in the form
of voltage and current of a magnitude of less then tens of millivolts and milliamperes,
respectively. Therefore, the method is inherently safe and requires no expensive and
complicated hardware. Sensing modalities include electrical-resistant tomography
(ERT), electrical-capacitance tomography (ECT), and microwave tomography (MT).
There are a few other modalities, but the ERT and MT techniques have been successfully applied to food imaging, and therefore this discussion will focus on these two
imaging modalities only. There is much literature on this subject, but imaging examples
provided here are based on work by Henningsson et al. (2005), who investigated the
use of the ERT technique for yoghurt profiling, and on recent research in applying MT
for grain imaging (Lim et al., 2003).
Both ERT and MT are soft-field sensor systems, since the sensing field is altered
by the density distribution and physical properties of the object being imaged. Therefore, as previously discussed, this limits the resolution compared to hard-field sensors.
Nevertheless, these relatively new imaging modalities are useful for some specialized
food applications where low-imaging resolution is adequately acceptable. As shown
in Figure 1.18, ERT and MT tomographic systems generally can be subdivided into
three basic parts: the sensor, the data-acquisition system, and the image-reconstruction
interpretation and display.
In order to perform imaging, an electrical signal is injected into a reactor through an
array of sensors which are mounted non-invasively on the reactor surface, where the
response of the system is measured. In the case of ERT, a low-frequency AC current is
injected and voltages are measured; in MT, the reactor is irradiated with microwave signals and the transmitted or scattered fields (or both) are measured. There are many ways
that sensors can be configured to do the measurement. The ERT system, employing a
Cross-section through reactor
Sensor sites
Reactor wall
Sensors on
Figure 1.18 Schematic block diagram of a typical ET instrument.
30 Image Acquisition Systems
four-electrode measurement protocol, uses one pair of adjacent sensors to inject current, and voltages appearing at all the other pairs of adjacent sensors are measured.
The number of independent measurements obtained using this geometry with N sensors
is determined to be equal to N (N2− 3) (Barber and Brown, 1984). Similarly, the number of unique measurements in MT also depends on the sensor geometry. In the case
of multiple-offset geometry with N transmitters and M receivers, the total number
of measurement is MN (Lim et al., 2003). Using suitable reconstruction methods,
the measured data can be processed, delivering a two-dimensional image depicting
the conductivity or permittivity distributions in ERT or MT, respectively. By using the
information from several sensor planes, a three-dimensional reconstruction can be
interpolated across the sectional map (Holden et al., 1998).
The general applicability of ERT is illustrated by the work of Henningsson et al.
(2005), who studied velocity profiles of yoghurt and its rheological behavior in a pipe
of industrial dimensions. A cross-correlation technique was used to transform the dualplane conductivity maps into velocity profiles. Comparing simulated and experimental
results, they discovered that ERT results have some noise (and thus uncertainty) in the
region near the wall, but the centerline velocities are very well resolved with an error of
less than 7 percent. They concluded that ERT is a useful method for determination of the
velocity profile of food; the information produced can be used in process conditioning
in order to minimize loss of product. Meanwhile, Lim et al. (2003) exploited the sensitivity of microwave signals to permittivity perturbation, which allowed them to apply
MT measurements for mapping the moisture profiles in grain. Image reconstruction
was based on the optical approach, permitting the use of straight-ray approximation
for data inversion. Examples of moisture tomograms obtained using this method are
illustrated in Figure 1.19. Tests indicate that this imaging modality would considerably
Moisture (%)
Moisture (%)
12.4 20.1
Moisture (%)
12.4 24.8
Figure 1.19 Example of MT images reconstructed from grain with homogeneous moisture of 12.4%.
Higher moisture anomalies were simulated at the left-centre of the cross-section, having values of (a) 18.3%,
(b) 20.1%, and (c) 24.8%.
Nomenclature 31
enhance results in situations where large dielectric constant differences exist in moisture regimes, such as mixtures of water and ice. For certain moisture regimes where
the difference in dielectric constant still exists but is small, it is important to consider
the electric field distortion due to diffraction and scattering effects, and to account
for these in the reconstruction.
4 Conclusions
As discussed above, there are several powerful imaging modalities that are capable of
producing food images, each having particular strengths and weaknesses. CCD vision
systems, covering both the visible and infrared regions, are suitable for surface imaging,
while CT, MRI, and ET are oriented for imaging internal structures. Of the latter three,
CT is suitable for imaging hard and solid objects, MRI for functional imaging, and
ET for conductivity or permittivity mapping. Some of these technologies are already
available commercially, while some are still in the development stage. Currently under
development is a system that can combine results from various modalities in order
to enhance and improve image quality further. With careful calibration, images from
different modalities can be registered and superimposed, giving rise to what is presently
known as “multimodal imaging” or the “sensor fusion technique.” With intense research
being pursued in some of the world’s leading laboratories, it will not be long before
such an emerging technology reaches food technologists and scientists.
absolute temperature, K
wavelength, m
speed of sound, m/s
density, kg/m3
Stefan-Boltzman constant, 1.38054 × 10−23 J/K
speed of light, 2.998 × 108 m/s
Energy, J
frequency, Hz
Planck’s constant, 6.626076 × 10−34 J s
excitation time, s
relaxation time, s
acoustical impedance, Abbreviations:
alternating current
A/D analog-to-digital converter
CAT computer-assisted tomography
CCD charge couple device
32 Image Acquisition Systems
Comite Consultatif International des Radiocommunication
Cadmium telluride
Complementary metal oxide silicon
computed tomography
direct current
digital signal processor
electrical-capacitance tomography
electrical-resistant tomography
extreme ultraviolet
Field effect transistor
frames per second
far ultraviolet
ground probing radar
HgCdTe Mercury cadmium telluride
HgMnTe Mercury manganese telluride
InGaAs Indium gallium arsenide
Indium antimonide
infrared focal plane array
light-emitting diode
magnetic resonance imaging
microwave tomography
near infrared
nuclear magnetic resonance
near ultraviolet
personal computer
peripheral component interface
positron-emission tomography
radio frequency
read-out integrated circuit
single photon-emission computed tomography
time of flight
vanadium oxide
Abdullah MZ, Abdul-Aziz S, Dos-Mohamed AM (2000) Quality inspection of bakery products using color-based machine vision system. Journal of Food Quality,
23, 39–50.
References 33
Abdullah MZ, Fathinul-Syahir AS, Mohd-Azemi BMN (2005) Automated inspection system for color and shape grading of starfruit (Averrhoa carambola L.) using machine
vision sensor. Transactions of the Institute of Measurement and Control, 27 (2), 65–87.
Abdullah MZ, Guan LC, Mohd-Azemi, BMN (2001) Stepwise discriminant analysis
for color grading of oil palm using machine vision system. Transactions of IChemE,
Part C, 57, 223–231.
Babin P, Della Valle G, Chiron H, Cloetens P, Hoszowska J, Penot P, Réguerre AL, Salva
L, Dendieval R (2006) Fast X-ray tomography analysis of bubble growth and foam
setting during bread making. Journal of Cereal Science, 43 (3), 393–397.
Barber DC, Brown BH (1984) Applied potential tomography. Journal of Physics E
Scientific Instrument, 11 (Suppl A), 723–733.
Batchelor BG (1985) Lighting and viewing techniques. In Automated Visual Inspection
(Batchelor BG, Hill DA, Hodgson DC, eds). Bedford: IFS Publication Ltd, pp.
Borompichaichartkul C, Moran G, Srzednicki G, Price WS (2005) Nuclear magnetic resonance (NMR) and magnetic resonance imaging (MRI) studies of corn at subzero
temperatures. Journal of Food Engineering, 69 (2), 199–205.
Brooks RA, Di Chiro G (1975) Theory of image reconstruction in computed tomography.
Radiology, 117, 561–572.
Brooks RA, Di Chiro G (1976) Principles of computer assisted tomography (CAT) in
radiographic and radioisotope imaging. Physics in Medical Biology, 21 (5), 689–732.
Camarena F, Martínez-Mora JA (2005) Potential of ultrasound to evaluate turgidity and
hydration of orange peel. Journal of Food Engineering, 75 (4), 503–507.
Campbell SA (2001) The Science and Engineering of Microelectronic Fabrication. New
York: Oxford University Press.
Daley W, Carey R, Thompson C (1993) Poultry grading inspection using colour imaging.
SPIE Proceedings Machine Vision Applications in Industrial Inspection, 1907, 124.
Doebelin EO (1996) Measurement Systems: Application and Design. New York:
McGraw Hill.
Fishbane PM, Gasiorowiczs S, Thornton ST (1996) Physics for Scientists and Engineers.
Upper Saddle River: Prentice-Hall.
Fito PJ, Ortolá MD, de los Reyes R, Fito P, de los Reyes E (2004) Control of citrus surface
drying by image analysis of infrared thermography. Journal of Food Engineering,
61 (3), 287–290.
Fraden J (1997) Handbook of Modern Sensors: Physics, Designs and Applications. New
York: American Institute of Physics Press.
Fu X, Milroy GE, Dutt M, Bentham AC, Hancock BC, Elliot JA (2005) Quantitative analysis of packed and compacted granular systems by x-ray microtomography. SPIE
Proceedings Medical Imaging and Image Processing, 5747, 1955.
Gan TH, Pallav P, Hutchins DA (2005) Non-contact ultrasonic quality measurements of
food products. Journal of Food Engineering, 77 (2), 239–247.
Gaossorgues G (1994) Infrared Thermography. London: Chapman and Hall.
Ginesu G, Guisto DG, Märgner V, Meinlschmidt P (2004) Detection of foreign bodies
in food by thermal image processing. IEEE Transactions of Industrial Electronics,
51 (2), 480–490.
34 Image Acquisition Systems
Gómez AH, He Y, Pereira AG (2005) Non-destructive measurement of acidity, soluble
solids and firmness of Satsuma mandarin using Vis/NIR-spectroscopy techniques.
Journal of Food Engineering, 77 (2), 313–319.
Gunasekaran S (1996) Computer vision technology for food quality assurance. Trends in
Food Science & Technology, 7, 245–256.
Heinemann PH, Hughes R, Morrow CT, Sommer III HJ, Beelman RB, Wuest PJ (1994)
Grading of mushrooms using machine vision system. Transactions of the ASAE,
37 (5), 1671–1677.
Henningsson M, Ostergren K, Dejmek P (2005) Plug flow of yoghurt in piping as determined by cross-correlated dual-plane electrical resistance tomography. Journal of
Food Engineering, 76 (2), 163–168.
Herman G (1980) Image Reconstruction from Projections. New York: Academic Press.
Hills B (1995) Food processing: an MRI perspective. Trends in Food Science & Technology,
6, 111–117.
Holden PJ, Wang M, Mann R, Dickin FJ, Edwards RB (1998) Imaging stirred-vessel
macromixing using electrical resistant tomography. AIChE Journal, 44 (4), 780–790.
Kak, CK (1979) Computerized tomography with x-ray, emission and ultrasound sources.
Proceedings of IEEE, 67 (9), 1245–1272.
Kak AC, Slaney M (1988) Principles of Computerized Tomography Imaging. New York:
IEEE Press.
Kolstad K (2001) Fat deposition and distribution measured by computer tomography in
three genetic groups of pigs. Livestock Production Science, 67, 281–292.
Lim MC, Lim KC, Abdullah MZ (2003) Rice moisture imaging using electromagnetic
measurement technique. Transactions of IChemE, Part C, 81, 159–169.
Lucas T, Greiner A, Quellec S, Le Bail A, Davanel A (2005) MRI quantification of ice
gradients in dough during freezing or thawing processes. Journal of Food Engineering,
71 (1), 98–108.
Matas J, Marik R, Kittler J (2005) Color-based object recognition under spectrally nonuniform illumination. Image and Vision Computing, 13 (9), 663–669.
McCarthy M (1994) Magnetic Resonance Imaging in Foods. NewYork: Chapman and Hall.
McClements DJ (1995) Advances in the application of ultrasound in food analysis and
processing. Trends in Food Science and Technology, 6, 293–299.
Morlein D, Rosner F, Brand S, Jenderka KV, Wicke M (2005) Non-destructive estimation
of the intramuscular fat content of the longissimus muscle of pigs by means of spectral
analysis of ultrasound echo signals. Meat Science, 69, 187–199.
Nott KP, Hall LD (1999) Advances in temperature validation of foods. Trends in Food
Science & Technology, 10, 366–374.
Oda N, Tanaka Y, Sasaki T, Ajisawa A, Kawahara A, Kurashina S (2003) Performance of 320 × 240 bolometer-type uncooled infrared detector. NEC Research and
Development, 44 (2), 170–174.
Paulsen M (1990) Using machine vision to inspect oilseeds. INFORM, 1 (1), 50–55.
Pearson T (1996) Machine vision system for automated detection of stained pistachio nuts.
Lebensmittel Wissenschaft und Technologie, 29 (3), 203–209.
Pedreschi F, León J, Mery D, Moyano P (2006) Development of a computer vision system
to measure the color of potato chips. Food Research International, 39, 1092–1098.
References 35
Rinck PA (2001) Magnetic Resonance in Medicine. Berlin: Blackwell.
Sarigul E, Abott AL, Schmoldt DT (2003) Rule driven defect detection in CT images of
hardwood logs. Computers and Electronics in Agriculture, 41, 101–119.
Simal S, Benedito J, Clemente G, Femenia A, Roselló C (2003) Ultrasonic determination
of the composition of a meat-based product. Journal of Food Engineering, 58 (3),
Steele DJ (1974) Ultrasonics to measure the moisture content of food products. British
Journal of Non-destructive Testing, 16, 169–173.
Steinmetz V, Crochon M, Bellon-Maurel V, Garcia-Fernandez JL, Barreiro-Elorza P,
Vestreken L (1996) Sensors for fruit firmness assessment: comparison and fusion.
Journal of Agricultural Engineering Research, 64 (1), 15–28.
Stiles WS, Wyszecki G (2000) Color Science: Concepts and Methods: Quantitative Data
and Formulas. New York: Wiley Interscience Publishers.
Tao Y, Heinemann PH, Varghese Z, Morrow CT, Sommer III HJ (1995) Machine vision for
color inspection of potatoes and apples. Transactions of the ASAE, 38 (5), 1555–1561.
Thybo AK, Szczpinski PM, Karlsoon AH, Donstrup S, Stodkilde-Jorgesen HS,
Andersen HJ (2004) Prediction of sensory texture quality attributes of cooked potatoes
by NMR-imaging (MRI) of raw potatoes in combination with different image analysis
method. Journal of Food Engineering, 61 (1), 91–100.
Vestergaard C, Erbou SG, Thauland T, Adler-Nisen J, Berg P (2005) Salt distribution in
dry-cured ham measured by computed tomography and image analysis. Meat Science,
69, 9–15.
Wells PNT (1969) Physical Principles of Ultrasonic Diagnosis. NewYork: Academic Press.
Image Segmentation
Chaoxin Zheng and Da-Wen Sun
Food Refrigeration and Computerised Food Technology, University
College Dublin, National University of Ireland, Dublin 2, Ireland
1 Introduction
Owing to the imperfections of image acquisition systems, the images acquired are
subject to various defects that will affect the subsequent processing. Although these
defects can sometimes be corrected by adjusting the acquisition hardware, for example
by increasing the number of images captured for the same scene and adopting higher
quality instruments, such hardware-based solutions are time-consuming and costly.
Therefore it is preferable to correct the images, after they have been acquired and
digitized, by using computer programs, which are fast and relatively low-cost. For
example, to remove noise, smooth filters (including linear and median filters) can
be applied; to enhance contrast in low-contrast images, the image histograms can be
scaled or equalized. Such corrections of defects in images are generally called “image
After pre-processing, the images are segmented. Segmentation of food images,
which refers to the automatic recognition of food products in images, is of course
required after image acquisition, because food quality evaluation is completely and
automatically conducted by computer programs, without any human participation in
computer vision techniques. Although image segmentation is ill-defined, it can generally be described as separating images into various regions in which the pixels have
similar image characteristics. Since segmentation is an important task, in that the entire
subsequent interpretation tasks (i.e. object measurement and object classification) rely
strongly on the segmentation results, tremendous efforts are being made to develop
an optimal segmentation technique, although such a technique is not yet available.
Nevertheless, a large number of segmentation techniques have been developed. Of
these, thresholding-based, region-based, gradient-based, and classification-based segmentation are the four most popular techniques in the food industry, yet none of these
can perform with both high accuracy and efficiency across the wide range of different
food products. Consequently, other techniques combining several of the above are also
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
38 Image Segmentation Techniques
being developed, with a compromise on accuracy and efficiency. Even so, they are not
adaptable enough for use on the full diversity of food products.
This chapter reviews the image pre-processing techniques and the image segmentation techniques that are adoptable or have already been adopted in the food industry.
The feasibility of the various techniques is also discussed. This review can serve as a
foundation for applying the segmentation techniques available, and for the development
of new segmentation techniques in computer vision systems.
2 Pre-processing techniques
2.1 Noise removal
Images captured using various means are all subject to different types of noise, such
as the read-out noise while reading information from cameras, the wiring noise while
transferring video signals from cameras to computers, and the electronic noise while
digitizing video signals. All these lead to degradation of the quality of the images
when they are subsequently processed. In Figure 2.1, two images of the same scene
have been taken at an interval of less than 2 seconds, using the same image acquisition
system, and the differences are illustrated to demonstrate the noise produced during
image acquisition. It is clearly important that noise is removed after images have been
digitized and stored in computers, and the most efficient and feasible approach for
image noise removal is to “average” the image by itself.
2.1.1 Linear filter
The simplest method of averaging an image by itself is the linear filter, by which the
intensity values of pixels in the image are averaged using the intensity values of their
neighboring pixels within a small region. The filter processing can be described by the
following equation:
+M +M
f (x, y) =
w i, j f (x + i, y + j)
w i, j
i=−M j=−M
i=−M j=−M
where f (x, y) is the intensity value of pixel (x, y), while M is the size of the filter and w
represents the weighting of the filter. The weighting and size of the filter can be adjusted
to remove different types of noise. For instance, increasing the weighting of the central
pixel means that the central pixel dominates the averaging. Increasing the size of the
filter results in a smoother image with less noise, but the detail of the image is reduced.
2.1.2 Median filter
Another popular filter that is widely used is the median filter. The intensity values
of pixels in a small region within the size of the filter are examined, and the median
intensity value is selected for the central pixel. Removing noise using the median filter
does not reduce the difference in brightness of images, since the intensity values of the
filtered image are taken from the original image. Furthermore, the median filter does
not shift the edges of images, as may occur with a linear filter (Russ, 1999). These
Pre-processing techniques 39
Figure 2.1 Illustration of noise present in images: (a) two-color peanut images (in RGB space) taken at an
interval of less than 2 seconds; (b) their difference in the red component; (c) their difference in the green
component; (d) their difference in the blue component. Contrast was enhanced in images (b), (c), and (d).
two primary advantages have led to great use of the median filter in the food industry
(Du and Sun, 2004, 2006a; Faucitano et al., 2005).
2.2 Contrast enhancing
Sometimes images captured are of low contrast – in other words, the intensity values
of the images are within a small range of intensity levels, and thus pixels with different
40 Image Segmentation Techniques
Figure 2.2
Illustrations of (a) low-contrast image, and (b) high contrast after histogram scaling.
intensity values are not well distinguished from each other. An image in which the
intensity values range from 100 to 109 is shown in Figure 2.2a. However, it is impossible
to sense the difference of intensity values between pixels. The process of contrastenhancing is designed to increase the difference in intensity values among pixels so
that they can be effortlessly distinguished by human or computer vision. Most of the
contrast-enhancing utilizes the image histogram, which is a plot showing the occurrence
of intensity values in images (Jain, 1989).
2.2.1 Histogram scaling
In histogram scaling, the original histogram is transferred from one scale to another –
mostly from a smaller scale to larger one. Accordingly, the difference between two
neighboring intensity values is increased. For instance, Figure 2.2b is the transformed
image of Figure 2.2a, whose histogram has been reallocated from [100, 109] to the
scale of [0, 200] linearly so that the difference between neighboring intensity values of
the original image is increased from 1 to 20 – which can easily be observed. The transform function used for histogram scaling can be linear or non-linear, and one-to-one
or multiple-to-one.
2.2.2 Histogram equalization
Most of the transform functions for histogram scaling are limited to proposed cases.
Therefore, it is important to develop a flexible and hopefully optimal function that can
be employed for different types of images. Taking this into consideration, histogram
equalization has been developed, in which a much more uniform histogram is generated
from the original histogram by spreading out the number of pixels at the histogram peaks
and selectively compressing those at the histogram valleys (Gauch, 1992). Histogram
equalization can be simply described by equation (2.2):
j =
Segmentation techniques 41
where H denotes the original histogram, and l and L are the minimum and maximum
intensity values, respectively. The parameter i is the ith intensity value in the histogram;
j and j stand for the intensity value in the original histogram, and its corresponding
intensity value in the equalized histogram, respectively. Sometimes the contrast needs
to be constrained to a limited range for the purpose of retaining visual information of
objects in images, especially those with homogeneous intensity values. Therefore, the
contrast-limited adaptive histogram equalization method was developed and has been
applied to adjust pork images by facilitating the segmentation of pores (Du and Sun,
2006a). In this method, the contrast of the images is enhanced by first dividing each
image into non-overlapping small regions, and then enhancing the contrast in each
small region.
3 Segmentation techniques
3.1 Thresholding-based segmentation
In thresholding-based segmentation the image histogram is partitioned into two classes
using a single value, called bi-level thresholding (Figure 2.3), or into multiple classes
using multiple values, called multilevel thresholding, based on the characteristics of the
histogram. In bi-level thresholding, pixels with intensity values less than the threshold
are set as background (object) while others are set as object (background). In multiplelevel thresholding, pixels with intensity values between two successive thresholds are
assigned as a class. However, in tri-level thresholding, only two classes are normally
defined – i.e. one with intensity values between the two thresholds, and the other with
intensity values outside the two thresholds. Theoretically, the levels of thresholding can
be increased limitlessly according to the number of objects present in images; however,
the computation load will be increased exponentially. For example, for searching the
four-level thresholding in a gray image, the calculation would be as large as O (L3 ),
where L is the gray level of the image (typically 256 for a gray image). The large calculation means that multilevel (more than tri-) thresholding is unfeasible, and therefore
only bi-level and tri-level thresholding are used in practice.
It is obvious that the threshold for the segmentation described above is a fixed
value (called the global threshold) across the whole image. There is another kind of
threshold, called the local threshold, which is an adaptive value determined by the local
characteristics of pixels. However, only the global threshold is popularly used in the
food industry, mainly because the global threshold is selected from the image histogram
rather than the image itself. Therefore the computing speed is not affected by the image
size, as might be the case in local-threshold methods. As the adaptive threshold is hardly
used in the food industry, it is not further discussed here. However, for the segmentation
of complex food images, such as toppings of pizzas (see Figure 2.4; Sun, 2000), the
global threshold is not competent. One explanation for this is that the number of classes
defined by the global threshold is restricted to two (object and background), which is
far less than those required to segment the complex food images, since there are many
food products with different intensity-level values to be segmented.
42 Image Segmentation Techniques
Occurrence of intensities
Intensities of image
Figure 2.3 Thresholding the histogram of a beef image: (a) image of beef; (b) thresholding the
histogram; (c) binarized (a) by the threshold.
3.1.1 Threshold selection
There are four main methods or algorithms for the selection of the global threshold:
manual selection, isodata algorithm, objective function, and histogram clustering. Manual selection
The simplest global thresholding method is by manual selection, in which the threshold
is manually selected by researchers using graphic–user interface image-processing
software such as Photoshop (Adobe Systems Incorporated, USA), Aphelion (AAI,
Inc., USA), Optimas (Media Cybernetics, Inc., USA), etc. Although this method is the
simplest and easiest in implementation, it is not ideal for online automatic food-quality
evaluation using computer vision without any human participation. Therefore, methods
for automatically selecting a threshold have been developed.
Segmentation techniques 43
Figure 2.4 Images of pizza toppings: (a) original image; (b) segmented image of (a). Isodata algorithm
The first automatic threshold selecting method was probably by isodata algorithm,
which was originally proposed by Ridler and Calvard (1978). In the algorithm,
a threshold is first guessed (in most cases it is selected by the average intensity value
of the image) and then used to segment the histogram into two classes, i.e. A and B.
The average intensity values, mA and mB , for both classes are calculated, and the new
threshold is then determined as the average of mA and mB . The new threshold is updated
iteratively by the new average intensity values until convergence is achieved.
Alternatively, the objective function method might be used. Here, the histogram is
preliminarily normalized and regarded as probability distributions using equation (2.3):
h( j) = H( j)
The distribution is classified into two groups (i.e. objects and background) using
a threshold, which is an intensity value iteratively selected from the minimum to the
maximum of the intensity values. The optimal threshold is determined as the one that
maximizes the objective function, and is based on the interaction of the two classes
with regard to evaluating the success of the thresholds. Two kinds of objective functions
are mostly used: variance-based and entropy-based.
In the variance-based objective function (Otsu, 1979), the optimal threshold t is
selected to maximize the between-class variance, which can be calculated by
[µ(L)ω(t) − µ(t)]2
ω(t)[1 − ω(t)]
where ω and µ are the zero-th- and first-order cumulatives of the probability distribution, respectively.
In the entropy-based objective function, the optimal threshold is selected as the
intensity value at which the sum entropies of the two classes are maximized. However,
44 Image Segmentation Techniques
the different calculation of the sum entropy leads to different entropy thresholding methods, as in those proposed by Pun (1980), Kapur et al. (1985), Sahoo et al. (1997), etc.
Researchers have undertaken the comparison of these two objective functions. However, most of the comparisons are based on practice – in other words, the performance
of these two objective functions is compared by applying them respectively to segment a set of images. No theoretical comparison has so far been conducted, and thus
the comparison results are dependent on the set of images being used. Nevertheless,
some advantages and disadvantages of the two methods have already been found. It
is suggested that the variance-based objective function generally performs better than
the entropy-based one, except for images in which the population (the number of pixels of one class) of one class is relatively larger than that of the other (Read, 1982).
The worst situation, that the variance-based objective function will produce erroneous
results, occurs in images in which the ratio of the population of one class over the other is
lower than 0.01 (Kittler and Illingworth, 1985). In contrast, the entropy-based objective
functions retain a more stable performance across images with different ratios of population, yet there is a major problem with entropy-based methods. When the probability
distribution of an intensity value is too small, the entropy of the value is exponentially
larger than those of other values, which will introduce potentially large computation
errors (Sahoo et al., 1997). Therefore, the threshold selected will be much less reliable. Histogram clustering
The clustering method that is mainly used in threshold selection is k-means clustering.
An intensity value from l to L is picked as the threshold to segment the histogram into
two classes, object and background, with mean intensity values of mA and mB . If the
threshold satisfies the criterion that every intensity value of class A (B) is closer to mA
(mB ) than to mB (mA ), the threshold is selected as a candidate threshold. Afterwards,
partition error of each candidate threshold is computed using equation (2.5), and the
one with the smallest partition error is chosen as the optimal threshold.
H(i)[i − µ(t)]
i=l Other techniques
Besides the techniques described above, there are many other thresholding-based
segmentation techniques – for example, the minimum error technique (Kittler and
Illingworth, 1986), the moment-preserving technique (Tsai, 1985), the window extension method (Hwang et al., 1997), and the fuzzy thresholding technique (Tobias and
Seara, 2002). As these techniques are less popular and much more complex than the
isodata algorithm, objective function, and histogram clustering methods, they are only
mentioned here for completeness. Among the above automatic threshold selection
methods, there is no single one that can perform better overall than any of the others.
Therefore, it is recommended that several methods be proposed to identify the one with
the best performance.
Furthermore, for the purpose of eliminating the effects of noise in segmentation, twodimensional histogram thresholding can be proposed. The two-dimensional histogram
Segmentation techniques 45
Figure 2.5 Illustration of thresholding on a two-dimensional histogram (Zheng et al., 2006). Region A is
regarded as being object (background), and B as being background (object). Regions C and D are referred to
as noises and edges, and thus are ignored in threshold selection.
is constructed by considering the co-occurrence of the intensity values of pixels, and the
average intensity values between pixels and their neighboring pixels (Abutaleb, 1989).
The threshold for a two-dimensional histogram is illustrated in Figure 2.5. Although
two-dimensional thresholding performs better than one-dimensional thresholding,
a far greater computation load is required for the two-dimensional technique; for
this reason, it is less popular in the food industry. Although the techniques described
above are all bi-level thresholding, apart from the isodata algorithm, most of them
can be easily expanded to tri-level thresholding simply by increasing the number of
classes segmented by the threshold to three – object, background1, and background2
(or object1, object2, and background).
3.1.2 Image-opening and -closing
After image thresholding, some defects might be present in the images – for example,
some parts of objects might be misclassified as background, and some small regions of
background might be mistakenly segmented as objects. Consequently, image-opening
and image-closing are proposed for post-processing images segmented by thresholding.
Image opening involves reserving the unsegmented parts of objects using first image
dilation, by merging neighboring pixels of an object into the object, and then image
erosion, by removing boundary pixels from the object. On the contrary, image-closing is
image erosion followed by image dilation in order to eliminate the unsegmented parts of
the background. An example is provided in Figure 2.6. To remove small defects, opening
consisting of one round of dilation and erosion, and closing consisting of one round of
erosion and dilation, is sufficient. When the size of the defects increases, more rounds
of dilation or erosion are required; here, detail on the boundary of products may be lost.
Therefore, if the size of the defects in images after thresholding-based segmentation
is relatively large, an alternative thresholding technique rather than post-processing
should be adopted.
46 Image Segmentation Techniques
Figure 2.6 Image-opening and -closing for defects removal of the segmented image in Figure 2.3:
(a) opening with 3 rounds of erosion and dilation; (b) closing with 2 rounds of dilation and
3.2 Region-based segmentation
There are two region-based segmentation techniques: growing-and-merging (GM), and
splitting-and-merging (SM) (Navon et al., 2005). In the GM methods, a pixel is initially
selected as a growing region. Pixels neighboring the region are iteratively merged into
the region, if the pixels have similar characteristics (e.g. intensity and texture) to
the region concerned, until no more pixels can be merged. Afterwards, the growing
procedure is repeated with another pixel that has not been merged into any regions, until
all the pixels in the image have been merged into various regions. It usually happens
that images are over-segmented, which means that there are some regions that are too
small to remain as independent regions, mostly due to the presence of noise. Therefore,
post-processing is generally conducted to merge the over-segmented regions into their
nearby independent regions of larger area.
In the SM methods, the whole image is initially regarded as a big region, and is split
iteratively into smaller regions with uniform image characteristics (e.g. color, gradient,
and texture). The segmentation is terminated when there are no longer any regions with
un-uniform characteristics to be split. Similarly to GM, to overcome the problem of
Segmentation techniques 47
over-segmentation, very small regions are merged into neighboring regions that are
large enough to be independent regions.
Region-based segmentation methods are usually proposed for the purpose of segmenting complex images in which the number of classes is large and unknown.
However, in the segmentation of food images, the number of classes is normally already
assigned as two – i.e. food products and background, or defect and non-defect. Further
to this, region-based techniques are usually time-consuming. Therefore, region-based
methods are less popular in the applications of computer vision in the food industry.
One of the limited instances of the use of a region-based method is a stick growingand-merging algorithm proposed by Sun and Du (2004) mostly for the segmentation of
pizza toppings; it is impossible to segment these by using thresholding-based methods.
3.3 Gradient-based segmentation
Computing the image gradient is favored simply because boundaries of local contrast
can be effortlessly observed in the gradient images, and thus the edges of objects
can also be easily detected. Image segmentation is meanwhile accomplished, since
the edges of objects in images are located. Therefore, gradient-based segmentation is
also called “edge detection.” Typically, in gradient-based segmentation, the gradient
of an image is computed using convolute gradient operators, and a threshold t is set to
distinguish effective edges whose gradient is larger than t. The threshold can usually
be selected from the cumulative of the gradient histogram of images, working on the
scheme that 5–10 percent of pixels with the largest gradient can be chosen as edges
(Jain, 1989).
3.3.1 Gradient operator
Considering the image as a function f of the intensity value of pixels (x, y), the gradient
g can be computed by:
∂f 2
In digital images, a gradient operator is similar to an averaging operator (for noise
removal), which is a weighted convolution operator utilizing the neighboring pixels
for the operation. However, unlike the averaging operator, the weightings of a gradient
operator are not exclusively positive integers. Indeed, at least one negative integer is
present in the weighting so that the intensity value of the central pixel can be subtracted
from the values of the neighboring pixels, in order to increase the contrast among
adjacent pixels for computing gradients. Gradients can be computed in a total of eight
directions (see Figure 2.7). Further to this, the sum weight of a gradient operator is
usually 0. Some of the well-known gradient operators that have been widely used are
the Sobel, Prewitt, Roberts, and Kirsch operators (Russ, 1999).
3.3.2 Laplace operator
Although most of the operators described above are competent when the intensity
transition in images is very abrupt, as the intensity transition range gradually gets
48 Image Segmentation Techniques
Figure 2.7
Eight possible directions in which to compute the gradient.
wider and wider the gradient operators might not be as effective as they are supposed
to be. Consequently, the second-order derivative operators depicted below might be
considered as alternative approaches for the gradient operators:
∇ 2f =
∂2 f
∂2 f
∂x 2
∂y 2
Similarly, the second-order derivative operators are also convolute operators in digital
images. The following is one of the widely used derivative operators, the Laplace
operator, in which the second-order derivative is determined by subtracting intensity
values of the neighboring pixels from the value of the central pixel:
However, the Laplace operator is very sensitive to noise, and thus it is not rated as a
good edge detector. Instead, some generalized Laplace operators might be used, such
as the approximation of the Laplacian of Gaussian function, which is a powerful zerocrossing detector for edge detection (Marr and Hildreth, 1980). To our knowledge these
operators have not yet been employed in the food industry, so they are not discussed
further here.
3.3.3 Other techniques
The first quantitative measurements of the performance of edge detectors, including
the assessment of the optimal signal-to-noise ratio and the optimal locality, and the
maximum suppression of false response, were performed by Canny (1986), who also
proposed an edge detector taking into account all three of these measurements. The
Canny edge detector was used in the food industry for boundary extraction of food
products (Du and Sun, 2004; 2006b; Jia et al., 1996).
Another popular gradient-based technique is the active contour model (ACM),
otherwise known as “Snakes,” which transforms the problem of edge detection into an
Segmentation techniques 49
energy optimization problem. An active and deformable contour of the object is first
defined and then, step-by-step, the active contour is moved towards the real object
contour by minimizing the energy. The primary disadvantage of the ACM is that the
initial contour sometimes cannot be close enough to the object edge, causing failure of
convergence of the active contour with the object edge. Fortunately, this problem can
be solved by the gradient vector flow (GVF), which can overcome the defect of the
traditional external flow and move the active contour towards the desired object edge
more efficiently. So far, the ACM method has been proposed for the segmentation of
touching, adjacent rice kernels (Wang and Chou, 1996). However, the technical details
of the ACM and GVF are far beyond our discussion here. Readers interested in these
techniques can refer to the original work on ACM and GVF by Kass et al. (1988) and
Xu and Prince (1998), respectively.
3.4 Classification-based segmentation
Classification-based segmentation is the second most popular method, after
thresholding-based segmentation, used in the food industry. Classification-based segmentation is a pixel-orientated method in which each pixel is regarded as being
an independent observer whose variables are generally obtained by image features
(e.g. color, shape, and texture). Afterwards, a matrix that contains every pixel as an
observer is obtained as the input of the classification. Each observer is then classified
(object and background, or defect and non-defect, etc.) according to its variables, using
a learning model (Du and Sun, 2006c). Normally, a set of images that is successfully
segmented manually using human vision is provided as the training set (called supervision learning) in the classification. Coefficients of the learning model are obtained
so that the testing image set can be classified using the same model with the acquired
coefficients. An example of the supervised classification procedure is illustrated in
Figure 2.8. Although having the training image set is an advantage, it is not absolutely
necessary because there are some unsupervised learning techniques available, such as
clustering and the self-organizing-map, by which the observers can be clustered into
different classes without any other a priori knowledge. Nevertheless, this unsupervised
training is not as accurate as supervised in most cases; therefore, it is still preferable
to use the training image set (supervision) if possible.
One drawback of the classification-based methods compared with gradient-based
and region-based techniques is that the goal of the segmentation needs to be known prior
to carrying out segmentation – in other words, the number of classes that the images
can be segmented into should be given. For instance, in the segmentation of a food
product from the background, segmenting into two classes (i.e. object and background)
is the segmentation goal; in defect detection in apples, the goal of segmentation is
defect and non-defect. Fortunately, in most segmentation cases in the food industry
the goal of segmentation is mostly known beforehand. Therefore, classification-based
segmentation is widely used in the food industry. Another drawback of this technique
is that its performance is subject to two major factors, i.e. the features obtained from
images as variables of the observers and the learning models used.
50 Image Segmentation Techniques
Color Texture . . .
A / B /. . .
Color Texture . . .
A / B /. . .
Color Texture . . .
Color Texture
model with
Color Texture . . .
A / B /. . .
A / B /. . .
A / B /. . .
Color Texture . . .
A / B /...
Color Texture . . .
A / B /...
Color Texture . . .
Color Texture . . .
Color Texture . . .
Figure 2.8
model with
A / B /...
A / B /...
A / B / ...
Classification-based segmentation.
3.4.1 Features extraction
Since pixel intensity value is the primary information stored within pixels, it is the most
popular and important feature used for classification. The intensity value for each pixel
is a single value for a gray-level image, or three values for a color image. An alternative
approach to the acquisition of intensity values from a single image is the multispectral
imaging technique, with which more than one image of the same product at the same
location can be obtained at different wavelengths. Afterwards, intensity values of the
same pixel are acquired from the various images as the classification features of pixels.
This technique has drawn strong interest from researchers carrying out work in applequality evaluation using computer vision technology (Leemans et al., 1999; Blasco
et al., 2003; Kleynen et al., 2005). Sometimes, to acquire more information about the
pixels, its features can be extracted from a small region that is centered on the pixel.
Therefore, besides the intensity value, the image texture – which is an important factor
of the product surface for pattern recognition due to its powerful discrimination ability
(Amadasun and King, 1989) – can also be extracted as a classification feature of pixels.
For further technical information on the extraction of image texture features, refer to
the review by Zheng et al. (2006).
3.4.2 Classification methods Dimension reduction
Since a large amount of data is present in the input matrix for classification,
it is generally preferred that the dimension of the original matrix is reduced
before classification. Although principal component analysis (PCA) is a powerful
Segmentation techniques 51
dimension-reduction method, it is mostly used for the purpose of reducing classification variables. Consequently, PCA is not suitable for classification-based segmentation
because classification-based segmentation demands a reduction in the number of classification observers. Accordingly, the self-organizing map (SOM) has been developed.
The SOM, generalized by extracting the intrinsic topological structure of the input
matrix from the regularizations and correlations among observers, is an unsupervised
neural network in which each neuron represents a group of observers with similar
variables. Afterwards, the SOM can be used for classification rather than the original
observers, and the observers are assigned to the class of the neuron that the observers
belong to (Chtioui et al., 2003; Marique et al., 2005). Classification
Although there are several different types of techniques available at this stage –
i.e. statistical technique (ST), neural network (NN), support vector machine (SVM),
and fuzzy logic (FL) – only the Bayesian theory (a ST method) and fuzzy clustering (combination of ST and FL) have been proposed in the food industry so far. The
Bayesian theory generates the Bayesian probability P(Ci |X ) for a pixel (observer) to
belong to the class Ci by its features (variables) X using the following equation:
P(C i |X ) =
P(X |C i )P(C i )
P(X )
where P(X |Ci ) is the probability of an observer belonging to Ci having the variable X ;
P(Ci ) is a priori the probability of classifying an observer into class Ci ; and P(X ) is
the a priori probability of an observer having the variable X . Later, a threshold on the
Bayesian probability is selected, and if the probability of an observer is larger than the
threshold, the observer is classified into the class Ci .
Fuzzy clustering is a combination of a conventional k-mean clustering and a fuzzy
logic system in order to simulate the experience of complex human decisions and
uncertain information (Chtioui et al., 2003; Du and Sun, 2006c). In fuzzy clustering,
each observer is assigned a fuzzy membership value for a class, and an objective
function is then developed based on the fuzzy membership value. The objective function
will be minimized iteratively, until convergence is reached, by updating the new fuzzy
membership value according to the observers and the number of iterations. The criterion
determining the convergence of the objective function is generally defined as when the
difference of the values of the objective function between two successive iterations is
significantly small.
3.5 Other segmentation techniques
3.5.1 Watershed
The concept of watersheds, which are introduced into digital images for morphological processing, originally comes from topography. In morphological processing,
images are represented as topographical surfaces on which the elevation of each point
is assigned as the intensity value of the corresponding pixel. Before the detection of
watersheds in images, two concepts (i.e. the minimum and catchment basin) need to
52 Image Segmentation Techniques
be defined. The minimum is a set of connected pixels with the lowest intensity value
in images, while catchment basin, covering the minimum, is another set of pixels
in which water only flows across pixels to the minimum inside (Vincent and Soille,
1991). While flooding water from the minimum of a catchment basin occurs gradually,
dams corresponding to watersheds are built surrounding the catchment basin to prevent water from falling into another catchment basin. Accordingly, regions are formed
using the watersheds, and image segmentation can be accomplished simultaneously.
The watersheds can be constructed from different scales of images – grayscale (Vicent
and Soille, 1991), binary (Casasent et al., 2001), and gradient (Du and Sun, 2006a).
Owing to the presence of noise and local irregularities, there are far more minima
from which far more catchment basins are formed, causing the over-segmentation of
images. To overcome this problem, algorithms are designed. One method for preventing over-segmentation is to eliminate the undesired minima, using morphological
operators such as opening and closing. One such method was proposed by Du and
Sun (2006a) to segment pores in pork ham images. In other methods, post-processing
is conducted to merge the over-segmented regions with similar image characteristics
together again. Such a method with a graphic algorithm to determine the similarity of
merging neighboring regions was developed by Navon et al. (2005).
3.5.2 Hybrid-based segmentation
Although a large number of segmentation techniques have been developed to date,
no universal method can perform with the ideal efficiency and accuracy across the
infinity diversity of imagery (Bhanu et al., 1995). Therefore, it is expected that several
techniques will need to be combined in order to improve the segmentation results and
increase the adaptability of the methods. For instance, Hatem and Tan (2003) developed an algorithm with an accuracy of 83 percent for the segmentation of cartilage and
bone in images of vertebrae by using the thresholding-based method twice. First the
images were segmented by a simple threshold, and regions of cartilage and bones were
formed. Subsequently, another two thresholds – one based on size and the other on
elongation – were used to filter the segmented cartilage and bone regions, but not the
real cartilage or bone. Although classification-based segmentation yields better segmentation results than thresholding-based methods in the segmentation of longissimus
dorsi beef images, the computation speed is strongly affected by using classificationbased methods. Therefore, a classification-based method was first employed in a study
by Subbiah et al. (2004) to successfully segment the longissimus dorsi in a set of
images from which an ideal threshold for histogram thresholding was automatically
computed and used to segment the results of images. This algorithm retained the accuracy of the classification-based segmentation (being only 0.04 percent slightly lower),
and meanwhile reduced the computation time by 40 percent.
4 Conclusions
Owing to the imperfections of image acquisition systems, image pre-processing such
as image filtering and histogram manipulation is performed to remove noise and
Nomenclature 53
enhance contrast for the purpose of facilitating subsequent processing. Later, image
segmentation is conducted to discriminate food products from the background for
further analysis.
Thresholding-based segmentation segments images by their histograms using an
optimal threshold that can be chosen by manual selection, isodata algorithm, objective functions, clustering, and many other techniques. Image-closing and -opening are
sometimes employed to correct the segmentation errors produced by thresholding.
In region-based segmentation, two schemes might be considered – growing-andmerging, and splitting-and-merging. Gradient-based segmentation, also known as edge
detection, is segmenting images by detecting the edges of objects, utilizing gradient
operators, derivative operators, and active contour models. In classification-based segmentation, pixels are allocated to different classes (e.g. objects and background) by
features such as intensity and texture. Other techniques, such as the use of watersheds,
have also been developed.
Despite this, because image segmentation is by nature still an ill-defined problem, none of the methods described can perform ideally across diverse images. It has
been suggested recently that several techniques might be combined together for the
sake of improving the segmentation result and simultaneously increasing segmentation
i, j
x, y
first-order cumulative
between-class variance
zeroth-order cumulative
partition error
transformed image
normalized histogram
maximum intensity
minimum intensity
size of image filters
average intensity
weight of image filters
54 Image Segmentation Techniques
fuzzy logic
neural networks
PCA principal component analysis
SOM self-organizing map
statistical learning
SVM support vector machines
Abutaleb AS (1989) Automatic thresholding of grey-level pictures using two-dimensional
entropies. Pattern Recognition, 47 (1), 22–32.
Amadasun M, King R (1989) Textural features corresponding to textural properties. IEEE
Transactions on Systems, Man, and Cybernetics, 19 (5), 1264–1274.
Bhanu B, Lee S, Ming J (1995) Adaptive image segmentation using a genetic algorithm.
IEEE Transactions on Systems, Man, and Cybernetics, 25 (12), 1543–1567.
Blasco J, Aleixos N, Moltó E (2003) Machine vision system for automatic quality grading
of fruit. Biosystems Engineering, 85 (4), 415–423.
Canny J (1986) A computational approach to edge detection. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 8 (6), 679–698.
Casasent D, Talukder A, Keagy P, Schatzki T (2001) Detection and segmentation of items
in X-ray imagery. Transactions of the ASAE, 44 (2), 337–345.
Chtioui Y, Panigrahi S, Backer LF (2003) Self-organizing map combined with a
fuzzy clustering for color image segmentation. Transactions of the ASAE, 46 (3),
Du C-J, Sun D-W (2004) Shape extraction and classification of pizza base using computer
vision. Journal of Food Engineering, 64 (4), 489–496.
Du C-J, Sun D-W (2006a) Automatic measurement of pores and porosity in pork ham
and their correlations with processing time, water content and texture. Meat Science,
72 (2), 294–302.
Du C-J, Sun D-W (2006b) Estimating the surface area and volume of ellipsoidal ham using
computer vision. Journal of Food Engineering, 73 (3), 260–268.
Du C-J, Sun D-W (2006c) Learning techniques used in computer vision for food quality
evaluation: a review. Journal of Food Engineering, 72 (1), 39–55.
Faucitano L, Huff P, Teuscher F, Cariepy C, Wegner J (2005) Application of computer image
analysis to measure pork marbling characteristics. Meat Science, 69 (3), 537–543.
Gauch JM (1992) Investigations of image contrast space defined by variations on histogram
equalization. CVGIP: Graphical Models and Image Processing, 54 (4), 269–280.
Hatem I, Tan J (2003) Cartilage and bone segmentation in vertebra images. Transactions
of the ASAE, 46 (5), 1429–1434.
Hwang H, Park B, Nguyen M, Chen Y-R (1997) Hybrid image processing for robust extraction of lean tissue on beef cut surface. Computers and Electronics in Agriculture,
17 (3), 281–294.
References 55
Jain AK (1989) Fundamentals of Digital Image Processing. Englewood Cliffs:
Kapur JN, Saho PK, Wong AKC (1985) A new method for gray level picture thresholding
using the entropy of the histogram. ComputerVision, Graphics, and Image Processing,
29, 273–285.
Kass M, Witkin A, Terzoulos D (1988) Snake: active contour models. International Journal
of Computer Vision, 1 (4), 321–331.
Kittler J, Illingworth J (1985) On threshold selection using clustering criteria. IEEE
Transactions on Systems, Man, and Cybernetics, 15 (5), 652–665.
Kittler J, Illingworth J (1986) Minimum error thresholding. Pattern Recognition, 19 (1),
Kleynen O, Leemans V, Destain M-F (2005) Development of a multi-spectral vision system
for the detection of defects on apples. Journal of Food Engineering, 69 (1), 41–49.
Leemans V, Magein H, Destein M-F (1999) Defect segmentation on ‘Jonagold’ apples
using color vision and a Bayesian classification method. Computers and Electronics
in Agriculture, 23 (1), 43–53.
Marique T, Pennincx S, Kharoubi A (2005) Image segmentation and bruise identification
on potatoes using a Kohonen’s self-organizing map. Journal of Food Science, 70 (7),
Marr D, Hildreth E (1980) Theory of edge detection. Proceedings of the Royal Society of
London B, 207, 187–217.
Navon E, Miller O, Averbuch A (2005) Image segmentation based on adaptive local
thresholds. Image and Vision Computing, 23 (1), 69–85.
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Transactions
on Systems, Man, and Cybernetics, 9 (1), 62–66.
Pun T (1980) A new method for gray-level picture thresholding using the entropy of the
histogram. Signal Processing, 2 (3), 223–237.
Read W (1982) Comments on two papers in pattern recognition. IEEE Transactions on
System, Man, and Cybernetics, 12, 429–430.
Ridler TW, Calvard S (1978) Picture thresholding using an iterative selection method.
IEEE Transactions on Systems, Man, and Cybernetics, 8 (8), 630–532.
Russ J C (1999) The Image Processing Handbook, 3rd edn. Boca Raton: CRC Press.
Sahoo P, Wilkins C, Yeager J (1997) Threshold selection using Renyi’s entropy. Pattern
Recognition, 30 (1), 71–84.
Subbiah J, Ray N, Kranzler GA, Acton ST (2004) Computer vision segmentation of
the longissimus dorsi for beef quality grading. Transactions of the ASAE, 47 (4),
Sun D-W (2000) Inspecting pizza topping percentage and distribution by a computer vision
method. Journal of Food Engineering, 44 (4), 245–249.
Sun D-W, Du C-J (2004) Segmentation of complex food images by stick growing and
merging algorithm. Journal of Food Engineering, 61 (1), 17–26.
Tobias OJ, Seara R (2002) Image segmentation by histogram thresholding using fuzzy sets.
IEEE Transactions on Image Processing, 11 (12), 1457–1465.
Tsai WH (1985) Moment-preserving thresholding: a new approach. Computer Vision,
Graphics, and Image Processing, 29 (3), 377–393.
56 Image Segmentation Techniques
Vincent L, Soille P (1991) Watersheds in digital spaces: an efficient algorithm based
on immersion simulations. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 13 (6), 583–598.
Wang Y-C, Chou J-J (1996) Automatic segmentation of touching rice kernels with an active
contour model. Transactions of the ASAE, 47 (5), 1803–1811.
Xu C, Prince JL (1998) Snakes, shapes, and gradient vector flow. IEEE Transactions on
Image Processing, 7 (3), 359–363.
Zheng C, Sun D-W, Zheng L (2006) Recent applications of image texture for evaluation of
food qualities – a review. Trends in Food Science & Technology, 17 (3), 113–128.
Object Measurement
Chaoxin Zheng and Da-Wen Sun
Food Refrigeration and Computerised Food Technology,
University College Dublin, National University of Ireland,
Dublin 2, Ireland
1 Introduction
After image segmentation, where objects are discriminated from the background, the
characteristics of objects, known as object measurements, are calculated. These measurements are the core elements in a computer vision system, because they contain
useful information for image understanding and interpretation, and object classification (Ballard and Brown, 1982). In the food industry, these object measurements
carry the direct information that can be used for quality evaluation and inspection.
Unsuccessful extraction of the proper object measurements would probably result in
the failure of the computer vision system for food quality inspection.
In computers, images are stored and processed in the form of matrices. Elements of
the matrices are referred to as pixels, in which two types of information are presented –
geometric information (i.e. the location of pixels in images) and surface information
(the intensity values associated with pixels). From the geometric information, two
different object measurements can be obtained: size and shape. From the surface information, color and texture can be extracted. These four measurements – size, shape,
color, and texture – are rated as the primary types of object measurements that can be
acquired from any images (Du and Sun, 2004a).
A great number of methods have been developed for the acquisition of object measurements, including size, shape, color, and texture, over the past few decades. Even
so, there is not yet a perfect method for each type of measurement, and especially
for texture measurements. This is because of the lack of a formal and scientific definition of image texture while facing the infinite diversity of texture patterns (Zheng
et al., 2006a). There are some problems with the methods that cause them not to
work properly under certain circumstances. For example, Fourier transform, which is
a potential method for extracting shape measurements, will not work properly when
there is re-entrant on the boundary of objects (Russ, 1999).
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
58 Object Measurement Methods
The objective of this chapter is to review the current methods available for the
extraction of object measurements. The advantages and disadvantages of most methods
are also discussed in order to provide those researchers in the food industry who intend
to pursue computer vision for quality evaluation with some guidelines on choosing
effective object measurement methods.
2 Size
Since three-dimensional (3-D) information regarding objects is lost during image
acquisition unless special techniques such as structural lighting are used (Baxes, 1994),
size measurements of objects in digital images are restricted to being one-dimensional
(1-D) and two-dimensional (2-D). The measurements of volume and surface area,
which are 3-D measurements, are thus less popular. Length, width, area, and perimeter
are the preferred measurements, and especially the latter two. The area and perimeter
are calculated simply by counting the number of pixels belonging to an object, and
summing the distance between every two neighboring pixels on the boundary of the
object, respectively. No matter how irregular the shape of the object, or what its orientation is, measurements of area and perimeter are stable and efficient once the object
has been successfully segmented from the background.
Calculation of the length and width is much more complex than that of area and
perimeter, especially for food objects, which generally have very irregular shapes. Nevertheless, some measurements for length and width have been developed by researchers
and are used in the food industry. The measurements most commonly used are Feret’s
Diameter, the major axis, and the minor axis (Zheng et al., 2006b). Feret’s Diameter
is defined as the difference between the largest and the smallest of the coordinates of
an object at different rotations (Figure 3.1). The major axis is the longest line that can
be drawn across the object, and is obtained by examining the distance between every
two boundary pixels and taking the longest. The minor axis, defined as the longest line
Figure 3.1
Illustration of Feret’s Diameter.
Shape 59
that can be drawn across the object perpendicular to the major axis, can therefore be
determined after determining the major axis. Further to these, the major and minor axes
can also be defined as those in an ellipse which is fit to the object using ellipse-fitting
methods (Russ, 1999; Mulchrone and Choudhury, 2004; Zheng et al., 2006c).
One drawback to length and width measurements is that the orientation at which
the length and width are measured must be determined prior to the calculation. Since
the shape of food products generally changes during processing, the orientation at
which the length and width are calculated needs constantly to be updated. Far more
calculations will thus be required, and this is undesirable for on-line food quality
evaluation. Consequently, area and perimeter measurements are preferable to length
and width measurements for the evaluation of the size of products such as tomatoes
(Tu et al., 2000; Devaux et al., 2005), pork (Collewet et al., 2005; Faucitano et al.,
2005), and grains (Srikaeo et al., 2006).
3 Shape
Shape, as with size, is another geometric measurement of food products. Furthermore, shape plays an important part in the purchase decision of customers (Leemans
and Destain, 2004), and this establishes the significance of shape measurement in
the applications of computer vision for food quality inspection. Typical applications
of shape include the evaluation of product acceptance to customers, using machine
learning techniques (Du and Sun, 2004a, 2006; Leemans and Destain, 2004), and
the discrimination of products with different characteristics (Ghazanfari and Irudayaraj; 1996; Zion et al., 1999; 2000). An example illustrating pizza bases with
different shapes is shown in Figure 3.2 (Du and Sun, 2004b). Along with these
applications, many methods have been developed to characterize product shapes,
including two major categories – size-dependent measurements and size-independent
Figure 3.2 Pizza bases of different shapes (Du and Sun, 2004b): (a) flowing; (b) poor alignment; (c) poor processing; (d) standard.
60 Object Measurement Methods
3.1 Size-dependent measurements
Size-dependent measurements (SDM) are descriptors of shape. These descriptors are
formed by the proper combinations of size measurements. The SDM that have been
applied in the food industry include (Zheng et al., 2006b):
Compactness, which is the ratio of area over the square perimeter
Elongation, which is the ratio of the major axis over the minor axis
Convexity, which is the ratio of the convex perimeter over the perimeter
Roughness, which is the ratio of area over the square major axis.
It can be seen that the definitions of these SDM are easy to understand, and that their
calculation is also straightforward. Compactness provides a good example of how to
describe shape by using SDM. For a perfectly circular food product, the largest value
of compactness, 1, is reached. Variations of the shape, as more and more corners are
added to the product, will gradually reduce the value of compactness.
3.2 Size-independent measurements
The ideal measurement of shape is that which can be used to discriminate one shape
adequately from another. In other words, with this ideal measurement, every shape
has a unique value (Russ, 1999). It is thus a matter of concern that size-dependent
measurements (SDM) may be insufficient to characterize the shape of every food
product because of the great irregularities of shape – consider, for example, a head
of broccoli, and the entire body of a fish. The chance of two different, very irregular
shapes having the same value under these simple combinations of size measurements
is still very large. Size-independent measurements (SIM), including region-based and
boundary-based methods, have consequently been developed.
3.2.1 Region-based method
The region-based method, also known as the spatial moment, is based on the statistical
characteristics of object regions. As pixels are the basic elements forming an object
region in digital images, the spatial moment consists of the statistics regarding the
spatial information of all pixels inside the object (Jain, 1989). The most basic measurement by spatial moment is the centre of mass (x, y), which can be calculated by
the following equations:
x =
N x y
y =
N x y
where N is the total number of pixels inside the object, and (x, y) is the coordinates of
a pixel. The (p, q) order of the central moment can thus be obtained by:
Mpq =
(x − x)p (y − y)q
Shape 61
Actually, the spatial moment measures the properties of an object rather than those
of its shape. It is an effective method for the purpose of discriminating one shape
from another, whereas its function in describing the changes of object shapes vividly is
limited (Zheng et al., 2006b). Applications of the spatial moment can be found in the
classification of fish specifies (Zion et al., 1999, 2000), where the Fourier transform
cannot work properly due to the re-entrants presented on the boundary of fish bodies.
3.2.2 Boundary-based method
In contrast to the region-based method, the boundary-based method obtains shape measurements by first representing the boundary with the spatial information of boundary
pixels, and then analyzing and extracting measurements from the spatial information. Boundary representation
The simplest way to represent an object boundary is by extracting and storing the coordinates (x, y) of every pixel on the boundary in a vector. Another method of boundary
representation is called the chain code. In this method, eight directions of a pixel are
defined. Since the boundary is constituted by connected pixels, a single pixel is selected
and the directions of subsequent pixels are stored in the chain code one by one until
finally the initial pixel is reached. Furthermore, the radius from every pixel on the
boundary to the center of object can be used for boundary representation (Figure 3.3),
and thus another method has been developed in which the radiuses are described as a
function of their angles by the following equation:
r = f (θ)
Although the boundary can be represented or effectively reconstructed with the methods described above, these representations are too sensitive to the size and orientation
of objects, and thus are not directly used as shape measurements (Baxes, 1994). Instead,
Fourier transform and autoregressive models, sometimes combined with principal component analysis, are usually applied to extract the shape measurements from the vector,
chain code, and radius function, so that effects arising from the size or orientation of
the object can be eliminated.
Figure 3.3 Representing an object’s boundary by using radius function: (a) the boundary of a beef joint;
(b) the radius function of (a).
62 Object Measurement Methods Boundary analysis and classification
Fourier transform
Fourier transform (FT) reconstructs the boundary representation, in most cases the
radius function, into a summation of a series of cosine and sine terms at increasing
frequency, as in equation 3.5:
f (θ)e−i2πvθ/N
F (v) =
where u is the coefficient of the FT, and N is the total number of frequencies. The
coefficients are further used for food quality evaluation in two different ways. In the
first approach, principal component analysis is applied to all the coefficients in order to
compress the data dimension by selecting the first few principal components containing
the significant information about object shape. The selected principal components are
later employed to classify or to predict the food of, for example, such things as pizza
bases (Du and Sun, 2004b) and apples (Currie et al., 2000). In the second approach,
the absolute value of each coefficient is summed up as shape measurements. Such an
application has been set up by Zheng et al. (2006d) to predict the shrinkage of large,
cooked beef joints as affected by water-immersion cooking.
The advantages of FT are perceivable. Using the Fourier coefficients rather than
the original radius can eliminate the effects of the location, size, and orientation of
objects on the shape measurements (Schwarcz and Shane, 1969), which is difficult to
achieve by using other methods. However, the method of FT has one disadvantage. For
object shape with re-entrants, the boundary function described in equation (3.4) has
multiple values at the same entry, which will therefore cause a failure in constructing
the Fourier series. Although the problem can be solved by the integration of another
parameter into the radius function (Russ, 1999), a far greater computation load will
also be experienced. FT is consequently only preferred for the description of shapes
without re-entrants, in the food industry.
Autoregressive models
The measurements obtained from Fourier transform are useful for the classification of
different shapes. For the purpose of extracting the global measurements or the similar
characteristics of a group of shapes, autoregressive models can be used (Kashyap and
Chellappa, 1981). This method is described by the following equations:
ux ( j) =
ax (k)ux ( j − k) + εx ( j)
x( j) = ux ( j) + µx
uy ( j) =
ay (k)uy ( j − k) + εy ( j)
y( j) = uy ( j) + µy
where u is the zero mean stationary random sequence (Jain, 1989), n is the nth pixel on
the boundary, ε is the uncorrelated sequence with zero mean and a specified variance,
Color 63
and µ is the ensemble mean (Jain, 1989) of x( j) and y( j), which are the x and y
coordinates, respectively, of pixel j. In equations (3.6)–(3.9), the values of a and ε are
specific to each shape, and are therefore considered to be shape measurements.
4 Color
Color provides the basic information for human perception. Further to this, color
is also elementary information that is stored in pixels to constitute a digital image.
Color is hence rated as one of the most important object measurements for image
understanding and object description. According to the tri-chromatic theory, that color
can be discriminated by the combination of three elementary color components (Young,
1802; MacAdam, 1970), three digital values are assigned to every pixel of a color image.
Two typical statistical measurements, including the mean and variance, are obtained
from each component as color measurements. Different types of values stored for the
three color components, and different color reproduction methods using these three
values, lead to different color spaces. These spaces can be generally classified into three
types: hardware-orientated, human-orientated, and instrumental. The measurements of
color are dependent on these spaces.
4.1 Hardware-orientated spaces
Hardware-orientated spaces are developed in order to facilitate hardware processing,
such as capturing, storing, and displaying. The most popular hardware-orientated space
is the RGB (red, green, blue) space, so-called because this is the way in which cameras
sense natural scenes and display phosphors work (Russ, 1999). RGB is consequently
used in most computers for image acquisition, storage, and display. Color in the RGB
space is defined by coordinates on three axes, i.e. red, green, and blue, as illustrated
in Figure 3.4. Apart from RGB, another popular hardware-orientated space is the YIQ
Figure 3.4 Illustration of the RGB color space.
64 Object Measurement Methods
(luminance, in-phase, quadrature) space, which is mainly used for television transmission. RGB space is transformed into YIQ space by using equation (3.10) to separate
the luminance and the chrominance information in order to facilitate compression
applications (Katsumata and Matsuyama, 2005).
⎤⎡ ⎤
⎡ ⎤ ⎡
0.114 R̂
⎣ Î ⎦ = ⎣0.596 −0.275 −0.321⎦⎣Ĝ⎦
0.207 −0.497
As well as YIQ space, YUV, YCbCr, and YCC spaces are also used in color transmission; the principles are similar to that of YIQ, and thus they are not further discussed
here. CMYK (cyan, magenta, yellow, black) is also a hardware-orientated color space.
However, CMYK is mainly employed in printing and copying output, and hence is not
used for color measurements in the food industry.
By combining values from each component in the hardware-orientated spaces, color
can be effectively measured. Even a very small variation in color can be sensed.
The hardware-orientated spaces are therefore popular in evaluating color changes of
food products during processing. For instance, small variations of color measurements
obtained from the RGB space can be used to describe changes of temperature and time
during the storage of tomatoes (Lana et al., 2005). Nevertheless, hardware-orientated
spaces are non-linear with regard to the visual perception of human eyes, and consequently are not capable of evaluating the sensory properties of food products. In order
to achieve this, human-orientated color spaces are used.
4.2 Human-orientated spaces
Human-orientated spaces, which include HSI (hue, saturation, intensity), HSV (hue,
saturation, value), and HSL (hue, saturation, lightness), have been developed with the
aim of corresponding to the concepts of tint, shade, and tone, which are defined by an
artist based on the intuitive color characteristics. Hue is measured by the distance of the
current color position from the red axis, which is manifested by the difference in color
wavelengths (Jain, 1989). Saturation is a measurement of the amount of color – i.e.
the amount of white light that is present in the monochromatic light (Jain, 1989; Russ,
1999). The last component – intensity, value, or lightness – refers to the brightness
or luminance, defined as the radiant intensity per unit projected-area by the spectral
sensitivity associated with the brightness sensation of human vision (Hanbury, 2002).
Compared with RGB space, which is defined by cuboidal coordinates, the coordinates
used to define color in HSI, HSV, and HSL are cylindrical (see Figure 3.5). The relationship between the RGB space and the HSI space can be described by the following
2R̂ − Ĝ − B̂
+ π /2π if (Ĝ < B̂)
⎨ 2 − tan
3(Ĝ − B̂)
Ĥ = (3.11)
if (Ĝ > B̂)
− tan
3(Ĝ − B̂)
Color 65
Figure 3.5 Illustration of the HSI color space.
Ŝ = 1 −
Î =
min(R̂, Ĝ, B̂)
R̂ + Ĝ + B̂
As specified above, HSI space has been developed by considering the concept of
visual perception in human eyes; color measurements obtained from HSI are thus
better related to the visual significance of food surfaces. There is therefore greater
correlation between the color measurements from human-orientated spaces and the
sensory scores of food products. This has been clarified by a study in which color measurements from the HSV space were found to give a better performance than those from
the RGB space in the evaluation of acceptance of pizza toppings (Du and Sun, 2005).
However, the defect of human-orientated spaces is that they, as with human vision, are
not sensitive to a small amount of color variation. Therefore, human-orientated color
spaces are not suitable for evaluating changes of product color during processing.
4.3 Instrumental spaces
Instrumental spaces are developed for color instruments, such as the colorimeter and
colorimetric spectrophotometer. Many of these spaces are standardized by the CIE
(Commission International de L’Éclairage) under the specifications of lighting source,
observer, and methodology spectra (Rossel et al., 2006). The earliest such space is
the one named XYZ, where Y represents the lightness while X and Z are two primary
66 Object Measurement Methods
virtual components (Wyszecki and Stiles, 1982). Equation (3.14) can be used to convert
color measurements linearly from RGB space to XYZ space.
⎤⎡ ⎤
⎡ ⎤ ⎡
0.412453 0.357580 0.180423 R̂
⎣Ŷ ⎦ = ⎣0.212671 0.715160 0.072169⎦⎣Ĝ⎦
0.019334 0.119194 0.950227
Although it is useful in defining color, XYZ is not ideal for the description of color
perception in human vision. CIE La∗ b∗ and CIE Lu∗ v∗ color spaces, which are the
non-linear transformation of XYZ as described below, are thus brought out and adopted
in many color measuring instruments.
116 × (Ŷ /Y )1/3 − 16 if (Ŷ /Y ) > 0.008856
L̂ =
903.3 × (Ŷ /Y )
a∗ = 500[(X̂ /X )1/3 − (Ŷ /Y )1/3 ]
b∗ = 200[(Ŷ /Y )1/3 − (Ẑ/Z )1/3 ]
u∗ = 13 × L̂ × (u − u )
v ∗ = 13 × L̂ × (v − v )
where X , Y , and Z are the values corresponding to the standardized point D65 shown
⎡ ⎤ ⎡
⎣Y ⎦ = ⎣ 100 ⎦
Here, u , u , v , and v are determined by equations (3.21)–(3.24), respectively:
u =
u =
v =
v =
X̂ + 15Ŷ + 3Ẑ
4X X + 15Y + 3Z 9Ŷ
X̂ + 15Ŷ + 3Ẑ
9Y X + 15Y + 3Z (3.21)
The color component L is referred to as the lightness or luminance, while a∗ (u∗ ) is
defined along the axis of red–green, and b∗ (v∗ ) is defined along the axis of yellow–
blue. A positive value of a∗ (u∗ ) indicates that red is the dominant color, while a negative
value suggests the dominance of green. The same applies the b∗ (v∗ ) component on the
yellow–blue axis – a positive value indicates that yellow is dominant, while a negative
value suggests the dominance of blue (Russ, 1999).
Texture 67
Since color measured by computer vision can be easily compared to that obtained
from instruments, these instrumental color spaces offer a possible way of evaluating the
performance of computer vision systems in measuring object color. Such an application
was previously established by O’Sullivan et al. (2003) for the grading of pork color.
5 Texture
Starting in the 1950s, when the first research paper on image texture appeared (Kaizer,
1955), image texture analysis has been another active research topic in computer
vision and image processing. Texture effectively describes the properties of elements
constituting the object surface, thus texture measurements are believed to contain substantial information for the pattern recognition of objects (Amadasun and King, 1989).
Although texture can be roughly defined as the combination of some innate image properties, including fineness, coarseness, smoothness, granulation, randomness, lineation,
hummocky, etc., a strictly scientific definition for texture has still not been determined
(Haralick, 1979). Accordingly, there is no ideal method for measuring texture. Nevertheless, a great number of methods have been developed, and these are categorized
into statistical, structural, transform-based, and model-based methods (Zheng et al.,
2006a). These methods capture texture measurements in two different ways – by the
variation of intensity across pixels, and by the intensity dependence between pixels and
their neighboring pixels (Bharati et al., 2004).
5.1 Statistical methods
In statistical methods, a matrix containing the higher order of image histograms is constructed from the intensities of pixels and their neighboring pixels. Statistics of matrix
elements are then obtained as texture measurements. Statistical methods are effective
in capturing micro-texture but are not ideal for analyzing macro-texture (Haralick,
1979), and thus they are suitable for analysis images from video cameras. Some of the
applications include classification of beef tenderness (Li et al., 1999), identification
of grains (Paliwal et al., 2003a, 2003b), and sorting of apples (Fernández et al., 2005).
Currently developed statistical methods include the co-occurrence matrix (Haralick
et al., 1973), the run-length matrix (Galloway, 1975), and the neighboring dependence
matrix (Sun and Wee, 1983).
5.1.1 Co-occurrence matrix
The co-occurrence matrix P is built according to the intensity co-occurrence between
pixels and their neighboring pixels, which can be described by equation (3.25):
max(|x 1 − x 2 |, |y − y |) = d 1
P(i, j, d, θ) = N (x 1 , y 1 ), (x 2 , y 2 ) ∈ W × W ((x 1 , y 1 ), (x 2 , y 2 )) = θ
I(x 1 , y ) = i, I(x 2 , y ) = j
68 Object Measurement Methods
where i and j are two different intensity values; (x1 , y1 ) and (x2 , y2 ) indicate two
pixels with the distance d and orientation θ; and W is the size of images. The matrix
is normalized, and texture measurements consisting of fourteen statistics are obtained
from it (Haralick et al., 1973). However, only seven of these are rated as important
texture measurements (Gao and Tan, 1996a, 1996b; Zheng et al., 2006a), and these are
listed in the appendix to this chapter.
5.1.2 Run-length matrix
Extraction of the run-length matrix R can be described by equation (3.26):
L(pr) = i
R(i, j, T ) = N pr I(pr) = j
where T is the threshold used for merging pixels into pixel-runs, r indicates pixel-runs,
L is the length of pixel-runs, and I is the average intensity of pixel-runs. A pixel-run is
a chain of connecting pixels with the similar intensity in the same row. Similar to the
co-occurrence matrix, the run-length matrix is normalized and texture measurements
are obtained with five statistical approaches (Galloway, 1975), which are also presented
in the appendix.
5.1.3 Neighboring dependence matrix
The neighboring dependence matrix (NDM) is dependent on two parameters, i.e.
distance d and threshold T . Construction of the NDM is described by equation (3.27):
I(x, y) = i
Q(i, j, d, T ) = N (x, y) N (x 1 , y ) |I(x, y) − I(x 1 , y 1 )| ≤ T
max(|x − x 1 |, | y − y |) ≤ d
where (x, y) and (x1 , y1 ) denote a pixel and its neighboring pixel. The NDM is normalized before the extraction of statistical measurements (see appendix) for texture
5.2 Structural methods
Structural methods are based on some textural elements or structural primitives that
occur repeatedly under the constraint of certain placement rules (Starovoitov et al.,
1998). This is particularly popular in the analysis of textile (Palm, 2004). However, in
the food industry, because the texture patterns in food images are very irregular, it is
impossible to summarize a textural element or a structural primitive that can describe
the texture constitution of food surfaces (Zheng et al., 2006a). Structural methods are
therefore rarely used in the food industry and are not further discussed here.
5.3 Transform-based methods
Transform-based methods extract texture measurements from images that are transformed from the original image using the convolution mask, Fourier transform, and
Texture 69
wavelet transform methods. Adjusted by parameters used during image transform,
transform-based methods are suitable for both micro-texture and macro-texture patterns. However, the problem with transform-based methods is the greatly increased
computation and storage load required while processing the transformed images, which
will significantly reduce analysis speed. This is undesirable in the food industry, especially for on-line food quality inspection, because the inspecting process of every
product needs to be accomplished within the time limit for conveying the product
through the evaluation system.
5.3.1 Convolution mask
With the convolution mask (CM), images are transformed by equation (3.28) from the
spatial domain into the feature domain for the revelation of objects such as edges, spots,
and lines (Patel et al., 1996).
N(k, l)I(x + k, y + l)
I (x, y) =
where I is the intensity of the transformed image from which texture measurements
can be obtained by statistics, mostly mean and standard deviation. The most popular
CM used to extract image texture is the Law’s mask, consisting of nine operators
that are obtained by the multiplication of three vectors – [−1, 0, 1], [1, 2, 1], and
[−1, 2, −1]. Another CM, the Gabor filter, has become more and more popular in
texture classification in recent years, because the Gabor filter processes and extracts
texture measurements with regard to three important parameters: space, frequency, and
orientation. However, further detail of the Gabor filter is beyond our discussion here;
interested readers might refer to the works by Daugman (1985), Kruizinga and Petkov
(1999), and Setchell and Campbell (1999).
5.3.2 Fourier transform
Images are transformed into new forms by Fourier transform (FT) with regard to their
spatial frequency of pixel intensities. From the FT magnitude images, texture measurements relating to the variation of pixel intensity can be obtained by statistical means.
As images are in the form of two-dimensional matrices with discrete intensity values,
a two-dimensional discrete FT is normally applied, which can be typically written as
in equation (3.29):
F (v x , v y ) =
y −1
x −1 N
f (x, y) e−2j(2π/N x )v x x e−2j(2x/N y )v y y
x=0 y=0
where v denotes the Fourier coefficients. FT has been used in the food industry for
measuring the color changes in the surface of chocolate (Briones and Aguilera, 2005).
5.3.3 Wavelet transform
The use of wavelet transform (WT) to extract texture measurements is based on the
multiresolution representation scheme, which is believed to be a formal representation
70 Object Measurement Methods
for any entities, including image texture (Mallat, 1989; Meyer, 1994). With WT, images
are decomposed into different resolutions from which texture measurements regarding
the different textural properties, from global texture at coarse resolution to local texture
at fine resolution, can be obtained. Performance of WT has been found to exceed that of
statistical methods in the food industry, including in the prediction of the chemical and
physical properties of beef (Huang et al., 1997) and the sensory characteristics of pork
(Cernadas et al., 2005). Three two-dimensional wavelets in three different directions –
horizontal (along the x axis), vertical (along the y axis), and diagonal (along y = x),
are first defined respectively as follows:
H (x, y) = φ(x)ψ(y)
V (x, y) = ψ(x)φ(y)
D (x, y) = ψ(x)ψ(y)
where φ is the scaling function, and ψ is the one-dimensional wavelet. Afterwards,
wavelet decomposition can be performed using equations (3.33)–(3.36), as proposed
by Mallat (1989):
Nx Ny
A2i =
I(x, y)φ2i (x − 2−i n)φ2i (y − 2−i m)dxdy
−N x
H2i =
−N x
V 2i =
−N x
D2i =
−N x
−N y
−N y
−N y
−N y
I(x, y)φ2i (x − 2−i n)ψ2i (y − 2−i m)dxdy
I(x, y)ψ2i (x − 2−i n)φ2i (y − 2−i m)dxdy
I(x, y)ψ2i (x − 2−i n)ψ2i (y − 2−i m)dxdy
where A, H , V , and D represent the approximation, horizontal signals, vertical signals,
and diagonal signals, respectively, of the original image at the resolution of 2i . Parameters m and n stand for two sets of integers. An illustration of wavelet transform for
beef images is displayed in Figure 3.6 (Zheng et al., 2006e).
5.4 Model-based methods
In model-based methods, a model with unknown coefficients simulating the dependence of pixels and their neighboring pixels is first set up. By regressing the model with
information from images, coefficients can be calculated as texture measurements. The
different models developed have led to the different model-based methods, i.e. fractal
models and the autoregressive model.
5.4.1 Fractal model
Surface intensity, showing the intensity value of pixels against their coordinates of
an image, is obtained and assumed to be a fractal (Pentland, 1984), which is defined
Texture 71
Stage 4
Stage 3
Stage 2
Stage 1
Figure 3.6 Wavelet transform of a beef image (Zheng et al., 2006e): (a) original image; (b) wavelet
transform of the region within the white boundary in (a).
as an object that remains the same regardless of the scale of observation (Quevedo
et al., 2002). Texture measurements are thus obtained by the fractal dimension (FD),
i.e. the dimension of the fractal (surface intensity in images), and can be determined
by equation (3.37):
L(φ) = Cφ1−FD
where L is a unit measurement such as perimeter, surface area, or volume; φ indicates
the scale used; C is a constant associated with the unit measurement; and FD can be
determined by a logarithmic regression against the observation scale φ. Employment
of the different unit measurements will lead to the different fractal methods, such
as the blanket method, the box counting method, and the frequency domain method
(Quevedo et al., 2002). Fractal models are useful for describing the surface variation
of food products such as pumpkin and chocolate (Quevedo et al., 2002).
5.4.2 Autoregressive model
The autoregressive model, which is a stochastic model-based approach, explicitly
describes the spatial relationship between pixels and their neighboring pixels while
characterizing image texture (Kartikeyan and Sarkar, 1991). The dependency between
pixels and their neighboring pixels in an image is expressed as a linear model, whose
coefficients are later determined as texture measurements by regressing the model
(Haralick, 1979; Thybo et al., 2004). However, there is no fast way to compute
the regression coefficients, and thus the method is not commonly used in the food
72 Object Measurement Methods
6 Combined measurements
Recently, there has been a trend towards using more than one kind of object measurement (size, shape, color, and texture) in the applications of computer vision in the
food industry. This is driven by two factors. The first is the rapid development of computer hardware, which has significantly increased the computing speed and computer
storage, and therefore the number of considered object measurements has little or no
impact on the computing speed. The second is based on the fact that quality evaluation is the most important issue that computer vision is used for in the food industry.
Food quality is complex, being determined by the combination of sensory, nutritive,
hygienic-toxicological, and technological properties (McDonald, 2001). More than one
quality attribute will therefore be considered in most of the manual food quality grading
systems. Furthermore, both geometrical measurements (size and shape) and surface
measurement (color and texture) provide useful information regarding defect detection and the class discrimination of food products (Paliwal et al., 2003, 2003b; Diaz
et al., 2004). It is therefore of great significance that the precision of computer vision
systems can be improved when more object measurements are taken into account. For
instance, the correlation coefficient has been found to be only 0.30 when using marbling characteristics (size measurements) and color measurements to indicate beef
tenderness, whereas introducing texture measurements into the classification variables
significantly increased the correlation coefficient, to 0.72 (Li et al., 1999).
7 Conclusions
There are four kinds of object measurements that can be obtained from images – size,
shape, color, and texture – and which contain significant information for food quality
evaluation. Size and shape are two geometrical measurements, while color and texture
are measurements of the object surface.
Area, perimeter, width, and length are four of the primary measurements of object
size. Area and perimeter are preferable to length and width, because they are more
reliable and more easily extracted.
Shape measurements can be categorized into two groups – size-dependent measurements (SDM) and size-independent measurements (SIM). The former work mostly for
objects whose shape is more or less regular, while the latter are especially suitable for
describing shapes with great irregularities.
Color measurements are dependent on the color spaces used, which include
hardware-orientated, human-orientated, and instrumental spaces. Hardware-orientated
spaces are developed for the purpose of facilitating computer hardware processes;
human-orientated spaces are aimed to help the human understanding of color; and
instrumental spaces are employed for the comparison of computer measurements with
those obtained from instruments.
Techniques that are available for the extraction of texture measurements include
statistical, structural, transform-based, and model-based methods. Statistical methods
Nomenclature 73
are competent for the analysis of micro-texture patterns. Although transform-based
methods are suitable for both micro- and macro-texture patterns, a great deal of computation and computer storage is required. The model-based methods are limited by
the lack of a fast way to regress the model.
By the proper integration of different types of object measurements, the accuracy of
computer vision for food quality inspection may be increased.
I, I
i, j, k, l
m, n
p, q
uncorrelated sequence
ensemble mean
scaling function
one-dimensional wavelet
two-dimensional wavelet
color component of a∗
color component of blue
color component b∗
diagonal signal
Fourier transform
color component of green
horizontal signal
color component of hue
color component of intensity
index parameters
unit measurement
color component of luminance
set of integers
number of elements in the set
co-occurrence matrix
order of the moments
neighboring dependence matrix
color component of quadrature
run-length matrix
color component of red
74 Object Measurement Methods
u , u
ν , ν
X , Y , Z x, y, x1 , y1 , x2 , y2
x̄, ȳ
color component of saturation
Fourier coefficients
zero mean stationary random sequence
parameters used to calculate u∗ color component
color component of u∗
vertical signal
parameters used to calculate v∗ color component
color component of v∗
size of images
color component of X
values of XYZ space at standard point D65
center of mass
color component of Y
color component of Z
x, y
horizontal signal
diagonal signal
vertical signal
convolution mask
fractal dimension
Fourier transform
SDM size-dependent measurements
SIM size-independent measurements
wavelet transform
Statistical measurements of co-occurrence matrix
Angular second moment (ASM):
P 2 (k, l)
Appendix 75
Contrast (CT):
CT =
⎜ j2 ⎜
P(k, l)⎟
Mean value (µ):
kP(k, l)
Sum of squares (SOS):
SOS(σ 2 ) =
(k − µ)2 P(k, l)
Correlation (CR):
CR =
(kl)P(k, l) − µ2
Inverse difference moment (IDM):
P(k, l)
1 + (k − l)2
Entropy (ET):
ET = −
P(k, l) log (P(k,l))
Statistical measurements of run-length matrix
Short run (SR):
R(k, l)
SR = k
Long run (LR):
l 2 R(k, l)
LR = k
R(k, l)
76 Object Measurement Methods
Non-uniformity (NU):
NU =
R(k, l)
R(k, l)
Run-length non-uniformity (RLE):
R(k, l)
R(k, l)
Run percent (RP) describing the grainy of images:
R(k, l)
RP = lR(k, l)
Statistical measurements of neighboring
dependence matrix
Small number emphasis (SNE):
Q(k, l)
SNE = l2
Q(k, l)
Large number emphasis (LNE):
l 2 Q(k, l)
LNE = k
Second moment (SM):
Q2 (k, l)
SE = k
Q(k, l)
Q(k, l)
References 77
Number of non-uniformity (NNU):
EM =
Q(k, l)
Entropy of the matrix (EM):
Q(k, l)
Q(k, l) log (Q(k, l))
Q(k, l)
Amadasun M, King R (1989) Textural features corresponding to textural properties. IEEE
Transactions on Systems, Man, and Cybernetics, 19, 1264–1274.
Ballard DA, Brown CM (1982) Computer Vision. Englewood Cliffs: Prentice-Hall.
Baxes GA (1994) Digital Image Processing Principle and Applications. New York: John
Wiley & Sons.
Bharati MH, Liu JJ, MacGregor JF (2004) Image texture analysis: methods and comparisons. Chemometrics and Intelligence Laboratory Systems, 72, 57–71.
Briones V, Aguilera JM (2005) Image analysis of changes in surface color of chocolate.
Food Research International, 38, 87–94.
Cernadas E, Carrión P, Rodriguez PG, Muriel E, Antequera T (2005) Analyzing magnetic
resonance images of Iberian pork loin to predict its sensorial characteristics. Computer
Vision and Image Understanding, 98, 345–361.
Collewet G, Bogner P, Allen P, Busk H, Dobrowolski A, Olsen E, Davenel A (2005)
Determination of the lean meat percentage of pig carcasses using magnetic resonance
imaging. Meat Science, 70, 563–572.
Currie AJ, Ganeshanandam S, Noiton DA, Garrick D, Shelbourne CJA, Orgaguzie N (2000)
Quantitative evaluation of apple fruit shape (Malus × domestica Borkh.) by principal
component analysis of Fourier descriptors. Euphytica, 111, 221–227.
Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and
orientation optimized by two-dimensional visual cortical filters. Journal of Optical
Society of America, 2, 1160–1169.
Devaux MF, Barakat A, Robert P, Bouchet B, Guillon F, Navez B, Lahaye M (2005) Mechanical breakdown and cell structure of mealy tomato pericarp tissue. Postharvest Biology
and Technology, 37, 209–221.
Diaz R, Gil L, Serrano C, Blasco M, Moltó E, Blasco J (2004) Comparison of three algorithms in the classification of table olives by means of computer vision. Journal of
Food Engineering, 61, 101–107.
78 Object Measurement Methods
Du CJ, Sun D-W (2004a) Recent development in the applications of image processing
techniques for food quality evaluation. Trends in Food Science & Technology, 15,
Du CJ, Sun D-W (2004b) Shape extraction and classification of pizza base using computer
vision. Journal of Food Engineering, 64, 489–496.
Du CJ, Sun D-W (2005) Comparison of three methods for classification of pizza topping using different color space transformations. Journal of Food Engineering, 66,
Du CJ, Sun D-W (2006) Learning techniques used in computer vision for food quality
evaluation: a review. Journal of Food Engineering, 72, 39–55.
Faucitano L, Huff P, Teuscher F, Gariepy C, Wegner J (2005) Application of computer
image analysis to measure pork marbling characteristics. Meat Science, 69, 537–543.
Fernández L, Castillero C, Aguilera JM (2005) An application of image analysis to
dehydration of apple discs. Journal of Food Engineering, 67, 185–193.
Galloway MM (1975) Texture analysis using grey level run lengths. Computer Vision,
Graphics, and Image Processing, 4, 172–179.
Gao X, Tan J (1996a) Analysis of expended-food texture by image processing part I:
geometric properties. Journal of Food Process Engineering, 19, 425–444.
Gao X, Tan J (1996b) Analysis of expended-food texture by image processing part II:
mechanical properties. Journal of Food Process Engineering, 19, 445–456.
Ghazanfari A, Irudayaraj J (1996) Classification of pistachio nuts using a string matching
technique. Transactions of the ASAE, 39, 1197–1202.
Hanbury A (2002) The taming of the hue, saturation, and brightness color space. In
CVWW ’02 – Computer Vision Winter Workshop (Widenauer H, Kropatsch WG, eds).
Autriche: Bad Aussee, pp. 234–243.
Haralick RM (1979) Statistical and structural approaches to texture. Proceeding of the
IEEE, 67, 786–804.
Haralick RM, Shanmugan K, Dinstein I (1973) Textural features for image classification.
IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621.
Huang Y, Lacey RE, Moore LL, Miller RK, Whittaker AD, Ophir J (1997) Wavelet textural
features from ultrasonic elastograms for meat quality prediction. Transactions of the
ASAE, 40, 1741–1748.
Jain AK (1989) Fundamentals of Digital Image Processing. Englewood Cliffs:
Kaizer H (1955) A quantification of texture on aerial photographs. Technology Note 121,
AD 69484, Boston University Research Laboratory, Boston, MA, USA.
Kartikeyan B, Sarkar A (1991) An identification approach for 2-D autoregressive models
in describing textures. Graphical Models and Image Processing, 53, 121–131.
Kashyap RL, Chellappa R (1981) Stochastic models for closed boundary analysis:
representation and reconstruction. IEEE Transactions on Information Theory, 27,
Katsumata N, Matsuyama Y (2005) Database retrieval for similar images using ICA and
PCA bases. Engineering Applications of Artificial Intelligence, 18, 705–717.
Kruizinga P, Petkov N (1999) Nonlinear operator for oriented texture. IEEE Transactions
on Image Processing, 8, 1395–1407.
References 79
Lana MM, Tijskens LMM, van Kooten O (2005) Effects of storage temperature and fruit
ripening on firmness of fresh cut tomatoes. Postharvest Biology and Technology, 35,
Leemans V, Destain MF (2004) A real-time grading method of apple based on features
extracted from defects. Journal of Food Engineering, 61, 83–89.
Li J, Tan J, Martz FA, Heymann H (1999) Image texture features as indicators of beef
tenderness. Meat Science, 53, 17–22.
MacAdam DL (1970) Sources of Color Science. Cambridge: MIT Press.
Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674–693.
McDonald K (2001) Effect of Vacuum Cooling on Processing Time, Mass Loss, Physical
Structure and Quality of Large Cooked Beef Products. PhD Thesis, University College
Dublin, National University of Ireland.
Meyer Y (1994) Wavelets: Algorithms & Applications. Philadelphia: Society for Industrial
and Applied Mathematics.
Mulchrone KF, Choudhury KR (2004) Fitting an ellipse to an arbitrary shape: implication
for strain analysis. Journal of Structural Geology, 26, 143–153.
O’Sullivan MG, Byrne DV, Martens H, Gidskehaug LH, Andersen HJ, Martens M (2003)
Evaluation of pork color: prediction of visual sensory quality of meat from instrumental and computer vision methods of color analysis. Meat Science, 65, 909–918.
Paliwal J, Visen NS, Jayas DS, White NDG (2003a) Cereal grain and dockage identification
using machine vision. Biosystems Engineering, 85, 51–57.
Paliwal J, Visen NS, Jayas DS, White NDG (2003b) Comparison of a neural network and a
non-parametric classifier for grain kernel identification. Biosystems Engineering, 85,
Palm C (2004) Color texture classification by integrative co-occurrence matrices. Pattern
Recognition, 37, 965–976.
Patel D, Davies ER, Hannah I (1996) The use of convolution operators for detecting
contaminants in food images. Pattern Recognition, 29, 1019–1029.
Pentland AP (1984) Fractal-based description of natural scenes. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 6, 661–674.
Quevedo R, Carlos LG, Aguilera JM, Cadoche L (2002) Description of food surfaces
and microstructural changes using fractal image texture analysis. Journal of Food
Engineering, 53, 361–371.
Rossel RAV, Minasny B, Roudier P, McBratney AB (2006) Color space models for soil
science. Geodema, in press.
Russ JC (1999) Image Processing Handbook, 3rd edn. Boca Raton: CRC Press.
Schwarcz HP, Shane KC (1969) Measurement of particle shape by Fourier analysis.
Sedimentology, 13, 213–231.
Setchell CJ, Campbell NW (1999) Using color Gabor texture features for scene understanding. Proceeding of 7th International Conference on Image Processing and Its
Application, pp. 372–376, Manchester, UK.
Srikaeo K, Furst JE, Ashton JF, Hosken RW (2006) Microstructural changes of starch
in cooked wheat grain as affected by cooking temperatures and times. LWT – Food
Science and Technology, 39, 528–533.
80 Object Measurement Methods
Starovoitov VV, Jeong SY, Park RH (1998) Texture periodicity detection: features, properties, and comparisons. IEEE Transactions on Systems, Man, and Cybernetics – Part A:
Systems and Humans, 28, 839–849.
Sun C, Wee WG (1983) Neighbouring grey level dependence matrix for texture classification. Computer Vision, Graphics, and Image Processing, 23, 341–352.
Thybo AK, Szczypiński PM, Karlsson AH, Dønstrup S, Stødkilde-Jørgensen HS, Andersen HJ (2004) Prediction of sensory texture quality attributes of cooked potatoes by
NMR-imaging (MRI) of raw potatoes in combination with different imaging analysis
methods. Journal of Food Engineering, 61, 91–100.
Tu K, Jancsók P, Nicolaï B, Baerdemaeker JD (2002) Use of laser-scatting imaging to
study tomato-fruit quality in relation to acoustic and compression measurements.
International Journal of Food Science and Technology, 35, 503–510.
Wyszecki G, Stiles WS (1982) Color Science: Concepts and Methods, Quantitative Data
and Formulae, 2nd edn. New York: John Wiley & Sons.
Young T (1802) On the theory of light and colors. Philosophical Transactions of the Royal
Society of London, 92, 20–71.
Zheng C, Sun D-W, Zheng L (2006a) Recent development of image texture for evaluation
of food qualities – a review. Trends in Food Science & Technology, 17, 113–128.
Zheng C, Sun D-W, Zheng L (200b) Recent developments and applications of image features for food quality evaluation and inspection – a review. Trends in Food Science &
Technology, 17, 642–655.
Zheng C, Sun D-W, Zheng L (2006c) Estimating shrinkage of large cooked beef joints
during air-blast cooling by computer vision. Journal of Food Engineering, 72, 56–62.
Zheng C, Sun D-W, Zheng L (2006d) Predicting shrinkage of ellipsoid beef joints as
affected by water immersion cooking using image analysis and neural network. Journal
of Food Engineering, 79, 1243–1249.
Zheng C, Sun D-W, Zheng L (2006e) Classification of tenderness of large cooked beef joints
using wavelet and Gabor textural features. Transactions of the ASAE, 49, 1447–1454.
Zion B, Shklyar A, Karplus I (1999) Sorting fish by computer vision. Computers and
Electronics in Agriculture, 23, 175–197.
Zion B, Shklyar A, Karplus I (2000) In-vivo fish sorting by computer vision. Aquaculture
Engineering, 22, 165–179.
Object Classification
Cheng-Jin Du and Da-Wen Sun
Food Refrigeration and Computerised Food Technology,
University College Dublin, National University of Ireland,
Dublin 2, Ireland
1 Introduction
The classification technique is one of the essential features for food quality evaluation
using computer vision, as the aim of computer vision is ultimately to replace the
human visual decision-making process with automatic procedures. Backed by powerful
classification systems, computer vision provides a mechanism in which the human
thinking process is simulated artificially, and can help humans in making complicated
judgments accurately, quickly, and very consistently over a long period (Abdullah
et al., 2004). Using sample data, a classification system can generate an updated basis
for improved classification of subsequent data from the same source, and express the
new basis in intelligible symbolic form (Michie, 1991). Furthermore, it can learn
meaningful or non-trivial relationships automatically in a set of training data, and
produce a generalization of these relationships that can be used to interpret new, unseen
test data (Mitchell et al., 1996).
Generally, classification identifies objects by classifying them into one of the finite
sets of classes, which involves comparing the measured features of a new object with
those of a known object or other known criteria and determining whether the new object
belongs to a particular category of objects. Figure 4.1 shows the general classification
system configuration used in computer vision for food quality evaluation. Using imageprocessing techniques, the images of food products are quantitatively characterized by
a set of features, such as size, shape, color, and texture. These features are objective data
used to represent the food products, which can be used to form the training set. Once
the training set has been obtained, the classification algorithm extracts the knowledge
base necessary to make decisions on unknown cases. Based on the knowledge, intelligent decisions are made as outputs and fed back to the knowledge base at the same
time, which generalizes the method that inspectors use to accomplish their tasks. The
computationally hard part of classification is inducing a classifier – i.e., determining
the optimal values of whatever parameters the classifier will use. Classifiers can give
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
82 Object Classification Methods
Classification system
Figure 4.1
The general configuration of the classification system.
simple yes or no answers, and they can also give an estimate of the probability that an
object belongs to each of the candidate classes.
A wide variety of approaches has been taken towards this task in the food quality evaluation. Among the applications where classification techniques have been employed for
building a knowledge base, artificial neural network (ANN) and statistical approaches
are the two main methods. Fuzzy logic and the decision tree have also been used for
classification. Besides the above classical classification approaches, the support vector
machine (SVM) is a currently emerging classification technique and has been demonstrated to be feasible for performing such a task. All these approached have a common
objective: to simulate a human decision-maker’s behavior, while having the advantage
of consistency and, to a variable extent, explicitness. The fundamentals of these classification techniques as applied for food quality evaluation will be discussed in detail
in the following sections.
2 Artificial neural network
Initially inspired by the biological nervous system, ANN approaches combine the
complexity of some of the statistical techniques with the objective of machines learning
to imitate human intelligence, which is characterized by their self-learning capability.
The key element of ANN is the novel structure of the information-processing system
for modeling the functionality of a nervous system. Through a learning process, like
humans, it can solve specific problems such as classification. ANNs have applicability
to a number of types of food product classification, including cereal grains (Luo et al.,
1999; Paliwal et al., 2001), fruits (Kavdir and Guyer, 2002; Li et al., 2002), fish
(Storbeck and Daan, 2001), meat (Li et al., 2001; Chao et al., 2002), and vegetables
(Nagata and Cao, 1998; Shahin et al., 2002).
2.1 Structure of neural network
A neural network is a collection of interconnected nodes or processing elements (PEs),
each of which is a key element of an ANN and is relatively simple in operation. The
common structure of a PE is shown in Figure 4.2. Each input path is associated with
a standardized signal using a transfer function (TF) and weighting. A PE has many
inputs from several of the “upstream” PEs in the network. All inputs are summed to
Artificial neural network 83
Figure 4.2 Common structure of a processing element (+ = sum, TF = transfer function).
produce a non-linear function of its input. The PE then generates an output, and sends
it “downstream” to the input paths of another group of PEs. The input weighting can be
changed adaptively, which makes this PE very flexible and powerful. The algorithms
for adjustment of weighting will be discussed in the following section. The transfer
functions can be classified into three categories: linear, threshold, and sigmoid. The
output of a linear function is proportional to the total weighted output. For the threshold
function, the output is set at one of two levels, depending on whether the total input is
greater than or less than some threshold value. Since sigmoid functions can obtain the
output varying continuously but not linearly as the input changes, they are the most
widely used transfer functions.
Figure 4.3 illustrates the general topology of an ANN. The complete network represents a very complex set of interdependencies, and may incorporate any degree of
non-linearity in theory. For food quality evaluation, very general functions can be modeled to transform physical properties into quality factors. ANN technology allows the
extension of computer vision technology into the areas of color, content, shape, and
texture inspection at near-human levels of performance, and can provide the decisionmaking and classification capabilities to succeed in these inspection tasks (Domenico
and Gary, 1994).
The input layer represents the raw information fed into the network, which normally
consists of the image attributes of food products, such as size, shape, color, and texture.
The input values are generally normalized, usually in the range of [0–1]. The number of
PEs in an input layer is typically defined based on different attribute types and attribute
domain. A neural network can have one or more hidden layers. Hidden layer(s) are
constructed for the process of learning by computations on their node and arc weights.
The activity of hidden layers is determined by the activities of the input PEs and the
weighting on the connections between the input and the hidden PEs. The result of
classification is the output of a PE in the output layer. Typically, there is one output PE
for each class. The behavior of the output layer depends on the activity of the hidden
84 Object Classification Methods
Input layer
Output layer
Figure 4.3
Hidden layer
(there may be
several hidden
The general topology of an artificial neural network.
layers, the weights and transfer functions between the hidden and output layers. The
PEs of input, hidden, and output layers are connected by arcs. Each arc is assigned an
initial random weighting, usually [−0.5 . . . 0.5], used in training, and may be modified
in the learning process.
The number of layers and the number of PEs per layer are the “art” of an ANN
designer. There is no quantifiable, best answer to the structure of an ANN for food
classification. Generally, as the complexity in the relationship between the input data
and the desired output increases, the number of PEs in the hidden layer should also
increase. The single-layer organization constitutes the most general case, and is of more
potential computational power than hierarchically structured multi-layer organizations.
The additional hidden layer(s) might be required when the process being modeled is
separable into multiple stages. The number of PEs in the hidden layer(s) should be less
than the amount of training data available. If too many PEs are used, the training set
will be memorized and lead to over-fitting. As a result, generalization of the data will
not occur, and the network will become useless on new data sets. However, too few
PEs will reduce the classification accuracy. The exact number of PEs in the hidden
layer(s) should be determined via experimentation.
2.2 Learning process
The knowledge of the ANN is contained in the values of connection weights. Learning involves adjustments to the values of weighting by passing the information about
response success backward through the network. Modifying the knowledge stored in
an ANN as a function of experience implies a learning rule of how to adapt the values
of the weights. For a simple PE, the fixed incremental rule could be used to adjust
weighting. The algorithm could be described as follows:
1. Initializing weights with small random numbers
2. Selecting a suitable value for the learning rate coefficient γ, ranging from 0 to 1
Artificial neural network 85
3. Running a sample feature vector x = (x1 , x2 , . . . , xd ) with d-dimension from a
training set as input
4. Applying the summation of weighted input S = di=0 wi xi and transfer function
tf to obtain an output y = tf (S)
5. Comparing the output with the expected class c from the training set; if the output
does not match, modifying arc weights according to wi = wi + γ(c − y)xi
6. Running the next sample and repeat steps 3–5
7. Repeating steps 3–6 until the weights converge.
The concept of this algorithm is to find a linear discriminant plane, by moving a
fixed distance, where no misclassification error occurs. If the feature vectors are linearly separable, the algorithm will converge and a correct, error-free solution is found.
Unfortunately, most feature vectors of food products are non-linearly separable. To
cope with this problem, one of the alternative algorithms developed for adjusting the
values of weights is the delta rule, which is used in feed-forward networks. The weights
are changed in proportion to the error δ in the equation (4.1):
w i (k + 1) = w i (k) + γδxi (k) = w i (k) + γ[c(k) − S(k)]x i (k)
where k indicates the kth iteration of the classifier, and c(k) is the class of the kth
training pattern.
Another solution is the back-propagation learning rule proposed by Rumelhart et al.
(1986), which has become one of the most important methods for training neural
networks. In order to avoid confusion, a clear notation is described first:
output state of jth PE in layer s
connection weight joining ith PE in layer (s − 1) to jth PE in layer s
summation of weighted inputs to jth PE in layer s
A PE in the output layer determines its activity by two steps. First, it computes
the total weighted input Sj using the formula:
Sj =
[o] [o−1]
w ji y i
is the output state of the ith unit in the previous layer. Then the PE
where yi
calculates the output state yj using transfer function of the total weighted input Sj .
Typically, the following sigmoid function is used:
yj = tf Sj
1+e j
Once the activities of all the output units have been determined, the network computes
the global error function E, which is given by
1 [o] 2
cj − yj
2 j
86 Object Classification Methods
where cj denotes the desired output, and yj denotes the actual output produced by the
network with its current set of weights.
Based on equations (4.2)–(4.4) described above, a standard back-propagation
algorithm is given as follows:
1. Initializing weights with small random numbers
2. Selecting a suitable value for the learning rate coefficient γ, ranging from 0 to 1
3. Running a sample feature vector x from the training set as input, and obtaining
an output vector y[o] at the output layer of the network
4. Calculating the local error and delta weight for each PE in the output layer as
∂E ∂y j
∂y j ∂Sj
= c j − y j tf Sj
= yj 1 − yj , if the sigmoid function is used as the transfer
where tf Sj
The delta weight of an output layer node can be given by:
w ji = −γej
5. Calculating the local error and delta weight for each PE in the hidden layers using
the following equations respectively:
[s+1] [s+1]
ej = tf Sj
w ij
[s] [s−1]
w ji = −γej y i
6. Updating all the weights in the network by adding the delta weights to the
corresponding previous weights
7. Running the next sample and repeating steps 3–6
8. Repeating steps 3–7 until the changes in weights are reduced to some predetermined level.
3 Statistical classification
Statistical classification (SC) utilizes the statistical properties of the observations from
the training set. It is generally characterized by having an explicit underlying probability
model, for example Bayesian theory, which is mathematically rigorous and provides a
probabilistic approach to inference. Based on a well-established field of mathematics,
SC has been proven successful in applications of computer vision for quality evaluation
of food products. Generally, there are three kinds of SC techniques used in applications:
Bayesian classification, discriminant analysis, and nearest neighbor.
Statistical classification 87
3.1 Bayesian classification
Bayesian classification is a probabilistic approach to learning and inference based on a
different view of what it means to learn from data, in which probability is used to represent uncertainty about the relationship being learnt. Before we have seen any data, our
prior opinions about what the true relationship might be are expressed in a probability
distribution. After we look at the data, our revised opinions are captured by a posterior
distribution. Bayesian learning can produce the probability distributions of the quantities of interest, and make the optimal decisions by reasoning about these probabilities
together with observed data (Mitchell, 1997). In order to improve the objectivity of
the inspection, Bayesian classifiers have been implemented for the automated grading
of apples (Shahin et al., 1999), mandarins and lemons (Aleixos et al., 2002), raisins
(Okamura et al., 1993), carrots (Howarth and Searcy, 1992), and sweet onions (Shahin
et al., 2002).
Suppose there are n classes (c1 , c2 , . . . , cn ) and A summarizes all prior assumptions
and experience, the Bayesian rule tells how the learning system should update its
knowledge as it receives a new observation. Before giving a new observation with
feature vector x, the learning system knows only A. Afterwards, it knows xA, i.e. x
and A. Bayes’ rule then tells how the learning system should adapt P(ci |A) into P(ci |xA)
in response to the observation x as follows:
P(c i |xA) =
P(c i |A)P(x|c i A)
where P(ci |xA) is usually called the posterior probability and P(ci |A) the prior
probability of class ci (it should be noted that this distinction is relative to the observation; the posterior probability for one observation is the prior probability for the next
observation); P(x|ci A) is the class-conditional probability density for observation x in
class ci and the prior assumptions and experience A. Both P(ci |A) and P(x|ci A) could
be determined if (c1 , c2 , . . . , cn ) are exhaustive and mutually exclusive – in other words
if exactly one of ci is true while the rest are false. P(x|A) is the conditional probability
of the prior assumptions and experience Z, and can be derived by
P(x|A) =
P(c k |A)P(x|c k A)
The Bayesian decision rule selects the category with minimum conditional risk. In
the case of minimum-error rate classification, the rule will select the category with the
maximum posterior probability. The classification procedure is then to compare the
values of all the P(ci |xA) and assign the new observation to class ci if
P(c i |xA) > P(c j |xA)
for all i = j
Figure 4.4 illustrates the structure of a Bayesian classifier. So far, we have explicitly
denoted that the probabilities are conditional to the prior assumptions and experience A.
In most cases the context will make it clear which are the prior assumptions, and usually
A is left out. This means that probability statements like P(x) and P(ci |x) should be
88 Object Classification Methods
Figure 4.4
Structure of a Bayesian classifier.
understood to mean P(x|A) and P(ci |xA) respectively, where A denotes the assumptions
appropriate for the context.
3.2 Discriminant analysis
Discriminant analysis is a very useful multivariate statistical technique which takes into
account the different variables of an object and works by finding the so called discriminant functions in such a way that the differences between the predefined groups are
maximized. The obtained discriminant rules provide a way to classify each new object
into one of the previous defined groups. Discriminant analysis has been demonstrated
as plausible for the classification of apples (Leemans and Destain, 2004), corn (Zayas
et al., 1990), edible beans (Chtioui et al., 1999), poultry carcasses (Park et al., 2002),
mushrooms (Vízhányó and Felföldi, 2000) and muffins (Abdullah et al., 2000), and
for individual kernels of CWRS wheat, CWAD wheat, barley, oats, and rye, based on
morphological features (Majumdar and Jayas, 2000a), color features (Majumdar and
Jayas, 2000b), and textural features (Majumdar and Jayas, 2000c).
The most famous approach of discriminant analysis was introduced by Fisher for two
class problems (Fisher, 1936). By considering two classes of d-dimensional observations x with means µ1 and µ2 , Fisher discriminant analysis seeks a linear combination
of features w · x that has a maximal ratio of between-class variance to within-class
variance as follows:
w T MB w
w T MW w
where MB = (µ1 − µ2 )(µ1 − µ2 )T and MW = i=1,2 dk=1 (xki − µi )(xki − µi )T are
the between- and within-class scatter matrices respectively. The intuition behind maximizing J (w) is to seek a linear direction for which the projected classes are well
separated. If the within-class scatter matrix MW has full rank, the maximum separation
J(w) =
Statistical classification 89
occurs when w = M−1
w (µ1 − µ2 ). When MW is singular, it cannot be inverted. The
problem can be tackled in different ways; one method is to use a pseudo inverse instead
of the usual matrix inverse (Rao and Mitra, 1971).
Fisher discriminant analysis is a very reasonable measurement of class separability.
Several approaches could be applied to generalize it for more than two classes, for
example the method developed by Rao (1948). The most common approach is to
substitute variance for covariance and simple ratios for ratios of determinants, which
is based on the fact that the determinant of a covariance matrix, known as generalized
variance, is the product of the variances along principal component directions.
Given a set of l d-dimensional samples represented by x, where each case belongs
to one of n known classes, X is the l × d matrix of all the group of samples and U
is its means, M is the n × d matrix of class means, and G is the l × n matrix of
class membership matrix that indicates which class each sample belongs to (gij = 1 if
and only if sample i is assigned to class j, or else gij = 0), then the within-class and
between-class sample covariance matrices are:
(X − GM)T (X − GM)
(GM − U)T (GM − U)
Then the problem of multiple discriminant analysis could be considered finding a
d × (n − 1) projection matrix W for which the projected samples XW are well separated.
Thus the two-class criterion consists of seeking the projection that maximizes the
ratio of the determinants of the within-class to the between-class covariance matrices,
and could be generalized as:
J(W ) =
The projection matrix W can be computed by solving the following generalized
eigenvector problem:
CMB W i = λi CMW W i
If the classes are Gaussian with equal covariance and their mean vectors are well
separated, the discriminant can achieve the optimal result with the minimum classification error. However, when the distributions are non-Gaussian or the mean vectors of
the two classes are close to each other, the performance of discriminant will be poorer.
3.3 Nearest neighbor
As well as the Bayesian classification and discriminant analysis, the nearest-neighbor
method is also feasible for classification of foods. For example, it has been applied to
classify healthy and six types of damaged Canadian Western Red Spring wheat kernels using selected morphological and color features extracted from the grain sample
90 Object Classification Methods
images (Luo et al., 1999). Nearest neighbor is a non-parametric classification technique performed by assigning the unknown case to the class most frequently represented
among the nearest samples. Without a priori assumptions about the distributions from
which the training examples are drawn, the nearest-neighbor classifier could achieve
consistently high performance in spite of its simplicity. It involves a training set of both
positive and negative cases. A new sample is classified by calculating the distance to
the nearest training case; the sign of that point then determines the classification of the
The k-nearest-neighbor (k-NN) classifier extends this idea by taking the k nearest
points, i.e. the closest neighbors around the new observation with feature vector x. The
classification is usually performed by a majority voting rule, which states that the new
sample to be assigned should be the label occurring most among the neighbors. Several
design choices arise when using this classifier. The first choice is to find a suitable
distance measurement; the second is the number of neighbors of k – choosing a large
k generally results in a linear classifier, whereas a small k results in a non-linear one,
which influences the generalization capability of the k-NN classifier. Furthermore, the
design of the set of prototypes is also an important issue.
The most common distance metric used to calculate the distances between samples
is Euclidean distance. Given two samples xi and xj , the Euclidean distance between
the two samples is defined as:
DE (x i , x j ) = x i − x j (4.17)
Other measures can also be used, such as the city-block distance and Mahalanobis
distance, defined respectively as follows:
DC (x i , x j ) =
|x ik − x jk |
DM (x i , x j ) = (x i − x j ) CM−1 (x i − x j )
where CM represents the covariance matrix. The city-block distance is also known
as the Manhattan distance, boxcar distance or absolute value distance. It represents
the distance between points in a city road grid, and examines the absolute differences
between the coordinates of a pair of feature vectors. Mahalanobis distance takes the
distribution of the points (correlations) into account, and is a very useful way of determining the “similarity” of a set of values from an “unknown” sample to a set of values
measured from a collection of “known” samples. The Mahalanobis distance is the same
as the Euclidean distance if the covariance matrix is the identity matrix.
Choosing the correct k is a hard problem. Too large (or too small) a k may result
in non-generalizing classifiers. The choice of k is often performed through the leaveone-out cross-validation method on the training set. Leave-one-out cross-validation
(Martens and Martens, 2001) can make good use of the available data and provide
an almost unbiased estimate of the generalization ability of a model. At the start, the
first observation is held out as a single-element test set, with all other observations
Fuzzy logic 91
as the training set. After that, the second observation is held out, then the third, and
so on. This of course still requires independent test sets for accurate error estimation
and comparison of different k-NN classifiers.
The design of the set of prototypes is the most difficult and challenging task. The
simplest approach is to select the whole training set as prototypes. However, this simple
approach requires huge memory and execution in large databases, and hence the size
of prototypes should be reduced in practice. The strategies for reducing the number of
stored prototypes can be divided into three types: condensing, editing, and clustering
algorithms. Condensing algorithms aim to keep those points that are near the class
border from the training data, which form the class boundaries (Hart, 1968). Editing
algorithms retain those training data that fall inside the class borders, and tend to form
homogeneous clusters since only the points that are at the centre of natural groups in the
data are retained (Wilson, 1972). It is also feasible to use any clustering algorithm, such
as k-means, to form a set of labeled prototypes (Devroye et al., 1996). The advantage
for clustering algorithms is that prototypes are not constrained to training points, and
thus more flexible classifiers can be designed.
4 Fuzzy logic
Fuzzy logic is introduced as a representation scheme and calculus for uncertain or
vague notions, and could provide a completely different method for applications such as
the classification of food products. Compared with traditional classification techniques,
fuzzy classification groups individual samples into classes that do not have sharply
defined boundaries. It embodies the nature of the human mind in some sense, as the
concepts of possibility and probability are emphasized in this logic. In contrast with
the absolute values and categories in the traditional Boolean logic, it mimics more
human behavior for decision-making and reasoning by extending the handling of the
intermediate categories to partially true or partially false. Thus it can simulate the
human experience of generating complex decisions using approximate and uncertain
information. The application of fuzzy logic in food quality evaluation includes the
grading of apples (Shahin et al., 2001) and tomatoes (Jahns et al., 2001).
The introduction of fuzzy set theory by Zadeh (1965) marked the beginning of a new
way of solving classification problems by providing a basis for a qualitative approach
to the analysis of a complex system. By incorporating the basics of fuzzy set theory, in
which linguistic or “fuzzy” terms rather than relationships between precise numerical
values are employed to describe system behavior and performance, a classification
system can make a decision in a similar way to humans. The fuzzy classifier is inherently
robust, does not require precise inputs, and can obtain a definite conclusion even based
upon vague, ambiguous, imprecise, and noisy input or knowledge. Figure 4.5 shows a
typical structure of a fuzzy classification system, which essentially defines a non-linear
mapping of the input data vector into a scalar output using fuzzy rules.
If considering an input vector x, the first step for a fuzzy classification system is
to transform crisp input variables into linguistic variables by creating fuzzy sets and
membership functions. The second step is to construct a fuzzy rule base. By computing
92 Object Classification Methods
Input x
Creating fuzzy sets
and membership
Constructing fuzzy
rule base
Producing fuzzy
Output y
Figure 4.5
Structure of a fuzzy classification system.
the logical product for each of the effective rules, a set of fuzzy outputs is produced.
Finally, the fuzzy outputs are processed and combined in some manner to produce a
crisp (defuzzified) output.
4.1 Creating fuzzy sets and membership functions
4.1.1 Fuzzy set
The very basic notion of a fuzzy classification system is a fuzzy set. A fuzzy set S in
a fuzzy space X could be represented as a set of ordered pairs:
S = {(x, τ(x)|x ∈ X )}
where x is a generic element, and τ(x) characterizes its grade of membership. In
Boolean logic, every element is true or false – i.e. restricted to just two values, 1 or
0 – and thus imposes rigid membership. In contrast, fuzzy sets have more flexible
membership requirements that allow for partial membership in a set. Each element
of a fuzzy set has a degree of membership, which can be a full member (100 percent
membership) or a partial member (between 0 and 100 percent membership) – i.e. the
membership value assigned to an element can be 0, 1, or any value in between.
Compared with the crisp sets in Boolean logic, fuzzy sets are more flexible in
applications. The flexibility of fuzzy set design allows different relationships between
the neighbor sets. Fuzzy sets in a fuzzy universe can be fully separated, or they can be
arranged in an overlapping manner. Hence, in fuzzy logic the freedom of both shape
and association of the fuzzy sets provides a broad base for applying fuzzy logic.
Fuzzy logic 93
The design of a series of fuzzy sets depends on the characteristics and complexity
of the classification problem. Although some formal procedures have been proposed
for obtaining fuzzy set mapping, there is still no theoretically universal method (Dutta,
1993). A principle called “minimum normal form,” which requires at least one element
of the fuzzy set domain to have a membership value of one, is most widely used.
4.1.2 Membership function
The mathematical function that defines the degree of an element’s membership in a
fuzzy set is called the membership function. In literature, a variety of membership
functions have been used, including linear, sigmoid, beta curve, triangular curve, and
trapezoidal curve (Sonka et al., 1999). The more complex the membership functions
are, the greater the computing overhead implement.
The membership function is a graphical representation of the magnitude of participation of each input variable. The number 1 assigned to an element means that the
element is in the set s, and 0 means that the element is definitely not in the set S. All
other values mean a graduated membership of the set S. In such a way, the membership function associates a weight with each of the inputs that are processed, defines
the functional overlap between inputs, and ultimately determines an output response.
These weighting factors determine the degree of influence or of membership.
4.2 Constructing a fuzzy rule base
A fuzzy rule base contains a set of fuzzy rules, whose forms are usually expressed in
IF–THEN. Each fuzzy rule consists of two parts, i.e. an antecedent block (between the
IF and THEN) and a consequent block (following THEN). Depending on the classification system, it may not be necessary to evaluate every possible input combination,
since some may rarely or never occur. By making this type of evaluation, it can simplify
the processing logic and perhaps even improving the fuzzy logic system performance.
In fuzzy logic, the AND, OR, and NOT operators of Boolean logic are usually defined
as the minimum, maximum, and complement, as Zadeh’s (1965) paper. So for the fuzzy
variables x1 and x2 :
NOT x 1 = (1 − truth(x 1 ))
x 1 AND x 2 = minimum(truth(x 1 ), truth(x 2 ))
x 1 OR x 2 = maximum(truth(x 1 ), truth(x 2 ))
There are also other operators, called linguistic hedges. Hedges play the same role
as in fuzzy production rules that adjectives and adverbs play in English sentences, such
as “very” or “somewhat.” By modifying the fuzzy set’s membership function, hedges
allow the generation of fuzzy statements through a mathematical formula. According
to their impact on the membership function, the hedges are divided into three groups:
concentrator, dilator, and contrast hedges. The concentrator hedge intensifies the fuzzy
region as τcon(S ) (x) = τSn (x), where n ≥ 1. In contrast, the dilator hedge dilutes the force
of fuzzy set membership function by τdil(S ) (x) = τS (x). The contrast hedge changes
94 Object Classification Methods
the nature of the fuzzy region by making it either less fuzzy (intensification) or more
fuzzy (diffusion):
1 1/2 if τ is ≥ 0.5, τ(S ) =
τ (S )
2 S
if τ < 0.5, τ(S ) = 1 −
1 1/2 τ (S )
2 S
4.3 Producing fuzzy outputs and defuzzification
The interpretation of an IF–THEN rule can be evaluated as follows. All fuzzy statements
in the antecedent block are first mapped to a degree of membership between 0 and 1. If
there are multiple parts in the antecedent, fuzzy logic operators are applied to resolve
the antecedent to a single number between 0 and 1. After that, the conclusions of the
consequent block are combined to form a logical sum.
The fuzzy outputs for all rules are finally aggregated into a single composite output
fuzzy set. The fuzzy set is then passed on to the defuzzification process for crisp output
generation – that is, to choose one representative value as the final output. This process
is often complex, since the resulting fuzzy set might not translate directly into a crisp
value. Several heuristic defuzzification methods exist. One of them is the centroid
method, which is widely used in the literature. This method finds the “balance” point
of the solution fuzzy region by calculating the weighted mean of the output fuzzy
region. The weighted strengths of each output member function are multiplied by their
respective output membership function center points and summed. This area is then
divided by the sum of the weighted member function strengths, and the result is taken
as the crisp output.
Besides the centroid method, the max method chooses the element with the highest
magnitudes. This method produce a continuous output function and is easy to implement; however, it does not combine the effects of all applicable rules. The weighted
averaging method is another approach that works by weighting each membership function in the output by its respective maximum membership value. Nonetheless, it fails
to give increased weighting to more rule votes per output member function.
5 Decision tree
The decision tree acquires knowledge in the form of a tree, which can also be rewritten
as a set of discrete rules to make it easier to understand. The main advantage of the
decision tree classifier is its ability to using different feature subsets and decision
rules at different stages of classification. As shown in Figure 4.6, a general decision
tree consists of one root node, a number of internal and leaf nodes, and branches.
Leaf nodes indicate the class to be assigned to a sample. Each internal node of a tree
corresponds to a feature, and branches represent conjunctions of features that lead to
those classifications. For food quality evaluation using computer vision, the decision
Decision tree 95
Figure 4.6 A general decision tree structure;
, , and represent root, internal, and leaf nodes
tree has been applied to the problem of meat quality grading (Song et al., 2002) and
the classification of “in the shell” pistachio nuts (Ghazanfari et al., 1998).
The performance of a decision tree classifier depends on how well the tree is constructed from the training data. A decision tree normally starts from a root node, and
proceeds to split the source set into subsets, based on a feature value, to generate subtrees. This process is repeated on each derived subset in a recursive manner until leaf
nodes are created. The problem of constructing a truly optimal decision tree seems
not to be easy. As one of the well-known decision tree methods, C4.5 is an inductive
algorithm developed by Quinlan (1993); this is described in detail below.
To build a decision tree from training data, C4.5 employs an approach which uses
information theoretically measured based on “gain” and “gain ratio.” Given a training
set TS, each sample has the same structure. Usually, the training set TS of food products
is partitioned into two classes – AL (acceptable level) and UL (unacceptable level). The
information (I) needed to identify the class of an element of TS is then given by
I(TS) = −
If the training set TS is partitioned on the basis of the value of a feature xk into sets
TS1 , TS2 , . . . , TSn , the information needed to identify the class of an element of TS
can be calculated by the weighted average of I (TSi ) as follows:
I(x k , TS) =
|TS i |
I(TS i )
96 Object Classification Methods
The information gained on a given feature is the difference between the information
needed to identify an element of TS and the information needed to identify an element
of TS after the value of the feature has been obtained. Therefore, the information
gained on xk is
gain(x k , TS) = I(TS) − I(x k , TS)
The root of the decision tree is the attribute with the greatest gain. The process of
building the decision tree is repeated, where each node locates the feature with the
greatest gain among the attributes not yet considered in the path from the root.
The gain measurement has disadvantageous effects regarding the features with a
large number of values. To cope with this problem, the gain ratio is introduced instead
of the gain. For example, the gain ratio of xk is defined as:
gainratio(x k , TS) =
split(x k , TS) =
gain(x k ,TS)
split(x k , TS)
|TS i |
|TS i |
where split(xk , TS) is the information due to the split of TS on the basis of the value
of feature xk .
Sometimes, the decision tree obtained by recursively partitioning a training set as
described above may become quite complex, with long and uneven paths. To deal with
this shortcoming, the decision tree is pruned by replacing a whole sub-tree with a leaf
node through an error-based strategy (Quinlan, 1993).
6 Support vector machine
The support vector machine (SVM) is a state-of-the-art classification algorithm which
has a good theoretical foundation in statistical learning theory (Vapnik, 1995). Instead
of minimization of the misclassification on the training set, SVM fixes the decision
function based on structural risk minimization to avoid the overfitting problem. It performs classification by finding maximal margin hyperplanes in terms of a subset of the
input data between different classes. The subset of vectors defining the hyperplanes is
called a support vector. If the input data are not linearly separable, SVM first maps the
data into a high- (possibly infinite) dimensional feature space, and then classifies the
data by the maximal margin hyperplanes. Furthermore, SVM is capable of classification in high-dimensional feature space with fewer training data. SVM was originally
developed for the problem of binary classification. Recently, it has also been shown
a great deal of potential in multi-class problems. As one of the relatively novel learning techniques, SVM has been successfully applied to some classification problems,
such as electronic nose data (Pardo and Sberveglieri, 2002; Trihaas and Bothe, 2002)
and bakery process data (Rousu et al., 2003), and pizza grading (Du and Sun, 2004,
2005a, 2005b).
Support vector machine 97
6.1 Binary classification
The classification of food products into acceptable and unacceptable quality levels can be examined as a binary categorization problem. Suppose that there are l
samples in the training data, and each sample is denoted by a vector xi , binary classification can be described as the task of finding a classification decision function
f :xi → yi , yi ∈ {−1, +1} using training data with an unknown probability distribution
P(x, y). Subsequently, the classification decision function f is used to correctly classify
the unseen test data. If f (x) > 0, the input vector x is assigned to the class y = +1, i.e. the
acceptable quality level, or to the class y = −1, i.e. the unacceptable quality level.
The classification decision function f is found by minimizing the expected classification risk as follows:
CR(f ) =
| y − f (x)|dP(x, y)
Unfortunately, the expected classification risk shown in equation (4.30) cannot be
calculated directly because the probability distribution P(x, y) is unknown. Instead, the
“empirical risk” ERemp ( f ) is applied to approximate the expected classification risk
on the training set (Burges, 1998):
ERemp (f ) =
1 | y − f (x i )|
2l i=1 i
Although there is no probability distribution appearing in equation (4.31), the classification decision function f still cannot be found correctly because the empirical risk
might differ greatly from the expected classification risk for small sample sizes. Structural risk minimization (SRM) is a technique suggested by Vapnik (1995) to solve the
problem of capacity control in learning from “small” training data. With a probability
of 1 − η (where 0 ≤ η ≤ 1), the following bound holds on the expected classification
risk (Vapnik, 1995):
VCD(log(2l/VCD)) − log(η/4)
CR(f ) ≤ ERemp (f ) +
where VCD is the Vapnik Chervonenkis dimension of the set of functions from which
the classification decision function f is chosen. The second term on the right-hand side
of equation (4.32) is the so-called “VC confidence.” SRM attempts to find the function
for minimizing the upper bound by training.
For the linearly separable training vectors xi , the classification function has the
following form:
f (x) = sgn(ωT x + b)
where ω is normal to the hyperplane and b is a bias term, which should satisfy the
following conditions:
y i (ωT x i + b) ≥ 1, i = 1, 2, . . . , l
98 Object Classification Methods
SVM intends to find the optimal separating hyperplane that maximizes the margin
between positive and negative samples. The margin is 2/ω, thus the optimal separating hyperplane is the one minimizing 12 ωT ω, subject to constraints shown in equation
(4.34), which is a convex quadratic programming problem.
For the linearly non-separable case, the constraints in equation (4.34) are relaxed
by introducing a new set of non-negative slack variables {ξi |i = 1, 2, . . . , l} as the
measurement of violation of the constraints (Vapnik, 1995), as follows:
y i (ωT x i + b) ≥ 1 − ξi , i = 1, 2, . . . , l
The optimal hyperplane is the one that minimizes the following formula:
1 T
ω ω+ λ
where –λ is a parameter used to penalize variables ξi , subject to constraints in
equation (4.35).
For a non-linearly separable case, the training vectors xi can be mapped into a high
dimensional feature space (HDFS) by a non-linear transformation ϕ(·). The training
vectors become linearly separable in the feature space HDFS and then separated by
the optimal hyperplane as described before. In many cases the dimension of HDFS
is infinite, which makes it difficult to work with ϕ(·) explicitly. Since the training
algorithm only involves inner products in HDFS, a kernel function k(xi , xj ) is used to
solve the problem, which defines the inner product in HDFS:
k(x i , x j ) = ϕ(x i ), ϕ(x j )
Besides a linear kernel, polynomial kernels and Gaussian radial basis function (RBF)
kernels are usually applied in practice, which are defined as:
k(x i , x j ) = (x i x j + b)m
k(x i , x j ) = exp(−x i − x j 2/2σ 2 )
where b is the bias term and m is the degree of polynomial kernels.
The classification function then has the following form in terms of kernels:
y i αi k(xi , x) + b
f (x) = sgn
where αi can be obtained by solving a convex quadratic programming problem subject
to linear constraints. The support vectors are those xi with αi > 0 in equation (4.40).
To illustrate the performance of SVM classifiers, a two-dimensional data set with
five samples for each class is shown in Figure 4.7, where the samples of class +1 are
represented by the lighter dots and the samples of class −1 by the darker dots.
The performance of a linear SVM is illustrated in Figure 4.8a. If the input data are
not linearly separable, SVM first maps the data into a high-dimensional feature space
using a kernel function, such as the polynomial kernel (equation (4.38)) and Gaussian
Support vector machine 99
Class ⫹1
Class ⫺1
Figure 4.7 An illustrated data set.
RBF kernel (equation (4.39)), and then classifies the data by the maximal margin
hyperplanes as shown in Figures 4.8a and 4.8b, respectively.
6.2 Multi-classification
Although SVM was originally developed for the problem of binary classification,
several SVM algorithms have been developed for handling multi-class problems; of
these, one approach is to use a combination of several binary SVM classifiers, such as
one-versus-all (Vapnik, 1998), one-versus-one (Kressel, 1999), and the directed acyclic
graph (DAG) SVM (Platt et al., 2000), while another approach is to directly use a single
optimization formulation (Crammer and Singer, 2001). Owing to its computational
expensiveness and complexity, single SVM formulation is usually avoided.
The multi-classification of samples with n classes can be considered as constructing
and combining several binary categorization problems. The earliest approach for multiclassification using SVM was one-versus-all. Multi-classification with this method can
be described as the task of constructing n binary SVMs. The ith SVM is trained with the
samples from the ith class positive, and the samples from all the other classes negative.
N classification decision functions can be found:
f i (x) =
y ij αij k(x ij , x) + bi , i = 1, . . . , n
where yji ∈ {+1, −1}, k is a kernel function, bi is a bias term, and αij is the coefficient
obtained by solving a convex quadratic programming problem. Given an unknown
sample (denoted by x), the input vector x is assigned to the class that has the largest
value of the decision function in equation (4.41).
100 Object Classification Methods
Support vectors of class +1
Support vectors of class −1
Figure 4.8
Performance of (a) a linear SVM classifier; (b) a polynomial SVM classifier; (c) an RBF SVM
Support vector machine 101
Another approach using a combination of several binary SVM classifiers is called
the one-versus-one method. Multi-classification with this method can be described as
the task of constructing n(n − 1)/2 binary SVMs, one classifier C ij for every pair of
distinct classes, i.e. the ith class and the jth class, where i = j, i = 1, . . . , n; j = 1, . . . , n.
Each classifier C ij is trained with the samples in the ith class with positive labels, and
the samples in the jth class with negative labels. The classification decision functions
can be constructed as detailed below:
f (x) =
ij ij
y k αk k(x k , x) + bij ,
i = j, i = 1, . . . , n; j = 1, . . . , n
where the sum is the total number of the ith and jth classes from the training data,
yk ∈ {+1, −1}, k is a kernel function, bij is a bias term, and αk is the coefficient
obtained by solving a convex quadratic programming problem. Given an unknown
sample, if the decision function in equation (4.42) states that the input vector x is in
the ith class, the classifier C ij casts one vote for the ith class; otherwise the vote for
the jth class is added by one. When all the votes from the n(n − 1)/2 classifiers are
obtained, the unknown sample x is assigned to the class with the most votes.
The third approach is the directed acyclic graph SVM, which is a learning algorithm
designed by combining many two-class classifiers into one multi-class classifier using
a decision graph. The training phase of the multi-classification is the same as the oneversus-one method, i.e. it constructs n(n − 1)/2 binary classifiers. However, in the test
phase it utilizes a new multi-class learning architecture called the decision directed
acyclic graph (DDAG). Each node of the DDAG associates with a one-versus-one
classifier. Supposing there are five categories in the samples, Figure 4.9 illustrates the
1, 2, 3, 4, 5
2, 3, 4, 5
1, 2, 3, 4
3, 4, 5
3, 4
2, 3, 4
4, 5
2, 3
1, 2, 3
Figure 4.9 The DDAG for classification of samples with five categories.
1, 2
102 Object Classification Methods
DDAG procedure of multi-classification. Given an unknown sample x, first the binary
decision function at the root node is evaluated. Then, if the value of the binary decision
function is −1, the node exits via the left edge; otherwise, if the value is +1, via the
right edge. Similarly, the binary decision function of the next internal node is then
evaluated. The class of x is the one associated with the final leaf node.
7 Conclusions
A number of classification techniques have been introduced in this chapter, including
the artificial neural network, Bayesian classification, discriminant analysis, nearest
neighbor, fuzzy logic, the decision tree, and the support vector machine. All of the
above methods have shown feasibility for the classification of food products, with
various successes. Given the proliferation of classification techniques, it is not an easy
task to select an optimal method that can be applied to different food products. It is
impossible to offer one technique as a general solution because each classification
technique has its own strengths and weaknesses and is suitable for particular kinds
of problem. As a result, one of the most interesting fields for further application is
to combine several techniques for classification of food products. Another trend for
further application is to adopt relatively novel classification techniques, such as SVM.
αi , αij , αn
b, bi , bij
c1 , c2 , . . . , cn
C ij
coefficient obtained by solving a quadratic programming problem
error between the actual class and predicted class
delta weight
learning rate coefficient
mean vector
normal to the hyperplane
sigma term of Gaussian radial basis function kernels
membership function
non-linear transformation
parameter used to penalize variables ξi
slack variables
probability of the bound holding
prior assumptions and experience
bias term
classes from number 1 to n
desired output class
classifier for the ith class and the jth class
class of the kth training pattern
between-class sample covariance matrix
within-class and sample covariance matrix
classification risk
Nomenclature 103
gain(xk , TS)
gainratio(xk , TS)
i, j, k, n
J (w)
J (W )
k(xi , xj )
P(ci |A)
P(ci |xA)
P(x|ci A)
split(xk , TS)
x, x1 , x2
x1 , x2 , . . . , xd
city-block distance
Euclidean distance
Mahalanobis distance
local error
empirical risk
classification decision function
matrix of class membership matrix
information gained on feature xk
ratio between the information gained and the information due to
the split of TS
ratio of between-class variance to within-class variance
ratio of the determinants of the within-class to the between-class
covariance matrices
kernel function
number of samples in a training set
degree of polynomial kernels
matrix of class means
between-class scatter matrix
within-class scatter matrix
prior probability of class ci
conditional probability to the prior assumptions and experience A
posterior probability
class-conditional probability density for observation x in class ci
and the prior assumptions and experience A
summation of weighted input
summation of weighted inputs to jth processing element in layer s
information due to the split of TS on the basis of the value of
feature xk
transfer function
means of all the group of samples
arc weight
weight vector
projection matrix
connection weight joining ith processing element in layer (s − 1) to
jth processing element in layer s
fuzzy variables
features from number 1 to d
sample feature vector
fuzzy space
matrix of all the group of samples
output class
output vector
output state of jth processing element in layer s
104 Object Classification Methods
acceptable level
artificial neural network
directed acyclic graph
global error function
high-dimensional feature space
information needed to identify the class of an element
processing element
fuzzy set
structural risk minimization
support vector machine
training set
unacceptable level
Vapnik Chervonenkis dimension
Abdullah MZ, Aziz SA, Mohamed AMD (2000) Quality inspection of bakery products
using a color-based machine vision system. Journal of Food Quality, 23 (1), 39–50.
Abdullah MZ, Guan LC, Lim KC, Karim AA (2004) The applications of computer vision
system and tomographic radar imaging for assessing physical properties of food.
Journal of Food Engineering, 61 (1), 125–135.
Aleixos N, Blasco J, Navarrón F, Moltó E (2002) Multispectral inspection of citrus in realtime using machine vision and digital signal processors. Computers and Electronics
in Agriculture, 33 (2), 121–137.
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Mining
Knowledge Discovery, 2 (2), 1–43.
Chao K, Chen Y-R, Hruschka WR, Gwozdz FB (2002) On-line inspection of poultry
carcasses by a dual-camera system. Journal of Food Engineering, 51 (3), 185–192.
Chtioui Y, Panigrahi S, Backer LF (1999) Rough sets theory as a pattern classification tool
for quality assessment of edible beans. Transactions of the ASAE, 42 (4), 1145–1152.
Crammer K, SingerY (2001) On the algorithmic implementation of multiclass kernel-based
vector machines. Journal of Machine Learning Research, 2, 265–292.
Devroye L, Györfi L, Lugosi G (1996) A Probabilistic Theory of Pattern Recognition. New
York: Springer-Verlag.
Domenico S, Gary W (1994) Machine vision and neural nets in food processing and
packaging – natural way combinations. In Food Processing Automation III –
Proceedings of the FPAC Conference, ASAE, Orlando, Florida, USA.
Du C-J, Sun D-W (2004) Shape extraction and classification of pizza base using computer
vision. Journal of Food Engineering, 64 (4), 489–496.
Du C-J, Sun D-W (2005a) Pizza sauce spread classification using color vision and support
vector machines. Journal of Food Engineering, 66 (2), 137–145.
References 105
Du C-J, Sun D-W (2005b) Comparison of three methods for classification of pizza topping
using different color spaces. Journal of Food Engineering, 68 (3), 277–287.
Dutta S (1993) Fuzzy Logic Applications: Technological and Strategic Issues. INSEAD
(European Institute of Business Administration), Boulevard de Constance, 77305
Fontainebleau Cedex, France.
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Annals of
Eugenics, 7, 179–188.
Ghazanfari A, Wulfsohn D, Irudayaraj J (1998) Machine vision grading of pistachio nuts
using gray-level histogram. Canadian Agricultural Engineering, 40 (1), 61–66.
Hart PE (1968) The condensed nearest neighbour rule. IEEE Transactions on Information
Theory, 14, 515–516.
Howarth MS, Searcy SW (1992) Inspection of fresh market carrots by machine vision.
In Food Processing Automation II – Proceedings of the 1992 Conference, ASAE,
Lexington Center, Lexington, Kentucky, USA.
Jahns G, Nielsen HM, Paul W (2001) Measuring image analysis attributes and modelling
fuzzy consumer aspects for tomato quality grading. Computers and Electronics in
Agriculture, 31, 17–29.
Kavdir I, Guyer DE (2002) Apple sorting using artificial neural networks and spectral
imaging. Transactions of the ASAE, 45 (6), 1995–2005.
Kressel UH-G (1999) Pairwise classification and support vector machines. In Advances
in Kernel Methods: Support Vector Learning (Schölkopf B, Burges CJC, Smola AJ,
eds.). Cambridge: MIT Press, pp. 255–268.
Leemans V, Destain M-F (2004) A real-time grading method of apples based on features
extracted from defects. Journal of Food Engineering, 61 (1), 83–89.
Li J, Tan J, Shatadal P (2001) Classification of tough and tender beef by image texture
analysis. Meat Science, 57, 341–346.
Li QZ, Wang MH, Gu WK (2002) Computer vision based system for apple surface defect
detection. Computers and Electronics in Agriculture, 36 (2–3), 215–223.
Luo X, Jayas DS, Symons SJ (1999) Comparison of statistical and neural network methods
for classifying cereal grains using machine vision. Transactions of the ASAE, 42 (2),
Majumdar S, Jayas DS (2000a) Classification of cereal grains using machine vision: I.
Morphology models. Transactions of the ASAE, 43 (6), 1669–1675.
Majumdar S, Jayas DS (2000b) Classification of cereal grains using machine vision: II.
Color models. Transactions of the ASAE, 43 (6), 1677–1680.
Majumdar S, Jayas DS (2000c) Classification of cereal grains using machine vision: III.
Texture models. Transactions of the ASAE, 43 (6), 1681–1687.
Martens H, Martens M (2001) Chapter 6. Analysis of two data tables X and Y: Partial Least
Squares Regression (PLSR). In Multivariate Analysis of Quality: an Introduction.
London: John Wiley & Sons, pp. 111–125.
Michie D (1991) Methodologies from machine learning in data analysis and software. The
Computer Journal, 34 (6), 559–565.
Mitchell RS, Sherlock RA, Smith LA (1996) An investigation into the use of machine
learning for determining oestrus in cows. Computers and Electronics in Agriculture,
15 (3), 195–213.
106 Object Classification Methods
Mitchell T (1997) Machine Learning. New York: McGraw-Hill.
Nagata M, Cao Q (1998) Study on grade judgment of fruit vegetables using machine vision.
Japan Agricultural Research Quarterly, 32 (4), 257–265.
Okamura NK, Delwiche MJ, Thompson JF (1993) Raisin grading by machine vision.
Transactions of the ASAE, 36 (2), 485–492.
Paliwal J, Visen NS, Jayas DS (2001) Evaluation of neural network architectures for cereal
grain classification using morphological features. Journal of Agricultural Engineering
Research, 79 (4), 361–370.
Pardo M, Sberveglieri G (2002) Support vector machines for the classification of electronic
nose data. In Proceedings of the 8th International Symposium on Chemometrics in
Analytical Chemistry, Seattle, USA.
Park B, Lawrence KC, Windham WR, Chen Y-R, Chao K (2002) Discriminant analysis
of dual-wavelength spectral images for classifying poultry carcasses. Computers and
Electronics in Agriculture, 33 (3), 219–231.
Platt JC, Cristianini N, Shawe-Taylor J (2000) Large margin DAGs for multiclass classification. In Proceedings of Neural Information Processing Systems. Cambridge: MIT
Press, pp. 547–553.
Quinlan JR (1993) C4.5: Programs for Machine Learning. San Mateo: Morgan Kauffman
Rao C, Mitra S (1971) Generalized Inverse of Matrices and Its Applications. New York:
John Wiley & Sons.
Rao CR (1948) The utilization of multiple measurements in problems of biological classification (with discussion). Journal of the Royal Statistical Society, Series B, 10,
Rousu J, Flander L, Suutarinen M, Autio K, Kontkanen P, Rantanen A (2003) Novel computational tools in bakery process data analysis: a comparative study. Journal of Food
Engineering, 57 (1), 45–56.
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error
propagation. In Parallel Data Processing, Vol.1 (Rumelhart D, McClelland, J, eds).
Cambridge: MIT Press, pp. 318–362.
Shahin MA, Tollner EW, Evans MD, Arabnia HR (1999) Watercore features for sorting Red
Delicious apples: a statistical approach. Transactions of the ASAE, 42 (6), 1889–1896.
Shahin MA, Tollner EW, McClendon RW (2001) Artificial intelligence classifiers for sorting apples based on watercore. Journal of Agricultural Engineering Research, 79 (3),
Shahin MA, Tollner EW, Gitaitis RD, Sumner DR, Maw BW (2002) Classification of
sweet onions based on internal defects using image processing and neural network
techniques. Transactions of the ASAE, 45 (5), 1613–1618.
Song YH, Kim SJ, Lee SK (2002) Evaluation of ultrasound for prediction of carcass meat
yield and meat quality in Korean native cattle (Hanwoo). Asian Australasian Journal
of Animal Sciences, 15 (4), 591–595.
Sonka M, Hlavac V, Boyle R (1999) Image Processing, Analysis and Machine Vision.
El Dorado Hills: PWS Publishing.
Storbeck F, Daan B (2001) Fish species recognition using computer vision and a neural
network. Fisheries Research, 51 (1), 11–15.
References 107
Trihaas J, Bothe HH (2002) An application of support vector machines to E-nose data.
In Proceedings of the 9th International Symposium on Olfaction & Electronic Nose,
Rome, Italy.
Vapnik V (1995) The Nature of Statistical Learning Theory. New York: Springer-Verlag.
Vapnik V (1998). Statistical Learning Theory. New York: John Wiley & Sons.
Vízhányó T, Felföldi J (2000) Enhancing color differences in images of diseased
mushrooms. Computers and Electronics in Agriculture, 26 (2), 187–198.
Wilson D (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE
Transactions on Systems, Man and Cybernetics, 2, 408–421.
Zadeh L (1965) Fuzzy sets. Information and Control, 8, 338–353.
Zayas I, Converse H, Steele J (1990) Discrimination of whole from broken corn kernels
with image analysis. Transactions of the ASAE, 33 (5), 1642–1646.
Quality Evaluation of
Meat Cuts
Liyun Zheng1 , Da-Wen Sun1 and Jinglu Tan2
1 Food Refrigeration and Computerised Food Technology,
University College Dublin, National University of Ireland,
Dublin 2, Ireland
2 Department of Biological Engineering, University of Missouri,
Columbia, MO 65211, USA
1 Introduction
Currently meat quality is evaluated through visual appraisal of certain carcass characteristics, such as marbling (intramuscular fat), muscle color, and skeletal maturity.
Although the visual appraisal method has been serving the meat industry for many years,
the subjective evaluation leads to some major intrinsic drawbacks, namely inconsistencies and variations of the results in spite of the fact that the graders are professionally
trained (Cross et al., 1983). This has seriously limited the ability of the meat industry to provide consumers with products of consistent quality, and subsequently its
As there is always a desire from the meat industry for objective measurement methods, many research efforts have been devoted to developing instruments or devices.
One obvious and popular approach is to measure the mechanical properties of meat as
indicators of tenderness, with the most well known perhaps being the Warner-Bratzler
shear-force instrument. For cooked meat, the shear strength correlates well with sensory
tenderness scores (Shackelford et al., 1995); however, such a method is not practical
for commercial fresh-meat grading.
To overcome this problem, one of the most promising methods for objective assessment of meat quality from fresh-meat characteristics is to use computer vision (Brosnan
and Sun, 2002; Sun, 2004). Recently, applications of computer vision for food quality
evaluation have been extended to food in many areas, such as pizza (Sun, 2000; Sun
and Brosnan, 2003a, 2003b; Sun and Du, 2004; Du and Sun, 2005a), cheese (Wang and
Sun, 2002a, 2002b, 2004), and cooked meats (Zheng et al., 2006a; Du and Sun, 2005b,
2006a, 2006b). However, for fresh meats, research began in the early 1980s. For example, Lenhert and Gilliland (1985) designed a black-and-white (B/W) imaging system for
lean-yield estimation, and the application results were reported by Cross et al. (1983)
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
112 Quality Evaluation of Meat Cuts
and Wassenberg et al. (1986). Beef quality assessment by image processing started with
the work by Chen et al. (1989) to quantify the marbling area percentage in six standard
USDA marbling photographs, and later on McDonald and Chen (1990a, 1990b) used
morphological operations to separate connected muscle tissues from the longissimus
dorsi (LD) muscle. For quality evaluation of other fresh meat, such as pork and lamb,
early studies were performed by Kuchida et al. (1991) and Stanford (1998). The composition (fat and protein %) of pork were analyzed based on color video images (Kuchida
et al., 1991) and video-image analysis was also used for on-line classification of lamb
carcasses (Stanford, 1998). Since then, research has been progressing well in this area.
To develop a computer vision system (CVS) for objective grading of meat quality, several steps are essential. Although the existing human grading system has many
intrinsic drawbacks, any new systems designed as a replacement must still be compared
with the human system before they can be accepted. Furthermore, the existing human
grading system is qualitative, whereas the quantitative characteristics that contribute
to the human grading are not always obvious. Therefore, it is necessary to search for
image features that are related to human scores for marbling abundance, muscle color,
and maturity – and, eventually, official grades such as USDA grades. Moreover, to
improve the usefulness of the grading system, new instrumentally-measurable characteristics are needed to enhance the power of the grades in predicting eating quality,
such as tenderness.
2 Quality evaluation of beef
2.1 Characterizing quality attributes
Meat images can be processed by computer vision to characterize quality attributes
such as those defined in the Japanese Beef Marbling Standard and in the USDA beef
grading system. Color-image features have been extracted to predict human scores of
color, marbling, and maturity (Tan, 2004). Studies have also been conducted to predict
the Japanese Beef Color Standard (BCS) number based on beef images (Kuchida et al.,
2.1.1 Color and marbling
Computer vision technique has been demonstrated as a rapid, alternative, and objective
approach for measuring beef color and marbling. The pioneering work in this area was
conducted by McDonald and Chen (1990a, 1990b, 1991, 1992). Based on reflectance
characteristics, fat and lean in the longissimus dorsi (LD) muscle were discriminated
to generate binary muscle images (McDonald and Chen, 1990a, 1990b, 1991, 1992). A
certain degree of correlation between total fat surface area and sensory panel scores for
marbling was achieved, with an r 2 of 0.47. Data from these early studies also suggest
that it is not reliable to measure only the visible total fat area to distinguish multiple
categories of marbling (McDonald and Chen, 1991). In order to improve marbling score
prediction, McDonald and Chen (1992) later proposed the use of a Boolean random
Quality evaluation of beef 113
model to describe the spatial marbling distribution, and made significant improvements
in prediction accuracy.
In Japan, Kuchida and his co-workers (1992, 1997a, 1997b, 1998, 2000a, 2000b,
2001a, 2001b, 2001c) conducted a series of studies in using computer vision to determine marbling scores of beef. Kuchida et al. (1997a) used images of ribeye muscle
from 16 Japanese Black steer carcasses and various models of beef marbling standards
(BMS 2-12) to assess fat as a percentage of ribeye muscle, the number of intramuscular fat deposits (marblings), and the characteristics (area and form coefficient) of each
marbling. The BMS (Beef Marbling Standard) used was developed by the Japanese
National Institute of Animal Industry for the evaluation of beef marbling in 1988.
Kuchida et al. (1997a) showed that, in ribeye muscle, the fat percentage as determined
by computer vision correlated highly with the marbling score as determined visually
(r = 0.87, P < 0.01). In order to establish correlation between fat area ratio and marbling, Kuchida et al. (2000a) used the marbling degree of semispinalis capitis (SC) and
semispinalis dorsi (SD) muscles from cattle as supplementary parameters in the BMS
to evaluate images of 99 cross-sections of SC, SD and LD muscles. It was shown that
image features of cross-sections of the ribeye muscle had the potential for prediction
of crude fat content (Kuchida et al., 2000b). Ratios of fat over lean area in longissimus
dorsi muscles from 22 Japanese Black cattle (steers) were also determined in their
studies (Kuchida et al., 2001c). In order to improve the results, a stepwise multiple
regression with the BMS value assigned by a grader as the dependent variable was
conducted, using 148 independent covariates. It was shown that the BMS value could
be predicted reasonably accurately by multiple regression with six covariates selected
by the stepwise approach (Kuchida et al., 2001b).
In addition, similar work was also conducted by Ishii et al. (1992) and Ushigaki
et al. (1997). The suitability of using the BMS value for evaluating beef marbling
was compared with that of the marbling score. Based on the relationship between the
BMS value (or marbling score) and the ratio of the intramuscular fat area over the total
ribeye area (determined by image analysis), it was found that the BMS value is a more
appropriate scale than the marbling score for evaluating beef marbling.
In the USA, the color-image processing technique has also been applied to the
assessment of muscle color and marbling scores of beef ribeye steaks. Tan (2004)
extracted color-image features to predict human scores of color, marbling, and maturity.
Sixty whole beef ribs representing various marbling and color scores were obtained
from a local supplier, and 5-cm thick slices were taken from the samples. Each slice
was then cut into two 2.5-cm thick samples. The two freshly cut opposite steak surfaces
were used for analysis; one for image acquisition and the other for sensory analysis.
Besides the sensory evaluations, a color-image system was also used to capture
sample images, with the illumination and camera settings carefully selected to feature an appropriate resolution for revealing the small marbling flecks. The captured
images were then subject to a series of processing techniques: image filter, background removal, segmentation of fat from muscles, isolation of the LD muscles, and
segmentation of marbling from the LD muscle (Gao et al., 1995; Lu and Tan, 1998;
Lu, 2002). Figure 5.1 shows an original image and the corresponding segmented one.
The holes in the LD muscles give the image of marbling flecks.
114 Quality Evaluation of Meat Cuts
Figure 5.1 Beef image: (a) original; (b) segmented LD muscle. The holes in the LD muscle give the
marbling image.
Features relating to the color of the entire LD muscle were extracted from the muscle
images. The LD muscle color was characterized by the means (µR , µG , and µB ) and
standard deviations (σR , σG , and σB ) of the red, green, and blue color components. Features representing the amount and spatial distribution of marbling were extracted, and
the size of each marbling fleck was also calculated. To account for the effects of fleck
size, marbling was classified into three categories according to area: A1 < 2.7 mm2 ,
A2 = 2.7–21.4 mm2 and A3 > 21.4 mm2 . Several marbling features were computed
to measure the marbling abundance: Dci (number of marbling flecks in size category
Quality evaluation of beef 115
Ai per unit ribeye area), Dai (sum of marbling area in size category Ai per unit ribeye
area), Dc (number of all marbling flecks per unit ribeye area), and Da (total marbling
area per unit ribeye area).
Marbling was rated by a 10-member trained panel according to the USDA
marbling scorecards in a 9-point scale where 1 = devoid, 2 = practically devoid,
3 = traces, 4 = slight, 5 = small, 6 = modest, 7 = moderate, 8 = slightly abundant, and
9 = moderately abundant. Color was evaluated according to a beef color guide in an
8-point scale: 1 = bleached red, 2 = very light cherry red, 3 = moderately light cherry
red, 4 = cherry red, 5 = slightly dark red, 6 = moderately dark red, 7 = dark red, and
8 = very dark red. The panel averages were used as the sensory scores.
The results indicated that blue mean (µB ) was not significant for color prediction,
whereas red and green (µR and µG ) were significant. This suggests that although
all three color components varied, the green component might not affect the panelists’
scoring. The fact that µR was significant in marbling prediction showed that the judges’
opinions were influenced by the lean color. Results also showed that both the count and
area densities of small marbling flecks (Dc1 and Da1 ) influenced the sensory scores,
which was expected. The area density of large flecks (Da3 ) was also significant in
influencing the sensory scores, indicating that the panelists were easily influenced by
the presence of a few large marbling flecks, although in the sensory analysis they were
instructed not to put more weight on larger flecks. Therefore, the global marbling area
density (Da ) was influential in the scoring. Statistical analysis indicated that µR and
µG were significant for color scores, while µR , Dc1 , Da1 , Da3 , and Da were useful for
marbling scores. The r 2 values of regression were 0.86 and 0.84 for color and marbling,
respectively, showing the usefulness of the above features in explaining the variations
in sensory scores.
The above study (Tan, 2004) shows that the image features characterizing the spatial
variation of marbling are not significant in the regression. McDonald and Chen (1992)
also indicated that information on the spatial distribution of marbling does not correlate
significantly with marbling scores.
In order to improve the results, Tan et al. (1998) used fuzzy logic and artificial
neural network techniques to analyze the sensory scores. In this study (Tan et al.,
1998), the fuzzy sets, fuzzy variables, and sample membership grades were represented
by the sensory scales, sensory attributes, and sensory responses, respectively. Multijudge responses were formulated as a fuzzy membership vector or fuzzy histogram
of response, which gave an overall panel response free of unverifiable assumptions
implied in conventional approaches. Then, from the image features selected by backward elimination, neural networks were employed to predict the sensory responses in
their naturally fuzzy and complex form. Finally, a maximum method of defuzzification
was used to give a crisp grade of the majority opinion. With this improvement by using
the fuzzy set and neural network, a 100 percent classification rate was achieved for
the color and marbling, which further verified the usefulness of the image features
The artificial neural network technique was also used to enhance the robustness of
a hybrid image-processing system which can automatically distinguish lean tissues in
beef cut images with complex surfaces, thus generating the lean tissue contour (Hwang
116 Quality Evaluation of Meat Cuts
et al., 1997). Furthermore, Subbiah (2004) also developed a fuzzy algorithm to segment
fat and lean tissue in beef steaks (longissimus dorsi). The fat and lean were differentiated
using a fuzzy c-means clustering algorithm using convex hull procedures. The LD was
segmented from the steak using morphological operations of erosion and dilation. After
each erosion–dilation iteration, a convex hull was fitted to the image to measure the
compactness. Iterations were continued, to yield the most compact LD. The algorithm
has been found to segment the steaks with a classification error of 1.97 percent.
Computer vision was also tested by Dasiewicz et al. (2002) to analyze the color of LD
from 30 dairy and beef cattle. A significant correlation was found between either texture
and CIE-Lab color features or texture and the pH values, regardless of the meat type
(dairy and beef). This study confirmed the advantage of using computer-image analysis
as a tool for evaluating chemical composition and marbling characteristics of beef cuts.
2.1.2 Skeletal maturity
The USDA beef grading system uses the lean color and the degree of cartilage ossification at the tips of the dorsal spine of the sacral, lumbar, and thoracic vertebrae to
determine the physiological maturity of beef carcasses. Such an evaluation is subjective
and prone to human biases in spite of the professional training received by the graders.
Therefore, in order to improve the consistency of results and obtain more precise
description of products, objective measurements of the physiological maturity of cattle
carcasses is desirable. The computer vision technique is one such objective method.
In a study conducted by Hatem and Tan (1998), color images from 110 beef carcass
with USDA maturity scores ranging from “A” (young) to “E” (old) were taken. For “A”
maturity the cartilage in the thoracic vertebrae is free of ossification, and for “B” maturity there is some evidence of ossification. Then, the cartilage becomes progressively
ossified with age until it appears as bone. As the degree of cartilage ossification in the
vertebrae is the most important indicator of skeletal maturity, only images focused on
the thoracic vertebra around the thirteenth to fifteenth ribs were taken (Figure 5.2). The
images were initially segmented to isolate the bone from the cartilage using color and
spatial features (Hatem and Tan, 2000, 2003; Hatem et al., 2003). The hue component
in the HSI (hue, saturation, and intensity) color space was found to be effective in
segmenting the cartilage areas, while the component in the CIE-Lab color space gave
good results for segmenting the bones. A set of morphological operations was conducted to refine and combine the segmented cartilage and bone into a bone–cartilage
object, which was then used to characterize the degree of cartilage ossification.
Compared with bone, the color of cartilage is normally lighter. Therefore, color varies
along the bone–cartilage differently due to different degrees of ossification (Hatem and
Tan, 1998). For animals with “A” maturity, which have more cartilage and thus give a
longer segment of light colors along the length of the carcass, the average hue values
calculated along the length of the bone–cartilage object are useful image features.
Therefore, Hatem and Tan (1998) used these hue values as input vectors to a neural
network and the maturity score as the output of the network. The network was trained
by using the back-propagation algorithm. The trained neural network could then be
used as maturity score predictor. For every set of samples from the 110 beef carcasses,
it was divided into five subsets – four for training and the fifth for testing, in a rotating
Quality evaluation of beef 117
Figure 5.2 Image of vertebrae: (a) original; (b) outline of bone–cartilage objects.
manner. The fuzzy set technique (Tan et al., 1998) was incorporated in the use of scoring
by the professional grader. The maturity scores predicted by the neural network were
compared with the human scores to calculate the correct classification rate. Results
show that the average rates for the five rotations varied from 55 to 77 percent (Tan
et al., 1998). The above algorithm was applied to another group of samples of 28 cattle
of known age, most of which were of “A” maturity while the rest were of “B” maturity.
An average 75 percent classification rate was obtained, indicating the generality and
robustness of the above procedure (Hatem et al., 2003).
2.2 Predicting quality and yield grades
Generally speaking, beef carcasses are yield-graded by visual appraisal of the twelfth
rib surface and other parts of a carcass. In the USA, USDA standards are used
118 Quality Evaluation of Meat Cuts
Figure 5.3
Fat area for the beef sample shown in Figure 5.1.
for the grading of the carcass – i.e., carcass yield (lean percentage) is determined
by considering (1) the amount of external fat, or the fat thickness over ribeye muscle;
(2) the amount of kidney, heart, and pelvic fat; (3) the area of the ribeye muscle, and
(4) the carcass weight (Lu and Tan, 1998). Computer vision has been investigated as a
tool to achieve the above grading (Lu et al., 1998; Soennichsen et al., 2005, 2006).
In an early study conducted by Lu et al. (1998), beef carcasses (247 for quality
grading, 241 for yield grading) of the same maturity were selected in a commercial
packing plant and prepared according to normal industrial practice. Each carcass was
graded by an official USDA grader with an eight-point scale (prime, choice, select,
standard, commercial, utility, cutter, and canner) for quality, and a five-point scale
(1 to 5) for yield. Immediately after the official grading, digital color images of the
ribbed surfaces (steaks) were captured under constant lighting conditions. The images
went through various steps of segmentation to obtain the regions of interest and to
extract relevant color and marbling. Figure 5.3 shows an example of the extracted fat
area image processed based on the image shown in Figure 5.1. As fat thickness is an
important indicator of lean meat yield, the back-fat area was partitioned into the dorsal
part (the upper-left half of the fat area in Figure 5.3) and the ventral part (the lower-right
half of the fat area in Figure 5.3). The thickness was then computed in the direction
approximately perpendicular to the back curvature (lower boundary of the fat area in
Figure 5.3), with the average thickness for both parts being used as the fat thickness.
Divergence maximization using linear and non-linear transforms was employed to
maximize the differences among classes (Lu and Tan, 1998). For quality classification,
only linear transforms were applied; for yield classification, linear quadratic and cubic
transforms were employed. Supervised classifiers were trained for both quality and
yield classification. The data set was randomly partitioned into ten subsets, nine of
them for training and the tenth for testing, in a rotating fashion until each of the ten
subsets was tested.
Quality evaluation of beef 119
For quality classification, the correct rate varied with the rotations of the procedure.
For a total of ten rotations, three were 100 percent; four were 90–99 percent, and the
remaining three were 60–70 percent. Therefore, the average rate was 85.3 percent.
For yield classification, the correct rate was above 50 percent for eight out of the ten
rotations with the linear transform. Using quadratic and cubic transforms did not significantly improve the correct rate. The linear transform yielded the best performance,
with an average rate of 64.2 percent. The quality classification result was considered
excellent, while the yield result was reasonably good.
Cannell et al. (2002) employed a dual-component computer vision system (CVS) to
predict commercial beef subprimal yields and to enhance USDA yield grading. In the
system, the first video camera captures an image of the outside surface and contour of
unribbed beef, while the second records an image of the exposed twelfth/thirteenth rib
interface after ribbing. Before the carcasses from 296 steer and heifer cattle were cut
into industry-standard subprimal cuts, the carcasses were evaluated by the CVS and
by USDA official graders and on-line graders. The results indicated that the CVS predicted wholesale cut yields more accurately than did the on-line yield grading. When
the estimated ribeye area was replaced by the computer vision measurement in determination of USDA yield grade, accuracy of the cutability prediction similar to that of
USDA official graders was achieved. The dual-component CVS was also employed
by Steiner et al. (2003) to enhance the application of USDA yield grade standards at
commercial chain speeds for cattle carcasses. The system measured the longissimus
muscle area of carcasses at the twelfth/thirteenth rib interface and combined the measured data with on-line grader estimates of yield grades, resulting in an increase in the
accuracy of yield grade prediction.
In a separate study, Shackelford et al. (2003) used a specially developed image
analysis system for on-line prediction of the yield grade, longissimus muscle area,
and marbling score of 800 cattle carcasses at two beef-processing plants. Prediction
equations developed incorporating hot carcass weight and image features could account
for 90, 88 and 76 percent of variation in calculated yield grade, longissimus muscle area,
and marbling score, respectively. As comparison, official USDA yield grade as applied
by on-line graders was able to account for only 73 percent of variation. Therefore the
system had the potential for improving accuracy of yield grade determination; however,
it could not accurately predict the marbling.
BeefCam is a video-imaging technology that scans beef carcasses into colordifferentiated images from which the subsequent eating quality can be predicted. For
instance, BeefCam can be used to measure lean color as an indicator of beef tenderness, since the color relates to the pH values of the lean tissue. Wyle et al. (2003)
tested the prototype BeefCam system to sort cattle carcasses into expected palatability groups. The system was either used alone or in combination with USDA quality
grades assigned by line-graders. A total of 769 carcasses from four commercial, geographically dispersed beef packing plants were used. These carcasses were divided into
three USDA quality groups – Top Choice, 241 carcasses; Low Choice, 301 carcasses;
Select, 227 carcasses. Before each use, the system was calibrated with a white standard
card. Images of longissimus muscles at the twelfth/thirteenth rib interface were then
captured. These images were processed and analyzed using two regression models: one
120 Quality Evaluation of Meat Cuts
only used BeefCam data while the other also used a coded value for quality grade. These
two models were validated with 292 additional carcasses at another plant. The quality
data were also obtained as determined by Warner-Bratzler shear force after 14 days of
aging and sensory measurements on corresponding cooked strip loin steaks. Results
confirmed the usefulness of the BeefCam system, as sorting by BeefCam reduced
the number of carcasses in the “certified” group, which generated steaks of tough or
unacceptable overall palatability.
Research was also conducted in Europe to study the capability of CVS for grading
carcasses according to the official EUROP scheme (EC 1208/1981). Soennichsen et al.
(2005, 2006) applied image analysis to grade 1519 calf carcasses. The CVS predicted
accurately the fat class on a 15-point scale (EUROP grid with subclasses); however, its
accuracy was poorer for conformation, suggesting that a special scale was needed for
calf carcasses. The system also predicted the weight of retail cuts with high accuracy,
with the residual estimation error of primal cuts and retail cuts being 1.4–5.2 percent.
Prediction of the total and the saleable meat weight was also very accurate, with residual
estimation errors of 2.1 and 3.5 percent, respectively.
2.3 Predicting carcass composition
Early studies using image features to predict beef carcass composition such as
lean, fat, and bone can be traced back to the late 1990s. Karnuah et al. (1999,
2001) established equations for predicting cattle-carcass percentages of total lean, total
fat, and total bone composition, using data collected from 73 Japanese Black steers
slaughtered at 27–40 months of age. The composition data were fitted into various
multiple linear regression equations. Correlation coefficients between predicted values and actual values obtained on dissection for weight of lean, fat, and bone were
0.70–0.72, whereas those for percentages of lean, fat, and bone were much lower
Anada and Sasaki (1992) and Anada et al. (1993) analyzed the fifth/sixth rib crosssection of beef carcasses to measure the areas of lean, fat and bone, and their total.
The dimensions of the longissimus and trapezius muscles, and the distance between
the centers of gravity of these two muscles were also measured. A stepwise regression
analysis was used to select the best regression equations to predict carcass composition
(as weight and percentage of lean, fat, and bone). The total area or fat area was the best
predictor for percentage lean; percentage fat area gave the best prediction for fat or bone
percentage; while the distance between the centers of gravity of the two muscles was
an important predictor for weight of fat and bone. Karnuah et al. (1994) also measured
beef composition using fifth/sixth rib cross-sections. Images from 28 fattened cattle
were captured to measure individual muscle area, circumference, length of long and
short axes, total cross-sectional area, total muscle area, total fat area, total bone area,
eccentricity, direction of long axis, and distance between the centers of gravity of any
two muscles. Results indicated that excellent repeatability measurements were achieved
in using the eccentricity and direction of long axis, total area, total muscle area, total
fat area, and total bone area of the carcass cross-section for the prediction of carcass
Quality evaluation of beef 121
Images of cross-sections cut at other locations in beef carcasses were also used to
predict composition. Nade et al. (2001) used images from cross-sectional ribloin cut
between the sixth and seventh rib bones of 24 Japanese Black cattle (steer) carcasses.
Predictive equations were derived for estimating composition parameters such as total
area, muscle area, fat area, ratio of muscle to fat, and shape of the longissimus and
trapezius muscles. The actual weight and ratio of muscle to fat were determined through
physical dissection from the left side of the carcass. The ribeye area, ratio of muscle to
total area, and carcass weight were used to predict the muscle weight. The ribeye area,
ratio of fat to total area, and carcass weight were used to estimate the amount of muscle
in the carcass, the fat weight and the amount of fat in the carcass. Results indicated that
the ribeye area, the ratio of fat to total area, and the carcass weight are important parameters for carcass composition prediction. Lu and Tan (2004) predicted lean yield by
measuring the twelfth rib surface of cattle carcasses and compared the CVS results with
USDA yield characteristics and USDA yield grades. Different multiple linear regression models were developed for data from each set of measurements on 241 cattle carcasses, and the models were found to be suitable for lean yield prediction. Results also
indicated that percentage of ribeye area was a more useful predictor of lean yield than
fat thickness. Marbling count and marbling area density were also useful for prediction.
However, prediction of lean percentage was not as accurate as that of lean weight.
2.4 Predicting tenderness
As discussed previously, marbling and color are two key grades in beef quality, especially for young animals such as those classified as “A” or “B” maturity in the USDA
system. However, these two quality factors are weak predictors of meat texture attributes
such as tenderness. Meat texture is a measure of the fineness of a cut surface, which
is influenced by the size of the muscle fibers and/or muscle-fiber bundles visible on a
transversely cut surface. The texture of muscles can vary from a velvety, light structure
to a coarse, rough structure, and may also be influenced by the amount of connective tissue and marbling. Therefore, meat surface texture can be a good indicator of tenderness.
Research on predicting meat texture is the most challenging of computer vision
applications for meat quality evaluation. Fortunately, meat texture can be related to
image texture, which is an important characteristic of images. Image texture, usually
referred to the fineness, coarseness, smoothness, granulation, randomness or lineation
of images, or how mottled, irregular or hummocky images are, can be quantitatively
evaluated (Haralick, 1973). For image texture analysis, a variety of techniques are
available (Zheng et al., 2006b, 2007), including statistical, structural and spectral
approaches (Du and Sun, 2004). Among them, the statistical approach is most commonly used with methods of the gray level co-occurrence matrix (GLCM), the gray
level difference method (GLDM) and the gray level run length matrix (GLRM) (Du and
Sun, 2004, 2006c). Therefore, in order to find the better quantitative predicators for
meat texture attributes, computer vision has been investigated as a tool – for example,
Sun and co-workers (Du and Sun, 2005b, 2006a, 2006b; Zheng et al., 2006b) have been
using computer vision to predict eating quality attributes of cooked meats. For fresh
meat cuts, Li et al. (1999, 2001) characterized muscle texture by image processing,
122 Quality Evaluation of Meat Cuts
and used color, marbling, and textural features to predict beef tenderness measured by
traditional methods such as Warner-Bratzler shear forces and sensory evaluations.
2.4.1 Correlation with Warner-Bratzler shear force
In the experiments performed by Li et al. (1999), 265 carcasses, all of “A” maturity,
were selected to differ in USDA quality grades in a commercial packing plant. A rib
section (posterior end) was removed and vacuum-packaged; this was later cut into
2.54-cm thick steaks and cooked for Warner-Bratzler shear-force measurements. Eight
cores of 1.27-cm diameter from each cooked steak were removed parallel to the muscle
fibers, and sheared with a Warner-Bratzler instrument. The shear force varied from
12.25 to 51.35 N, but the average data were used in analysis.
Images of the ribbed surfaces were captured in the plant immediately following
quality grading, and segmented into muscle, fat and marbling. Image textural features,
based on pixel value, run length, and spatial dependence, were computed as predictors
of tenderness, as the image texture of beef muscles surface is directly or indirectly
related to tenderness. Figure 5.4 shows differences in image textures of beef samples with varying tenderness. These differences can be measured by image processing
(Li et al., 1999).
A pixel run is defined as a set of connected pixels in the same row having the same or
close intensity values. Textural features can be obtained from the statistics of the pixel
runs. Pixel value spatial dependence can be described by the so-called co-occurrence
matrix. A total of 14 statistical measures (Haralick et al., 1973) were employed to extract
textural features from this matrix. The textural features having the highest correlation
with shear force were selected for subsequent analyses. Principal component regression
(PCR) was performed to test the improvement in shear-force prediction after the textural
features were added to the color and marbling features. PCR consists of principal
component analysis (PCA) followed by multiple linear regression (MLR) and partial
least squares (PLS). The SAS stepwise procedure (Anon, 2000) was performed to
select the variables significant for shear-force prediction. Classification analysis was
also used to classify the beef samples. The prediction of shear-force values involves
comparison among three groups of quality predictors: color and marbling scores graded
by UADA official graders; color and marbling features obtained from images; and
color, marbling, and textural features from images. When the first group of features
was used to predict shear force, the prediction was very poor (r 2 < 0.05); however,
when the second group of features was used, the predictions were slightly improved
to r 2 = 0.16, where r 2 is the correlation coefficient. The last group of features yielded
the best classification results, with r 2 being 0.18 using PCR and 0.34 using partial
least square (PLS). However, the prediction results were still poor. The classification
procedure was thus improved by the following procedure. Based on the shear-force
values (≤1.71 kg, 1.71 kg–3.09 kg, and ≥ 3.09 kg), the beef samples were segregated
into three categories. Among them, 198 samples were used as calibration data and
45 samples were used as test data. The SAS Discriminant procedure (Anon, 2000)
with a linear discriminant function was used to classify the samples into a category.
The calibration samples could be classified with 76 percent accuracy, and the test
samples with 77 percent accuracy.
Quality evaluation of beef 123
Figure 5.4 These saturation images of two samples of different tenderness exhibit different image textures.
The upper sample is less tender.
The above results show the possibilities of using combined color, marbling, and
textural features to improve the models for shear-force prediction; however, the prediction accuracy is still far from satisfactory. Therefore, color, marbling, and muscle
image textures may still not contain sufficient information to define cooked-meat
shear force. Nevertheless, the inclusion of textural features brought about significant
improvement. Therefore, the image texture of muscles is at least a significant indicator
of the mechanical properties of beef (Du and Sun, 2006a).
Wheeler et al. (2002) compared the accuracy of three objective systems (a portable
prototype BeefCam image analysis system, slice shear-force values, and colorimetry) for identifying beef cuts which can be guaranteed as being tender. Longissimus
muscles at the twelfth rib from 708 carcasses were assessed. Steaks were cooked
124 Quality Evaluation of Meat Cuts
for sensory analysis and Warner-Bratzler shear-force determination. As indicated by
Li et al. (1999), only color features (either by BeefCam or colorimetry) were inadequate in predicting tenderness, and slice shear values were still the accurate method
for identifying certifiably tender beef. However, if a BeefCam module was integrated
with a CVS (Vote et al., 2003), the CVS/BeefCam reading for longissimus muscle
areas correlated well with shear values. CVS/BeefCam loin color values were effective
in classifying carcasses into groups which produced steaks of varying shear values,
except that CVS/BeefCam fat color values were generally ineffective.
2.4.2 Correlation with sensory tenderness
Image features of beef samples have been investigated to correlate with tenderness
as determined by sensory analysis (Tan, 2004). In the study conducted by Li et al.
(1999), beef samples were obtained from the short loins of pasture-finished steers and
feedlot-finished steers of the same maturity grade. Two sets (97 pieces in each set)
of short strip loins were used: one for sensory tenderness evaluation performed by a
trained ten-person panel, and the other for image analysis.
Images of the beef samples were acquired individually for all the samples under the
same conditions. The acquired images were processed. Features were extracted, and
37 of them were selected as predictors for further analysis. Of the 97 beef samples,
72 formed a training subset and the remaining 25 samples were used as a test subset.
PCR and PLS were performed to test the improvement in tenderness prediction resulting
from adding texture features. The SAS stepwise procedure (Anon, 2000) was then
performed to select the significant variables.
PCR was applied to all the samples, and results indicated that the r 2 was increased
to 0.72 after adding the texture features as compared with 0.30 for using color and
marbling features alone. In the PLS analysis, the first 14 factors explaining most of
the variations were used for regression. For using only the color and marbling features,
the r 2 for the training data set and test data set were 0.35 and 0.17 respectively, which
were increased to 0.70 and 0.62 respectively after adding texture features. Similar to
shear-force prediction, the above improvements confirmed the significant contribution
made by the textural features to beef tenderness prediction.
A neural network (NN) model with one hidden layer was also developed. The
14 factors from the PLS analysis were used as inputs, and the tenderness scores as
the output. The backpropagation algorithm was employed to train the neural network,
and the test data subset was used to test the model. The r 2 for the prediction by NN
was 0.70, which is similar to those from PCR and PLS (Li et al., 1999).
In a further study carried out by Li et al. (2001), samples from 59 crossbred
steers were used, which were divided into “tough” (tenderness score <8) and “tender”
(otherwise) groups. Color images (480 × 512 pixels) were captured from an area of
about 250 mm2 of the LD muscle for each sample, and each of these was then divided
into several sub-images of 64 × 64 pixels. Of these smaller images, 45 from the tough
group (tenderness scores ranged from 4.17 to 6.67) and 45 from the tender group
(tenderness scores from 8.54 to 11.99) were randomly selected for further analysis. A
wavelet-based method was developed to decompose each texture image into 35 different rectangular textural primitives of different sizes (Li et al., 1999). The degree
Quality evaluation of pork 125
of presence of each primitive type was measured with the percentage of the image
area occupied by the primitive, which was referred to as the primitive fraction. Variance and correlation analysis was used to reduce the primitive fractions from 35 to 10.
The ten primitive fractions were significantly different between the tough and tender
groups, and were not significantly correlated to each other. Multivariate classification based on the Fisher’s linear discriminant (Anon, 2000) was applied to these 10
primitive fractions with the leave-one-out scheme. Of the 90 samples consisting of
45 tough and 45 tender samples, 89 were used to train the Fisher’s linear discriminant and the remaining one for testing. This procedure was repeated rotationally until
each of all the 90 samples was tested. Results indicated that 38 tough and 37 tender samples were correctly classified, giving an overall correct classification rate of
83.3 percent. However, if a neural-network classifier was trained and used in the same
rotational manner, a lower overall correct classification rate (only 82.3 percent) was
3 Quality evaluation of pork
3.1 Characterizing color and marbling attributes
As discussed previously, color and marbling are the two main quality attributes in meat.
Color perception plays a major role in consumer evaluation of meat quality. Consumers
need to be satisfied with the visual quality before purchase. Computer vision has been
extensively tested in predicting the quality attributes of beef muscles; however, research
on pork quality evaluation mainly focuses on the assessment of pork color (Lu et al.,
1997, 2000; Tan et al., 2000).
Early work on the assessment of pork color was conducted by Lu et al. (1997,
2000). Forty-four pork loins were randomly picked and cut at the tenth rib, and the
muscle color was evaluated by a trained seven-member sensory panel. The color was
rated using the following five-point scale: 1 = pale-purplish gray, 2 = grayish pink,
3 = reddish pink, 4 = purplish red, and 5 = dark purplish red. After sensory analysis,
images were immediately captured under the same lighting conditions, and segmented
into background, muscle, and fat (Lu et al., 2000). Color-image features, including
the means (µR , µG and µB ) and standard deviations (σR , σG and σB ) of the red, green
and blue values of the segmented loin muscle areas, were extracted, which were used
to predict sensory color scores by using both statistical models and neural network.
Partial least square (PLS) technique was used to determine the latent variables for
further analysis by multiple linear regression (MLR) and neural network (NN) trained
with the backpropagation algorithm. The correlation coefficient between the predicted
and the sensory color scores was 0.75 for the NN, with 93.2 percent of the samples
having a prediction error lower than 0.6; for the statistical model, the correlation
coefficient was 0.52 and 84.1 percent of the samples had a prediction error of 0.6
or lower, which was considered negligible from a practical point of view. Therefore,
image processing with NN is an effective tool for predicting consumer responses to
fresh pork color. Later, Tan et al. (2000) also used computer vision to predict color
126 Quality Evaluation of Meat Cuts
scores visually assessed by a five-member untrained panel for fresh loin chops. A total
of 203 pork loin chops in three separate experiments were evaluated. After training
with image features from pork classified by the panel, the computer vision system
(CVS) could predict loin chops accurately, with up to 86 percent agreement with
visually assessed scores. Therefore, by combining with an effective tracking system,
CVS could sort retail meat cuts into uniform quality/color classes before shipping, and
operate at on-line speeds with accuracy and repeatability similar to that of stationary
Unlike beef, which offers a better color contrast, pork marbling assessment by CVS
is a more challenging task owing to the low contrast of pork coloration. A couple of
preliminary studies have been performed, but the characteristics of marbling fat were
not determined in these studies (Murray, 1996; Scholz et al., 1995a, 1995b). In the
work of Scholz et al. (1995a, 1995b), a three-chip RBG camera and an S-VHS camera
were each used to acquire the images of pig (swine) carcasses, and degrees of marbling
of LD muscle were assessed. The three-chip RBG camera gave the best results, with
a degree of certainty of 73 percent in 130 carcasses, however, the S-VHS camera only
gave a degree of certainty of 46 percent in 21 carcasses – which is still better than
subjective visual assessment.
Recently, Faucitano et al. (2005) developed a computer vision system for quantitative
description of marbling fat and studied its capability. Intramuscular fat (IMF) content
and pork tenderness varied among three lines of pigs differing in their marbling and IMF
patterns. A commercial image analysis program, “ImageC,” was used. The pork image
was imported as a tif file, converted to gray scale using the Micro “LinCo 2” in ImageC,
and a threshold function was performed. A median filter (5 × 5) was then applied
followed by a green-channel selection on the original color image. As the variance of
the green channel was the largest for fat in meat samples, the green component was used
for binarization. The procedure described above was again performed, resulting in a
gray-scale image of the filtered original image (Figure 5.5). Thresholding segmentation
was used to select the fat flecks, and finally the marbling characteristics (such as size,
number, and area of marbling flecks and their proportions on the muscle surface) were
calculated. Results on 60 pork chops (longissimus) showed significant correlation
between marbling characteristics obtained by image analysis, and both intramuscular
content and shear-force values, indicating the possibility of evaluating the contribution
of marbling to variability in the eating quality of pork.
3.2 Predicting carcass grades
The overall objective of pork carcass grading is to identify the true commercial
value of a carcass by segregating carcasses based on lean content. In Canada, the
grading system has elements of both classification and grading schemes. The classification scheme objectively describes the carcass in terms of estimated yield and
carcass weight, and the grading scheme assigns a yield and weight class (Fortin,
1989). In value-based systems such as the Canadian system, the negative relationship between the subcutaneous fat thickness in the carcass and the amount of lean
Quality evaluation of pork 127
Figure 5.5 Pork sample: (a) formaldehyde-treated, after cutting; (b) following oil-red treatment;
(c) after rinsing; (d) image after the CIA binarization process (Faucitano, 2005).
in the carcass is used to estimate the lean content of the carcass. The fat thickness
is measured at a single site on the carcass mid-line at two locations (maximum loin
and maximum shoulder). However, the accuracy of single-site technology is limited
(Fisher, 1990).
In order to improve accuracy, new concepts have been introduced. For example,
Sather et al. (1996) studied the use of ultrasound imaging technology for predicting
carcass quality in terms of lean meat yield. Meanwhile, a Danish company developed a
digitized three-dimensional ultrasound system (AutoFom) which generates a full scan
of the carcass (Busk et al., 1999). Using AutoFom, Busk et al. (1999) reported a
residual mean square error of 1.84 for estimating lean meat percentage.
On the other hand, Soennichsen et al. (2001, 2002) conducted investigations regarding using a computer vision system (CVS) for assessment of the commercial quality
of pig (swine) carcasses in small-scale slaughterhouses in Germany. Performance of
the CVS apparatus alone was compared with that of CVS plus ultrasonic probe data
and the AutoFom carcass grading instrument. Relative estimation error for weight of
cuts was <5 percent for CVS in all cases except the loin (approx. 7 percent), which
was slightly improved by the addition of ultrasonic probe data. However, AutoFom
128 Quality Evaluation of Meat Cuts
Figure 5.6 The two components of the Lacombe Computer Vision System for grading pig carcasses
consist of a computer vision system to provide two- and three-dimensional measurements of a carcass, and
an ultrasound system to provide fat thickness, depth and area of the loin (m. longissimus dorsi) (Fortin et al.,
was slightly more accurate than the CVS techniques for most cuts. When the CVS was
applied to evaluate the lean meat content of the belly cut from half carcasses, it was
necessary to utilize ultrasonic probe data in order to achieve a low relative estimation
error comparable with that of the AutoFom instrument.
In order to improve grading accuracy further, Fortin et al. (2003) focused on an
approach integrating ultrasound with image analysis, which was significantly different
from the approach of scanning the whole carcass. They developed a CVS prototype
containing two components: video imaging to capture images of the carcass, and
ultrasound imaging to scan a cross-section of the loin muscle. Figure 5.6 shows the
set-up of such a system (Fortin et al., 2003). The system was used at a commercial
abattoir to grade 241 carcasses (114 barrows and 127 gilts), which fell into three
weight categories (<80.9, 80.9–89.0, and >89 kg) and three fat-thickness categories
(<15, 15–21, and >21 mm). Saleable pork yields were determined from full cut-out
values. Linear, two-dimensional, angular, and curvature measurements by the CVS
provided an accurate and almost instantaneous assessment of conformation. Muscle
area and fat depth 7 cm off the mid-line, measured by ultrasound at three-fourths of the
last rib, alongside two- and three-dimensional measurements of the lateral side of the
carcass, gave the best estimations of saleable pork yields. Results suggest that the CVS
offer a novel approach to grading swine carcasses and appears to have commercial
Quality evaluation of lamb 129
3.3 Predicting carcass composition
In addition to using computer vision systems to predict pork color and marbling, and
grade carcasses, such systems have also been investigated as a tool to predict the
composition of pork carcasses (Kuchida et al., 1991; Schwerdtfeger et al., 1993; Branscheid et al., 1995). Kuchida et al. (1991) described a procedure for estimating the
composition (fat and protein percentage) of pork, and found that for fresh pork both
the fat and protein percentages were significantly correlated with the color features
from RGB space and CIE-Lab space. For all except the “a” value, correlations were
slightly higher for fat percentage than for protein percentage.
Branscheid et al. (1996) and Branscheid and Dobrowolski (1996) conducted studies
using CVS data for the prediction of tissue composition of a carcass and major cuts, the
lean meat weight, total weight, and color of these cuts. Experiments were conducted
on 109 carcasses, and the cross-section at the sixth/seventh thoracic vertebra level was
examined. Predictions of weight of major cuts and weight of lean meat were mostly
accurate, while predictions of tissue composition of the major cuts (especially the belly)
were also good.
Branscheid et al. (2003, 2004) compared a CVS system alone or in combination
with a grading instruments such as AutoFom for the estimation of carcass composition
of weight of cuts, and lean meat percentage in the carcass and the belly cut. A total of
143 swine carcasses (representing two commercial crossbred lines and carcass weights
of 75–115 kg) were studied. For all the carcass parameters studied, estimations using
the CVS were fully comparable to those using grading instruments. The combination of
CVS data with grading instrument data improved prediction accuracy for all parameters
studied. A similar CVS system was also studied by McClure et al. (2003), to predict
swine carcass composition and pork cutability from 278 swine carcasses differing in
gender. Hot carcass weight and percentage carcass lean were analyzed by the imaging
system prior to prediction. Results showed that when hot carcass weight was included
as a predictive factor, the CVS was able to predict the weight of total saleable product,
fat corrected lean, bone-in ham, bone-in loin, loin lean, and belly with good levels
of accuracy. New prediction models and equations were also developed to give some
further improvement in estimation accuracy and precision for total saleable product
yield and fat-corrected lean. However, performance was still no better than that with
existing instrumental technology.
4 Quality evaluation of lamb
Unlike beef and pork, research on using computer vision systems to evaluate the quality
of lamb is quite limited. A study regarding prediction of the saleable meat yield of
sheep carcasses using image analysis based on carcass shape and color data collected
from 1211 lambs of known gender, breed type, and carcass weight has been reported
(Stanford et al., 1998).
Like beef and pork, the use of computer vision systems for the grading of lamb
carcasses is a major research area, and commercial systems especially designed for the
130 Quality Evaluation of Meat Cuts
classification of lamb carcasses have been developed. In the USA, a lamb vision system
(LVS) has been developed. Brady et al. (2003) assessed the system, based on 246 lamb
carcasses. Hot carcass weights (HCW) were recorded and carcasses were assessed
by official graders. Using LVS + HCW, 65–87 percent and 72–86 percent variation
in weight were predicted for boneless and bone-in cuts, respectively. LVS + HCW
estimates accounted for 60–62 percent of variation in saleable meat yields, which was
improved by including LD muscle area and/or adjusted fat thickness in the output
models. On the whole, the accuracy of the LVS + HCW system was better than that of
yield grades in predicting red-meat yields from the carcasses. In Australia, a VIAScan
lamb vision system has also been developed. Several studies have been conducted
to assess the system. Hopkins (1996) showed the possibility of using VIAScan to
predict muscularity, but the predictive models were not robust enough for use in other
populations of lamb carcasses. Stanford et al. (1998) included principal component
analysis (PCA) to provide a more robust model. Recently, Hopkins et al. (2004) used
the VIAScan system to predict lean meat yields of 360 sheep carcasses consisting of
ewes and wethers, and illustrated the potential for its automatic prediction of meat
yield, fat depth, shape, and individual cut yield.
Regarding predicting eating quality, as previously discussed, most studies have been
conducted on beef, with approaches used including GLCM (Shiranita et al., 1998;
Li et al., 1999), GLRM (Li et al., 1999; 2001), the fractal approach (Ballerini and
Bocchi, 2001), and wavelets (Li et al., 2001). Work on predicting lamb texture is very
recent. Chandraratne et al. (2006a) investigated the effectiveness of geometric and
textural features extracted from lamb-chop images in predicting lamb carcass grades.
In this study, 12 geometric and 90 textural (co-occurrence) features were extracted
from each of the acquired images. After dimensionality reduction, six feature sets were
generated and used for classification, which included three sets of principal component
(PC) scores and three sets of reduced features. Results indicated that 66.3 percent and
76.9 percent overall classification, based on 6 PC scores (geometric) and 14 PC scores
(geometric and texture) respectively, could be achieved. Based on 6 geometric and 14
(geometric and textural) reduced features, it was also possible to achieve 64.4 percent
and 79.4 percent overall classification of lamb carcasses, respectively. If carcass weight
was included as a parameter, the overall classification accuracy of both feature sets was
increased to 85 percent. Chandraratne et al. (2006b) further investigated the usefulness
of raw-meat surface characteristics (geometric and texture) in predicting lamb tenderness. Besides the 12 geometric features measured from each of the acquired lamb-chop
images, further textural features (including 36 difference histogram, 90 co-occurrence
and 10 run length textural features) were also extracted. After dimensionality reduction, four feature sets consisting of six geometric, four difference histogram, eight
co-occurrence and four run length features were generated, and these were utilized
individually and in different combinations to predict cooked lamb tenderness by using
neural network, linear and non-linear regression analyses. The neural network analysis
produced the highest coefficient (r 2 ) of determination of 0.746 using 14 (geometric and co-occurrence) features, and the non-linear regression analysis produced the
highest r 2 of 0.602 using 22 (geometric, co-occurrence, difference histogram, and
run length) features. The linear regression analysis was least successful. These studies
Future perspectives 131
show the predictive potential of combining image analysis with texture analysis for lamb
grade prediction.
5 Future perspectives
Although quality evaluation of fresh meats, including beef, pork, and lamb, using
computer vision has made excellent progress, challenges still remain; in the meantime,
there are many research opportunities with great potential.
Given the complex nature of meat images, one of the most challenging issues in
computer vision is to develop excellent segmentation algorithms, because no existing
algorithm is totally effective for meat-image segmentation. Segmenting a meat image
into parts of interest, without human intervention, in a reliable and consistent manner, is
a prerequisite to the success of all subsequent operations leading to successful computer
vision-based meat grading. There have been successes in employing multivariate and
non-linear approaches (Gao et al., 1995; Lu and Tan, 1998) that will lead to unsupervised image segmentation algorithms effective for meat-image processing (Lu, 2002).
Unsupervised learning techniques, e.g. clustering and self-organizing map, appear to
be the key to robust meat-image segmentation. Other than unsupervised techniques,
supervised learning techniques – especially the support vector machine (SVM) – can
also be applied. The SVM is a state-of-the-art learning algorithm that fixes the decision
function based on structural risk minimization instead of minimization of the misclassification on the training set to avoid overfitting problems. SVM performs binary
classification by finding maximal margin hyperplanes in terms of a subset of the input
data (support vectors) between different classes (Du and Sun, 2004). Furthermore, system robustness, real-time capability, sample handling, and standardization are among
the issues that remain to be addressed. System robustness or reliability require further
in-depth research, as it is still a major challenge to design a system that has sufficient flexibility and adaptability to handle the biological variations in meat products.
However, the issues seem not to be insurmountable, although they do require further
research and development.
For finding reliable quality indicators for fresh meats, using only conventional
indicators of color, marbling, and maturity is not sufficient to predict eating qualities such as tenderness. Therefore, many research opportunities exist to discover
new measurable fresh-meat characteristics that are predictors of cooked-meat quality.
Image texture is one of the most active research fields (Zheng et al., 2006b). Du and Sun
(2006a) have investigated the correlation of image textural features extracted by five
different methods with the tenderness of cooked pork ham, and have achieved some
success. Recently, a powerful signal filter, the Gabor filter, which analyzes images
by simulating the human perception and process of texture information corresponding
to orientation, spatial position, and periodicity, has been extensively and successfully
applied in textural image segmentation (Zheng et al., 2006b), and therefore the Gabor
filter could possibly be applied to analysis of image textural features for predicting,
grading, and inspecting fresh meat qualities. Furthermore, research can also focus on
132 Quality Evaluation of Meat Cuts
capturing images with different wavelengths and even different modalities, which has
the potential to reveal additional quality information.
6 Conclusions
Computer vision technology based on color-image processing and analysis is a useful
tool for the evaluation of fresh meat, including beef, pork, and lamb. The image features
extracted can be used to effectively quantify and characterize quality attributes such as
muscle color, marbling, maturity, and muscle texture, and quality and yield grades and
cooked-beef tenderness can be predicted with satisfactory accuracy. Therefore, computer vision is a promising technology for objective meat quality grading. The extensive
research results published have formed a foundation for further investigation of the ability of computer vision systems to better provide quantitative information that is unobtainable subjectively, leading to the eventual replacement of human graders. Despite the
above progress and successes, many challenging issues still remain and require continued in-depth research – among these, developing effective methodologies for consistent
meat-image segmentation and discovering new quality indicators are the priorities.
Anada K, SasakiY (1992) Image analysis prediction of beef carcass composition from cross
sections. Animal Science & Technology, 63 (8), 846–854.
Anada K, Sasaki Y, Nakanishi N, Yamazaki T (1993) Image analysis prediction of beef
carcass composition from the cross section around the ribeye muscle in Japanese
Black steers. Animal Science & Technology, 64 (1), 38–44.
Anon (2000) SAS/INSIGHT User’s Guide, Version 8, SAS Institute Inc., Cary, NC, USA.
Ballerini L, Bocchi L (2001). A fractal approach to predict fat content in meat images. In
2nd International Symposium on Image and Signal Processing and Analysis, ISPA
2001, Pula, Croatia, 19–21 June 2001.
Brady AS, Belk KE, LeValley SB, Dalsted NL, Scanga JA, Tatum JD, Smith GC (2003)
An evaluation of the lamb vision system as a predictor of lamb carcass red meat yield
percentage. Journal of Animal Science, 81 (6), 1488–1498.
Branscheid W, Dobrowolski A (1996) Accuracy of video image analysis. Assessment of
joint value and lightness of pork. Fleischwirtschaft, 76 (12), 1228, 1230, 1233–1236,
1238, 1328.
Branscheid W, Dobrowolski A, Hoereth R (1995) Video image analysis. A method for online
determination of the joints’ value in pig carcasses. Fleischwirtschaft, 75 (5), 636–638,
641–642, 671.
Branscheid W, Dobrowolski A, Hoereth R (1996) Video image analysis. A method for the
on-line recording of the cut value of pig carcasses. Fleischwirtschaft, 76 (7), 721–724.
Branscheid W, Hoereth R, Baulain U, Tholen E, Dobrowolski A (2003) Estimation of the
carcass composition based on the combination of the video imaging analysis with other
References 133
grading systems Mitteilungsblatt der Bundesanstalt für Fleischforschung, Kulmbach,
42 (161), 259–265.
Branscheid W, Hoereth R, Baulain U, Tholen E, Dobrowolski A (2004) Estimation of the
carcass composition based on the combination of the video imaging analysis with
other grading systems. Fleischwirtschaft, 84 (2), 98–101.
Brosnan T, Sun D-W (2002) Inspection and grading of agricultural and food products by
computer vision systems – a review. Computers and Electronics in Agriculture, 36
(2–3), 193–213.
Busk H, Olsen EV, Brødum J (1999) Determination of lean meat with the AutoFom
classification system. Meat Science, 52, 307–314.
Cannell RC, Belk KE, Tatum JD, Wise JW, Chapman PL, Scanga JA, Smith GC (2002)
Online evaluation of a commercial video image analysis system (Computer Vision
System) to predict beef carcass red meat yield and for augmenting the assignment of
USDA yield grades. Journal of Animal Science, 80 (5), 1195–1201.
Chandraratne MR, Kulasiri D, Frampton C, Samarasinghe S, Bickerstaffe R (2006a) Prediction of lamb carcass grades using features extracted from lamb chop images. Journal
of Food Engineering, 74 (1), 116–124.
Chandraratne MR, Samarasinghe S, Kulasiri D, Bickerstaffe R (2006b) Prediction of lamb
tenderness using image surface texture features. Journal of Food Engineering, 77 (3),
Chen YR, McDonald TP, Crouse JD (1989) Determining percent intra-muscular fat on
ribeye surface by image processing. 1989 ASAE Annual International Meeting, Paper
No. 893009, ASAE, St Joseph, MI, USA.
Cross HR, Gilliland DA, Durland PR, Seideman S (1983). Beef carcass evaluation by use
of a video image analysis system. Journal of Animal Science, 57 (4), 910–917.
Dasiewicz K, Slowinski M, Maczuga C (2002) Quality of meat from dairy and beef cattle
and the use of computer image analysis for marbling evaluation. Przemysl Spozywczy,
56 (7), 26–28.
Du CJ, Sun D-W (2004) Recent developments in the applications of image processing
techniques for food quality evaluation. Trends in Food Science & Technology, 15 (5),
Du CJ, Sun D-W (2005a) Comparison of three methods for classification of pizza topping
using different color space transformations. Journal of Food Engineering, 68 (3),
Du CJ, Sun D-W (2005b) Correlating shrinkage with yield, water content and texture of pork ham by computer vision. Journal of Food Process Engineering, 28,
Du CJ, Sun D-W (2006a) Correlating image texture features extracted by five different
methods with the tenderness of cooked pork ham: a feasibility study. Transactions of
the ASAE, 49 (2), 441–448.
Du CJ, Sun D-W (2006b) Automatic measurement of pores and porosity in pork ham and
their correlations with processing time, water content and texture. Meat Science, 72
(2), 294–302.
Du CJ, Sun D-W (2006c) Learning techniques used in computer vision for food quality
evaluation: a review. Journal of Food Engineering, 72 (1), 39–55.
134 Quality Evaluation of Meat Cuts
Faucitano L, Huff P, Teuscher F, Gariepy C, Wegner J (2005) Application of computer image
analysis to measure pork marbling characteristics. Meat Science, 69 (3), 537–543.
Fisher AV (1990) New approaches to measuring fat in the carcasses of meat animals. In
Reducing Fat in Meat Animals (JD Wood, AV Fisher, eds). London: Elsevier Applied
Science, pp. 243–255.
Fortin A (1989) Electronic grading of pig carcasses: the Canadian experience. In New
techniques in Pig Carcass Evaluation (JF O’Grady, ed.). Proceedings of the European
Association of Animal Production–Symposium of the Commission on Pig Production.
Helsinki: EAAP Publication No 41, pp. 75–85.
Fortin A, Tong AKW, Robertson WM, Zawadski SM, Landry SJ, Robinson DJ, Liu T,
Mockford RJ (2003) A novel approach to grading pork carcasses: computer vision
and ultrasound. Meat Science, 63 (4), 451–462.
Gao X, Tan J, Gerrard DE (1995) Image segmentation in 3-dimensional color space. 1995
ASAE Annual International Meeting, Paper No. 953607, ASAE, St Joseph, MI, USA.
Haralick RM, Shanmugum K, Dinstein I (1973) Textural features for image classification.
IEEE Transactions of Systems, Man, and Cybernetics, SMC-3 (6), 610–621.
Hatem I, Tan J (1998) Determination of skeletal maturity by image processing. 1998 ASAE
Annual International Meeting, Paper No. 983019, ASAE, St Joseph, MI, USA.
Hatem I, Tan J (2000) Cartilage segmentation in vertebra images. 2000 ASAE Annual
International Meeting, Paper No. 003125, ASAE, St Joseph, MI, USA.
Hatem I, Tan J (2003) Cartilage and bone segmentation in vertebra images. Transactions
of the ASAE, 46 (5), 1429–1434.
Hatem I, Tan J, Gerrard DE (2003) Determination of animal skeletal maturity by image
processing. Meat Science, 65 (3), 999–1004.
Hopkins DL (1996) The relationship between muscularity, muscle: bone ratio and cut
dimensions in male and female lamb carcasses and measurement of muscularity using
image analysis. Meat Science, 43, 307–317.
Hopkins DL, Safari E, Thompson JM, Smith CR (2004) Video image analysis in the Australian meat industry – precision and accuracy of predicting lean meat yield in lamb
carcasses. Meat Science, 67 (2), 269–274.
Hwang H, Park B, Nguyen M, Chen Y-R (1997) Hybrid image processing for robust extraction of lean tissue on beef cut surfaces. Computers & Electronics in Agriculture,
17 (3), 281–294.
Ishii T, Cassens RG, Scheller KK, Arp SC, Schaefer DM (1992) Image analysis to determine
intramuscular fat in muscle. Food Structure, 11 (1), 55–60.
Karnuah AB, Moriya K, Sasaki Y (1994) Computer image analysis information extracted
from beef carcass cross-section and its precision. Animal Science & Technology,
65 (6), 515–524.
KarnuahAB, Moriya K, SasakiY (1999) Extraction of computer image analysis information
by desk top computer from beef carcass cross sections. Asian-Australasian Journal of
Animal Sciences, 12 (8), 1171–1176.
Karnuah AB, Moriya K, Nakanishi N, Nade T, Mitsuhashi T, Sasaki Y (2001) Computer
image analysis for prediction of carcass composition from cross-sections of Japanese
Black steers. Journal of Animal Science, 79 (11), 2851–2856.
References 135
Kuchida K, Suzuki K, Yamaki K, Shinohara H, Yamagishi T (1991) Prediction of chemical
composition of pork by personal computer color image analysis. Animal Science &
Technology, 69 (5), 477–479.
Kuchida K,Yamaki K,YamagishiT, MizumaY (1992) Evaluation of meat quality in Japanese
beef cattle by computer image analysis. Animal Science &Technology, 63 (2), 121–127.
Kuchida K, Kurihara T, Suzuki M, Miyoshi S (1997a) Development of an accurate method
for measuring fat percentage on ribeye area by computer image analysis. Animal
Science & Technology, 68 (9), 853–859.
Kuchida K, Kurihara T, Suzuki M, Miyoshi S (1997b) Computer image analysis method for
evaluation of marbling of ribeye area. Animal Science & Technology, 68 (9), 878–882.
Kuchida K, Konishi K, Suzuki M, Miyoshi S (1998) Prediction of the crude fat contents in
ribeye muscle of beef using the fat area ratio calculated by computer image analysis.
Animal Science & Technology, 69 (6), 585–588.
Kuchida K, Kato K, Suzuki M, Miyoshi S (2000a) Utilization of the information from M.
semispinalis capitis and M. semispinalis dorsi by computer image analysis on BMS
number prediction. Animal Science Journal, 71 (9), J305–310.
Kuchida K, Kono S, Konishi K, Vleck LD van, Suzuki M, Miyoshi S (2000b) Prediction of
crude fat content of longissimus muscle of beef using the ratio of fat area calculated
from computer image analysis: comparison of regression equations for prediction
using different input devices at different stations. Journal of Animal Science, 78 (4),
Kuchida K, Hasegawa M, Suzuki M, Miyoshi S (2001a) Prediction of Beef Color Standard
number from digital image obtained by using photographing equipment for the cross
section of carcass. Animal Science Journal, 72 (9), J321–328.
Kuchida K, Suzuki M, Miyoshi S (2001b) Development of photographing equipment for
the cross-section of carcass and prediction of BMS number by using obtained image
from that equipment. Animal Science Journal, 72 (8), J224–231.
Kuchida K, Fujita K, Suzuki M, Miyoshi S (2001c) Investigation of the relationship between
season and BMS number assigned by grader using image analysis method. Animal
Science Journal, 72 (7), J6–12.
Lenhert DH, Gilliland DA (1985) The design and testing of an automated beef grader. 1985
ASAE International Meeting, Paper No. 853035, ASAE, St Joseph, MI, USA.
Li J, Tan J, Martz FA, Heymann H (1999) Image texture features as indicators of beef
tenderness. Meat Science, 53 (1), 17–22.
Li J, Tan J, Shatadal P (2001) Classification of tough and tender beef by image texture
analysis. Meat Science, 57 (4), 341–346.
Lu J (2002) Transforms for Multivariate Classification and Application to Tissue Image
Segmentation. PhD dissertation, University of Missouri, Columbia, MO, USA.
Lu J, Tan J (1998) Application of image segmentation to meat image processing. 1998
ASAE Annual International Meeting, Paper No. 983016, ASAE, St Joseph, MI, USA.
Lu W, Tan J (2004) Analysis of image-based measurements and USDA characteristics as
predictors of beef lean yield. Meat Science, 66 (2), 483–491.
Lu J, Tan J, Gerrard DE (1997) Pork quality evaluation by image processing. 1997 ASAE
Annual International Meeting, Paper No. 973125, ASAE, St Joseph, MI, USA.
136 Quality Evaluation of Meat Cuts
Lu J, Tan J, Gao X, Gerrard GE (1998) USDA beef classification based on image processing. 1998 ASAE Mid-Central Conference, Paper No. MC98131, ASAE, St Joseph,
Lu J, Tan J, Shatadal P, Gerrard DE (2000) Evaluation of pork color by using computer
vision. Meat Science, 56 (1), 57–60.
McClure EK, Scanga JA, Belk KE, Smith GC (2003) Evaluation of the E+V video image
analysis system as a predictor of pork carcass meat yield. Journal of Animal Science,
81 (5), 1193–1201.
McDonald TP, Chen YR (1990a) Application of morphological image processing in
agriculture. Transactions of the ASAE, 33 (4), 1345–1352.
McDonald TP, Chen YR (1990b) Separating connected muscle tissues in images of beef
carcass ribeyes. Transactions of the ASAE, 33 (6), 2059–2065.
McDonald TP, Chen YR (1991) Visual characterization of marbling in beef ribeyes and its
relationship to taste parameters. Transactions of the ASAE, 34 (6), 2499–2054.
McDonald TP, Chen YR (1992) A geometric model of marbling in beef longissimus dorsi.
Transactions of the ASAE, 35 (3), 1057–1062.
Murray AC (1996) Digitized image analysis for assessment of pork quality. In Proceedings
of the 42nd International Congress of Meat Science and Technology, 1–6 September,
Lillehammer, Norway, p. 242.
Nade T, Karnuah AB, Masuda Y, Hirabara S, Fujita K (2001) Estimation of carcass composition from the cross-section at ribloin of Japanese Black steers by computer image
analysis. Animal Science Journal, 72 (9), J313–320.
Sather AP, Bailey DRC, Jones SDM (1996) Real-time ultrasound image analysis for the
estimation of carcass yield and pork quality. Canadian Journal of Animal Science,
76 (1), 55–62.
Scholz A, Paulke T, Eger H (1995a) Determining the degree of marbling in the pig. Use of
computer-supported video picture analysis. Fleischwirtschaft, 75 (11), 1322–1324.
Scholz A, Paulke T, Eger H (1995b) Degree of marbling in the porcine M. long. dorsi. Use
of computer-supported video picture analysis for determination. Fleischwirtschaft,
75 (3), 320–322.
Schwerdtfeger R, Krieter J, Kalm E (1993) Objective evaluation of the belly cut.
Fleischwirtschaft, 73 (1), 93–96.
Shackelford SD, Wheeler TL, Koohmaraie M (1995) Relationship between shear-force and
trained sensory panel tenderness ratings of 10 major muscles from Bos indicus and
Bos Taurus cattle. Journal of Animal Science, 73, 3333–3340.
Shackelford SD, Wheeler TL, Koohmaraie M (2003) On-line prediction of yield grade,
longissimus muscle area, preliminary yield grade, adjusted preliminary yield grade,
and marbling score using the MARC beef carcass image analysis system. Journal of
Animal Science, 81 (1), 150–155.
Shiranita K, Miyajima T, Takiyama R (1998) Determination of meat quality by texture
analysis. Pattern Recognition Letters, 19, 1319–1324.
Soennichsen M, Dobrowolski A, Hoereth R (2001) Commercial valuation of pig carcasses by using Video Image Analysis. Mitteilungsblatt der Bundesanstalt fuer
Fleischforschung, Kulmbach, 40 (153), 223–230.
References 137
Soennichsen M, Dobrowolski A, Hoereth R, Branscheid W (2002) Commercial valuation
of pig carcasses by video image analysis. Fleischwirtschaft, 82 (1), 98–101.
Soennichsen M, Dobrowolski A, Spindler M, Brinkmann D, Branscheid W (2005) Video
image analysis of calf carcasses. Mitteilungsblatt der Fleischforschung Kulmbach,
44 (168), 99–106.
Soennichsen M, Dobrowolski A, Spindler M, Brinkmann D, Branscheid W (2006) Video
image analysis of calf carcasses. Fleischwirtschaft, 86 (5), 107–110.
Stanford K, Richmond RJ, Jones SDM, Robertson WM, Price MA, Gordon AJ (1998)
Video image analysis for on-line classification of lamb carcasses. Animal Science,
67 (2), 311–316.
Steiner R, Wyle AM, Vote DJ, Belk KE, Scanga JA, Wise JW, Tatum JD, Smith GC (2003)
Real-time augmentation of USDA yield grade application to beef carcasses using
video image analysis. Journal of Animal Science, 81 (9), 2239–2246.
Subbiah J, Ray N, Kranzler GA, Acton ST (2004) Computer vision segmentation of
the longissimus dorsi for beef quality grading. Transactions of the ASAE, 47 (4),
Sun, D-W (2000) Inspecting pizza topping percentage and distribution by a computer vision
method. Journal of Food Engineering, 44 (4) 245–249.
Sun D-W (ed) (2004) Applications of computer vision in the food industry. Special issue
of Journal of Food Engineering, 61 (1), 1–142.
Sun D-W, Brosnan T (2003a) Pizza quality evaluation using computer vision – part 1. Pizza
base and sauce spread. Journal of Food Engineering, 57 (1), 81–89.
Sun D-W, Brosnan T (2003b) Pizza quality evaluation using computer vision – part 2. Pizza
topping analysis. Journal of Food Engineering, 57 (1), 91–95.
Sun D-W, Du CJ (2004) Segmentation of complex food images by stick growing and
merging algorithm. Journal of Food Engineering, 61 (1), 17–26.
Tan FJ, Morgan MT, Ludas LI, Forrest JC, Gerrard DE (2000) Assessment of fresh pork
color with color machine vision. Journal of Animal Science, 78 (12), 3078–3085.
Tan J (2004) Meat quality evaluation by computer vision. Journal of Food Engineering,
61 (1), 27–35.
Tan J, Gao X, Gerrard DE (1998) Application of fuzzy sets and neural networks in sensory
analysis. Journal of Sensory Studies, 14, 119–138.
Ushigaki T, Moriya K, Sasaki Y (1997) BMS number as a scale for evaluation of beef
marbling standard. Animal Science & Technology, 68 (12), 1146–1153.
Vote DJ, Belk KE, Tatum JD, Scanga JA, Smith GC (2003) Online prediction of beef
tenderness using a computer vision system equipped with a BeefCam module. Journal
of Animal Science, 81 (2), 457–465.
Wang HH, Sun D-W (2002a) Correlation between cheese meltability determined with a
computer vision method and with Arnott and Schreiber tests. Journal of Food Science,
67 (2), 745–749.
Wang HH, Sun D-W (2002b) Assessment of cheese browning affected by baking conditions
using computer vision. Journal of Food Engineering, 56 (4), 339–345.
Wang HH, Sun D-W (2004) Evaluation of the oiling off property of cheese with computer
vision: correlation with fat ring test. Journal of Food Engineering, 61 (1), 47–55.
138 Quality Evaluation of Meat Cuts
Wassenberg RL, Allen DM, Kemp KE (1986) Video image analysis prediction of total
kilograms and percent primal lean and fat yield of beef carcasses. Journal of Animal
Science, 62 (6), 1609–1616.
Wheeler TL, Vote D, Leheska JM, Shackelford SD, Belk KE, Wulf DM, Gwartney BL,
Koohmaraie (2002). M The efficacy of three objective systems for identifying beef
cuts that can be guaranteed tender. Journal of Animal Science, 80 (12), 3315–3327.
Wyle AM, Vote DJ, Roeber DL, Cannell RC, Belk KE, Scanga JA, Goldberg M, Tatum JD,
Smith GC (2003). Effectiveness of the SmartMV prototype BeefCam system to sort
beef carcasses into expected palatability groups. Journal of Animal Science, 81 (2),
Zheng CX, Sun D-W, Zheng LY (2006a) Correlating color to moisture content of large
cooked beef joints by computer vision. Journal of Food Engineering, 77 (4), 858–863.
Zheng CX, Sun D-W, Zheng LY (2006b) Recent applications of image texture for evaluation
of food qualities – a review. Trends in Food Science and Technology, 17 (3), 113–128.
Zheng CX, Sun D-W, Zheng LY (2007) Recent developments and applications of image
features for food quality evaluation and inspection – a review. Trends in Food Science
and Technology, 17 (12), 642–655.
Quality Measurement
of Cooked Meats
Cheng-Jin Du and Da-Wen Sun
Food Refrigeration and Computerised Food Technology,
University College Dublin, National University of Ireland,
Dublin 2, Ireland
1 Introduction
The processing system for cooked meat products is normally a cook-chill system, which
is based on thorough cooking of the meat followed by chilling. The manufacturing
procedures (i.e. cooking and cooling) are among the principal determinants of the
quality of the product. During processing, significant changes occur in the composition
and structure of such meats, influencing the quality accordingly. Cooking is one of the
most important factors that affects the quality of cooked meats, owing to a series of
chemical and physical changes as cooking produces certain textures and flavors while
simultaneously killing pathogens and keeping the food safe. In order to obtain high
quality and safe cooked meats, they should be cooled quickly after cooking.
Over the last decade, cooked meat products for delicatessens, catering, and industrial
ingredient usage have become more and more popular in the meat industry. The yearly
pork production in the EU countries is around 13 million tonnes of carcasses (Daudin
and Kuitche, 1996). It is obvious that there is a developing and increasing market for
cooked meat products (Anon, 1997), and therefore the meat industry has a huge interest
in improving the visual qualities and inspection efficiency of cooked meats by applying
automatic techniques such as computer vision.
Cooked meat can be considered from various physical and chemical aspects. From
the visual viewpoint, exteriorly it is a solid system with a shape, and interiorly it
has irregular pores, and varying color and texture. From the sensory viewpoint, it has
qualities of tenderness, springiness, cohesion, gumminess, and chewiness. Chemically,
it is a combination of diverse components such as water, protein, and fat. As these all
describe the same object from different aspects, it is reasonable to assume that there
are relationships between the visual characteristics and the physical attributes/chemical
components of cooked meats, and that, based on theses relationships, the quality of
cooked meats could be evaluated through investigating the changes in their visual
properties, such as size, shape, color, and texture. Therefore, it is feasible to evaluate
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
140 Quality Measurement of Cooked Meats
the quality of cooked meats using computer vision. In the quality evaluation of cooked
meats as affected by cooking and cooling, computer vision has recently been shown
to have great potential for performing such a task by evaluating the physical changes
(shrinkage, pores, and porosity) during the manufacturing procedures, and their image
features (color and texture).
2 Shrinkage
Shrinkage of cooked meats is one of the most significant physical changes during the
cooking and cooling processes. In cooked meat systems, shrinkage is rarely negligible.
Considering the variable physical properties and the shrinkage of cooked meat during
cooling, Wang and Sun (2002a) developed a three-dimensional transient, simultaneous mass-and-heat-transfer finite element model for analyzing the vacuum cooling of
cooked meat joints. To predict temperature profiles in meat patties during double-sided
cooking, a mathematical model taking into account the two-dimensional cylindrical
geometry, radial shrinkage, and variation of thermal properties with temperature was
developed (Zorrilla and Singh, 2003). Using computer vision techniques, the shrinkage
measurement of cooked meats can be implemented automatically. Thus, the measurement efficiency of shrinkage can be improved in addition to maintaining consistency
and eliminating subjectivity.
2.1 Measurement of size and shape
Heating and cooling cause shrinkage of cooked meats, leading to decreases in all
dimensions, including perimeter, superficial area, and volume. Obviously, size and
shape measurements before and after processes are the basis for shrinkage estimation.
Using image-processing techniques, size and shape measurements of cooked pork ham
and beef joints have been implemented automatically in the work of Du and Sun (2006a)
and Zheng et al. (2006a), respectively.
The first stage for size- and shape-measurement using computer vision is to develop
methods to extract the contours of cooked meats. An image-processing algorithm of
three steps – i.e. image segmentation, noise reduction, and edge detection – can be
developed. Any image-segmentation methods that have the ability to segment the
cooked meat product from the background can be applied, such as thresholding or
region-based segmentation methods (Du and Sun, 2006a). A simple filter, like a median
filter, can then be employed to reduce possible noise within the segmented image, especially around the edge area. Based on the de-noised image, the edge of cooked meats
can be detected by the Canny edge detector (Canny, 1986).
2.1.1 Average diameter, short axis, long axis, and perimeter
The measurements of average diameter, short axis, long axis, and perimeter are relatively simple, as they can be estimated in two-dimensional space. The attained contour
of cooked meat product, as illustrated in Figure 6.1, can be expressed using the polar
Shrinkage 141
Long axis
Short axis
Figure 6.1 Shrinkage evaluation of pork ham.
coordinates R (the radial coordinate) and θ (the polar angle) with the center of gravity of
the shape as the origin. Then the axes li of the shape of cooked meat can be calculated by
l i = Ri + Rĩ ,
i = 1, 2, . . . , n/2
where i and ĩ satisfy the condition θ ĩ − θi = π, and n is the number of boundary points of
the contour. From the calculated axes, the average diameter can be computed as the average value of the axes. The longest axis can be obtained as the long axis, while the shortest
axis is the short one. The perimeter PM can be obtained using the following equation:
PM =
PMi = R2i + R2i+1 − 2 × Ri × Ri+1 × cos(θi )
Another way to estimate the average diameter, short axis, long axis, and perimeter of
a cooked meat is by fitting its irregular shape to a regular geometrical object. Since an
ellipse can not only represent the size of the object but also its orientation and deviation
from circularity, this was used to approximate the shape of cooked beef joints (Zheng
et al., 2006a). There are two approaches available in the literature for ellipse-fitting:
the boundary-based method and the region-based method. The former focuses on the
boundary of the object, and is employed when the shape of the object is regular. The
latter mainly deals with a region of the object (Mulchrone and Choudhury, 2004) and
is suitable for an object of very irregular shape. As cooked beef joints are normally far
from a perfect ellipse in shape, a region-based method developed by Russ (1999) was
adopted in the study of Zheng et al. (2006a). Based on the ellipse fitted, the average
diameter, short axis, long axis, and perimeter can be obtained.
2.1.2 Surface area and volume
The surface area and volume are important physical characteristics for manufacturing
cooked meat joints. They play a significant role in tackling production problems such
as those involved in cooking and cooling. Previous research results confirm that the
surface area and volume of samples correlate highly with the cooling loss. Since vacuum
cooling produces the cooling effect through moisture evaporation from cooked beef
142 Quality Measurement of Cooked Meats
products, McDonald et al. (2000) reported that the efficiency of vacuum cooling was
dependent on the surface area to volume ratio.
Due to the irregularities and variation in the surface profile of cooked meats, it is very
difficult to measure accurately the actual surface area and volume. Two methods have
been developed to allow estimation of the surface area and volume of cooked meats
using computer vision techniques: the derived method and the partitioned method (Du
and Sun, 2006a). The derived method is based on the three principal dimensions, i.e.
length (L), width (W), and thickness (T), and the surface area and volume are derived
mathematically. From the three measured principal dimensions, the analytical volume
equation of an ellipsoid can be used to estimate the volume of cooked meat:
V =
L×W ×T
Although there is no analytical equation available for calculating the surface area of an
ellipsoid, it can be estimated approximately using the expression proposed by Kumar
and Mathew (2003):
π/2 2
b sin ϕ + c cos ϕ × b2 cos2 ϕ + c 2 sin2 ϕ dϕ
π/2 1−
⎜ arcsin
+ 4a
b2 sin2 ϕ + c 2 cos2 ϕ × ⎜
⎟ dϕ
2 2
b cos ϕ + c sin ϕ
b2 cos2 ϕ + c 2 sin2 ϕ
where a = L/2, b = W/2, c = T/2, ϕ is the eccentric angle of a point on the surface of
ham. The integral can be solved by the Simpson’s rule (Abramowitz and Stegun, 1972).
There are no arbitrary assumptions or approximations involved in the proposed method
of Kumar and Mathew (2003), which is expected to be more reliable and accurate in
estimating the surface area of ellipsoids.
The partitioned method first divides ham into a number of sections, and then sums
the surface area and volume of each section to obtain an entire result. As shown in
Figure 6.1, apart from the two end portions, the shape of cooked meat is partitioned
into many thin discs with two plane faces and a curved edge, which are assumed to
be conical frustums for further computation. In their difference from the middle discs,
the two end portions are considered to be spherical caps. The entire surface area and
volume of cooked meat is hence obtained by summing these values for all the discs
and the two end portions.
Similarly, in order to calculate the volume and surface area of a cooked beef sample,
Zheng et al. (2006a) divided the beef sample into numerous cross-sections and assumed
each cross-section to be a cylindrical disc. The volume (V) and surface area (S) of
the cooked beef sample are the sums of these sections, using the following integral
approximations (Thomas and Finney, 1984):
V = Ax dx
Shrinkage 143
1 + f 2x + f 2y dx dy
where Ax is the area of each circular cross section perpendicular to the x axis, fx is the
gradient in the x direction, fy is the gradient in the y direction, and D is the projected
area of the beef sample for the integration.
2.2 Shrinkage determination and its relationship with yield,
water content, and texture
Shrinkage measurement of cooked meats is valuable not only from the viewpoint of
quality but also for economic reasons. The more water is lost, the greater the shrinkage
of cooked meats. A higher level of shrinkage generally implies more cooking and
cooling losses, and increased hardness, which has a negative consequence on the
quality of cooked meats. In contrast, a lower level of shrinkage leads to the expectation
of a juicier and more tender cooked meat product. Clearly, since shrinkage both has an
influence on the yield of cooked meats and gives a negative impression to the consumer,
it is also of great economic importance to the catering industry.
2.2.1 Determination of shrinkage
Based on the above size and shape characteristics obtained, three kinds of shrinkage of
cooked meats can be evaluated:
1. Shrinkage caused by cooking
2. Shrinkage caused by cooling
3. Total shrinkage during the entire cooking and cooling process.
The cooking and cooling shrinkages can be expressed as the percentage change in
the average diameter, short axis, long axis, perimeter, volume, and surface area of the
sample during the cooking and cooling processes, respectively. The total shrinkage of
the average diameter, short axis, long axis, perimeter, volume, and surface area can be
evaluated by the ratio between the initial values and the values after cooling.
Du and Sun (2006a) reported that the greatest shrinkage was that of volume during
cooking and cooling, at up to 9.36 and 12.65 percent, respectively. The long axis of
the sample is least affected by cooking and cooling, with decreases of 1.20 and 1.84
percent, respectively. Furthermore, all the measurements of cooking shrinkage are
somewhat lower than the corresponding measurements of cooling shrinkage; this can
be ascribed to external water making up some water loss while the joint is cooking in
a water bath.
Zheng et al. (2006a) reported that the cooling shrinkage of cooked beef joints could
be predicted by the developed model. The maximum and minimum axes, volume, and
surface area of the beef samples before and after cooling have a good linear relationship.
2.2.2 Correlations with yield, water content, and texture
The correlation analysis conducted by Du and Sun (2005) shows that the cooking
shrinkage in surface area is very significantly correlated with cooking loss (r = 0.95),
144 Quality Measurement of Cooked Meats
and that of volume is also significantly correlated with cooking loss (r = 0.91). However, no significant relationship has been found between cooking loss and the other four
cooking shrinkage measurements. During cooking of meat, shrinkage causes fluid to be
expelled from the meat, leading to loss of mass. Several stages of temperature-induced
shrinkage occur during cooking:
1. At around 40◦ C myosin begins to denature and precipitate, and transverse
shrinkage of the meat is observed
2. At approximately 55–60◦ C collagen shrinks due to denaturation
3. At around 60◦ C, longitudinal shrinkage of the meat is initiated (Bertram et al.,
Therefore, the shrinkage that takes place during cooking is multidimensional. Since
only one or two dimensions are considered when assessing the average diameter, short
axis, long axis, and perimeter, reductions in these measurements do not correlate well
with cooking loss; however, reductions in volume and surface area measurements have
a strong relationship with cooking loss.
Regarding cooling loss and yield, there is no significant correlation between either
cooling loss or yield and any of the six shrinkage dimensions. The lower moisture
content of the external surface may induce the formation of a crust (Mayor and Sereno,
2004), which fixes the volume and complicates the relationship between cooling loss
and the subsequent shrinkage of the inner part of ham. Consequently, the relationship
between total shrinkage and yield is also complicated.
The various types of shrinkage are highly negatively correlated with water content, although the shrinkages in volume and surface area show the highest correlation
(r = −0.98). Shrinkage of cooked meats increases with the volume of water removed;
the more water is removed, the greater the pressure imbalance produced between the
interior and the exterior of the meat, which generates contracting stresses leading to
shrinkage and changes in its shape (Mayor and Sereno, 2004). Conversely, the shrinkage in measurements has a positive correlation with the textural attributes, in particular
the shrinkage in the long axis correlates significantly with hardness (P < 0.05). The
amount of water in the meat significantly affects its quality when cooked. As the water
content decreases owing to shrinkage, the shear force, hardness, cohesion, springiness,
gumminess, and chewiness increase. However, no significant correlation has been
found with Warner-Bratzler Shear (WBS) force, cohesion, springiness, gumminess,
and chewiness (P > 0.05).
Figure 6.2 presents a global indication of the relationships between the set of shrinkage variables and the set of quality variables by using principal component analysis
(PCA). As stated by Destefanis et al. (2000), in comparison to the classical correlations,
PCA proves to be a very useful method to point out quickly the relationships among
the variables themselves, and allows immediate identification of which variables are
correlated with which others, and in which direction. It can be observed that the variable
of water content is located towards the top left-hand area of the diagram, and all six total
shrinkages are towards the right-hand side; therefore, they are negatively correlated.
In particular, since the shrinkages in short axis, volume, and surface area are located
Pores and porosity 145
Springiness Cohesion
Long axis *
Average diameter
Surface area
*Short axis
Figure 6.2 Plot of the first two loading vectors of principal component analysis.
towards the bottom right-hand side of the diagram, correlations between them and
water content are significant. Similarly, the yield variable has negative relationships
with all six total shrinkages, especially in average diameter, long axis, and perimeter.
The texture is positively correlated with all six shrinkage variables located towards the
right of the loading plot. Placed close together in the bottom right quadrant, the WBS
variable has higher relationships with shrinkages in the short axis, the volume, and
the surface area, while the hardness and cohesion variables are highly correlated with
shrinkage in the average diameter, long axis, and perimeter.
3 Pores and porosity
In the literature, much research effort has been directed towards studying the pores and
porosity of cooked meats (McDonald and Sun, 2001a, 2001b; Kassama and Ngadi,
2005). Cooked meats can be considered to be multiphase systems, i.e. gas–liquid–
solid systems (Rahman et al., 1996), which are hygroscopic and capillary porous with
definite void structures that modulate mass transport during heat processing (Kassama and Ngadi, 2005). The pore formation in cooked meats is very complex; this
is a consequence not only of the meat itself, but also of the subsequent processing,
i.e. cooking and cooling. Using mercury porosimetry and helium pycnometry, pores
and porosity can be measured manually. However, manual methods cannot provide
sufficient information. In the experimental work of McDonald and Sun (2001a), the
difficulties in acquiring the exact porosity of cooked beef samples were realized from
the outset. Recently, a computer vision method has been developed for pore characterization of pork ham (Du and Sun, 2006b), and the results demonstrate the ability of
such a technique to characterize the pore structure of cooked meats.
146 Quality Measurement of Cooked Meats
3.1 Measurement of pores and porosity
To develop an automatic method for the pore-structure characterization of cooked
meats using computer vision, an image-processing algorithm of three stages can be
developed to segment pores from the images – i.e. cooked meat extraction, image
enhancement, and pore segmentation. After the cooked meat product has been extracted
from the background, the contrast-limited adaptive histogram equalization (CLAHE)
(Mathworks, 1998) method can be applied to enhance the image. CLAHE operates
on small regions in the image, called tiles. Once each tile is enhanced, the contrast
can be limited, especially in homogeneous areas. To segment the pores correctly, it is
important to account for the fact that the pores are smaller compact spots than are the
non-pore areas.
In the work of Du and Sun (2006b), an improved watershed algorithm was developed
to extract pores from the gray-level images of ham as precisely as possible. To overcome
the problem of over-segmentation using the traditional watershed algorithm, a method
called the “marker-controlled watershed” is applied (Meyer and Beucher, 1990). After
marker extraction, the gradient image of ham is modified where it only has regional
minima at the locations of pore and background markers. Based on the modified
gradient image, the classical watershed transform can be used to obtain the desired
segmentation results. Figure 6.3 illustrates the results of pore segmentation.
From the segmented pores, the porosity, number of pores, pore size, and size distribution can be measured. Porosity is the most common terminology used in characterizing
pores (Rahman, 2001), which can be calculated as the ratio between the total area of
pores and the area of cooked meats. The pore size can be computed as the area or the
equivalent diameter of the pore.
The results obtained by Du and Sun (2006b) indicate that there is a wide range
of pore size within the samples. The statistical analysis shows that 79.81 percent of
pores have area sizes between 6.73 × 10−3 and 2.02 × 10−1 mm2 . However, only 8.95
percent of pores have an area size of more than 4.04 × 10−1 mm2 . This tendency of size
distribution is consistent with the reports from other researchers (Farkas and Singh,
1991; Kassama et al., 2003). The majority of small pores are likely to be the result
of cooking. During cooking, heating causes denaturation of protein, which may lead
to structural collapse and, allowing for the dehydration and shrinkage of the meat,
the formation of numerous pores. The porosity and pore sizes of samples tend to
decrease with cooking (Kassama and Ngadi, 2005), which can be attributed to the
physicochemical changes that trigger certain visco-elastic behavioral characteristics of
proteins. Intense heating may prompt meat protein gelation, a condition that causes
agglomeration of protein and shrinkage of the muscle, leading to alterations in pore
structure. Larger pores might be mainly attributed to the void space, while some of them
develop during cooling. McDonald and Sun (2001a) reported that the effect of cooling
on porosity is observed with the large increase in porosity throughout processing in
samples. In their work, they also point out that development of porosity during cooling
of the cooked meat is dependent on the initial moisture content of the sample, as well as
its composition, muscle-fiber orientation, and available surface area, and its physical
properties such as thermal conductivity or thermal diffusivity.
Pores and porosity 147
Figure 6.3 Results of pore segmentation: (a) original image; (b) extracted image; (c) enhanced image;
(d) segmented image (Du and Sun, 2005).
3.2 Correlation with water content, processing time,
and texture
As a defect often observed in cooked meats, internal pore formation is normally unappealing for the consumers and therefore has a negative effect for the meat industry
(Hullberg et al., 2005). Du and Sun (2006b) reported that the total number of pore (TNP)
significantly negatively correlated with the water content of pork ham (P < 0.05). For
raw meat, the variation in total extracellular space is found to explain 39 percent of the
variation in early postmortem drip loss in pork (Schäfer et al., 2002). During cooking,
heat denaturation of myofibrillar proteins and collagen will create more pores, and at
the same time increase water loss (Ofstad et al., 1993). As a result, the more pores
there are, the greater the amount of water lost during processing. Similarly, the water
content is found to be highly negatively correlated with porosity (P < 0.05). The action
of cooking causes loss of water, and consequently decreases the water content and
increases the porosity of cooked meats. Water evaporation plays an important role in
148 Quality Measurement of Cooked Meats
energy exchange during cooling (Girard, 1992). To facilitate the cooling process, it
is necessary to remove a certain proportion of the sample mass in the form of water
vapor (McDonald and Sun, 2001a). In the meantime, as moisture transport is closely
related to the formation of pores (Rahman, 2001), the more water lost during cooling,
the higher the porosity achieved.
Both TNP and porosity are negatively correlated with the cooking time. A greater
TNP (r = −0.56) and higher porosity (r = −0.67) will result in a quicker cooking time
(Du and Sun, 2006b). The cooking efficiency is affected by the thermal properties of
foods, which can be calculated from the thermal properties of each composite part of
a food. The main such parts of cooked meats are water, protein, and fat, while there are
also very small amounts of such compounds as salt and ash. The thermal conductivity
of protein and fat is considerably less than that of water (Mittal and Blaisdell, 1984),
and the typical thermal conductivity of meat increases with increasing water content.
Since pork ham is immersed in a water bath for cooking, the pores are filled with water
throughout the cooking procedure. A greater number of pores and higher porosity mean
that more water is held in the cooked meat, leading to a shorter cooking time.
Similar relationships between the cooling time and TNP and porosity were found
by Du and Sun (2006b). During the air-blast cooling process, the heat is transferred
from the core of the cooked meat to the surface by conduction and is released to the
cooling environment mainly by convection. The cooling rate of air-blast cooling is
governed by the thermal conductivity of the cooked meat (Wang and Sun, 2002b). For
the same reason, higher thermal conductivity of cooked meats with a greater TNP and
higher porosity will result in a shorter cooling time. However, as the cooling procedure
progresses, the thermal conductivity of cooked meats decreases with the decrease
in liquid water mass due to moisture loss and the generation of vapor in the pores.
Therefore, compared with the cooking time, the cooling time has a poorer linear relation
with the TNP and porosity. As the total processing time (TPT) is the sum of cooking
and cooling times, the TPT has negative relationships with the TNP and porosity.
For texture analysis, positive correlations are found between the pore characteristics and WBS, hardness, cohesion, and chewiness, respectively, while springiness and
gumminess are negatively related to the TNP and porosity (Du and Sun, 2006b). Measured by mechanical methods, the textural characteristics are profoundly affected by
their porous structure of food materials (Huang and Clayton, 1990). It has been demonstrated that both cooking and cooling can lead to the increase in porosity of cooked
meats due to water loss: greater porosity indicates a higher water loss in cooked meats.
Water is not only a medium for reaction but also an active agent in the modification of
physical properties (Huang and Clayton, 1990). Loss of water might lead to the compression of muscle fibers and an increase in the concentration of the interstitial fluid,
thus enhancing the adhesive power and strength (McDonald et al., 2000). Therefore,
cooked meats with greater TNP and porosity will result in higher shear force values and
a reduction in tenderness, while there will be an increase in hardness, cohesion, and
chewiness. The decreasing trend of springiness and gumminess with increasing TNP
and porosity could be explained by stress–strain analysis. Structurally, the porosity
and number of cavities might have an influence on deformation – a meat sample with
Color 149
larger porosity and more pores becomes weaker, and less mechanical stress is needed
to cause yielding and fracturing.
The relationships between pore characteristics and the quality attributes of cooked
meats are very complex in nature (Rahman and Sablani, 2003). Pore formation is dependent on the quality of the raw meat, pre-treatment, and processing, which influence
the pore size, geometry or shape, porosity, and size distribution of the meat matrix.
The variation in pore characteristics has various effects on the processing time, water
content, textural, and other quality attributes of the cooked meat. A well-structured
matrix and a fine, uniform structure with numerous small pores or open spaces will
probably result in a greater absorptive capacity and better retention of water compared
to coarse structures with large pores (Hermansson, 1985; DeFreitas et al., 1997), and
thus have a positive effect on the quality of cooked meats.
4 Color
Color is one of the main aspects involved in the visual qualities of cooked meats, and has
a significant impact on consumers’ appetite and judgment of their quality. Unfavorable
coloring will reduce the acceptability to the consumer and consequently decrease the
sale value, which is of great economic importance to the cooked meat industry. To
improve the color appearance, it is normal to inject sodium nitrite into raw meat before
tumbling to produce some cooked meat products, such as pork ham. The injection
level has a great effect on the reaction substances, including nitrite and myoglobin,
and thus on the color generation in cooked pork ham (pink nitrosyl myochromogen).
If injection levels are the same, there should be no difference in the content of nitrosyl
myochromogen under a certain cooking temperature. Therefore, it is not feasible for
computer vision systems to evaluate the quality of such kinds of cooked meat product
by using color features as indicators. However, for the non-injected cooked meats it is
possible to use computer vision for quality evaluation, as demonstrated in the work of
Zheng et al. (2005).
4.1 Color measurement
Of the color spaces used to characterize food products by computer vision, RGB (red,
green, and blue) is the most common because the digital images captured are normally
saved in this space. L*a*b* and HSI (hue, saturation, and intensity) have also shown
good performance in completing such a task. Two color spaces were applied by Zheng
et al. (2005) to characterize the color features of cooked beef, i.e. RGB and HSI.
For each color component in a color space, two measurements are usually performed;
the mean and standard deviation. The mean characterizes the average color properties
of cooked meats, while the standard deviation provides a measure of color variation.
For example, 12 color features were extracted, including the mean and the standard
deviation of each color component, in two color spaces (RGB and HSI) in the study
conducted by Zheng et al. (2005).
150 Quality Measurement of Cooked Meats
4.2 Correlation with water content
Meat color depends not only on the quantity of myoblobin present and the type of
myoblobin molecule, but also on its chemical state and the chemical and physical
conditions of other components in the meat (Lawrie, 1998). As one of the most important chemical components, the water content has been related with the color of cooked
beef using computer vision (Zheng et al., 2005). In their work, a partial least squares
regression (PLSR) model and a neural network (NN) model are proposed for correlating the color to the water content of the beef joints. Correlation coefficients (r 2 ) of the
models were 0.56 and 0.75 for PLSR and NN, respectively.
Further analysis of the regression coefficients by Zheng et al. (2005) reveals that
saturation is the aspect, of the 12 color features, that makes the largest contribution to
the results of the prediction model. On the one hand, because saturation measures the
distance between red and pink (Russ, 1999), it can reflect the amount of myoglobin
denatured during heat-processing of meat (Lawrie, 1998). On the other hand, the
water content of beef has an effect on the denaturation of myoglobin (Khalil, 2000).
Consequently, the water content of cooked beef can be indicated by saturation. However,
without the other color features, saturation alone is not sufficient for establishing
a model for establishing the correlation between meat color and its water content.
5 Image texture
Image texture is one of the main features measured for food quality evaluation using
computer vision technology (Du and Sun, 2004). As a useful feature for area description, image texture can quantify some characteristics of the gray-level variation within
an object, such as fineness, coarseness, smoothness, and graininess. The concept of
texture as it is generally understood and used in the food industry refers to the manner
in which the food behaves in the mouth, and is characterized by parameters such as
hardness, cohesiveness, viscosity, elasticity, adhesiveness, brittleness, chewiness, and
gumminess. It has been demonstrated that the image texture features of cooked meats
have a good relationship with one of the most important food texture attributes, i.e.
tenderness (Du and Sun, 2006c), and can be used to classify cooked beef joints of
different degrees of tenderness (Zheng et al., 2006b).
5.1 Extraction of image texture features
A number of methods can be applied for extracting the image texture of cooked meats.
Of these, some are statistical, including the first-order gray-level statistics (FGLS), the
run length matrix (RLM) method, and the gray-level co-occurrence matrix (GLCM)
method. Moreover, several texture description methods are based on transform techniques, such as Gabor and wavelet transform (WT). Transform-based texture analysis
techniques determine the texture of an object by converting the image into a new form
using the spatial frequency properties of the pixel intensity variations. In addition, fractal dimension (FD) has also been employed to describe numerically the image texture
characteristics of cooked meats.
Image texture 151
Various image texture features can be derived to characterize the images of cooked
meats based on the above methods. In the work of Du and Sun (2006c), these features included mean, variance, skewness, and kurtosis for the FGLS method; short run
emphasis, long run emphasis, gray-level non-uniformity, run length non-uniformity,
run length percentage, and low and high gray-level run emphases for the RLM method;
angular second moment, contrast, correlation, sum of squares, inverse difference
moment, sum average, sum variance, sum entropy, entropy, difference variance, difference entropy, two information measurements of correlation, cluster shade, and cluster
prominence for the GLCM method; the FD of the original image, FD of high gray-level
image, FD of low gray-level image, and multifractal of order two for the FD method;
and the energy of each sub-band for the WT-based method. Additionally, both wavelet
and Gabor energies were measured by Zheng et al. (2006b) to describe the image
texture features of cooked beef joints.
The scale of image is important for texture analysis because there might be several
different textures in the same image with different scales. However, the traditional
approaches for image texture analysis, such as RLM and GLCM methods, are limited
in that they are restricted to the analysis of an image over a single scale. The development of multi-scale analysis such as WT has been proven to be useful for characterizing
different scales of image textures of cooked meats effectively. WT not only has a solid
theoretical foundation in formal mathematical theory, but also gives a good empirical
performance for multi-scale image analysis. Nonetheless, common WT suffers from a
lack of translation-invariance, where a simple shift of the image will result in non-trivial
modifications of the values of wavelet coefficients. Therefore, the transforms that have
good reconstruction properties, e.g. translation-invariance and rotation-invariance,
should be applied for extracting the image texture features of cooked meats.
5.2 Correlations with tenderness
Tenderness is often regarded as one of the most important attributes affecting the
eating quality of cooked meat products (Morgan et al., 1991). Warner-Bratzler shear
(WBS) force measurement is the most widely used instrumental method to evaluate the
tenderness of cooked meat products in the meat industry. In the research conducted by
Du and Sun (2006c), it was reported that the image texture features extracted from the
images of cooked meats contained valuable information about tenderness, and were
useful indicators of its WBS value.
The correlation coefficients between WBS and the extracted image texture features
using five methods indicate that the image texture features of multi-scale representation have better relationships with tenderness (Du and Sun, 2006c). The image
texture features obtained using the WT-based method have the strongest correlation
with the tenderness of cooked meats. Among them, the energies of five sub-bands
(EL2B1, EL2B4, EL3B1, EL3B3, and EL3B4) are very significantly correlated with
WBS (P < 0.01), where the energy of the sub-band at the m-th pyramid level and
the n-th orientation band is denoted as ELmBn. Furthermore, the energies of four
sub-bands (EL1B1, EL2B2, EL2B3, and EL3B2) have significant relationships with
WBS (P < 0.05). For the four fractal texture features, only the correlation between the
multifractal FD and WBS reaches the significant level (P < 0.05).
152 Quality Measurement of Cooked Meats
0.05 FDM
Image texture feature
Figure 6.4 Estimated regression coefficients for predicting Warner-Bratzler shear force. FDM, multifractal
of order two; ELmBn, energy of the subband (the m-th pyramid level and the n-th orientation band).
However, no significant correlations have been found between WBS and the image
texture features extracted by the traditional methods, including FGLS, RLM, and
GLCM methods (P > 0.05), which indicates that these attributes are not linearly related
to tenderness. The variance extracted by the FGLS method, and the sum entropy,
entropy, and difference variance extracted by the GLCM method, are correlated more
with the WBS of cooked meats, but have not reached the significant level (P > 0.05).
The reason can be attributed to the fact that traditional methods are restricted to the
analysis of spatial interactions over relatively small neighborhoods on a single scale.
However, the scale is related to the size of textural elements, and should be considered
in investigating the relationship between image texture features and the tenderness of
cooked meats. With the property of preserving local texture complexity, WT can be
applied to extract local texture features and to detect multiresolution characteristics. The
local textural characteristics represented by the local variance of wavelet coefficients
are useful in differentiating two different regions in an image.
For further analysis of the relationships between the selected image texture features
and WBS, the partial least squares regression (PLSR) technique was applied in the
work of Du and Sun (2006c). As one of the techniques for multivariate regression
analysis, PLSR is a hybrid of multiple regression and principal component analysis
(PCA) (MacFie and Hedderley, 1993), and can be used to understand the relationship
between two data sets by predicting one data set (Y) from the other set (X) (Martens
and Martens, 2001). It not only provides solutions for both X and Y variables, but also
attempts to find the best solution of X to explain the variation of the Y variable set. The
estimated regression coefficients of the predicting model for WBS with three factors
(Figure 6.4) show that all of the selected image texture features are positively correlated
with WBS, thus having a negative impact on the tenderness of cooked meats (Du and
Sun, 2006c). Furthermore, EL2B1 and EL3B1 have the highest relationship with WBS,
followed by EL3B4, EL2B4, and EL3B3. The contributions of FDM, EL1B1, EL2B2,
EL2B3, and EL3B2 to the prediction of WBS are relatively smaller.
In another work, Zheng et al. (2006b) found that it was useful to apply multi-scale
approaches (Gabor and WT) for the classification of tough and tender cooked beef
joints by image texture analysis. Four different groups of image texture features, i.e.
wavelet features (WF), Gabor features (GF), wavelet Gabor features (WGF), and a
combination of wavelet features and Gabor features (CWG), were extracted from the
Nomenclature 153
images of cooked beef. After reducing the dimensionality with principal component
analysis, the four groups of features were employed to classify the tough and tender
beef samples based on the clustering results using a linear discrimination function.
WGF was found to perform the best for the classification of beef tenderness, followed
by WF and CWG, while GF characterized the tenderness with the least confidence.
The error rate of WGF was 29.4 percent, indicating the potential of image texture for
determining cooked beef tenderness.
6 Conclusions
Computer vision can provide an objective, consistent, and efficient way to evaluate
the quality of cooked meats as affected by their manufacturing procedures, including
shrinkage measurement, pore characterization, color, and image texture extraction.
Further research should investigate the microstructure of cooked meats using a camera
with higher magnification or modern microscopy techniques, and the internal structures
using ultrasound, magnetic resonance imaging, computed tomography, and electrical
tomography techniques. Based on the selected image features, a more powerful mathematical model or algorithm should be developed to predict the physical and chemical
quality of cooked meats.
eccentric angle of a point on the surface of ham
polar angle
area of each circular cross section perpendicular to the x axis
half of the length L
half of the width W
half of the thickness T
projection area of sample for the integration
ELmBn energy of the sub-band at the m-th pyramid level and the
n-th orientation band
gradient in the x direction
gradient in the y direction
i, ĩ
axes of the cooked meat shape
number of boundary point of the contour
radial coordinate
contrast-limited adaptive histogram equalization
combination of wavelet features and Gabor features
fractal dimension
first-order gray-level statistics
154 Quality Measurement of Cooked Meats
Gabor features
gray-level co-occurrence matrix
hue, saturation, and intensity
neural network
principal component analysis
partial least squares regression
red, green, and blue
run length matrix
surface area
total number of pore
total processing time
Warner-Bratzler Shear
wavelet features
wavelet Gabor features
wavelet transform
Abramowitz M, Stegun IA (eds) (1972) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Books on Advanced Mathematics,
New York: Dover.
Anon (1997) Western European Meat and Meat Products. London: Datamonitor Europe.
Bertram HC, Engelsen SB, Busk H, Karlsson AH, Andersen HJ (2004) Water properties
during cooking of pork studied by low-field NMR relaxation: effects of curing and
the RN − gene. Meat Science, 66 (2), 437–446.
Canny J (1986) A computational approach to edge detection. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 8 (6), 679–698.
Daudin JD, KuitcheA (1996) Modelling of temperature and weight loss kinetics during meat
chilling for time variable conditions using an analytical based method – III. calculations
versus measurements on pork carcass hindquarters. Journal of Food Engineering, 29,
DeFreitas Z, Sebranek JG, Olson DG, Carr JM (1997) Carrageenan effects on salt soluble
meat proteins in model systems. Journal of Food Science, 62, 539–543.
Destefanis G, Barge MT, Brugiapaglia A, Tassone S (2000) The use of principal component
analysis (PCA) to characterize beef. Meat Science, 56 (3), 255–259.
Du C-J, Sun D-W (2004) Recent developments in the applications of image processing
techniques for food quality evaluation. Trends in Food Science & Technology, 15 (5),
Du C-J, Sun D-W (2005) Correlating shrinkage with yield, water content and texture of
pork ham by computer vision. Journal of Food Process Engineering, 28 (3), 219–232.
References 155
Du C-J, Sun D-W (2006a) Estimating the surface area and volume of ellipsoidal ham using
computer vision. Journal of Food Engineering, 73 (3), 260–268.
Du C-J, Sun D-W (2006b) Automatic measurement of pores and porosity in pork ham and
their correlations with processing time, water content and texture. Meat Science, 72
(2), 294–302.
Du C-J, Sun D-W (2006c) Correlating image texture features extracted by five different
methods with the tenderness of cooked pork ham: a feasibility study. Transactions of
the ASAE, 49 (2), 441–448.
Farkas BE, Singh RP (1991) Physical properties of air-dried and freeze-dried chicken white
meat. Journal of Food Science, 56 (3), 611–615.
Girard PJ (1992) Technology of Meat and Meat Products. London: Ellis Horwood.
Hermansson AM (1985) Water and fat holding. In Functional Properties of Food Macromolecules (Mitchell JR, Ledward DA, eds). London: Elsevier Applied Science,
pp. 273–314.
Huang CT, Clayton JT (1990) Relationships between mechanical properties and microstructure of porous foods: Part I. A review. In Engineering and Food. Vol. 1. Physical
Properties and Process Control (Spiess WEL, Schubert H, eds). London: Elsevier
Applied Science, pp. 352–360.
Hullberg A, Johansson L, Lundström K (2005) Effect of tumbling and RN genotype on sensory perception of cured-smoked pork loin. Meat Science, 69 (4),
Kassama LS, Ngadi MO (2005) Pore structure characterization of deep-fat-fried chicken
meat. Journal of Food Engineering, 66 (3), 369–375.
Kassama LS, Ngadi MO, Raghavan GSV (2003) Structural and instrumental textural properties of meat patties containing soy protein. International Journal of Food Properties,
6 (3), 519–529.
Khalil AH (2000) Quality characteristics of low-fat beef patties formulated with modified
corn starch and water. Food Chemistry, 68, 61–68.
Kumar VA, Mathew S (2003) A method for estimating the surface area of ellipsoidal food
materials. Biosystems Engineering, 85 (1), 1–5.
Lawrie RA (1998) Lawrie’s Meat Science. Cambridge: Woodhead Publishing.
MacFie HJH, Hedderley D (1993) Current practice in relating sensory perception to
instrumental measurements. Food Quality and Preference, 4 (1), 41–49.
Martens H and Martens M. (2001). Analysis of two data tables X and Y: Partial Least
Squares Regression (PLSR). In Multivariate Analysis of Quality: an Introduction.
London: John Wiley & Sons, pp. 111–125.
Mathworks (1998) Matlab Reference Guide. Natick: The MathWorks, Inc.
Mayor L, Sereno AM (2004) Modelling shrinkage during convective drying of food
materials: a review. Journal of Food Engineering, 61 (3), 373–386.
McDonald K, Sun D-W (2001a) The formation of pores and their effects in a cooked beef
product on the efficiency of vacuum cooling. Journal of Food Engineering, 47 (3),
McDonald K, Sun D-W (2001b) Pore size distribution and structure of a cooked beef
product as affected by vacuum cooling. Journal of Food Process Engineering, 24,
156 Quality Measurement of Cooked Meats
McDonald K, Sun D-W, Kenny T (2000) Comparison of the quality of cooked beef products
cooled by vacuum cooling and by conventional cooling. Food Science andTechnology –
Lebensmittel Wissenschaft und Technologie, 33, 21–29.
Meyer F, Beucher S (1990) Morphological segmentation. Journal of Visual Communication
and Image Representation, 1 (1), 21–46.
Mittal GS, Blaisdell JL (1984) Heat and mass transfer properties of meat emulsion. Food
Science and Technology – Lebensmittel Wissenschaft und Technologie, 17, 94–98.
Morgan JB, Savell JW, Hale DS, Miller RK, Griffin DB, Cross HR, Shakelford SD (1991)
National beef tenderness survey. Journal of Animal Science, 69, 3274–3283.
Mulchrone KF, Choudhury KR (2004) Fitting an ellipse to an arbitrary shape: implication
for strain analysis. Journal of Structural Geology, 26, 143–153.
Ofstad R, Kidman S, Myklebust R, Hermansson AM (1993) Liquid holding capacity and
structural changes during heating of fish muscle: cod (Gahus morhua L.) and salmon
(Salmo salar). Food Structure, 12, 163–174.
Rahman MS (2001) Toward prediction of porosity in foods during drying: a brief review.
Drying Technology, 19 (1), 1–13.
Rahman MS, Sablani SS (2003) Structural characteristics of freeze-dried abalone –
porosimetry and puncture test. Food and Bioproducts Processing, 81 (C4), 309–315.
Rahman MS, Perera CO, Chen XD, Driscoll RH, Potluri PL (1996) Density, shrinkage and
porosity of calamari mantle meat during air drying in a cabinet dryer as a function of
water content. Journal of Food Engineering, 30 (1–2), 135–145.
Russ JC (1999) Image Processing Handbook. Boca Raton: CRC Press.
Schäfer A, Rosenvold K, Purslow PP, Andersen HJ, Henckel P (2002) Physicological and
structural events postmortem of importance for drip loss in pork. Meat Science, 61,
Thomas JRGB, Finney RL (1984) Calculus and Analytic Geometry. Boston: AddisonWesley Publishing Company.
Wang LJ, Sun D-W (2002a) Modelling vacuum cooling process of cooked meat – part 2:
mass and heat transfer of cooked meat under vacuum pressure. International Journal
of Refrigeration, 25 (7), 862–871.
Wang LJ, Sun D-W (2002b) Evaluation of performance of slow air, air blast and water
immersion cooling methods in cooked meat industry by finite element method. Journal
of Food Engineering, 51, 329–340.
Zheng CX, Sun D-W, Zheng LY (2005) Correlating color to moisture content of large cooked
beef joints by computer vision. Journal of Food Engineering, 77 (4), 858–863.
Zheng CX, Sun D-W, Du C-J (2006a) Estimating shrinkage of large cooked beef joints
during air-blast cooling by computer vision. Journal of Food Engineering, 72 (1),
Zheng CX, Sun D-W, Zheng LY (2006b) Classification of tenderness of large cooked beef
joints using wavelet and Gabor textural features. Transactions of the ASAE, 49 (5),
Zorrilla SE, Singh RP (2003) Heat transfer in double-sided cooking of meat patties considering two-dimensional geometry and radial shrinkage. Journal of Food Engineering,
57 (1), 57–65.
Quality Inspection of
Poultry Carcasses
Bosoon Park
US Department of Agriculture, Agricultural Research Service,
Richard B. Russell Research Center, Athens, GA 30605, USA
1 Introduction
The Food Safety Inspection Service (FSIS) has been mandated to inspect organoleptically each poultry carcass on the line at processing plants in the US. The development
of accurate and reliable instruments for on-line detection of unwholesome carcasses –
such as cadavers and those that are septicemic, bruised, tumorous, air-sacculitic, and
ascetic – is essential to improve the US federal poultry inspection program.
Major causes for condemnation of poultry during quality inspection include the
1. Cadaver, which is the carcass of a chicken that died from some cause other than
slaughter. The skin is reddish because either the animal was already dead at the
time of bleeding, or it was not accurately stuck and therefore did not properly
bleed out.
2. Septicemia, which is a systemic disease caused by pathogenic microorganisms
and/or their toxins in the blood. It may result in a variety of visible changes in
the carcass and viscera of an affected bird, including swollen, watery tissues,
hemorrhages throughout the animal, and a darkened red to bluish discoloration
of the skin.
3. Bruising, which is due to the accumulation of blood in tissues outside the vascular
system, resulting in discoloration of some parts of the skin and underlying tissues.
4. A tumor, which is a mass of swollen or enlarged tissue caused by uncontrolled
growth of new tissue that has no useful function.
5. Ascites, which is an accumulation of fluid in the peritoneal cavity of the abdomen.
6. Airsacculitis, which is inflammation of the air sacs (membrane lines, air-filled
structures) with the accumulation of fluid or exudate within the cavities. Airsacs
may be caused by many different organisms (bacteria, mycoplasma, viruses, or
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
158 Quality Inspection of Poultry Carcasses
Poultry products have increased in popularity with US consumers in recent years.
The number of poultry slaughtered at federally inspected establishments has increased
from 6.9 billion birds in 1993 to 8.9 billion birds in 2004 (USDA, 2005).
Previous research showed that a machine vision system could separate wholesome
birds from unwholesome birds, including septicemic carcasses and cadaver, with high
classification accuracy (Park and Chen, 1994a). Thus, machine vision systems are useful for poultry industry applications, particularly in grading and inspection, because
inspection and classification of poultry carcasses is a tedious and repetitive procedure.
Daley et al. (1988) reported that machine vision would be feasible for grading poultry
production and for identifying parts of poultry carcasses at the processing line. In the
mid-1990s, a multispectral imaging system was developed to identify normal, bruised,
tumorous, and skin-torn carcasses for the purpose of poultry quality inspection, and
to develop a methodology for separating healthy from unwholesome carcasses (Park
et al., 1996). From this study, Park and colleagues determined the optimum wavelengths
for identifying bruised, tumorous, and skin-torn carcasses; developed software for the
processing and analysis of multispectral images in both spatial and frequency domains;
and developed a neural network model for classifying unwholesome carcasses. Thus,
machine vision with color and spectral imaging can be used successfully for poultry
quality inspection. Currently, individual carcasses are inspected by federal inspectors
at poultry processing lines, but this visual bird-by-bird inspection is labor-intensive and
prone to human error and variability. Development of high-speed and reliable inspection systems to ensure the safe production of poultry during post-harvest processing
has become an important issue, as the public is demanding assurance of better and
safer food.
Machine vision techniques are useful for the agricultural and food industries, particularly in grading and inspection (Sakar and Wolfe, 1985; Miller and Delwiche, 1989; Tao
et al., 1990; Precetti and Krutz, 1993; Daley et al., 1994). Machine vision is the technology that provides automated production processes with vision capabilities, which is particularly useful when the majority of inspection tasks are highly repetitive and extremely
boring, and their effectiveness depends on the efficiency of the human inspectors. Even
though machine vision has evolved into a promising technology for agricultural product applications, among the many factors to be considered in on-line application are
processing speed, reliability, and applicability for industrial environments.
2 Poultry quality inspection
The inspection and the grading of poultry are two separate programs within the US
Department of Agriculture (USDA). Inspection for wholesomeness is mandatory,
whereas grading for quality is voluntary. The service is requested by poultry producers
and processors.
American consumers can be confident that the FSIS ensures that poultry products
are safe, wholesome, and correctly labeled and packaged. Under the Federal Meat
Inspection Act and the Poultry Products Inspection Act, the FSIS inspects all raw meat
Color imaging for quality inspection 159
and poultry sold in interstate and foreign commerce, including imported products. It
also monitors meat and poultry products after they leave federally inspected plants. In
addition, the FSIS monitors state inspection programs, which inspect meat and poultry
products sold only within the state in which they were produced. The 1968 Wholesome Poultry Products Act requires state inspection programs to be equivalent to the
Federal inspection program. If states choose to end their inspection program or cannot maintain this standard, the FSIS must assume responsibility for inspection within
that state.
In its efforts to protect the safety and integrity of poultry products, the FSIS works
with many other agencies within the USDA and other agencies, including state inspection programs, the Food and Drug Administration of the US Department of Health and
Human Services, and the Environmental Protection Agency.
Since the Federal inspection program began, the poultry industry has grown and
changed significantly. In the early 1900s, most meat was slaughtered and used locally;
however, nowadays there is a wide variety of meat and poultry products on the market.
Meat is slaughtered and processed in sophisticated, high-volume plants, and is often
shipped great distances to reach consumers.
As the industry has changed, the FSIS has also changed the inspection program. In
its early days the primary concern of the inspectors was disease, and they relied almost
exclusively on visual inspection of animals, products, and plant operations. Since the
mid-1970s, FSIS has been modernizing inspection to reduce costs and make it more
scientifically-based. The requirements in the new final rule on Pathogen Reduction
and Hazard Analysis and Critical Control Points (HACCP) are designed to minimize
the likelihood of harmful bacteria being present in raw meat and poultry products.
However, some bacteria might still be present and may become a problem if meat and
poultry are not handled properly.
The FSIS inspector must have knowledge about the particular species inspected,
and the carcasses must fit with the available equipment in the plant. In modern
poultry plants, USDA-certified inspectors perform the whole inspection process.
Individual, high-speed visual inspection of birds (35 birds per minute) is both laborintensive, and prone to human error and variability. During the past decade, several
studies have reported on the developments of automated inspection systems for
poultry carcass inspection (Chen and Massie, 1993; Chen et al., 1996a; Park and
Chen, 1996).
3 Color imaging for quality inspection
3.1 Detection of splenomegaly
Poultry spleen size is an important indicator of whether the poultry should be condemned and must be further examined by human inspectors in processing plants.
According to poultry pathologists and veterinarians, if a chicken has an enlarged
spleen then the animal is diseased (Schat, 1981; Arp, 1982; Clarke et al., 1990).
160 Quality Inspection of Poultry Carcasses
Conversely, if a chicken is diseased, the spleen is likely to be enlarged. As a part of the
research on the inspection of poultry carcasses for internal diseases, inspecting spleens
was suggested as an initial step. This has been added to the further inspections for
other disease syndromes such as airsacculitis and inflammatory processes (Domermuth
et al., 1978).
Inspection of poultry carcasses for their wholesomeness is a complex process. An
automated machine vision inspection system must incorporate human knowledge into
a computer system with machine intelligence. The vision system development is often
a progressive process, with problems conquered one at a time. Substantial progress
has been made regarding the machine vision inspection of poultry carcasses (Chen
et al., 1998a; Park et al., 1996). An on-line vision system was developed for inspecting tumors, diseases, and skin damage. Using multispectral imaging and fiber-optics,
external chicken surfaces were analyzed. The system seemed highly promising for
detecting specific poultry disease problems, and was a step forward in the technology of automated poultry inspection. Through the research, imaging techniques were
developed for inspecting the internal organs of poultry to identify abnormalities of
the spleen. At the same time, the new knowledge developed through this research was
contributing to the understanding and development of future advanced technologies in
machine vision-based poultry inspection. A spectral imaging method was developed
to identify poultry spleen from its surrounding viscera, such as liver and intestine;
and an image-processing algorithm that recognizes the spleen in an image and detects
splenomegaly (enlargement of the spleen) was developed.
As splenomegaly is one indication that processed poultry may not be acceptable
for human consumption, because of diseases such as tumors or septicemia, the study
explored the possibility of detecting splenomegaly with an imaging system that would
assist human inspectors in food safety inspections. Images of internal viscera from
45-day-old commercial turkeys were taken with fluorescent and ultraviolet lighting
systems. Image-processing algorithms using linear transformation, morphological filtering, and statistical classification were developed to distinguish the spleen from
its background surroundings, and then to detect abnormalities. Experimental results
demonstrated that the imaging method could effectively distinguish the spleen from
other organs and intestines. The system had 95 percent classification accuracy for
the detection of spleen abnormality. The methods indicated the feasibility of using
automated machine vision systems to inspect internal organs as an indication of the
wholesomeness of poultry carcasses.
3.2 Inspection of the viscera
A practical application of food microbiology in poultry processing and marketing might
be to ensure clean, wholesome products. However, under commercial production,
processing, handling, and marketing conditions, it is not feasible to run microbiological
counts (Mountney, 1987) to determine the presence of pathogens on slaughtered birds.
For this reason, the current practice of poultry inspection in the processing plant is
based on postmortem pathology correlation – i.e. observing signs of abnormalities or
diseases from the carcass exterior, body cavity, and viscera. Previous studies (Chen
Color imaging for quality inspection 161
et al., 1998b, 1998c, 1998d; Park et al., 1998a, 1998b) have shown that the systems
can separate normal poultry carcasses from abnormal carcasses. The system, however,
may not be able perfectly to discriminate individual abnormal carcasses. In addition,
procedures that depend only on images of the carcass exterior are insufficient to detect
some condemnable conditions, such as airsacculitis and ascites. Therefore, there is
a need to acquire additional information, using machine vision, from post-mortem
poultry at different locations (such as the body cavity) and/or from different internal
organs (including the liver and heart).
Color is an important attribute for food inspection (Daley et al., 1994; Tao et al.,
1995). With the availability of improved hardware for acquiring color images, and
advances in image-processing software (Jang 1993; Nauck and Kruse, 1995), there
is now the capability for development of color-vision systems for poultry inspection.
Therefore, Chao et al. (1999) have studied color imaging in identifying individual
condemnable conditions from poultry viscera. From the study, they determined features
for discriminating condemned conditions of poultry viscera and developed the neurofuzzy models for identifying individual poultry viscera condemnations.
Poultry viscera of liver and heart were separated into four classes depending on their
symptoms, including normal, airsacculitis, cadaver, and septicemia. These images in
RGB color space were segmented, and statistical analysis was performed for feature
selection. The neuro-fuzzy system utilizes hybrid paradigms of the fuzzy interference
system and neural networks to enhance the robustness of the classification processes.
The accuracy in separating normal from abnormal livers was between 87 and 92 percent
when two classes of validation data were used. For two-class classification of chicken
hearts, the accuracy was between 93 and 97 percent. However, when neuro-fuzzy models were employed to separate chicken livers into three classes (normal, airsacculitis,
and cadaver), the accuracy was only 83 percent. Combining the features of chicken
liver and heart, a generalized neuro-fuzzy model was designed to classify poultry viscera into four classes (normal, airsacculitis, cadaver, and septicemia). In this case, a
classification accuracy of 82 percent was obtained.
3.3 Characterizing wholesomeness
For poultry quality and safety inspection, scientifically-based innovative inspection
technologies are needed that can allow poultry plants to meet government food safety
regulations efficiently and also increase competitiveness and profitability to meet consumer demand. Due to successful food safety and quality monitoring applications in
other food processing and production agriculture industries, researchers have been
developing spectral imaging methods suited to the poultry processing industry.
In particular, visible/near-infrared (Vis/NIR) spectroscopic technologies have shown
the capability of distinguishing between wholesome and unwholesome poultry carcasses, and detecting fecal contamination on poultry carcasses, by differences in skin
and tissue composition. Chen and Massie (1993) used Vis/NIR measurements taken
by a photodiode array spectrophotometer to classify wholesome and unwholesome
chicken carcasses, and selected wavelengths at 570, 543, 641, and 847 nm based on
linear regression for classification.
162 Quality Inspection of Poultry Carcasses
Using Vis/NIR measurements of fecal contamination of poultry carcasses, Windham
et al. (2003a) identified four key wavelengths via principal component analysis at
434, 517, 565, and 628 nm. Through single-term linear regression (STLR), an optimal
ratio of 574 nm/588 nm was determined and used to achieve 100 percent detection
of contaminates (Windham et al., 2003b). Chao et al. (2003) developed an on-line
inspection system to measure the reflectance spectra of poultry carcasses in the visible
to near-infrared regions between 431 and 944 nm. The instrument measured the spectra
of poultry carcasses at speeds of 140 or 180 birds per minute.
TheVis/NIR system can clearly be used to differentiate wholesome and unwholesome
poultry carcasses at high speed. These studies include significant findings for the use
of spectral reflectance in the visible region, but have not utilized methods of analysis
for sample color as perceived through human vision. The International Commission for
Illumination (CIE) has established a colorimetry system for identifying and specifying
colors, and for defining color standards. Following the establishment of the CIE 1924
luminous efficiency function, the system of colorimetry was developed based on the
principles of trichromacy and Grassmann’s laws of additive color mixture (Fairchild,
1998). The concept of colorimetry is that any color can be matched by an additive
mixture of three primary colors: red, green, and blue.
Because there are three different types of color receptor cones in the eye, all the
colors that humans see can be described by coordinates in a three-dimensional color
space, which measures the relative stimulations to each type of cone. These coordinates
are called tristimulus values, and can be measured in color-matching experiments. The
tristimulus values are the amounts of the three primary colors used to achieve a match.
A system using broad-band primaries was formalized in 1931 by the CIE. Wavelengthby-wavelength measurement of tristimulus values for the visible spectrum produces
the color-matching functions. The tristimulus values for a particular color are labeled
(X, Y, Z) in the CIE 1931 system, and are extended such that they can be obtained for
any given stimulus, defined by a spectral power distribution (SPD) (Williamson and
Cummins, 1983). The SPD can be measured by a spectrophotometer. From the SPD
both the luminance and the chromaticity of a color are derived to describe precisely
the color in the CIE system.
Chao et al. (2005) investigated a quantitative, color-based method suitable for rapid
automated on-line sorting of wholesome and unwholesome chickens. They characterized wholesome and unwholesome chicken color in CIE color coordinates. According
to their studies, the color-based sensing technique has the potential for rapid automated
inspection for wholesomeness of poultry in the visible region. Spectra in the range of
400–867 nm are suitable for poultry carcass inspection on a high-speed kill line using
a visible/near-infrared spectrophotometer.
CIELUV color was calculated as a simple distance formula and used to classify
wholesome and unwholesome poultry carcass samples. They found that the greatest
color differences occurred at different combination of wavelengths – at 508 nm and
426 nm; at 560 nm and 426 nm; and at 640 nm and 420 nm. Full-spectrum classification
achieved accuracy of 85 percent in identifying wholesome carcasses. Using the 560nm and 426-nm wavelengths, approximately 90 percent classification accuracy was
obtained for wholesome carcasses.
Spectral imaging 163
4 Spectral imaging
4.1 Quality characterization
A technique for recognizing global or systemic defects on poultry carcasses by using
a color-imaging system has been reported. The goals of this study were to process
images at speeds of about 180 birds per minute and to use a neural network-based
classifier for classification. Color-image-processing procedures involve three steps:
background removal, HSI (hue, saturation, intensity) conversion, and histogram calculation. Features of three histograms (hue, saturation, intensity) were used as inputs
of the neural network for detecting large-scale defects (e.g. septicemic carcasses, or
cadavers). Also, a color-image processing system to detect skin tears, feathers, and
bruising was developed by Daley et al. (1994).
The HSI could be more useful for poultry carcasses identification than the RGB
and XYZ color processing techniques (Daley and Rao, 1990). However, color machine
vision for poultry carcass classification was conducted by using a CCD camera which
enables the detection of only broadband, visible (400–700-nm) information in the
spatial domain.
Park and Chen (1994b) developed a multispectral imaging system to detect abnormal poultry carcasses. The machine vision inspection system they developed provides
spectral information regarding the object, as well as the spatial information in the visible and near-infrared spectral regions. Using multispectral images, they characterized
several different abnormal poultry carcasses, including bruised, tumorous, and skintorn carcasses. From the study, they determined the optimum wavelength for optical
filter selection for discriminating such carcasses.
4.1.1 Spectral characterization of poultry carcasses
Multispectral imaging provides image information in the spectral domain as well
as in the spatial domain. Specifically, the intensified multispectral imaging system
was found to improve sensitivity and to control exposure automatically, and had the
capability to calibrate image intensity. The multispectral camera with selected optical
filters provided more spectral characteristics of poultry carcasses. The response of the
reflectance intensity of each carcass was sensitive to the wavelength of the filter. Based
on the six different wavelengths (542, 570, 641, 700, 720, and 847 nm) with 10-nm
bandwidth, which were selected by spectrophotometry of the poultry carcasses (Chen
and Massie, 1993), the characteristics of the poultry carcasses were distinguishable
when interference-filter wavelengths of 542 and 700 nm were installed in the camera.
Figure 7.1 shows the spectral response in normal and abnormal carcasses. The
reflectance intensity of normal carcasses was not sensitive to the wavelength of the
filter. As shown in Figures 7.1a and 7.1b, there was little difference of reflectance
intensity between 542- and 700-nm wavelengths. For normal carcass images, the dark
area of the body was a shadow of the image. In the case of bruised carcasses, the
reflectance intensity with a 542-nm wavelength was much darker than the body intensity when using a 700-nm wavelength (Figures 7.1c and 7.1d). In Figure 7.1c, the dark
area on the back was bruised and the right portion of the left leg was skin-torn. Thus,
164 Quality Inspection of Poultry Carcasses
Figure 7.1 Intensified multispectral images of poultry carcasses: (a) normal at 542 nm; (b) normal at
700 nm; (c) bruising at 542 nm; (d) bruising at 700 nm; (e) tumor at 542 nm; (f) tumor at 700 nm; (g)
skin-tear at 542 nm; (h) skin-tear at 700 nm.
of b
of bod
Gray-level int
Gray-level int
Figure 7.2 Gray-level intensity distribution of poultry carcasses scanned with filter wavelength of 542 nm:
(a) normal; (b) bruising.
the tissues of the poultry carcasses can be characterized by spectral imaging using
different wavelengths.
Multispectral imaging had the potential to differentiate tumorous carcasses from
normal carcasses. As shown in Figure 7.1e, the dark area at the center of the body
was actually a tumor; however, other dark spots were blood clots; thus a wavelength of
542 nm was not effective at distinguishing tumorous carcasses. However, this problem
was solved by using a filter of 700 nm – Figure 7.1f clearly shows that the tumorous
spectral image at 700 nm was different from that at 542 nm. The combination of these
two different wavelengths enabled differentiation of tumorous carcasses. For a skintorn carcass, the reflectance intensity of the muscle was darker than the intensity of
the skin when a 542-nm wavelength was used (Figure 7.1g); on the other hand, the
reflectance intensity of the muscle (skin-torn area) with a 700-nm wavelength was
high (see Figure 7.1h). Thus, the reflectance image intensity provided the capability of
differentiating bruised, tumorous, and skin-torn carcasses.
The gray-level image intensity of each carcass was compared, to differentiate abnormal carcasses. Figure 7.2 shows the three-dimensional distribution of gray-level image
Spectral imaging 165
intensity in the spatial domain. The image intensity of the bruised carcass varied much
more than the intensity of the normal carcass. Thus, the variation of reflectance image
intensity could be a significant feature in distinguishing between normal and bruised
poultry carcasses.
4.2 Detection of skin tumors
Currently, each chicken intended for sale to US consumers must by law be inspected
post-mortem, by the Food Safety Inspection Service, for wholesomeness (USDA,
1984). Inspectors visually and manually inspect poultry carcasses and viscera on-line
at processing plants. The FSIS uses about 2200 poultry inspectors to inspect more than
8 billion poultry per year in 310 poultry slaughter plants nationwide, and this volume
is growing. Each inspector is limited to a maximum of 35 birds per minute. Inspectors working at least 8 hours per day in these conditions have a tendency to develop
repetitive strain injuries and attention and fatigue problems (OSHA, 1999).
Poultry inspection is a complex process. FSIS inspectors are trained to recognize
infectious conditions and diseases, dressing defects, fecal and digestive content contamination, and conditions that are related to many other consumer protection concerns.
In general, diseases and defects that occur in the processing of poultry can be placed
into several categories. There are diseases/defects that are localized in nature, and those
that are generalised or systemic (i.e. affect the whole biological system of the bird).
Systemic problems include septicemia and toxemia. Studies using visible/NIR spectroscopy (Chen et al., 2000) and reflectance imaging (Park and Chen, 1994b; Chao
et al., 2000) have shown good results in inspecting for systemic diseases of poultry;
however, localized problems are difficult to detect, and require the use of not only
spectral but also spatial information. Examples of localized poultry diseases/defects
include skin tumors and inflammatory process. An automated system to inspect for
diseases/defects of poultry must be able to measure these attributes and eliminate
unwholesome carcasses. Chicken skin tumors are round, ulcerous lesions that are surrounded by a rim of thickened skin and dermis (Calnek et al., 1991). For high-speed
inspection a machine vision system is a solution, but advanced sensing capabilities are
necessary in order to deal with the variability of a biological product.
Multispectral imaging is a good tool in these advanced techniques. Several investigations (Throop and Aneshansley, 1995; Park and Chen, 1996; Park et al., 1996; Wen
and Tao, 1998;) have shown that the presence of defects is often more easily detected
by imaging at one or more specific wavelengths where the reflectivity of good tissue
is notably different from that of damaged tissue. For example, skin tumors in poultry
are less reflective in the NIR than good tissue (Park et al., 1996). The measurable
indication may be amplified, and therefore more easily detected, when more than one
wavelength is imaged and the difference or ratio of the images is measured. Chao
et al. (2002a) investigated the selection of wavelengths for a multispectral imaging
system to facilitate the analysis of chicken skin tumors, to process and identify features from multispectral images, and to design classifiers for identifying tumors from
normal chicken skin tissue. According to their findings, spectral imaging techniques
were used to detect chicken skin tumors. Hyperspectral images of tumorous chickens
166 Quality Inspection of Poultry Carcasses
were taken in the spectral range 420–850 nm. Principal component analysis (PCA) was
applied to select useful wavelength bands (465, 575 and 705 nm) from the tumorous
chicken images. Then, multispectral image analysis was performed to generate ratioed
images, which were divided into regions of interests (ROIs) classified as either tumorous or normal. Image features for each ROI (coefficient of variation, skewness, and
kurtosis) were extracted and used as inputs for fuzzy classifiers. The fuzzy classifiers
were able to separate normal from tumorous skin with increasing accuracy as more
features were used. In particular, use of all three features gave successful detection
rates of 91 and 86 percent for normal and tumorous tissue, respectively.
4.3 Detection of systemic disease
Regarding machine vision application for poultry quality and safety inspection, several
studies have been conducted over recent decades to develop automated poultry inspection systems using multispectral visible/near-infrared (Vis/Nir) imaging algorithms
(Swatland, 1989; Chen and Massie, 1993; Liu and Chen 2001; Hsieh et al., 2002; Park
et al., 2002; Chao et al., 2003; Liu et al., 2003; Windham et al., 2003a). From these
studies key wavelengths were selected from redundant Vis/Nir spectra (Chao et al.,
2003), because selection of key wavelengths enabled simplification of data processing
methods for accurate detection of defective carcasses. A multi-channel filter corresponding to the selected wavelengths can be implemented within the imaging system.
The modern common-aperture camera with multi-channel filters can take multispectral images with a single shot, and this ability is essential to a real-time automatic
inspection system (Park et al., 2003).
However, key wavelengths may vary from disease to disease, as well as with the poultry’s environment. After selecting the key wavelengths, image-processing algorithms
are developed to correct, analyze, and classify the images. With an appropriate imageprocessing procedure, some features can be extracted from multispectral image data
to more suitably represent the classification target and thus increase the classification
Yang et al. (2004) also developed multispectral image-processing algorithms for differentiating wholesome carcasses from systemically diseased ones, specifically those
that are septicemic. The multispectral imaging system included a common-aperture
camera and a spectrometer with four-channel filters in the visible wavelength range. An
image-processing algorithm defined the ROI for accurate differentiation. According to
their study, a multispectral imaging system can successfully differentiate wholesome
and septicemic carcasses automatically. From Vis/Nir reflectance spectra of poultry
carcasses, average CIELAB L∗ (lightness), a∗ (redness), and b∗ (yellowness) values
were analyzed. The difference of lightness between wholesome and septicemic carcasses was significant. The multispectral imaging system included four narrow-band
interference filters for 488-, 540-, 580-, and 610-nm wavelengths. The 16-bit multispectral images of poultry carcasses were collected for image processing and analysis.
Image-processing algorithms, including image registration, flat-field correction, image
segmentation, region of interest identification, feature measurement, and symptom
recognition, were developed to differentiate septicemic from wholesome carcasses.
Spectral imaging 167
For the image processing, a 610-nm wavelength was used to create a mask to extract
chicken images from the background. The average reflectance intensities at 488, 540,
580, and 610 nm from different parts of the fron of the carcass were calculated. Moreover, four normalization and differentiation methods between two wavelengths were
also calculated for comparison. Decision trees were applied for generating thresholds
for differentiating septicemic carcasses for wholesome ones. The results showed that,
using an average intensity of 580 nm in the region of interest, 98 percent of septicemic
carcasses and 96 percent of wholesome carcasses were efficiently identified.
4.4 Detection of heart disease
Visual inspection of poultry viscera is one of the tasks currently performed by human
inspectors at poultry slaughter plants searching for discrepancies resulting from diseases. Because of the significance of poultry viscera in the poultry inspection process,
full automation of poultry inspection requires the development of techniques that can
effectively identify individually contaminated conditions of poultry viscera. Studies on
the development of methods for automated inspection of poultry viscera have focused
on morphological measurements of internal organs. Using UV light to segregate the
spleen from other internal organs, Tao et al. (1998) used spleen enlargement measurements to classify wholesome and unwholesome poultry carcasses. In classifying
poultry diseases from liver and heart images, Chao et al. (1999) reported that RGB
color information could be effectively used for differentiating normal livers from airsacculitis and cadaver livers. However, the RGB color images of chicken hearts could
not be effectively used for the separation of diseased poultry carcasses.
Instead, using narrow band (rather than broadband RGB), images of chicken hearts
were effective for the separation of systemically diseased poultry carcasses. Highresolution images, rather than simple monochromatic data, were gathered to give more
flexibility in applications – such as generating size and morphological information, or
detecting more localized conditions.
Spectral imaging measures the intensity of diffusely reflected light from a surface at
one or more wavelengths with narrow band-passes. The resulting data for each carcass
are three-dimensional (two spatial dimensions and one spectral dimension). Because of
the potentially large size of these data sets, spectral imaging often involves three steps:
measuring the spectra of whole samples at many wavelengths, selection of optimal
wavelengths, and collection of images at selected wavelengths (Muir, 1993; Favier
et al., 1998).
In general, a Vis/Nir spectrophotometer is chosen to measure the spectra because
of its previous success in providing useful information about chicken carcasses (Chen
et al., 1996b). From a set of relatively contiguous spectra, it is possible to characterize
spectral features with a potential to differentiate diseases.
Several methods of wavelength selection have been reported (Saputra et al., 1992;
Chen and Massie, 1993). These include combination of spectra, prior knowledge of
spectral characteristics, and mathematical selection based on the spectral difference or
statistical correlation of the reflection with diseased status. Chao et al. (2001) utilized
discriminant analysis on a subset of the available wavelengths.
168 Quality Inspection of Poultry Carcasses
A multispectral image acquisition system could be implemented in several ways –
by using a filter wheel, a liquid-crystal tunable filter (LCTF), an acousto-optics tunable filter (AOTF) several cameras with different filters, and a single camera with a
beamsplitter. A critical issue that should be considered in real-time (at least 35 birds per
minute, which equates to the speed of a human inspector) operation of these devices
is the amount of time between sequentially acquired images at different wavelengths.
This is a function of both the image acquisition speed and the switching band speed.
Electromechanical filter wheels are limited in the speed of switching filters. Improvement in LCTF technology enables a LCTF system superior to electromechanical filter
wheels in both speed and flexibility of spectral selection (Evans et al., 1997). The time
required for the LCTF to switch into the next wavelength is approximately 50 ms (Mao
and Heitschmidt, 1998). However, this still makes the system unsuitable for synchronization with moving objects, which is necessary for high-speed inspection. Recent
advances in optical design make the four-band imager, based on stationary filters and
a beamsplitter, a promising technique for real-time operation. It has the advantage of
no moving parts and the simultaneous capture of images at four different wavelengths
with good image registration.
Using this system, Chao et al. (2001) investigated optical spectral reflectance and
multi-spectral image-analysis techniques to characterize chicken hearts for real-time
disease detection. Spectral signatures of five categories of chicken hearts (airsacculitis, ascites, normal, cadaver, and septicemia) were obtained from optical reflectance
measurements taken with a Vis/Nir spectroscopic system in the range of 473–974 nm.
Multivariate statistical analysis was applied to select the most significant wavelengths
from the chicken-heart reflectance spectra. By optimizing the selection of key wavelengths for different poultry diseases, four wavelengths were selected (495, 535, 585,
Figure 7.3 Detection of poultry systemic disease using multispectral heart images at 495, 535, 585, and
605 nm.
Spectral imaging 169
and 605 nm). Figure 7.3 shows the detection of poultry systemic disease using multispectral heart images at 495, 535, 585, and 605 nm. The multispectral imaging system
utilizes four narrow-band filters to provide four spectrally discrete images on a single
CCD focal plane. Using the filters at the wavelengths selected from the reflectance
spectra, it was possible easily to implement multispectral arithmetic operations for disease detection. Based on statistical analysis of spectral image data, the multispectral
imaging method could potentially differentiate individual diseases in chicken hearts in
real-time. All categories except cadaver were separable with accuracy greater than 92
percent by discrimination algorithms involving differences of average image intensities.
4.5 Identification of systemic disease
According to the Food Safety and Inspection Service (FSIS) of the USDA, performance standards are set at zero tolerance for two Food Safety categories (i.e. fecal
contamination, and infectious condition such as septicemia and toxemia). For poultry plants to meet federal food safety regulations and satisfy consumer demand while
maintaining their competitiveness, the FSIS has recognized the need for new inspection technologies (USDA, 1985), such as automated machine-vision based inspection
Recent research has investigated the development of automated poultry inspection
techniques based on spectral imaging. Chao et al. (2002) developed a multispectral
imaging system using 540- and 700-nm wavelengths, and obtained accuracies of 94
percent for wholesome and 87 percent for unwholesome poultry carcasses. With hyperspectral imaging, Park et al. (2002) achieved 97–100 percent accuracy in identifying
fecal and ingesta contamination on the surface of poultry carcasses using images at the
434-, 517-, 565-, and 628-nm wavelengths. They found that spectral images present
spectral and spatial information from the surface of broiler carcasses, which is essential
for efficient identification of contaminated and systemically diseased broilers. Not only
can multispectral imaging achieve high classification accuracies, this non-destructive
method also shows potential for on-line inspections at high-speed processing plants.
Based onVis/Nir spectroscopic analysis (Hruschka, 1987), previous studies have also
shown that key wavelengths are particularly useful for the identification of diseased,
contaminated, or defective poultry carcasses (Chao et al., 2003; Windham et al., 2003a).
After selection of key wavelengths, filters corresponding to those wavelengths can be
implemented for multispectral image acquisition. Image-processing algorithms are
then developed to enhance and analyze the images. With appropriate image-processing
procedures, some features can be extracted from multispectral images to more suitably
represent the classification target and increase the classification accuracy.
Yang et al. (2006) have developed a simple method for differentiating wholesome
carcasses from systemically diseased carcasses using signatures of Vis/Nir multispectral images. Image-processing algorithms extract image features that can be used to
determine thresholds for identifying systemically diseased chickens.
According to their study, color differences between wholesome and systemically diseased chickens can be used to select interference filters at 488, 540, 580, and 610 nm
for the multispectral imaging system. An image-processing algorithm to locate the
170 Quality Inspection of Poultry Carcasses
region of interest was developed in order to define four classification areas on each
image, including whole carcass, region of interest, upper region, and lower region.
Three feature types – average intensity, average normalization, and average difference
normalization – were defined using several wavelengths for a total of 12 classification features. A decision-tree algorithm was used to determine threshold values for
each of the 12 classification features in each of the 4 classification areas. The feature
“average intensity” can be used to identify wholesome and systemically diseased chickens better than other features. Classification by average intensity in the region of interest
using 540- and 580-nm wavelengths resulted in the accuracies of 96 and 97 percent
for the classification of wholesome and systemically diseased chickens at 540 nm,
respectively. This simple differentiation method shows potential for automated on-line
chicken inspection.
4.6 Quality inspection by dual-band spectral imaging
Over the past three decades, poultry production has greatly increased and the processing
speed at slaughter plants has tripled (USDA, 1996a). Due to the massive production of
poultry and the inherent variability and complexity in individual birds, there are great
challenges for further improvement of the existing organoleptic inspection methods.
To design an effective machine vision system for on-line applications, vision hardware
functionality needs to be considered during the development of software (Park et al.,
A spectral imaging system measures the intensity of diffusely reflected light from
a surface at several wavelengths. The reflected light contains information regarding
the area close to the skin surface of broiler carcasses. Using intensities at six different
spectral wavelengths (540, 570, 641, 700, 720, and 847 nm), several spectral image
algorithms to differentiate wholesome carcasses from unwholesome carcasses have
been developed (Park and Chen 1996; Park et al., 1996). In this case, comparison
of images at two or more wavelengths provides robustness for classifying spectral
images. Since the process of analyzing a digital image to identify certain objects is
inherently computationally intensive, it is advantageous optically to pre-process the
image, extracting only those wavelengths which provide useful information. A pilotscale facility has been constructed specifically for developing the machine-vision based
systems for on-line poultry inspection. The facility has been utilized for evaluating
individual vision components and testing the workability of spectral imaging algorithms
(Park and Chen, 1998).
Chao et al. (2000) designed a real-time machine vision system, including vision hardware and software components integration, which can be adapted to on-line processing
at poultry slaughter plants. Object-oriented analysis was employed to identify the system’s responsibility for individual components. A real-time machine vision inspection
system was implemented in the pilot-scale facility. The system’s performance was optimized for on-line classification of normal and abnormal poultry carcasses. According
to their studies, two sets of dual-camera systems were applicable for on-line inspection
of poultry carcasses: one to image the front of the bird and the other to image the
back. Each system consisted of two identical CCD cameras equipped with interference
Poultry image classifications 171
filters of 540 nm and 700 nm with 10-nm bandwidth. The first set of dual-cameras captured the spectral images simultaneously, followed by the second set of dual-cameras.
Object-oriented analysis was performed to identify the attributes of individual software components and the relationships between these software components. These
individual software components were then organized by the object patterns to form
a software architectural framework for on-line image capture, off-line development
of classification models, and on-line classification of carcasses into wholesome and
unwholesome categories. For the model development, the accuracies to differentiate
between wholesome and unwholesome carcasses were 96 and 88 percent at 540 and
700 nm, respectively, for the front images; and 95 and 85 percent at 540 and 700 nm,
respectively, for the back images. According to the on-line classification using neural
network models, the imaging system used for scanning the fronts of carcasses performed well, with accuracies of 91, 98 and 95 percent for normal, abnormal, and
combined carcasses, respectively. However, the system accuracy tested from the back
images produced accuracies of 84, 100 and 92 percent for normal, abnormal, and
combined carcasses. Thus, dual-camera based spectral imaging system with selective
wavelength filters can be effectively used for on-line poultry quality inspection.
5 Poultry image classifications
5.1 Airsac classification by learning vector quantization
Since it was recognized that computer imaging would greatly improve the inspection
procedures, much work has been devoted to automatic inspection for wholesomeness in
chicken carcasses. Most of the research is based on different optical techniques, mainly
spectroscopy for the classification of wholesome, septicemic, and cadaver carcasses.
Chen and Hruschka (1998) performed on-line trials of a system for chicken carcass
external inspection, based on Vis/NIR reflectance. The system was able successfully
to identify 95 percent of the carcasses at a speed of 70 birds per minutes. Fiber-optic
spectroscopy was also used for the classification of diseases in slaughtered poultry
carcasses (Park et al., 1998a). Park et al. (1998b) also proposed the combination of
multispectral imaging and neural network classification models. In that research, two
cameras with interference filters at 540 nm and 700 nm and a back-propagation neural
network algorithm were used for the inspection of wholesomeness in poultry carcasses.
As for the detection of lesions commonly observed in the body cavity, Chao et al.
(1998) analyzed the size and coloration of liver in infected poultry. In related research
(Tao et al., 2000), the size and color features of infected enlarged spleen in turkeys
were studied. Both studies were performed under laboratory conditions, with the viscera
prepared prior to the experiments.
Color processing is also very competent for identifying agricultural problems. Ibarra
et al. (2002) developed a method for the classification of airsacculitis lesions in chicken
carcasses induced by secondary infection with Escherichia coli. They established a procedure for controlled induction of airsacculitis as well as RGB color transformation
for optimal classification. In addition, neural network classification was implemented
172 Quality Inspection of Poultry Carcasses
for color features from airsacculitis, using the learning vector quantization (LVQ) technique. According to their research, the variation in color features observed during the
evolution of airsacculitis in chicken carcasses can be exploited to classify the disease
using digital imaging and neural networks. For the supervised classification, a knowledge base set of normalized RGB values (corresponding to negative, mild, and severely
infected airsacs images) were obtained. Statistical data exploration indicated no significant difference between the color features of mild and severely infected airsacs;
a significant difference, however, was found between infected and negative tissues.
A neural network using the learning vector quantization algorithm classified the data
from infected and negative categories. Resubstitution and hold-out errors were calculated, giving an overall classification accuracy of 96 percent. The method developed in
this research has potential for integration into a computer-assisted inspection system
for wholesomeness at the poultry processing plants.
5.2 Quality classification by texture analysis
The features to be extracted from intensity information were mean, variance, and histogram of intensity. Even though the reflectance intensity measurement of the spectral
images provided useful information in the spatial domain for differentiating poultry
carcasses, these features were too sensitive to the variation in light intensity and spatial
dependency. Textural analysis methods, specifically Fourier power spectra analysis and
fractal analysis in the frequency domain, on the other hand, only depend on the spectral
frequency distribution on the image surface. This textural information is invariant to
the variation of light intensity and spectral dependency, rather than spatial dependency.
Texture is the term used to characterize the tonal or gray-level variation in an image.
Texture is an important discriminating surface characteristic which can aid in segmentation and classification of the region. Regions in an image cannot be classified until the
image has been segmented, but segmentation requires knowledge of region boundaries.
Hence, most methods of texture analysis operate on sub-images when the composition
of the image is unknown. This leads to a compromise between classification accuracy
and resolution. A smaller sub-image would not be a good representative, while a larger
sub-image would result in poor segmentation resolution. Therefore, the sub-images
need to be selected based on the consideration of carcass image size. Fourier power
spectrum analysis and fractal analysis were introduced for multispectral spectral image
classification of poultry carcasses.
5.2.1 Spectral poultry image classification in the frequency domain
For fast Fourier transform (FFT) analyses, all images were transformed by
equation (7.1):
mu nv 1 +
f (m, n)exp −j2π
MN m=0 n=0
M−1 N−1
F (u, v) =
To increase computational speed, the FFT algorithm was used. The input image was
recursively reordered to the form suitable for FFT calculation. Each spectral component
Poultry image classifications 173
was calculated by using factor numbered look-up tables to optimize speed at the expense
of memory requirement.
Since many image frequency spectra decrease rapidly with increasing frequency,
their high-frequency terms have a tendency to become obscured when displayed in the
frequency domain. Therefore, the equation below was used for Fourier power spectrum
representation instead of |F(u, v)|:
D(u, v) = 50 ∗ log(1 + |F (u, v)|)
Also, to display the full size of the Fourier power spectrum, the origin of the image in
the frequency domain was shifted to the coordinate of (N/2, N/2). Since only the Fourier
spectrum of the image was preserved, it was impossible to use invert FFT (IFFT) to
get back to the original image. Therefore, users should save it to a different file if they
wish to retain the original image.
The radial distributions of values in |F|2 are sensitive to textural coarseness. A coarse
texture will have high values of |F|2 concentrated near the origin, while a smoother
texture will have more spread (i.e. like a ring). Similarly, angular distributions of the
values of |F|2 are sensitive to the directionality of the texture. Thus, a directional texture
will have high values concentrated around the perpendicular lines (like wedges).
5.2.2 Fast power spectra of spectral images
The Fourier power spectra provide the coarseness of the texture of spectral images.
The 128 × 128 (16 384 pixels) image was cropped out of the whole body to generate
the power spectrum (Park et al., 1996a). Figure 7.4 shows the region of interest in
wholesome and unwholesome (bruised, skin-torn, and tumorous) carcass images and
corresponding FFT at different wavelengths of 542 and 700 nm. The Fourier spectrum
of wholesome carcasses is distinguishable from that of unwholesome carcasses.
As shown in each spectrum, there was little difference in the power spectrum of the
spectral image between 542 nm and 700 nm, except in the skin-torn carcass image. For
normal carcasses, the power spectrum was spread around the x-axis and concentrated
around horizontal lines. Thus, the textural feature of normal carcasses in the frequency
domain had a more directional distribution. On the other hand, the power spectra of
bruised, tumorous, and skin-torn carcasses concentrated near the origin. The features
in the frequency domain provided the texture coarseness. Since the radial distributions
of values in the Fourier power spectrum were sensitive to the texture coarseness of the
image in the spatial domain, a coarse texture had the high values of the power spectrum
concentrated near the origin, while smoother textures had more spread. Therefore, the
Fourier power spectrum was useful to differentiate normal carcasses from abnormal
carcasses (bruised, tumorous, and skin-torn carcasses) because it provides spectral
information and the features in the frequency domain are spatial-independent.
5.2.3 Fractal analysis
“Fractal” is a term used to describe the shape and appearance of objects which have
the properties of self-similarity and scale-invariance. The fractal dimension is a scaleindependent measure of the degree of boundary irregularity or surface roughness (Park
et al., 1996a).
174 Quality Inspection of Poultry Carcasses
(f )
Figure 7.4 Region of interest of poultry carcass images (128 × 128 pixels) and corresponding FFT at
different wavelengths: (a) normal at 542 nm; (b) normal at 700 nm; (c) bruising at 542 nm; (d) bruising at
700 nm; (e) skin-tear at 542 nm; (f) skin-tear at 700 nm; (g) tumor at 542 nm; (h) tumor at 700 nm.
Assume that the intensity I of a square image of size N × N is given by
I = I(x, y)
where 0 ≤ x, y < N − 1 and a displacement vector is defined as w = (x, y), where
x and y are integers. The integer restriction on x and y results from the discrete
nature of the image storage system. Minimum non-zero displacements are thus one
picture element horizontally or vertically. Finally, the difference of the image intensity
at a point (x, y) for a specific displacement vector w is defined by the following
Iw (x, y) = I(x, y) − I(x + x, y + y)
Poultry image classifications 175
Table 7.1 Fractal features of poultry carcasses in the frequency domain.
Wavelength (nm)
The above equation gives the difference of the image intensity of a picture along with
a specific displacement vector w, whose beginning is at a point (x, y) and whose
end is at a point (x + x, y + y). For example, if w = (1, 0) then for point (x, y)
we can construct the difference of the image intensities simply by calculating I(x,
y) − I(x + 1, y) for all combinations of x and y. In practice, the maximum value of x or
y would be limited to N − 2 to remain within the boundaries of the image.
The Image Processing Tools for Windows in-house software ARS scientists developed first finds the FFT of the image on the active window, then, from the result of the
FFT, the fractal of the images were calculated. The results of fractal analyses displayed
on the window were saved to the file name FRACTAL.DAT and POWER.DAT. The
fractal dimension D and roughness parameter H were calculated by:
−Slope = 1 + 2H = 7 − 2D
Roughness parameter H ranges from 0 to 1. When H is close to 0, the surface is the
roughest. When the value of H is close to 1, the surface is relatively smooth. From
these results, the roughness surface of an image can be quantized.
Fractal dimension, roughness, and slope of intensity changes were calculated from
the Fourier spectra of each carcass. Table 7.1 shows the fractal values of normal, tumorous, bruised, and skin-torn carcasses at the wavelengths of 542 and 700 nm. Based on
the spectral images scanned by the 542-nm wavelength, the fractal dimension of normal carcasses was smaller than that of abnormal carcasses. However, the roughness
and slope of the normal carcass were larger than those fractal features of the tumorous,
bruised, and skin-torn carcasses. The fractal dimension of the bruised carcasses was
much the same as that of skin-torn carcasses, which was even larger than the fractal
dimension of tumorous carcasses. The roughness and slope values of the bruised carcasses were similar to those values of skin-torn carcasses, but lower than in tumorous
carcasses. However, the fractal features of the spectral images scanned by the 700-nm
wavelength were not consistent compared with the results of the 542-nm wavelength –
i.e., the fractal dimension of bruised carcasses was lower than that of tumorous carcasses, and the roughness and the slope values of the bruised carcasses were higher
than those of tumorous carcasses. Thus, the fractal features of the poultry carcasses
varied with the wavelength of spectral images. Finally, the fractal dimension of the
176 Quality Inspection of Poultry Carcasses
normal carcasses was lower and the roughness and the slope of the normal carcasses
were higher than in abnormal carcasses in the spectral images of 700-nm wavelength.
5.2.4 Neural network models
A feed-forward backpropagation neural network algorithm was used for classifying
poultry carcasses. Because of prediction-related problems, the feed-forward network
structure was suitable for handling non-linear relationships between input and output
variables. Backpropagation was most frequently used for feed-forward networks. The
mathematical description of the backpropagation to be used for classification was
reported (Park et al., 1994).
The network has an input layer with 256 input nodes, an output layer with 2 output
nodes, and a hidden layer with 6 hidden nodes. Each layer was fully connected to the
succeeding layer. During learning, information was also propagated back through the
network and used to update the connection weights. The aim of the learning process is
to minimize the global error of the system by modifying the weights. Given the current
set of weights, it is necessary to determine how to increase or decrease them in order
to decrease the global error.
For the backpropagation algorithm, it is important to set an appropriate learning
rate. Changing the weights as a linear function of the partial derivative of the global
error makes the assumption that the error is locally linear which is defined by the
learning coefficient. To avoid divergent behavior of the network model, it is important
to keep the learning coefficient low. However, a small learning coefficient can lead to
very slow learning. The “momentum” was implemented to resolve this dichotomy. The
delta weight equation was modified so that a portion of the previous delta weight was
fed through to the current delta weight. The momentum term allows a low learning
coefficient but fast learning. Spectral poultry image data for neural network models
The region of interests (ROI) of the image to be analyzed was 128 × 128 (= 16 384)
pixels; however, because of the limitation of the number of the neural network input
nodes, the size of ROI was reduced to 16 × 16 (= 256) pixels, which was used for the
neural network models as input data. These input data were generated by averaging
8 × 8 image pixels of each chicken gray-intensity image. Figure 7.5 shows the image
data generated in the spatial domain and spectral domain for neural network models.
Figure 7.5 Multispectral images (16 × 16 pixels) at 542-nm wavelength, for neural network model:
(a) gray intensity of tumorous carcass; (b) FFT of tumorous carcass; (c) gray intensity of normal carcass;
(d) FFT of normal carcass.
Poultry image classifications 177 Neural network pattern classification
The neural network (NN) classifiers were developed and validated to differentiate
tumorous carcasses from normal carcasses based on the image data generated by the
Neural Network Image Date Generation Tool included in in-house software. The NN
model had 256 input nodes, a hidden layer with 16 hidden nodes, and 2 outputs. Based
on the testing results, using a total of 216 carcass images including 108 normal and
108 tumorous, the classification accuracy of neural network models for separating
tumorous carcasses from normal ones was 91 percent.
When two spectral images (542- and 700-nm wavelengths) were combined and used
as input data for the NN model to reduce the variability of intensity distribution (considering the position of the tumor on the body) in the spatial domain, the classification
model performed perfectly. None of the tumorous carcasses were classified as normal carcasses. Thus, the combined information gained from different spectral images
improved the performance of neural network models in classifying tumorous from
normal carcasses.
5.3 Supervised algorithms for hyperspectral image
In addition to infectious conditions of poultry carcasses, the FSIS is also concerned
about fecal contamination; in their food safety performance standards, there is zero
tolerance (USDA, 1996b).
In order to select the optimum classifier for identifying surface contaminants of
poultry carcasses, the performance of six different supervised classification algorithms
were investigated and compared. A pushbroom line-scan hyperspectral imager was used
for hyperspectral image acquisition with 512 narrow bands between 400- and 900-nm
in wavelength. Feces from three different parts of the digestive tract (duodenum, ceca,
colon) and ingesta were considered as contaminants. These contaminants were collected
from broiler carcasses fed with corn, milo, and wheat with soybean mixture.
5.3.1 Hyperspectral imaging system
A hyperspectral imaging system (Park et al., 2002) was used to collect spectral images
of contaminated and uncontaminated poultry carcasses. A transportable imaging cart
was designed to provide both portability and flexibility in positioning both the lights
and the camera system. The cart also contained a computer, power supplies, and other
equipment for hypercube date collection. Lighting requirements were evaluated and
adjusted for quality image acquisition. The imaging system consisted of an imaging
spectrograph with a 25-µm slit width and an effective slit length of 8.8 mm – Grating
Type I (ImSpector V9, PixelVision, Beaverton, Oregon); a high resolution CCD camera (SensiCam Model 370KL, Cooke Corporation, Auburn Hills, MI); a 1.4/23-mm
compact C-mount focusing lens (Xenoplan, Schneider, Hauppauge, NY) and associated optical hardware; motor for lens motion control (Model RSP-2T, Newport Corp.,
Irvine, CA); a frame-grabber (12-bit PCI interface board, Cooke Co, Auburn Hills, MI);
and a computer (Pentium II, 500 MHz). The prism-grating-prism spectrograph had a
nominal spectral range of 430–900 nm with a 6.6-mm axis, and attached to the camera
178 Quality Inspection of Poultry Carcasses
for generating line-scan images. The spectrograph had a nominal spectral resolution of
2.5 nm, and was connected to a 2/3-inch silicon-based CCD sensor with 1280 × 1024
pixel resolution. The camera was thermoelectrically cooled and had a spectral response
from 290 to 1000 nm with a maximum readout time of 8 fps. For consistent illumination of poultry carcasses, the lighting system consisted of the 150-watt quartz halogen
DC stabilized fiber-optic illuminator (Fiber-Lite A240, Dolan-Jenner, Inc., Lawrence,
MA), a lamp assembly, fiber-optic cables, and 10-inch illuminating size of quartz
halogen line lights (QF5048, Dolan-Jenner, Inc., Lawrence, MA).
5.3.2 Classification methods
Six supervised classification methods were examined in this study for selecting
optimum classifiers to identify contaminants on the surface of broiler carcasses: parallelepiped, minimum distance, Mahalanobis distance, maximum likelihood, spectral
angle mapper, and binary encoding classifier. Parallelepiped classification uses a
simple decision rule to classify hyperspectral data. The decision boundaries form an
n-dimensional parallelepiped in the image data space. The dimensions of the parallelepiped are defined based upon a standard deviation threshold from the mean of each
selected class. If a pixel value lies above the low threshold and below the high threshold for all n bands being classified, it is assigned to that class. The minimum distance
method uses the mean vectors of each endmember and calculates the Euclidean distance
from each unknown pixel to the mean vector for each class. All pixels are classified
to the nearest class unless a standard deviation or distance threshold is specified, in
which case some pixels may be unclassified if they do not meet the selected criteria.
Maximum likelihood classification assumes that the statistics for each class in each
band are normally distributed, and calculates the probability that a given pixel belongs
to a specific class. Unless a probability threshold is selected, all pixels are classified.
Each pixel is assigned to the class that has the highest probability. The Mahalanobis
distance classification is a direction-sensitive distance classifier that uses statistics for
each class. It is similar to the maximum likelihood classification, but it assumes that all
class co-variances are equal and therefore processing time is faster. All pixels are classified to the closest region of interest (ROI) class unless a distance threshold is specified,
in which case some pixels may be unclassified if they do not meet the threshold. For
more details about classification algorithms, readers are referred to Richards and Jia
(1999). The spectral angle mapper (SAM) is a physically-based spectral classification
that uses an n-dimensional angle to match pixels to reference spectra. The algorithm
determines the spectral similarity between two spectra by calculating the angle between
the spectra, treating them as vectors in a space with dimensionality equal to the number of bands. SAM compares the angle between the endmember spectrum vector and
each pixel vector in n-dimension space. Smaller angles represent closer matches to
the reference spectrum. More details are presented in Kurse et al. (1993). The binary
encoding classification method encodes the data and endmember spectra into 0s and
1s based on whether a band falls below or above the spectrum mean. An exclusive OR
function is used to compare each encoded reference spectrum with the encoded data
spectra and a classification image produced. For more details about binary encoding
classification algorithm, see Mazer et al. (1988).
Poultry image classifications 179
After all supervised classification methods had been applied to the hyperspectral
ROI data, the post-classification method (a confusion matrix in this case) was applied
for the optimum selection of the classification method to identify fecal and ingesta
For the assessment of classification accuracy, a confusion matrix was analyzed to
determine the accuracy of a classification results by comparing a classification result
with ground truth ROI information. The kappa coefficient was also calculated to compare the accuracy of different classifiers. The kappa coefficient is an indicator or overall
agreement of a matrix and accounts for all the elements in a confusion matrix. The
kappa coefficient (κ) can be obtained by:
χkk −
χk χk
χk χk
where N = total number of pixels in all ground truth classes, χkk = sum of confusion
matrix diagonals, χk = sum of ground truth pixels in a class, and χk = sum of
classified pixels in that class.
The kappa coefficient is always less than or equal to 1. A value of 1 implies perfect
agreement, and values less than 1 imply less than perfect agreement.
5.3.3 Hyperspectral image characteristics for classification
In order to select the optimum classification method for fecal and ingesta contaminant identification on poultry broiler carcasses, six different supervised classification
methods were investigated and the results were compared. Figure 7.6 shows a typical
hyperspectral image of uncontaminated (Figure 7.6a) and surface contaminated ROIs
(Figure 7.6b). In this sample, 25 pixels were observed as being duodenum, 27 pixels
as ceca, 78 pixels as colon, 93 pixels as ingesta, and 195 pixels as skin. Actually, the
skin included breast, thigh, and wing for classification.
Figure 7.6 ROI of a corn-fed poultry carcass: (a) clean (uncontaminated); (b) fecal contaminant.
ROI: duodenum (25 pixels), ceca (27 pixels), colon (78 pixels), ingesta (93 pixels), skin (195 pixels).
180 Quality Inspection of Poultry Carcasses
Skin (thigh)
Skin (breast)
Skin (wing)
Reflectance, percent
Wavelength, nm
Figure 7.7 Mean spectra of fecal and ingesta contaminant ROIs from corn-fed poultry broiler carcass.
Figure 7.7 is the corresponding spectrum for each ROI. Each spectrum indicated
duodenum, cecal, colon, ingesta, thigh, breast, and wing, respectively. Typically, the
spectra from contaminants gradually increased with wavelength from 420 to 730 nm,
whereas the reflectance spectra of skin increased to about 520 nm but decreased and
then increased again from about 550 nm upwards. The reflectance spectra of skin were
much higher than those of the contaminants.
5.3.4 Comparison of classification methods
Figure 7.8 shows six different classification maps that allow visualization of results
of each classification method tested to identify fecal and ingesta contaminants on surface of broiler carcasses. The parallelepiped classifier identified duodenum, ceca, and
colon with high accuracy. However, many ingesta pixels were misclassified as duodenum (Figure 7.8a). Most duodenum, cecal, and colon contaminants, except ingesta,
were also classified correctly by minimum distance classifier (Figure 7.8b). The Mahalanobis distance classifier also classified fecal contaminants with high accuracy, yet
most ingesta contaminants were misclassified as duodenum and uncontaminated skin
surfaces were also misclassified as duodenum (false positive) (Figure 7.8c). The results
of the maximum likelihood classifier were similar to those of the Mahalanobis distance classifier. The duodenum, cecal, and colon contaminants were classified with
a minimal misclassification rate. The misclassification of ingesta was much lower
than with the Mahalanobis distance classifier; however, many false positive pixels for
uncontaminated skin were found (Figure 7.8d). The spectral angle mapper classifier
also identified most fecal and ingesta contaminants with high classification accuracy.
However, with this classifier many pixels on the skin and vent area were misclassified as duodenum (Figure 7.8e). Even though the classification accuracy was not high
Poultry image classifications 181
(f )
Figure 7.8 Classification maps from mean spectra of surface contaminant ROI from corn-fed poultry
carcasses: (a) parallelepiped classifier; (b) minimum distance classifier; (c) Mahalanobis distance classifier;
(d) maximum likelihood classifier; (e) spectral angle mapper classifier; (f) binary coding classifier. Each color
map represents duodenum (first row of from top), ceca (second row), colon (third row), ingesta (fourth
row), skin (white), and unclassified or background (black).
enough, the binary coding classifier classified most fecal contaminants and ingesta as
well. For this classifier, many pixels on skin were misclassified as colon contaminants
(Figure 7.8f).
5.3.5 Accuracy of classifiers for contaminant identification
Six different supervised classification methods were applied for the broiler carcasses
fed with three different feeds to compare the accuracy of classification methods for
selecting a robust classifier regardless of the diet fed to the poultry.
Table 7.2 shows the overall mean accuracies of each classification method as applied
to differently-fed broiler carcasses. Both the maximum likelihood and spectral angle
mapper classifiers performed with higher accuracy than other classifiers for all fecal
and ingesta contaminant identification from all the differently-fed broiler carcasses. For
the corn-fed carcasses, the classification accuracies ranged from 64.7 (parallelepiped)
to 92.3 (spectral angle mapper) percent. The mean accuracy of classifiers for milo-fed
carcasses was slightly lower than for corn-fed carcasses, with the accuracy ranging from
62.9 (binary coding) to 88 (maximum likelihood) percent. For wheat-fed carcasses,
the highest mean classification accuracy (91.2 percent) was again obtained from the
maximum likelihood classifier. Of the six supervised classification methods, the best
classifier for classifying fecal and ingesta contaminants was the maximum likelihood
method (90.2 percent), followed by the spectral angle mapper method (89.4 percent),
182 Quality Inspection of Poultry Carcasses
Table 7.2 Mean accuracy of classification methods for classifying feces and ingesta contaminants in
three differently-fed (corn, milo, and wheat) broiler carcasses.
Minimum distance
Mahalanobis distance
Maximum likelihood
Spectral angle mapper
Binary coding
64.70 (0.590)a
79.73 (0.760)
69.21 (0.634)
91.44 (0.899)
92.27 (0.908)
66.83 (0.607)
66.48 (0.612)
78.75 (0.747)
70.41 (0.649)
88.02 (0.859)
87.34 (0.849)
62.94 (0.563)
66.86 (0.615)
80.41 (0.767)
71.33 (0.659)
91.16 (0.895)
88.65 (0.865)
63.80 (0.574)
66.01 (0.606)
79.63 (0.758)
70.32 (0.647)
90.21 (0.884)
89.42 (0.874)
64.52 (0.581)
kappa coefficient values are given in parentheses.
the minimum distance method (79.6 percent), the Mahalanobis distance method (70.3
percent), the parallelepiped method and binary coding method (64.5 percent).
The kappa coefficients in Table 7.2 indicate overall agreement of a matrix, and
accounts for all the elements in a confusion matrix, which is used to calculate overall
accuracy in the table. A kappa coefficient of 1 reflects perfect agreement between classification and ground truth data. The kappa coefficients confirmed that the optimum
classifiers were the SAM classifier (0.908) for corn, and the maximum likelihood for
both milo (0.859) and wheat (0.895), which indicated those classifiers had very good
agreement in identifying each contaminant from different diets.
6 Conclusions
Food safety is an important issue for public health, because reduction in potential health
risks to consumers from human pathogens in food is the most important public concern. The Food Safety and Inspection Service (FSIS) in the USDA sets zero tolerance
performance standards for two food safety categories, including fecal contamination
and infectious condition such as septicemia and toxemia, during poultry processing.
Along with global food safety issues, the FSIS is charged with protecting consumers
by ensuring safe and wholesome poultry and poultry products. The FSIS is pursuing
a broad and long-term scientifically-based strategy to improve the safety of poultry
and poultry products to better protect public health. For poultry plants to meet federal
food safety regulations and satisfy consumer demand while maintaining their competitiveness, the FSIS has recognized the need for new inspection technologies, such as
automated machine-vision based inspection systems.
Several different machine vision systems, including color, multi-, and hyper-spectral
imaging, have been developed and tested for poultry quality and safety inspection. For
high-speed inspection, machine vision is a solution; however, it requires advanced
sensing capabilities for the complexity of poultry carcasses. Multispectral imaging
is a good tool in these advanced techniques because of its capability to detect both
unwholesomeness and contamination using two or more specific wavelengths which
reflect the condition of poultry carcasses. Along with selective image processing and
analysis software, a multispectral imaging system can be effectively implemented
References 183
real-time, on poultry processing lines, at the speed the industry requires (currently
140 birds per minute).
Hyperspectral imaging is also an extremely useful tool to analyze thoroughly the
spectra of the surface of poultry carcasses, because hyperspectral image data contain
a wide range of spectral and spatial information. A hyperspectral imaging system
with simple image-processing algorithms could be effectively used for the detection of
both contaminants and infectious disease on the surface of broiler carcasses. Further
analyses of hyperspectral imagery enable identification of the type and sources of
various contaminants and systemic diseases, which can determine critical control points
to improve HACCP for the federal poultry safety program. Because the concerns of
today’s inspectors are broader, and include unseen hazards such as microbiological
and chemical contamination, hyperspectral imaging techniques will be widely used for
poultry quality and safety inspection.
Arp JH (1982) Pathology of spleen and liver in turkeys inoculated with Escherichia Coli.
Avian Pathology, 11, 263–279.
Calnek BW, Barnes HJ, Beard CW, Reid WM, Yoder HW (1991) Diseases of Poultry. Iowa
State University Press, pp. 386–484.
Chao K, Gates RS, Anderson RG (1998) Knowledge-based control systems for single stem
rose production – Part I: systems analysis and design. Transactions ASAE, 41 (4),
Chao K, Chen YR, Early H, Park B (1999) Color image classification systems for poultry
viscera inspection. Applied Engineering in Agriculture, 15 (4), 363–369.
Chao K, Park B, Chen YR, Hruschka WR, Wheaton FW (2000) Design of a dual-camera
system for poultry carcasses inspection. Applied Engineering in Agriculture, 16 (5),
Chao K, Chen YR, Hruschka WR, Park B (2001) Chicken heart disease characterization
by multispectral imaging. Applied Engineering in Agriculture, 17 (1), 99–106.
Chao K, Mehl PM, Chen YR (2002) Use of hyper- and multi-spectral imaging for detection
of chicken skin tumors. Applied Engineering in Agriculture, 18 (1), 113–119.
Chao K, Chen YR, Chan DE (2003) Analysis of Vis/NIR spectral variations of wholesome,
septicemia, and cadaver chicken samples. Applied Engineering in Agriculture, 19 (4),
Chao K, Chen YR, Ding F, Chan DE (2005) Characterizing wholesome and unwholesome
chickens by CIELUV color difference. Applied Engineering in Agriculture, 21 (4),
Chen YR, Hruschka WR (1998) On-line trials of a chicken carcass inspection system
using visible/near-infrared reflectance. ASAE Paper No. 983047, ASAE, St Joseph,
Chen YR, Massie DR (1993) Visible/near-infrared reflectance and interactance spectroscopy for detection of abnormal poultry carcasses. Transactions ASAE, 36 (3),
184 Quality Inspection of Poultry Carcasses
Chen YR, Huffman RW, Park B. (1996a) Changes in the visible/NIR spectra of chicken
carcasses in storage. Journal of Food Process Engineering, 19, 121–134.
Chen YR, Huffman RW, Park B, Nguyen M (1996b) Transportable spectrophotometer
system for on-line classification of poultry carcasses. Applied Spectroscopy, 50 (7),
Chen YR, Nguyen N, Park B, Chao K (1998a) Intelligent on-line training and updating of
automated poultry inspection system. ASAE Paper No. 983047, ASAE, St Joseph,
ChenYR, Nguyen M, Park B (1998b) Neural network with principal component analysis for
poultry carcass classification. Journal of Food Process Engineering, 21 (5), 351–367.
Chen YR, Hruschka WR, Early H (1998c) On-line inspection of poultry carcasses using
visible/near-infrared spectrophotometer. Proceedings of the SPIE, The International
Society of Optical Engineering, 3544, 146–155.
Chen YR, Park B, Huffman RW, Nguyen M (1998d) Classification of on-line poultry carcasses with back propagation neural networks. Journal of Food Process Engineering,
21, 33–48.
Chen YR, Hruschka WR, Early H (2000) A chicken carcass inspection system using
visible/near-infrared reflectance: in plant trials. Journal of Food Process Engineering,
23 (2), 89–99.
Clarke JK, Allan GM, Bryson DG, Willians W, Todd D, Mackie DP, McFerran JB (1990)
Big liver and spleen disease of broiler breeders. Avian Pathology, 19, 41–50.
Daley W, Rao T (1990) Color vision for industrial inspection. In Proceedings of Machine
Vision Association of Society of Manufacturing Engineers, MS90-600.
Daley W, Soulakos C, Thomson C, Millet R (1988) A novel application: machine vision
inspection, grading, and identification of chicken parts. In Proceedings of Robotics
and Vision ’88, Society of Manufacturing Engineers, Dearborn, Michigan, USA.
Daley W, Carey R, Thompson C (1994) Real-time color grading and defect detection of food
products. Proceedings of the SPIE, The International Society of Optical Engineering,
2345, 403–411.
Domermuth CH, Harris JR, Gross WB, DuBose RT (1978) A naturally occurring infection
of chickens with a hemorrhagic enteritis/marble spleen disease type of virus. Avian
Diseases, 23 (2), 479–484.
Evans MD, Thai CN, Grant JC (1997) Computer control and calibration of a liquid crystal
tunable filter for crop stress imaging. ASAE Paper No. 973141, ASAE, St Joseph, MI,
Fairchild MD (1998) Color Appearance Models. Reading: Addison Wesley.
Favier J, Ross DW, Tsheko R, Kennedy DD, Muir AY, Fleming J (1998) Discrimination of
weeds in brassica crops using optical spectral reflectance and leaf texture analysis. Proceedings of the SPIE,The International Society of Optical Engineering 3543, 311–318.
Hruschka WR (1987) Data analysis: wavelength selection methods. In Near-Infrared Technology in Agricultural and Food Industries (Williams P, Norris. K, eds). St Paul:
American Association of Cereal Chemists, pp. 35–55.
Hsieh C, Chen YR, Dey BP, Chan DE (2002) Separating septicemic and normal chicken
livers by visible/near-infrared spectroscopy and back-propagation neural networks.
Transactions ASAE, 45 (2), 459–469.
References 185
Ibarra JG, Tao Y, Newberry L, Chen YR (2002) Learning vector quantization for color
classification of diseased air sacs in chicken carcasses. Transactions ASAE, 45 (5),
Jang JR (1993) ANFIS: Adaptive-Network-based Fuzzy Inference System. IEEE Transactions on Systems, Man, & Cybernetics, 23(3), 665–683.
Kurse FA, Lefkoff AB, Boardman JB, Heidebrecht KB, Shapiro AT, Barloon PJ, Goetz AFH
(1993) The spectral image processing system (SIPS) – interactive visualization and
analysis of imaging spectrometer data. Remote Sensing of Environment, 44 (1),
LiuY, ChenYR (2001)Analysis of visible reflectance spectra of stored, cooked, and diseased
chicken meats. Meat Science, 58 (4), 395–401.
Liu Y, Fan X, Chen YR, Thayer DW (2003) Changes in structure and color characteristics
of irradiated chicken breasts as a function of dosage and storage time. Meat Science,
63 (3), 301–307.
Mao C, Heitschmidt J (1998) Hyperspectral imaging with liquid crystal tunable filter for
biological and agricultural assessment. Proceedings of the SPIE, The International
Society of Optical Engineering, 3543, 172–181.
Mazer AS, Martin M, Lee M, Solomon JE (1988) Image processing software for imaging
spectrometry analysis. Remote Sensing of Environment, 24 (1), 201–210.
Miller BK, Delwiche MJ (1989) A color vision system for peach grading. Transactions
ASAE, 34 (4), 1484–1490.
Mountney M (1987) US Department of Agriculture standards for processed poultry and
poultry products. In The Microbiology of Poultry and Meat Products (Cunningham
FE, Cox NA, eds). New York: Academic Press, Ch. 6.
Muir AY (1993) Machine vision and spectral imaging. Agricultural Engineering, 48 (4):
Nauck D, Kruse R. (1995) NEFCLASS – a neuro-fuzzy approach for the classification of
data. Proceedings of the Association for Computing Machinery Symposium On Applied
Computing, Nashville, 26–28 Feb. New York: ACM Press.
OSHA (1999) Chicken disassembly – ergonomic considerations. http://www.oshaslc.gov/SLTC/poultryprocessing. US Department of Labor, Washington, DC.
Park B, Chen YR (1994a) Intensified multi-spectral imaging system for poultry carcass
inspection. Transactions ASAE, 37 (6), 1983–1988.
Park B, Chen YR (1994b) Multispectral image textural analysis for poultry carcasses
inspection. ASAE Paper No. 946027, ASAE, St Joseph, MI, USA.
Park B, Chen YR (1998) Real-time multispectral image processing for poultry inspection.
ASAE Paper No. 983070, ASAE, St Joseph, MI, USA.
Park B, Chen YR, Whittaker AD, Miller RK, Hale DS (1994) Neural network modeling for
beef sensory evaluation. Transactions ASAE, 37 (5), 1547–1553.
Park B, Chen YR, Huffman RW (1995) Integration of visible/NIR spectroscopy and
multispectral imaging for poultry carcass inspection. Proceedings of the SPIE, The
International Society of Optical Engineering, 2345, 162–171.
Park B, Chen YR, Nguyen M, Hwang H (1996a) Characterizing multispectral images of
tumorous, bruised, skin-torn, and wholesome poultry carcasses. Transactions ASAE,
39 (5), 1933–1941.
186 Quality Inspection of Poultry Carcasses
Park B, Chen YR (1996b) Multispectral image co-occurrence matrix analysis for poultry
carcasses inspection. Transactions ASAE, 39 (4), 1485–1491.
Park B, Chen YR, Chao K (1998a) Multispectral imaging for detecting contamination
in poultry carcasses. Proceedings of the SPIE, The International Society of Optical
Engineering, 3544, 110–120.
Park B, Chen YR, Nguyen M (1998b) Multi-spectral image analysis using neural network
algorithm for the inspection of poultry carcasses. Journal of Agricultural Engineering
Research, 69, 351–363.
Park B, Lawrence KC, Windham WR, Buhr RJ (2002) Hyperspectral imaging for detecting fecal and ingesta contaminants on poultry carcasses. Transactions ASAE, 45 (6),
Park B, Lawrence KC, Windham WR, Smith DP, Feldner PW (2003). Machine vision for
detecting internal fecal contaminants of broiler carcasses. ASAE Paper No. 033051,
ASAE, St Joseph, MI, USA.
Precetti CJ, Krutz GW (1993) Real-time color classification system. ASAE Paper No.
933002, ASAE, St Joseph, MI, USA.
Richards JA, Jia X (1999) Remote Sensing Digital Image Analysis. Berlin: Springer-Verlag.
Sakar N, Wolfe RR (1985) Feature extraction techniques for sorting tomatoes by computer
vision. Transactions ASAE, 28 (3), 970–979.
Saputra D, Payne FA, Lodder RA, Shearer SA (1992) Selection of near-infrared wavelengths
for monitoring milk coagulation using principle component analysis. Transactions
ASAE, 35 (5), 1597–1605.
Schat KA (1981) Role of the spleen in the pathogenesis of Marek’s disease. Avian Pathology,
10, 171–182.
Swatland HJ (1989) A review of meat spectrophotometry (300 to 800 nm). Canadian
Institute of Food Science and Technology Journal, 22 (4), 390–402.
Tao Y, Morrow CT, Heinemann PH, Sommer JH (1990) Automated machine vision
inspection of potatoes. ASAE Paper No. 903531, ASAE, St Joseph, MI, USA.
Tao Y, Heinemann PH, Varghese Z, Morrow CT, Sommer III HJ (1995) Machine vision for
color inspection of potatoes and apples. Transactions ASAE, 38 (5), 1555–1561.
Tao Y, Shao J, Skeeles JK, Chen YR (1998) Spleen enlargement detection of eviscerated
turkey by computer vision. Proceedings of the SPIE, The International Society of
Optical Engineering, 3544, 138–145.
Tao Y, Shao J, Skeeles K, Chen YR (2000) Detection of splenomegaly in poultry carcasses
by UV and color imaging. Transactions ASAE, 43 (2), 469–474.
Throop JA, Aneshansley DJ (1995) Detection of internal browning in apples by light transmittance. Proceedings of the SPIE, The International Society of Optical Engineering,
2345, 152–165.
Tsuta M, Sugiyama J, Sagara Y (2002) Near-infrared imaging spectroscopy based on sugar
absorption for melons. Journal Agricultural Food Chemistry, 50 (1), 48–52.
USDA (1984) A review of the slaughter regulations under the Poultry Products Inspection
Act. Regulations Office, Policy and Program Planning, FSIS, USDA, Washington, DC.
USDA (1985) Meat and Poultry Inspection. Committee on the Scientific Basis of the
Nation’s Meat and Poultry Inspection Program. Washington, DC: National Academy
References 187
USDA (1996a) Key Facts: Economic impact analysis. USDA, FSIS, HACCP Rule-Economic
Analysis. Washington, DC: USDA/FSIS.
USDA (1996b) Pathogen Reduction; Hazard Analysis and Critical Control Point (HACCP)
Systems, Final Rule. Fed. Reg. 61: 38805–38855.
USDA (2005) Agricultural Statistics, National Agricultural Statistics Service,
Washington, DC.
Wen Z, Tao Y (1998) Fuzzy-based determination of model and parameters of dualwavelength vision system for on-line apple sorting. Optical Engineering, 37 (1),
Williamson SJ, Cummins HZ (1993) Light and Color in Nature and Art. New York: Wiley.
Windham WR, Lawrence KC, Park B, Buhr RJ (2003a) Visible/NIR spectroscopy for
characterizing fecal contamination of chicken carcasses. Transactions ASAE, 46 (3),
Windham WR, Smith DP, Park B, Lawrence KC, Feldner PW (2003b) Algorithm development with visible/near-infrared spectra for detection of poultry feces and ingesta.
Transactions ASAE, 46 (6), 1733–1738.
Yang CC, Chao K, Chen YR, Kim MS (2004) Application of multispectral imaging for
identification of systemically diseased chicken. ASABE Paper No. 043034.
Yang CC, Chao K, ChenYR, Kim MS, Early HL (2006) Simple multispectral image analysis
for systemically diseased chicken identification. Transactions ASAE, 49 (1), 245–257.
Quality Evaluation of
Murat O. Balaban1 , Asli Z Odabaşi1 , Sibel Damar1 and Alexandra
C.M. Oliveira2
1 University of Florida, Food Science and Human Nutrition Department,
PO Box 110370, Gainesville, FL 32611-0370, USA
2 Fishery Industrial Technology Center, University of Alaska Fairbanks,
Kodiak, AK 99615, USA
1 Introduction
Quality attributes of seafood include appearance (size, shape, color), smell, taste,
nutritional aspects, and safety-related properties. Machine vision (MV) can potentially
evaluate all these attributes. Smell and taste are the most difficult to evaluate with MV,
although volatile attributes can be related to color for analysis (Rakow and Suslick,
2000; Suslick and Rakow, 2001) by inducing color changes in an array of dyes and
permitting visual identification. Nutrition can also be evaluated as far as some proximate composition components are concerned (such as moisture content and fat) using,
for example, near infrared (Wold and Isakkson, 1997). Pinbones, shell fragments, and
other undesirable matter can also be recognized by MV (Graves, 2003). Direct measurement of safety (microbial, chemical, metal fragments, etc.) is currently difficult to
measure using MV.
Visual attributes of seafood will be discussed in this chapter. They include size,
shape, and color. Brief literature examples will be given for each, and some of the
research performed in our laboratory will be presented.
2 Visual quality of seafood
2.1 Size
2.1.1 Literature
Arnarson (1991) describes a system to sort fish and fish products by machine vision.
The difficulties of sorting fish are listed as: fast speed requirements, number of species,
variation of the size and shape of each species, variation of the optical characteristics of
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
190 Quality Evaluation of Seafood
each fish, the elastic nature of fish, and the harsh environment in factories. The length
measurement of cod was accomplished by measuring the distance between the middle
of the tail and the top of the head of a straight fish. The accuracy of length measurement
was ± 0.9 cm, independent of size and orientation. Bent fish are allowed. The algorithm
detected each fish, drawing a rectangle around it. It then detected the head, the tail,
and the belly, and drew a line from the middle of the tail to the top of the head, taking
into account the position of the belly. Each fish required an analysis time of between
0.6 and 0.8 seconds.
The most accurate method to measure free-swimming fish is to use a stereo video
system (Batty, 1989; Champalbert and Direach-Boursier, 1998), where the space in
front of two cameras can be calibrated in three dimensions using landmark points. Freeswimming fish can then be viewed from different angles and orientations. However,
these sophisticated systems require a great deal of computing power, and work better
on relatively large fish.
Martinez-Palacios et al. (2002) used a video camera and recorder to measure the
length of larval and juvenile white fish (Chirostoma estor estor). Juvenile fish were
placed in a Petri dish with water, and images were taken from the top with a calibration
grid of 1-mm lines. The dry weight of the fish was correlated to length using a logarithmic equation with an r 2 = 0.99. The average weight estimation error was 2 percent.
Oysters are generally sold by volume. Accurate oyster grading is critical for pricing,
and oysters are typically graded and sorted by humans before and after shucking. This
is a labor-intensive, time-consuming, and subjective process. Oysters do not have a
regular shape (Diehl et al., 1990), and the dimensions and the overall thickness of
oysters, the thickness of their shells, and the amount of meat vary depending on the
species, location, age, pre- or post-spawning, and individual oyster (Li, 1990; Hamaoka
and Sasaki, 1992). It is clearly desirable to accurately predict the volume or weight of
oysters for sorting and pricing. Machine vision, being a fast, accurate, and non-contact
method of grading various food products, can also be applied to oysters.
Parr et al. (1994) developed a machine-vision based sorting and grading system
for oyster meat. In a continuous system, one oyster could be graded into one of three
grades in 2 seconds. Tojeiro and Wheaton (1991) developed a system, based on a
black-and-white video camera and a mirror, to obtain the top and side views of an
oyster simultaneously. Further, they also developed software to determine the ratio
of thicknesses about 1.5 cm from each end to decide on the hinge side. The method
oriented 233 oysters with a correct rate of 98.2 percent. Li and Wheaton (1992) used
a Wheaton shucking machine to trim the hinge-ends of oysters and obtained images
using a video camera. A pattern recognition technique was used to locate oyster hinge
lines, with an error rate of 2.5 percent. In 2002, So and Wheaton published results of
their latest software development efforts to automate oyster hinge-line detection using
machine vision. This time a color camera was used. The software calculated circularity,
rectangularity, aspect ratio, and Euclidian distance to recognize the hinge from other
dark objects on the hinge end of the oyster. Lee et al. (2001) used a laser-line based
method to predict the volume of oyster meat. The thickness information was gathered
by the shape of the laser line on the meat. The predicted volume was compared to the
experimentally determined volume, where the correlation coefficient was 0.955.
Visual quality of seafood 191
Weight (g)
head off
tail off
View pixels (thousands)
Figure 8.1 Weight of different white shrimp forms vs view area obtained from a machine vision system.
2.1.2 Determination of shrimp weight, count, and uniformity ratio
Quality evaluation practice for shrimp involves a trained inspector who weighs a shrimp
sample, counts the number of shrimp, and calculates the count (number/unit weight)
and uniformity ratio UR (a weight ratio of the largest 10 percent of shrimp to the smallest
10 percent of shrimp). The inspector looks for visible defects such as melanosis (black
spots formed by enzymatic activity), foreign material, shell parts, and broken pieces.
This subjective practice can be automated. Luzuriaga et al. (1997) developed calibration
relationships of the view area obtained by MV vs the weight of intact, headless, peeled
tail-on, and peeled tail-off white shrimp (Penaeus setiferus). Head-on, non-frozen
shrimp were placed individually in the light box described by Balaban et al. (1994),
and an image was acquired. The view area of the shrimp in pixels was measured. The
shrimp was then weighed; 100 shrimp were processed in this way. The same procedure
was repeated three times for the same 100 shrimp after they had been deheaded, peeled,
and the tail removed, respectively. The results are shown in Figure 8.1.
Several equations were tested to correlate the weight to the view area in pixels. The
best fits are shown in Table 8.1. It is evident that for a shrimp that is not split, weight
can be accurately predicted by view area. Once weight is determined, then count and
uniformity ratio are easy to calculate.
In industrial practice, one issue would be whether shrimp were touching or partially
blocking each other. There are applications of estimating the weight of shrimp by
machine vision in the industry – for example, the Marel (Reykjavik, Iceland) Model
L-10 “Vision Weigher” for shrimp processing. Once calibrated, the system estimates
the weight of a shrimp from its view area.
2.1.3 Oyster volume
Damar et al. (2006) experimentally measured volumes (overall, shell, meat) of oysters
from Florida, Texas, (Crassostrea virginica), and Alaska (Crassostrea gigas) using the
Archimedes principle. Using a machine vision system, the top and side view images
of whole oysters were captured (Figure 8.2), and the actual view areas were calculated
by calibrating pixel area with that of a known square.
192 Quality Evaluation of Seafood
Table 8.1 Experimentally determined and estimated total weight, count, and uniformity ratio values
for different forms of white shrimp.
y = weight (g), x = view area (pixels)
(n = 97)
(n = 99)
(n = 101)
Tail off
(n = 100)
1.96 × 10−7
3.40 × 10−5
2.05 × 10−6
6.15 × 10−9
Total weight (g)
Uniformity ratio
1.57 × 10−5
1.94 × 10−5
2.21 × 10−5
2.63 × 10−5
Total weight (g)
Uniformity ratio
Fit: y = a + bx c
Fit: y = a + bx 1.5
R2 =
R2 =
The oyster image was divided into an even number of volume slices of equal thickness
between points p1 and p2. The sum of all the volume slices would give the total volume.
Coordinates of points a, b, c, and d were determined from the image. The distance
between a and b along the X axis, and the distance between c and d along the Y axis,
are shown in Figure 8.2. Points c and d were assumed to be at the midpoint of points a
and b along the X axis. Therefore:
ax + b x
dx = cx
cx =
The cross-sectional area at each volume slice (shown in Figure 8.2) was found by fitting
a cubic spline to points a, c, and b, and another to points a, d, and b. The cross-sectional
area formed by these two curves was calculated as:
Cross-sectional area =
5 (bx − ax )(d y − c y )
Visual quality of seafood 193
Figure 8.2 Determination of oyster volume using cubic splines.
These cross-sectional areas were integrated along the Z axis using Simpson’s method:
⎞ ⎛
Volume = ⎝
4 areai ⎠ + ⎝
2 areai ⎠
i=1,i is even
i=1,i is odd
where n is the number of cross sections, and h = (p2z − p1z )/n.
Texas oysters had an average experimental volume of 66.52 ± 18.90 cm3 and an
average calculated volume of 67.33 ± 19.70 cm3 . Figure 8.3 shows the predicted and
experimental volumes of Texas oysters (r 2 = 0.93). Total oyster volume, meat volume,
and meat weight were also correlated with the view areas. However, more research
is needed in this area to validate this method in oysters from different locations and
seasons, and with different spawning status. This method can potentially be used to
sort oysters on a conveyor line.
2.2 Shape
2.2.1 Literature
Williams et al. (2006) used underwater video from fish-farm cages to detect salmon
in images collected. The 256 gray-level images were contrast-enhanced, and the background was removed by segmentation. An active shape model technique was applied:
a collection of labeled points are determined to define boundaries of a specified shape.
194 Quality Evaluation of Seafood
Calculated volume (cm3)
Real volume (cm3)
Figure 8.3
Comparison of oyster volume calculated by cubic splines and measured experimentally.
During training, statistical variation between the points is determined. A model representing the average appearance in the training set is obtained from the mean values
of each point. This results in a point distribution model with a number of parameters
that can be altered during a search to identify a shape even when it is deformed. From
125 initial fish images, 65 (or 52 percent) were correctly matched using the salmon
shape model. Shadows, and fish swimming towards or away from the camera, created
problems, as well as segmentation inaccuracies.
In the fisheries and seafood area, prawns can be automatically graded and packaged
into a single layer with the same orientation by combining machine vision and robotics
(Kassler et al., 1993).
Morphological and spectral features of shrimp can be determined to find the optimum
location for removal of shrimp heads (Ling and Searcy, 1989).
Fish species can be sorted according to the shape, length, and orientation of the fish
in a processing line (Strachan et al., 1990; Strachan, 1993). Digital image processing
of fall chum salmon was used to find an objective criterion to predict the flesh redness
from the spawning coloration (Hatano et al., 1989).
2.2.2 Evaluation of rigor mortis in sturgeon
A new method to determine the onset and resolution of rigor in cultured Gulf sturgeon (Ancipenser oxyrynchus desotoi) was developed using analysis of video images
(Oliveira et al., 2004). Insight into the progress of rigor through the fish body was
provided. At 10 different time intervals, from 0 to 67 hours after death, fish were temporarily secured to the edge of a table by the head, with the body free to droop, and
video images were taken (Figure 8.4).
The extent of deflection of various points along the body length was analyzed. New
parameters based on maximum deflection and integral deflections were developed. The
displacements in the horizontal and vertical directions of various points along the spine
were measured by this method. Therefore, the times at which a particular point entered
rigor, reached maximum rigor, and rigor was dissolved could be observed (Figures 8.5
Visual quality of seafood 195
85 cm
Point (0,0)
60 cm
120 cm
50 cm
Figure 8.4 Experimental set-up for the measurement of rigor mortis in Gulf sturgeon.
Y Values (inches)
13 h
21 h
@ 67 h
Onset @ 27 h
59 h
36 h
52 h
46 h
Rigor max. @ 31 h
X Values (inches)
Figure 8.5 Movement of a point on the spine 66% of the length of the fish from the head, over time.
and 8.6). For example, a point along the spine 66 percent of the total distance from the
head entered rigor 27 hours after death, maximum rigor was reached after 31 hours,
and rigor was dissolved at the sixty-seventh hour after death (Figure 8.5). The tail also
entered rigor 17 hours after death; however, maximum rigor was reached after 46 hours,
while dissolution was again at the sixty-seventh hour (Figure 8.6).
2.3 Color
Color is one of the most important visual quality attributes of foods. The first purchasing
decision regarding acceptance or rejection of a food generally depends on its color.
196 Quality Evaluation of Seafood
Resolution @ 67 h
21 h
Y Values (in)
Onset @ 27 h
13 h
59 h
36 h
31 h
52 h
Rigor max.
@46 h
Figure 8.6
X Values (in)
Movement of the tail of the fish, over time.
Machine vision has unique capabilities in measuring color, especially of foods of nonuniform color and surface characteristics. This section provides brief examples from
the literature regarding color evaluation of seafood.
2.3.1 Color space
Color machine vision systems generally capture images in the red, green, blue (RGB)
color system as 24-bit images. Each color axis is allocated 8 bits, resulting in 256 different values. This gives 16.7 million possible color combinations (256 × 256 × 256).
Since it is difficult to handle 16.7 million colors it was decided to reduce the number
of colors in the color space, and this was done by dividing each color axis (red, green,
blue) into 4, 8, or 16 (Luzuriaga et al., 1997). In the three-dimensional color space this
resulted in 64, 512, or 4096 “color blocks” (Figure 8.7). Any color in the block was
represented by the center color of that block.
This effectively reduced the number of colors from 16 million to a manageable
number. It was expected that some loss of information would occur. Indeed, for the
64 color-block system, some “averaging” of the colors occurred (Figure 8.8), resulting
in patchy and artificial colors. However, the 4096 color-block system was visually
indistinguishable from the real image.
The software for most machine vision applications can also convert the color
information from one system to another. Typical color systems include XYZ,
Hue–Saturation–Lightness, Munsell, RGB, L-a-b.
2.3.2 Shrimp color
Luzuriaga et al. (1997) objectively measured melanosis levels in white shrimp (Penaeus
setiferus) stored on ice for up to 17 days, to correlate these with the evaluation grades
Visual quality of seafood 197
Figure 8.7 Formation of “color blocks’’ by dividing the RGB axes into different parts. (A color version can
be viewed at http://books.elsevier.com/companions/9780123736420)
of a human expert, and to quantify the color changes occurring in the shrimp stored on
ice for up to 17 days. The inspector visually evaluated each sample and graded it for
melanosis on a scale from 0 = none to 10 = high (Otwell and Marshall, 1986). A range
of values, such as 1–2 or 5–6, was assigned by the inspector. As soon as the inspector’s
evaluation was complete, the shrimp were analyzed for melanosis by the MV system
to measure the percentage area with melanosis. Images of both sides of the shrimp
were analyzed, and melanosis values were averaged. The trained inspector identified
the black and the dark areas as melanotic. The RGB colors of the dark areas identified
by the trained inspector were determined. These dark color blocks were added to the
melanosis analysis. Six color blocks from the 64 used in this system were chosen as
candidates to be included in melanosis calculations. Table 8.2 shows the specifications
of these colors.
When these colors were included in the analysis of melanosis by MV, the correlation
with the inspector’s evaluation of the shrimp was r 2 = 0.68 (Figure 8.9). The change
of the melanotic colors over storage time is shown in Figure 8.10.
2.3.3 Color evaluation of carbon-monoxide treated seafood
Carbon monoxide (CO) is known to bind to hemoglobin and myoglobin, resulting in a
cherry-red color (Kristinsson et al., 2006). Excess use of CO may result in an unnatural
color in seafood products such as tuna, mahi mahi, tilapia, and catfish. Balaban et al.
(2006) studied the possibility of “improving” the color of tuna and mahi mahi, starting
with inferior-quality fish. An MV system was used to quantify the color of the fish
before and after treatments, and during refrigerated storage. Since tuna has a fairly
uniform color, a colorimeter can be used to quantify its color. The advantage of the
198 Quality Evaluation of Seafood
Original picture
64 color blocks
512 color blocks
4096 color blocks
Figure 8.8 Comparison of 64, 512, and 4096 color blocks with the original image. (A color version can be
viewed at http://books.elsevier.com/companions/9780123736420)
Table 8.2 Specification of melanotic and light color blocks.
Color block#
Melanotic colors
Light colors
R value
G value
B value
Color block (no.)
R value
G value
B value
MV system becomes evident when analyzing the color of mahi mahi, which has a
dark muscle strip surrounded by light muscle. When using a colorimeter, the number
and position of the locations at which the colorimeter is to be placed when making
the measurements, and the aperture size of the colorimeter, affects the accuracy of
Visual quality of seafood 199
Human expert grade
Melanosis grade
MV grade
% Area as melanosis
Storage time on ice (days)
% area
Figure 8.9 Comparison of melanosis grade from human inspector, MV system, and the melanotic areas of
shrimp stored on ice.
0 0
3 7
9 13
15 17
Storage (days)
Color block
% area
0 3
7 9
13 15
Color block
Storage (days)
Figure 8.10 Change of “dark’’ (melanotic) and “light’’ colors of shrimp over storage time.
the average color values calculated. The MV can analyze all the pixels of the sample,
eliminating the limitations of sampling frequency, size and location mentioned above.
The authors found that it was possible to start with inferior-quality fish, treat it with
100 percent CO, and “improve” its color to be comparable to or better than fish of good
200 Quality Evaluation of Seafood
Mahi Mahi (enhancement with CO)
1 week on ice
100% CO
Figure 8.11 Quantification of color changes in original, refrigerated (1 week), and refrigerated (1 week)
then CO-treated mahi mahi, by MV. (A color version can be viewed at http://books.elsevier.com/
quality. An example of the images and resulting L*, a* and b* values for mahi mahi
are shown in Figure 8.11.
2.3.4 Sorting whole salmon by skin color
Pink salmon (Oncorhynchus gorbuscha) is sorted visually and grades are assigned to
determine price. Typical fish with the grades from A to F (decreasing value) are shown
in Figure 8.12. The typical indices of quality are the brightness of the skin color, and
the lack of “watermarking” on the skin of the fish. Since fatigue and resulting errors
occur on human inspection lines, it is desirable to apply MV sorting of the intact fish
by skin appearance.
A study was initiated in 2003 into the accurate sorting of pink salmon (Oliveira
et al., 2006a). An expert evaluated 94 fish and assigned a grade each. This initial
grading resulted in 21 fish of AB grade, 23 fish of CD, 8 fish of DE, 32 fish of E,
and 10 fish of F grade. After grading, each fish was placed in a light box, similar to
that described in Luzuriaga et al. (1997) but with larger dimensions to accommodate
whole fish, and an image was acquired with MV. The first approach was to calculate
the average L*, a*, and b* of each image by averaging these values of all pixels of
each fish. The average L* values were compared to the grades assigned by an expert.
Figure 8.13 shows that there was much variation in the average L* value of the whole
fish to allow for accurate prediction of the grade.
The next trial was to quantify the percentage of the total fish surface with an L*
value greater than a threshold value between 60 and 90 (Balaban et al., 2005). This was
accomplished using LensEye software (Gainesville, FL). Averages were taken for each
grade (Figure 8.14). A threshold level of L* > 85 was chosen since it had the smoothest
line. The percentage surface area with L* > 85 for each fish was plotted against the
Visual quality of seafood 201
Figure 8.12 Grading of pink salmon by skin color with emphasis on “watermarking’’. (A color version can
be viewed at http://books.elsevier.com/companions/9780123736420)
Average L* value, whole fish
Experimental grade
Figure 8.13 L* values averaged over the whole surface of pink salmon with different human expert
assigned grades.
202 Quality Evaluation of Seafood
% surface area with L theshold
Experimental grade
Figure 8.14 Percentage of the total surface area of whole pink salmon assigned different grades by human
expert. Different threshold results are shown.
Y 54.38 4.09 X
% Surface with L 85
R2 0.61
Experimental grade
Figure 8.15 Linear regression of percentage surface area with L* < 85 vs human evaluation of
experimental grade.
experimental grade (Figure 8.15). The objective was to analyze the image of a fish
obtained from MV, calculate the percentage surface with L* > 85, locate this value on
the Y axis of Figure 8.15, and predict the grade using the regression line. However, it
was obvious that the scatter of the data precluded accurate prediction of the grade.
Therefore, it was decided to select a region of interest (ROI) on each fish. This was
chosen as a rectangular area bounded by the lateral line at the top, behind the gill
plate towards the head, the pectoral fin at the bottom, and the end of the dorsal fin
towards the tail (Figure 8.12). It was expected that this area would have less variation
and more significance regarding watermarking. The percentage surface area of the ROI
having pixels with L* values lower than a threshold (between 60 and 90) are shown
in Figure 8.16. The threshold value of L* < 80 was selected because it had the least
Visual quality of seafood 203
Percent of ROI area
L 60
L 65
L 70
L 75
L 80
L 85
L 90
Figure 8.16 Different threshold values of salmon grades based on percentage area < threshold L* value.
Average L, a, b values
a* average
L* average
b* average
Figure 8.17 Average L*, a*, b* values, determined by machine vision, of each grade of salmon in the
regions of interest shown in Figure 8.12.
amount of variation (error bars in Figure 8.16). The average L*, a*, and b* values of
the ROIs for each grade are shown in Figure 8.17.
The correlation of the average L* value of the ROI and the percentage surface of
ROI with L* < 80 is shown in Figure 8.18. The latter parameter was chosen, since it
had a larger spread.
Finally, an iterative procedure was performed where the average percentage ROI
surface with L* < 80 was taken for each grade. Between grades AB and CD, the
204 Quality Evaluation of Seafood
Average L* of ROI area
Percent of ROI area L* 80
Figure 8.18 Relationship between average L* of the regions of interest (ROI) shown in Figure 8.12 for
different grades of salmon, and the percentage of the ROI with L* < 80.
Table 8.3 Human expert and machine vision classification of whole pink salmon by skin
MV prediction →
Human grade (total fish)
AB (21)
CD (23)
DE (8)
E (32)
F (10)
Total (94)
difference in these averages was divided in half, and this was taken as the separation
level between the grades AB and CD. The same was applied to the other grades. Next,
each fish was reclassified by its L* < 80 value by moving it into the appropriate grade
range. This was repeated until no fish moved between grade ranges. The result is shown
in Table 8.3. It is important to note that many fish were misclassified by the human
expert in the mid-grade DE, based on the results in Table 8.3. For very high-grade (AB)
or the very low-grade (F) fish, the human and MV estimations of grade were similarly
accurate. The misclassified fish were re-examined by the human expert, and the new
grades assigned by the MV system were confirmed. These results are encouraging for
the use of MV in efficiently grading whole salmon, on a conveyor belt, by skin color.
2.3.5 Comparison of MV and colorimeter evaluation of sturgeon color
Oliveira and Balaban (2006b) compared the color readings of a hand-held colorimeter
with a MV system in measuring the color of Gulf of Mexico sturgeon fillets from
fish fed different diets, and refrigerated for up to 15 days (Figure 8.19). The L*a*b*
values were measured at days 0, 5, 10, and 15 using both instruments, and E values
Visual quality of seafood 205
1/2 L
width W
1/3 F L
1/2 W
Fillet length F
Machine vision center slice
Figure 8.19 Determination of color of sturgeon fillets. Location of colorimeter measurements (above),
and the machine vision region. (A color version can be viewed at http://books.elsevier.com/
calculated to allow comparison of results. The E value measures the “total” color
change, described as:
E = (Lo − Li )2 + (ao − ai )2 + (bo − bi )2
where: the subscript o refers to the values at time 0, and i refers to values at 5, 10, or
15 days.
Statistical analysis indicated that there were no significant differences in E values from the hand-held colorimeter or machine vision between either treatments or
storage days (P < 0.05). E values were significantly different (P < 0.05) between
instruments, except for day 0. The large differences in E for the colorimeter between
day 0 and day 1 did not reflect the mild color changes over time visually observed
from pictures. The authors concluded that machine vision had the ability to measure
color with high spatial resolution, thus it could outperform other colorimeters when
recording and estimating subtle and non-uniform color changes in foods.
2.3.6 Combining color with other quality parameters
The advantage of evaluating colors not as an average but as discrete values allows
different types of analyses, such as discriminant function and neural network methods.
In a storage study, Korel et al. (2001a) used a color machine vision system (MV)
to monitor the changes in the color of tilapia (Oreochromis niloticus) fillets dipped in
sodium lactate solutions (0%, 4%, 8% (v/v)). The use of MV allowed for the percentage
of each of the color blocks in a 64-color block system to be calculated in addition to
the reporting of the average L*a*b* values. The authors selected those color blocks
that represented the color of areas that made up at least 5 percent of the fillet surface.
Twenty color blocks selected were used in a discriminant function analysis to classify
206 Quality Evaluation of Seafood
Lactate 4%
Lactate 8%
Discriminant function 2
Discriminant function 1
Figure 8.20
(Ellipses 95% confidence area)
Discriminant function analysis of tilapia color for all treatments at 1.7◦ C, based on color data.
the observations into one of the lactate treatment groups. The corresponding overall
correct classification rate was 82 percent (Figure 8.20). For each lactate treatment,
the color block data were classified into storage time groups and correct classification
rates between 56–80 percent were observed. These rates improved significantly when
electronic nose data were combined with the color block data: 100 percent of the
observations were correctly classified into their respective storage time group. The
authors recommended the use of such an approach where MV measurements of color
and electronic nose data are combined to locate the group (defined by storage time) of
a tilapia sample, the storage history of which may be unknown.
In another study (Korel et al., 2001b), raw and cooked catfish (Ictalurus punctatus) fillets were evaluated with MV and electronic nose throughout storage. Similar
to the tilapia study previously described, correct classification was obtained for all
observations when discriminant function analysis was performed on color block and
electronic nose data to group samples with respect to storage time (Figure 8.21). It was
concluded that MV data, especially when combined with another tool like electronic
nose, provide an improvement towards the determination of overall food quality. A
similar study with oyster color and e-nose data, analyzed by discriminant function,
resulted in similar conclusions (Tokusoglu and Balaban, 2004).
3 Conclusions
Seafood is a food commodity that has great variation in shape, size, color and other
visual properties when it comes to expected quality attributes. Non-uniform sizes,
shapes, surfaces, and colors are common. This constitutes a challenge to the evaluation
of parameters by traditional instruments or methods. The visual quality of seafood
References 207
Discriminant function 2
Sensory scores
Discriminant function 1
Figure 8.21 Discriminant function analysis of catfish color based on sensory scores.
can be measured by machine vision accurately, in a non-contact, non-destructive, and
continuous manner. As data from more research accumulate, and as hardware becomes
faster and more affordable, it is expected that MV will find more real-world applications
in the quality evaluation of seafood. Combination of machine vision data with other
sources, such as electronic nose or near-infrared analysis, will synergistically improve
quality evaluation.
Arnarson H (1991) Fish and fish product sorting. In Fish Quality Control by MachineVision
(Pau LF, Olafsson R, eds). New York: Marcel Dekker, pp. 245–261.
Balaban M O, Yeralan S, Bergmann Y (1994) Determination of count and uniformity ratio
of shrimp by machine vision. Journal of Aquatic Food Product Technology, 3 (3),
Balaban MO, Kristinsson HG, Otwell WS (2005) Evaluation of color parameters in a
machine vision analysis of carbon monoxide-treated fresh tuna. Journal of Aquatic
Food Product Technology, 14 (2), 5–24.
Balaban MO, Kristinsson HG, Otwell WS (2006) Color enhancement and potential fraud
in using CO. In Modified Atmosphere Processing and Packaging of Fish: Filtered
Smokes, Carbon Monoxide & Reduced Oxygen Packaging (Otwell WS, Balaban MO,
Kristinsson HG, eds). Ames: Blackwell Publishing, pp. 127–140.
Batty RS (1989) Escape responses of herring larvae to visual stimuli. Journal of Marine
Biological Association of the United Kingdom 69 (3), 647–654.
Champalbert G, Direach-Boursier LL (1998) Influence of light and feeding conditions on
swimming activity rythms of larval and juvenile turbot: an experimental study. Journal
of Sea Research, 40 (3–4), 333–345.
208 Quality Evaluation of Seafood
Damar S, Yagiz Y, Balaban MO, Ural S, Oliveira ACM, Crapo CA (2006) Prediction of
oyster volume and weight using machine vision. Journal of Aquatic Food Product
Technology, 15(4), 5–17.
Diehl KC, Awa TW, Byler RK, van Gelder MF, Koslav M, Hackney CR (1990). Geometric
and physical properties of raw oyster meat as related to grading. Transactions of the
ASAE, 33, 1270–1274.
Graves, M. (2003). X-ray bone detection in further processed poultry production. In
Machine Vision for the Inspection of Natural Products (Graves, M. and Batchelor, B.,
eds). New York: Springer-Verlag, pp. 421–448
Hamaoka T, Sasaki K (1992) Development for a system for judging the freshness of raw
oysters from Hirsoshima using fuzzy reasoning. Japanese Journal of Fuzzy Theory
and Systems, 4(1), 65–73.
Hatano M,Takahashi K, OnishiA, KameyamaY (1989) Quality standardization of fall chum
salmon by digital image processor. Nippon Suisan Gakkaishi, 55 (8), 1427–1433.
Kassler M, Corke P, Wong P (1993) Automatic grading and packing of prawns. Computers
and Electronics in Agriculture, 9, 319–333.
Korel F, D A Luzuriaga, Balaban MO (2001a) Objective quality assessment of raw tilapia
(Oreochromis Niloticus) fillets using electronic nose and machine vision. Journal of
Food Science, 66 (7), 1018–1024.
Korel F, Luzuriaga, DA, Balaban MO (2001b) Quality evaluation of raw and cooked catfish
(Ictalurus punctatus) using electronic nose and machine vision. Journal of Aquatic
Food Product Technology, 10 (1), 3–18.
Kristinsson HG, Balaban MO, Otwell WS (2006) The influence of carbon monoxide and
filtered wood smoke on fish muscle color. In Modified Atmosphere Processing and
Packaging of Fish: Filtered Smokes, Carbon Monoxide & Reduced Oxygen Packaging (Otwell WS, Balaban MO, Kristinsson HG, eds). Ames: Blackwell Publishing,
pp. 29–53.
Lee DJ, Lane RM, Chang GH (2001) Three-dimensional reconstruction for high speed
volume measurement. Proceedings of SPIE, 4189, 258–267.
Li J (1990) Oyster hinge line detection using digital image processing. Presented during
the 1990 International Summer Meeting of the ASAE, June 24–27, Columbus, OH.
Li J, Wheaton FW (1992) Image processing and pattern recognition for oyster hinge line
detection. Aquacultural Engineering, 11, 231–250.
Ling PP, Searcy SW (1989) Feature extraction for a vision based shrimp deheader. Presented
during the 1989 International Winter Meeting of the ASAE, December 12–15, New
Orleans, LA.
Luzuriaga D, Balaban MO, Yeralan S (1997) Analysis of visual quality attributes of white
shrimp by machine vision. Journal of Food Science, 62 (1), 1–7.
Martinez-Palacios CA, Tovar EB, Taylor JF, Duran GR, Ross LG (2002) Effect of temperature on growth and survival of Chirostoma estor estor, Jordan 1879, monitored using
a simple video techniques for remote measurement of length and mass of juvenile
fishes. Aquaculture, 209, 369–377.
Oliveira ACM, O’Keefe SF, Balaban MO (2004) Video analysis to monitor rigor mortis in
cultured Gulf of Mexico sturgeon (Ancipenser oxyrynchus desotoi). Journal of Food
Science, 69 (8), E392–397.
References 209
Oliveira, ACM, Crapo, C and Balaban MO (2006a) Grading of pink salmon skin watermarking using a machine vision system. Second Joint Transatlantic Fisheries Technology
Conference. October 29–November 1, 2006, Quebec City, Quebec, Canada. P-46,
p. 138.
Oliveira ACM, Balaban MO (2006b) Comparison of a colorimeter with a computer vision
system in measuring color of Gulf of Mexico sturgeon fillets. Applied Engineering in
Agriculture, 22 (4), 538–587.
Otwell S, Marshall M (1986) Studies on the use of sulfites to control shrimp melanosis
(blackspot). Florida Sea Grant College, Technical Paper No. 46, Gainesville, FL, USA.
Parr MB, Byler RK, Diehl KC, Hackney CR (1994) Machine vision based oyster meat
grading and sorting machine. Journal of Aquatic Food Product Technology, 3 (4),
Rakow NA, Suslick KS (2000) A colorimetric sensor array for odor visualization. Nature,
406, 710–713.
So JD, Wheaton FW (2002) Detection of Crassostrea virginica hinge lines with machine
vision: software development. Aquacultural Engineering, 26, 171–190.
Strachan NJC (1993) Length measurements of fish by computer vision. Computers and
Electronics in Agriculture, 8, 93–104.
Strachan NJC, Nesvadba P, Allen A R (1990) Fish species recognition by shape analysis of
images. Pattern Recognition, 23 (5), 539–544.
Suslick KS, Rakow NA (2001) A colorimetric nose: “smell-seeing”. In Artificial Chemical Sensing: Olfaction and the Electronic Nose (Stetter JR, Pensrose WR, eds).
Pennington: NJ Electrochemical Society, pp. 8–14.
Tojeiro P, Wheaton F (1991) Oyster orientation using computer vision. Transactions of the
ASAE, 34 (2), 689–693.
Tokusoglu O, Balaban MO (2004). Correlation of odor and color profiles of oysters (Crassostrea virginica) with electronic nose and color machine vision. Journal of Shellfish
Research, 23 (1), 143–148.
Williams RN, Lambert TJ, Kelsall AF, Pauly T (2006) Detecting marine animals in underwater video: let’s start with salmon. Proceedings of the 12th Americas Conference on
Information Systems, August 4–6, Acapulco, Mexico, pp. 1482–1490.
Wold JP, Isakkson T (1997) Non-destructive determination of fat and moisture in whole
Atlantic salmon by near-infrared diffuse spectroscopy. Journal of Food Science, 62
(4), 734–736.
Quality Evaluation
of Apples
Vincent Leemans and Olivier Kleynen
Gembloux Agricultural University, Department of Mechanical
Engineering, Passage des Déportés 2, B-5030, Gembloux, Belgium
1 Introduction
The apple is a fruit that is produced and consumed world wide. Its production is
rated at over 60 × 109 kg in 2005, with the most important producers being the
People’s Republic of China (25 × 109 kg), the European community (25 countries,
7.5 × 109 kg), the United States of America (4.25 × 109 kg), Turkey (2.55 × 109 kg)
and Iran (2.4 × 109 kg). The number of cultivars is estimated to be over 7500, but only
a few of these are subject to mass production and appear on supermarket shelves.
The quality of apples is strictly regulated, and they are classified into categories
Extra, I, and II by standards established by international organizations such as the
OECD (International Standard on Fruits and Vegetables – Apples and Pears, 1970)
(The category names may vary between countries; a category III theoretically exists
but, to the knowledge of the authors, is not used.) The fruits not complying with the
minimal requirements of the lowest class are excluded from the fresh market and used
by the food industry (stewed apples, juice, or cider) or for animal feeding (the cull).
The quality encompasses different aspects, the most important of which concerns the
presence of defects and the size tolerated within each class. The shape of the fruit is
also expressed in those standards. National and distribution “standards” usually specify
size, grade and color classes.
The quality of the fruits presented to the fresh market has a major influence on their
price. The distributors demand batches of homogeneous quality, while the intrinsic
quality of these biological products varies widely, from fruit to fruit, from one orchard
to another, and in time. The grading is thus an essential step; however, it is a tedious
job, and it is difficult for the graders to maintain constant vigilance. If this task could be
performed by machine vision, the results would be more objective; it would also save
labor and enhance output. This chapter presents recent developments in this domain.
The grading of an apple by using computer vision begins by acquiring an image and
finishes with evaluation of the fruit’s quality. Meanwhile, the information contained in
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
214 Quality Evaluation of Apples
Image acquisition
Fruit localization
Shape evaluation
Calyx & stalk-end
Object recognition/
defects identification
Apple grading
Figure 9.1
Diagram showing the path of the information from image acquisition to evaluation of quality.
the image(s) is processed following, more or less, the diagram presented in Figure 9.1.
Not every step is encountered in every study, but this seems to be a reasonable guideline,
and the organization of this chapter follows this scheme. The first step consists of
acquiring the images, and this is briefly described in section 2. The first treatment
consists of localization of the fruit in the image, and the determination of its boundary.
Boundary analysis may be used to parameterize the shape information, which can be
fed directly to the fruit quality classifier; this is discussed in section 3. The boundary is
also used to determine the shape of the region of interest (ROI), including the pixels to
be taken into account for subsequent procedures. The color of the fruit is then proposed
for color grading and for detection of defects (sections 4 and 5). Two poles, i.e. the
calyx and the stalk-end (or stem-end), are presented in apples as showing an aspect
quite different from the rest of the fruit. They are usually darker areas, including the
pixels that are often classified as identifying a defect by most of the segmentation
algorithms. Their identification is necessary, and dedicated algorithms are often used
(section 6). The segmentation results are used to grade the fruit, at a low level, with
a minimal treatment, or after the different objects have been characterized by features
such as their shape or their color in order to recognize the defects (section 7). Finally,
the fruit’s color, its shape, and the presence of defects, their nature, and their area,
contribute to the quality assessment.
The quality of apples may also include other aspects related to “internal” properties such as the chemical composition (e.g. sugar content and acidity), physical
Material 215
characteristics (hardness, juiciness, mealiness) and internal defects. Though color may
somehow be related to the maturity and thereby to the above properties, accurate evaluation requires other techniques (such as near infra-red spectroscopy) which will not
be discussed in this chapter. Internal breakdown (such as brownheart) is not visible
from outside the fruit and is thus out of the scope of this chapter, while defects such as
bruising and bitter pit which are visible through the skin will be considered.
2 Material
The most immediate task of an apple-grading machine is transporting the fruit. Indeed,
as apples are fragile, it is a challenge to ensure that the task can be carried out at rates
of up to 10 fruits per second while presenting all facets to a camera under adequately
controlled lighting. In a grading line, a distance of about 0.11 m between the centers
of two fruits seems to be the minimum. In other words, the fruits should be carried at a
speed of about 1.1 m/s. In order to avoid blurred images, the integration time should not
exceed 0.25 ms. Furthermore, the lighting should be powerful enough (around 80 W of
lighting tubes per line) to be adapted to the chosen spectral bands and with adequate
2.1 Manipulation and presentation of the fruits
The apples are near-spherical objects, and thus their surface cannot be presented on
a plane. Consequently, there is no theoretical way to assemble different images of an
apple to represent its whole surface without distortions and compromises. Figure 9.2
shows some of the possibilities, while Table 9.1 gives a summary regarding representing
the apple surface.
To ensure that the whole surface of the apple might be visible, several devices are
used. The earliest but still most commonly used method is to place the apples on
bi-conical rollers so that they evolve under the camera while rotating. With the apple
being placed on rollers and moved perpendicularly to the optical axis of the camera,
about two-thirds of the surface is visible; this may be enough to estimate its ground
color and the blush area ratio, but not for defect detection. The rotational poles cannot
be seen from above, and thus mirrors are added to the side of the sorting line. By
assembling successive images from a matrix camera, it is possible to obtain a nearcylindrical projection of the surface. The fruit is placed on “rollers” that have a given
angular speed. If the fruit does not slip on the rollers, the tangential speed at the
contact points is the same and the angular speed of the fruit depends directly on its
diameter as well as the dimensions of the ROI to be taken into account. Because of the
lack of stability at a high rotational speed, this method is limited to a grading rate of
around three apples per second. In a similar but more complex system, a kind of cup
and small wheels, as designed by Throop et al. (2005), orientates the stalk–calyx axis
vertically during its transport. The fruit is then tipped by 45◦ on to rollers and presented
perpendicularly to the optical axis of a camera. A rectangular ROI is used, and a single
216 Quality Evaluation of Apples
Figure 9.2
Apple-image acquisition diagram.
image is reconstructed (the calyx and stalk poles are then ignored). The number and
the width of the ROIs are function of the diameter.
In another device, two cameras inspect the line(s) with their optical axis at an angle
of around 45◦ to the vertical (Leemans, 1999). If only one line is inspected, the distance
from the line to both cameras is equivalent and thus all the apples are viewed at the same
scale. The apple is modeled as a sphere rotating without slipping on the rollers. Two
ROIs are considered. The shape of the smaller ROI is computed as the projection on the
camera sensor (the charge-coupled device, CCD) of a spherical triangle delimiting the
portion of the fruit assigned to each image. One apex of this triangle is at the rotational
pole and the two others are at the “equator.” Their positions are determined taking into
account the diameter of the fruit. The larger ROI surrounds the triangle by at least five
pixels. All the pixels in this area are classified as defects or healthy tissue. On each
Material 217
Table 9.1 Main devices proposed to present the whole surface of the fruit to the camera.
Optical device
Surface ratio
observed (%)
One single camera
One camera +
Near cylindrical projection +
rotational poles views
Two cameras
Near bi-conical projection
Robot arm
One camera
Near cylindrical projection
Tetrahedral projection
view, every object (defect, calyx, stalk-end) is characterized by a number of features,
including the position of its center of gravity. In order to evaluate the quality correctly,
each defect has to be counted once and once only, although it may appear on several
images. To solve this, the defects with their center of gravity within the “triangle” are
considered. If the same defect appears in another image, its center of gravity should be
outside the corresponding triangle. The apples are then graded according to the entire
set of attributes of all the retained defect.
These devices share the same drawback in that the assumption is made that the apples
spin without slipping or tilting (i.e. the rotational axis remains the same during one
turn). To overcome this, Moltó et al. (1998) manipulated the fruit by two robot arms
but at a low rate of about one fruit per second. In the study by Guedalia (1997), apples
were conveyed on ropes while images were acquired by three cameras; however, a
small part of the apple surface was blocked by the ropes.
The geometrical relationship between the different images of the fruit is not obvious,
and thus many researchers work on separate images. The blush area ratio of a fruit is
computed using the whole set of views. For the defects, there are various possibilities –
for example, to evaluate the defects in each view and grade each view individually, the
rating of the fruit being the one given by the worst view; or to evaluate the defects in each
view and compute global parameters such as the total area of defects or the bigger defect.
The support for the fruit constitutes the surrounding area of the fruit in the image,
and obviously it should be of relatively high contrast to the fruit. Figure 9.3 shows
a bi-color apple placed on two different backgrounds, one bright and one dark. The
contrast is sufficient in the red channel for the dark background (and also in the NIR
wavelength bands, unshown), and in the blue channel for a bright background (or both,
218 Quality Evaluation of Apples
Figure 9.3 Bi-color apple (green ground color left, red blush right) placed on a part white and part black
background. From left to right, these are red, green, and blue channels of a RGB color image.
using a blue background, for example). A dark background seems to be used most
often, but bright blue and white can be encountered. A bright background may present
shadows and is more subject to unevenness. When the fruit is well contrasted against
the background, fruit localization is undertaken by classical supervised or unsupervised
threshold techniques.
2.2 Lighting
The aim of the lighting system is to provide irradiance that provides the most relevant
information about apple quality after being reflected. Two major concepts should be
considered: repartition (i.e. its geometry) and spectral content.
2.2.1 Lighting geometry
The apple surface presents different degrees of glossiness, depending on the variety and
the maturity. Specular reflection seems unavoidable, but diffuse lighting can minimize
its effects. The geometry of the lighting should make the image of the fruit either as
uniform as possible (provided that its reflectance is uniform), or give it known variations. In an attempt to fulfill the former requirement, half-spherical lighting chambers
are designed. The fruit is placed at the center of the chamber, and the light sources are
placed below the fruit and illuminate the inner surface of the chamber, which is painted
flat white to provide a diffuse and uniform light (Moltó et al., 1998). This device is
used with a robot arm manipulator, but in practice it is not suitable for roller sorting
machines. A cylindrical lighting tunnel is therefore built, based on the same principles, allowing the fruit to pass through it. Figure 9.4 illustrates some of the designs.
In such devices (Miller and Drouillard, 1997; Leemans, 1999; Throop et al., 2005)
uniform lighting is possible at the direction perpendicular to the traveling direction,
but is difficult to achieve in the direction of travel because the apples are quite close
to one another. In some cases additional lighting devices are added to the extremity of
the lighting chamber. For other devices, only part of the image can be used; however,
as the fruit is rotating, there are ways to observe the whole surface under the correct
Material 219
Figure 9.4 Different image-acquisition designs. (a) The fruit is placed on a conveyor system (here a belt is
schematized) and illuminated by the diffuse reflection of the light provided by lamps (here lighting tubes and
bulbs) placed beneath the level of the fruit. The camera observes the apple from above through a hole in
the reflector. (b) Cross-section of a lighting tunnel where the fruit is placed on rollers and observed by two
cameras through the reflector. (c) A view of a two grading-line prototype based on the former concept.
220 Quality Evaluation of Apples
conditions. Lighting forming a horizontal layer above the apples was used by Wen
and Tao (1998); this had the advantage of covering several lines, but the drawback of
presenting strong illuminant variations from the center of the fruit to its border.
2.2.2 Spectral composition
The spectral composition of the incident light depends mainly on the lighting sources,
and cannot be easily tuned. Fluorescent lighting tubes are mainly used for image
acquisition in the visible part of the spectrum, while incandescent bulb lamps are
generally used for inspection in the NIR part. Some researchers combine both to extract
spectral information at different wavelength bands to enhance the defect detection
(Kleynen et al., 2004), while others use this method to extract two different kinds
of information at the same time. Yang (1993) used the visible spectrum for defect
localization and the NIR region for fruit curvature analysis, while Penman (2001) used
the green to NIR part of the spectrum for defect localization and the blue region for
curvature analysis. Light-emitting diodes have also been used recently (Throop et al.,
2005). These present the advantage of emitting a narrow bandwidth.
2.3 Image acquisition devices
The spectral sensitivity of the image acquisition devices and the number of “channels”
acquired depend on the development of technology. As a guide, in the 1980s and earlier,
monochrome cameras were used; in the 1990s, color cameras were considered. More
recently, Mehl et al. (2004) have used a hyperspectral imaging system to detect apple
surface defects. Since this imaging technique provides a large amount of data, which
it takes a great deal of time to acquire and to process, it cannot be transferred to an
industrial machine. Taking into account practical considerations, Kleynen et al. (2003,
2004) selected four wavelength bands in the visible and NIR spectra and developed
a four-band multispectral vision system dedicated to defect detection on apples. The
system had the potential for industrial application. Mid-infra-red cameras were also
employed in order to recognize the stalk-ends and calyxes (Cheng et al., 2003), but
their high price prevents their use in commercial grading machines for the moment.
2.4 The image database
The grading machines rely on a model of an ideal fruit, and data from the observed apple
are compared with those of the model. This model is built thanks to a database. However,
the important question is how many fruits should be considered while building such
a database. This depends, of course, on the variability of the parameters. The quantities
given here may be considered a general guideline.
Regarding color, it is most important to have fruits representative of the color variability in space and time (the color changes according to the picking date and time of
storage). A hundred fruits, being representative of the variability at a particular moment
(and thus including the extremes), and four samplings a year (thus 400 apples) seems
Material 221
Regarding shape, a variety presenting a well-defined form (such as Golden Delicious) is easily measured, and a hundred apples would be sufficient. For varieties with
a more variable shape, the number of samples should be increased accordingly.
For defects, the simple answer is, as many as possible. Since variability of the
blemishes is extremely large (see Figure 9.5), their detection and the fruit grading
usually require advanced algorithms and the estimation of many parameters. The ideal
system should be able to expand the database in time. At the very least, several hundred
apples should be considered; a thousand or even more is preferable. Since one year is
different from another, this database should be built across several years. A particular
blemish may represent an important proportion of defects for one year or for one
location, but might not be encountered for the several years afterwards. It is thus
important to vary the origin of the apples with regard to space and time, to take into
account the “inter-orchard,” “in-year,” and “inter-year” variability.
Figure 9.5 Different kinds of defects, showing the variability in color, size, and texture. Left to right:
(a) fungal attack, old mechanical damage, recent bruise (within the dotted line), old bruise; (b) russet,
attack by apple fruitminer (Marmara Pomonella), bitterpit, old scar; (c) reticular russet, reticular russet, aphid
attack (Disaphis plantaginea, leaving no “patch’’ but a textured surface), frost damage; (d) four healthy fruits.
222 Quality Evaluation of Apples
3 Shape grading
The shape of the fruit is defined in the standards in words such as “typical of the
variety” for class Extra, “showing slight misshapeness” for class I, “showing strong
misshapeness” for class II, or to be rejected. For each variety, the typical shape is
again expressed as spherical, elongated, conical, flattened, with marked ribs, and so
on, which are impractical definitions for image analysis. Most researchers ask experts
to grade fruits into shape classes and to find suitable shape parameters. Different
kinds have been used, ranging from shape indexes such as circularity, eccentricity, and
Hu’s invariant moments (Hu, 1962) to fractals or Fourier descriptors. In the latter, the
distance from the center of gravity of the fruit to its boundary is expressed as a function
of the angle from the horizontal (or any other reference). The amplitudes of the first
few harmonics (computed using a fast Fourier transform) can be used to grade Golden
Delicious apples with an error rate of 6 percent using a linear discriminant analysis
(Leemans, 1999). Other varieties such as Jonagold, which is a cross between a rather
elongated variety (Golden) and a flat one (the Jonatan), present highly variable shapes,
and can show the shape of either of their ancestors. In this case a “misshapen” shape
owing to a pollination problem might be more complicated to detect.
The main drawback is that the fruit have to be presented to the camera with their
stalk–calyx axis perpendicular to the optical axis, which requires a mechanism such
as the one proposed by Throop et al. (2005). However, a failed orientation rate of
2.3 percent occurs.
4 Color grading
Apples usually present two colors, i.e. the ground color varying from green to yellow,
and the degree of ripeness and the blush varying from pink to deep red. Many varieties,
such as Boskoop, Braeburn, Gala and Jonagold, present both colors, while others
present mainly one – for example, Granny Smiths are normally light green, Gingergold
and Transparent are whitish green, Golden Delicious are green to yellow-green but may
show a slight pinkish blush, Fuji are generally red, and Red Delicious are deep red. The
color criteria given by international standards, such as the European Community (EC)
no. 1799/2001, are often complemented by national or auction standards. It should
also be noted that many varieties of apples also present varietal russet, which will be
discussed in the next section.
Early studies concerning apple color (Lespinasse, 1971) are at the root of the picking
color charts, using the color space available at that time. The relationships between the
ground color at harvest and colors during storage were studied (at that time ultra-low
oxygen storage facilities were not common and the fruit matured much more quickly
during storage than is the case nowadays). As a result, the picking date could be chosen
taking into account the ground color and the expected storage duration. Others (Ferré
et al., 1987; Shrevens and Raeymakers, 1992) studied the relationship between the
L*a*b* space and the maturity or the ground color standards. It should be noted that
Color grading 223
Figure 9.6 Relative frequency diagrams, projection on the plane determined by the R and G axes. Distribution computed on 100 fruits
of different maturity classes, from the least (left) to the most (right) ripe.
the color spaces used for human-based color assessment (such as the L*a*b*) are
not intrinsically the most suitable for computer grading. Figure 9.6 shows the relative
frequency distributions for the luminance of the red channel vs the green channel
for pixels of bi-color Jonagold apples of different ripeness levels. Images used were
acquired with a three-CCD color camera. The ground color (shown in the upper right
of each of the diagrams in Figure 9.6) varies with the maturity, while the blush (bottom
left) does not. The color picking and grading charts are representative of two facts:
apples presenting an important ground color are graded according to their color into
classes from green (associated with freshness and chosen by people who prefer acidic
fruits) to yellow (associated with maturity and sweetness); and for apples showing a
distinct blush the proportion of blush area is important.
From the image-analysis point of view, this means that the pixels belonging to the
ground color should first be separated from those composing the blush area. As it can
be seen in the frequency-distribution diagram in Figure 9.7, the frequencies between
the two modes corresponding to the blush and the ground color are quite low. This
suggests that the transition (the pigment change) appears to be quite fast. Because
of the non-Gaussian distribution of both colors, the pixels are best classified using
neural networks into either ground color and blush (Leemans, 1999) or different color
classes (“normal red,” “poor color red,” “vine,” “upper and lower background color”)
and injured (Nakano 1997).
Evaluation of the proportion of the blush area is straightforward. The attribution of
a ground color class for the fruit is based on the mean or, better, on the median ground
color, since the latter is less influenced by the asymmetry of the distribution. Figure
9.7 shows scatter diagrams, in the green–red and blue–red planes, of the median color
of 80 Golden Delicious apples graded by an auction expert into four ground-color
classes. The dispersion of the median points is similar for each class, while the mean
of the distribution is close to a straight line. The first canonical variate can be used to
discriminate the medians into the color classes with an error rate of 9 percent, according
to the experts. (The first canonical variate maximizes the ratio between the variance
within the classes and the variance between the classes. It is given by the first eingen
vector of the matrix A = FE−1 , where F is the factorial sum of the products of deviates
matrix, and E is the residual sum of products of deviates matrix.) It can be seen from
Figure 9.7 that part of the error may be attributed to the experts. The hue parameter h
224 Quality Evaluation of Apples
B 75
Figure 9.7 Scatter diagrams of the median color of 80 Golden Delicious apples, graded by an expert into
four color classes, from the greenest to the yellowest: = A++; ♦ = A+; = A; x = Ar. (a) Red (R) and
green (G) plane; (b) red (R) and blue (B) plane; (c) the red (R), green(G), blue (B) space perpendicularly to
the first canonical variate (the discriminant functions are visible).
is also very effective, although, being a non-linear combination of the red, green, and
blue values, it requires more computations.
A RGB image contains all the information necessary to grade fruits according to their
color. When a dedicated wavelength imaging device is used for apple defect recognition
(Kleynen et al., 2004), the selected wavelengths are primarily chosen to enhance the
defect detection. These wavelengths are not well suited for ground color vs blush segmentation, and a supplementary wavelength band located in the green visible spectrum
(500–600 nm) should be used. Indeed, as illustrated in Figure 9.8, in that wavelength
band the reflectance differences between the ground color and the blush are highest.
5 Evaluation of surface defects
External defects have many different origins, including fungal attack, insect or bird
bites, various mechanical wounds, and physiological factors such as frost damage
and sunburn. As presented in Figure 9.5, these are expressed by variable colors, textures, boundaries (frank and diffuse), shapes (circular and irregular), and dimensions.
Evaluation of surface defects 225
Ground color
Normalized reflectance (%)
Wavelength (nm)
Figure 9.8 Spectral reflectance of the ground color and the blush of Jonagold apples.
Furthermore, healthy tissue also has its own variability and texture. Each fruit presents
two areas – the calyx and the stalk-end – which are not defects but may present similar
aspects. Russet is produced by the fruit itself and is not regarded as a defect as long as its
size and repartition is “typical of the variety.” This complicates defect recognition and
proscribes the use of simple methods such as the measurement of global parameters for
the whole area of the fruit, as presented (amongst others) by Heineman et al. (1995).
Defects can be observed because of their different luminance compared with the
surrounding sound tissue. Yang (1994) described the aspect of a mono-color apple
and its defects as they might be seen in a monochrome image. The fruit appeared
light green, with the mean luminance depending on the fruit color. Apples presented
lenticels, creating small variations comparable to noise. It was also noted that the
reflection factor decreased from the center to the boundary. The defects were usually
darker than the healthy tissue, but their contrasts, sizes, and shapes might vary strongly.
For these reasons, the author assumed that simple techniques such as “thresholding”
or background subtraction gave poor results. Consequently, researchers pretreated the
images by removing the outer parts, which were observed under an unfavorable angle
(Leemans, 1999; Unay and Gosselin, 2005). It was also considered beneficial to compensate the non-uniformities algorithmically with a flat-field correction by computing
a correction coefficient function according to their distance to the center of the fruit
(Wen and Tao, 1998), or with background correction by a flat white spherical object
of equivalent size (Throop et al., 2005). The images were then segmented by applying a threshold, set empirically or algorithmically (Ridler and Calvard, 1978; Otsu,
1979; Kapur et al., 1985). Yang and Marchant (1996) presented a method based on a
topological algorithm (called flooding) followed by a snake algorithm for the detection
of “patch like defects” which did not require the flat-field correction.
226 Quality Evaluation of Apples
It is unlikely that these methods would work on monochrome images of bi-color
apples acquired in the visible part of the spectrum up to 650 nm, because the variation
in the reflectance between the ground color and the blush is far too important (Figures
9.3, 9.8). However, they remain valuable for monochrome images acquired in the NIR
wavelength bands or for mono-color green fruits.
In color and multispectral imaging, defect detection can be carried out in several
ways. The different algorithms applied to process both the kinds of image and the data
issued from these may be similar. The term “color” (in quotation marks) will hereafter
be used for both color or multispectral images.
In multispectral imaging, detection may be performed separately for each wavelength
band and the data may be fused afterwards (Throop et al., 2005).
More efficient methods take into account the simultaneous variations of the different
spectral components. Working on Golden Delicious apples (mono-color fruits), Leemans (1999) evaluated the difference between the color of each pixel and the average
of the fruit by the Mahalanobis distance dM 2 :
dM2 = (x − x)−1 (x − x)
with x being the color vector [r, g, b] of the pixel, x the mean color vector of the
fruit, and the covariance matrix of the color. This is in fact the generalization of
a confidence interval. When the distance is lower than a threshold, the corresponding
pixel is considered as healthy tissue; otherwise, it is assigned to a defect. Samples
of segmentation results are presented in Figure 9.9. Slight under-segmentation may
Figure 9.9 Examples of defects on Golden Delicious apples (top) and segmented images using (middle)
the Mahalanobis distance and (bottom) after the second step. Defects origin: (a) russet resulting from an
insect bite; (b) scab; (c) diffuse russetting; (d) bruising.
Evaluation of surface defects 227
be observed for a low-contrast defect (the russet) while a part of the boundary is
erroneously segmented as defect, which is not a problem because it is out of the
ROI. This kind of algorithm has the advantage of being unsupervised. The dispersion
parameters of the color distribution have to be known before segmentation, but they
can be measured once, off-line, on healthy fruits that are selected to be representative
of fruit color. Moreover, since each pixel color is compared to the mean color, if small
disturbances occur – for example in the illuminant changing both the mean and each
pixel values – the distances are not much affected and the algorithm remains robust.
Nevertheless, it works only if the probability density function (PDF) of the fruit color
is, at least approximately, a Gaussian distribution, which is the case for mono-color
fruit such as Golden Delicious.
For bi-color fruits in the RGB space, this assumption is far from being fulfilled.
As can be observed for Jonagold apples in Figures 9.6 and 9.10, these distributions
are multimodal. The different modes correspond to the ground color and the blush
for the healthy tissue, and the different origin of the defects. Moreover, the distributions are close to each other. However, discrimination between the defects and the
Figure 9.10 Relative frequency diagrams of healthy Jonagold apples (top) and of defects (bottom). Left:
projection on the plane determined by the R and G axes; right: projection on the plane determined by the
B and G axes.
228 Quality Evaluation of Apples
Figure 9.11 Sample images of Jonagold (top): (a) ripe, healthy fruit; (b) healthy fruit; (c) poorly
contrasted rotten fruit; (d) russet; (e) scab. The second row gives the a posteriori classification probabilities
(high probability of healthy tissue is shown as white). The third row shows results of the segmentation after
the second step; the background is black, the blush is dark gray, the ground color is light gray, and the
defects are in white.
healthy tissue is possible using the a posteriori classification probabilities computed by
Bayes’ theorem. It is necessary to estimate the PDFs of the color of the healthy tissue
and the defects. Taking into account the complexity of the distributions, Leemans
(1999) proposed a numerical model. In this case, defects had to be previously marked
on images by an operator to obtain their color frequency distribution. In order to segment the images on-line, the PDFs were estimated using the kernel method and the
probability that a pixel of a given color belonged to the healthy fruit or to a defect
was computed off-line and stored in a table. The model was compared regarding color
coded on six bits and seven bits per channel. Similar results were experienced, and the
former was consequently chosen to reduce the size of the table. Figure 9.11 presents
the a posteriori healthy tissue classification probabilities of four sample images (high
probability of healthy tissue is shown in white).
In order to segment defects on San Fuji apples, Nakano (1997) used a backpropagated neural network with two layers to classify pixels into six color classes by
pixel features including position and the mean color (in RGB). Five of the classes were
representative of the colors of healthy tissue, while the other was for defects. The same
kind of neural network was used by Unay and Gosselin (2005) on four wavelength-band
multispectral images of Jonagold apples acquired with the imaging device developed
by Kleynen et al. (2004).
Both methods (Bayes’ theorem and back-propagated neural networks) need preliminary supervised classification of the pixels, which makes them sensitive to a change in
the illuminant. To solve this major drawback, Kleynen and Destain (2004) proposed an
unsupervised defect segmentation method to process multispectral images of Jonagold
Evaluation of surface defects 229
Figure 9.12 Result of the unsupervised segmentation of multi-spectral images of defects (ringed) which
are typically poorly segmented with standard color cameras and supervised segmentation. (a) Hail damage
without skin perforation; (b) scald. Top: green visible spectral band (centered on 500 nm); bottom: result of
segmentation (dark = defective tissue, white = healthy tissue).
apples. This method did not depend on parameters previously computed on sample
images, and was based on the analysis of the probability density distribution of the
spectral components of the image. The modes and the valleys of the distribution were
detected by a hill-climbing method using a density gradient estimate derived from the
“mean shift” procedure (Comaniciu and Meer, 2002), whose variations were correlated
to local maxima of the PDF. This procedure leads to a variable number of clusters. In
order to obtain only two tissue classes (defect and healthy tissue), the Bhattacharyya
distance (generalization of the Mahalanobis distance to populations with covariance
matrix not supposed equal) was used to identify the two most distant clusters of the
distribution. Starting from these two seed clusters, the probability density distribution
was then divided into two main clusters by regrouping the other clusters according to
the nearest neighbor method. Figure 9.12 presents the segmentation results regarding
two kinds of defects, which are generally poorly segmented with supervised methods
and classical color-imaging devices.
When the image has been segmented, several researchers have considered that refinements might be possible. Yang and Marchant (1996) used the snake algorithm, an active
contour model. The limits of the objects were modeled as a string attached to the one
initially segmented position by a spring, attracted by the dark area and presenting
230 Quality Evaluation of Apples
a certain rigidity (inducing a bending moment). The boundary was reshaped by minimizing the total energy of the system. Three parameters were fitted: the weight, the
spring, and the boundary rigidities. This caused initial over-segmentation, which was
usually the case with the flooding algorithm. Leemans (1999) considered, for monocolor fruits, a second segmentation step. After the first step, the mean colors of the
defects and of the healthy tissue were computed, and, for each pixel, the distances
to each mean color were computed. The pixel was reassigned as healthy tissue or as
a defect according to the closest mean. The examples given in Figure 9.9 show the
segmentation enhancement of lower-contrast defects. For bi-color apples, researchers
proceeded in the similar way but in a local area (Figure 9.11).
Wavelengths in the red and NIR parts of the spectrum are mostly encountered for
defect segmentation. As can be observed in Figure 9.7, the reflectance in the blue part is
low (0.1) and it is highly variable in the green and yellow part. However, as demonstrated
by Kleynen et al. (2003) while testing the whole set of three or four wavelength bands,
these parts of the spectra also contain valuable information, because the corresponding
standard deviations are also low.
6 Calyx and stalk-end recognition
The calyxes and stalk-ends are “defect-like” objects, and are usually spotted by classical
defect segmentation algorithms. Consequently, these have to be recognized, either
before or after segmentation.
The calyxes and stalk-ends present an aspect far less variable than defects, even
though many factors may influence it. The russet in the stalk-end and around it is often
a varietal characteristic, and as such should not be considered as defect unless it is
overgrown. The stalk-end and calyx may be positioned centrally on the fruit image, or
at its periphery. Nevertheless, they remain circular objects that are dark in the centre
and have fuzzy boundaries.
In order to locate these cavities, the pattern-matching method is a simple and useful
method. The principle is to match a known image or a pattern (Figure 9.13) with
Figure 9.13
(a) Stem-end and (b) calyx patterns.
Defect recognition and fruit classification 231
another by computing cross-correlation and finding the maximum. To compensate the
sensitivity for a given model, a mean image computed from five stalk-end images was
used by Leemans (1999). The author, working on RGB images, also showed that the
green and the red channels gave similar results for mono-color fruits such as Golden
Delicious and for bi-color fruits such as Jonagold. When the maximum value of the
correlation coefficient was used to distinguish defects having a similar aspect (mainly
circular defects), the error rate was 3 percent. The calyxes and the stalk-ends were well
recognized, but some defects (such as circular defects and misshapenness owing to
insect bites) were misclassified.
Yang (1993) and Penman (2001) both used structured lighting in the NIR or in the
blue spectral bands to reveal the different curvature of the fruit around the cavities, and
detected the defects in another part of the spectra.
Cheng et al. (2003) showed that a couple of NIR/MIR cameras were useful in
revealing the calyxes and stalk-ends. Unfortunately, the high cost of such equipment
is prohibitive.
Unay and Gosselin (2007) proposed a technique based on the segmentation of multispectral images with one channel (750 nm) and object classification. More than 35
parameters regarding “color” (in each of the four used channels), texture, and shape
were extracted from each object. After selection of the most relevant parameters and the
most discriminant method, the authors showed that just nine parameters were enough,
and that the support vector machine gave the best result (using k-fold cross-validation)
with an error rate near zero for the calyxes and stalk-ends and of around 13 percent
for defects. Guedalia (1997) employed a set of parameters measured for each object to
determine whether the object was a calyx, a stalk-end, or a defect.
When the cavities have been located, some researchers simply remove the corresponding pixels from the apple surface while others process them during defect
recognition (discussed in the next section). Figure 9.14 presents the results of the
flood-filling method used by Kleynen and Destain (2004) for segmenting the calyxes
and stalk-ends on the basis of a seed pixel corresponding to the maximum value of the
7 Defect recognition and fruit classification
Once the image has been segmented, information is extracted in order to grade the
fruit. The size, “color,” shape, and texture of the object, as well as the distance from
center of gravity of the object to the calyx or to the stalk-end, may be evaluated for
each object. The number of objects detected in the segmented image may vary from
none in the ideal case of a healthy fruit correctly segmented, to 100 for some kinds
of russet. As classifiers require a fixed number of input parameters, this information
has to be summarized. The different approaches consist of extracting global statistical
parameters on the whole set of pixels, characterizing each object, and grading the
fruit on the worst one. The latter two can also be referred to as recognizing the defect
individually and grading the fruit according to the standards of examples.
232 Quality Evaluation of Apples
Figure 9.14 Samples of results of calyx/stem-end segmentation by a flood filling algorithm. The center of
the white cross is the seed pixel of the algorithm and the white contour line is the boundary of the filled area.
For most of these methods, the grading is based on the information coming from one
image. As several images are required to observe a whole fruit, we can suppose that
each image is graded separately, and the grade given to the whole fruit is the lowest
7.1 Features
The most evident and commonly used size and shape parameter is the area. It may be
computed from each object or from the whole fruit by the sum of the effective pixels.
In the latter case, it can be used directly (or as defect area ratio, i.e. the ratio of the total
defect area to the fruit area) to grade the apple. The distance from the center of gravity
of the object to the center of gravity of the fruit is also used as global or object feature.
The perimeter, the major inertia moment, and the ratio of the inertia moments are also
used to evaluate the shape of defects individually.
The most encountered “color” parameters are the mean value of each channel, or
a distance from the mean “color” of the object to the mean “color” of the fruit – i.e.
its contrast. This latter distance may be computed for each channel (one parameter per
channel, usually the absolute differences) or in the color space (i.e. one parameter, the
Euclidian or the Mahalanobis distances).
The texture may be evaluated by the standard deviation in each color channel and
by the mean and standard deviation of the image gradient for a particular channel.
Defect recognition and fruit classification 233
Invariant moments computed on the co-occurrence matrix are also used, although
a greater computational load is experienced.
A step-wise process with the error classification rate used as a criterion is usually proposed for parameter selection. Normally 12 to 15 parameters are retained, representing
the different categories (shape, color, and texture).
7.2 Global statistical parameters
Some parameters are extracted directly at pixel level: the total area; the defect area
ratio; and the mean, the median, and the standard deviation values of each spectral
Several researchers have considered the area of the largest defect. In most cases, each
image was processed separately and the fruit was graded according to the worst case.
Throop et al. (2005) used the total area of the defect in an image representing two-thirds
of the fruit surface. The apples were graded according to the USDA standards, with an
error rate of 12 percent. The fruit being mechanically oriented, the calyx and stalk-end
were, however, not inspected.
In order to grade Jonagold apples by multispectral images, Kleynen et al. (2004)
employed the mean, the median, and the standard deviation values of the 450-, 750-,
and 800-nm spectral components plus the defect area ratio. The authors achieved an
error rate of 11 percent with linear discriminant analysis. The calyx and stalk areas
were detected and segmented prior to defect detection, and the corresponding area
were ignored. Unay and Gosselin (2005) proposed a similar set of parameters, and
obtained similar results by using a support vector machine.
Another technique developed by Guedalia (1997) is to perform a principal component
analysis on the whole object feature set before using a supervised grading (error rate
of 33 percent).
7.3 Hierarchical grading based on object supervised
The basic idea is to recognize a defect’s origin by means of supervised defect classification. The standard separates the defects into flesh defects (unacceptable, whatever
size) and skin defects (which degrade the fruit according to their size as presented in
Table 9.2). It should be noted that bruises are flesh defects, and any fruit presenting
a bruise should be rejected.
The steps to achieve fruit grading are:
1. Compute shape, color, and texture features of each object in the image
2. Classify the object into one defect category
3. Grade the fruit according to the standards.
This procedure is well suited to blobs or patch-like defects, but, as can be observed in
Figure 9.8, the reality is more complex. Some defects present a more scattered aspect,
such as the diffuse russet, while others (bruises and russet, mainly) have a color very
close to that of the healthy tissue. Scattered or reticular russet is often segmented as
234 Quality Evaluation of Apples
Table 9.2 Maximal dimensions for defects accepted in each category, according to OECD.
(mm2 )
(mm2 )
(mm2 )
Minor defects
• apples belonging to the
class below (%)
• fruits with worms
• size
• global
a myriad of small objects. This is sometimes the case in a diffuse transitional area
between the ground color and the blush. For this reason, small objects have to be
considered too. On the other hand, it is not possible accurately to evaluate most of the
shape or texture parameters in these, for example. It is thus wise to apply a simplified
grading system to small objects.
Leemans (1999) proposed a method by which small objects consisting of less than 10
pixels (1.75 pixels/mm) were classified as healthy tissue, russet, or defect, using five
features. The larger objects were classified using the full set of parameters in two steps –
first into healthy tissue or defects, and then the defects were graded as slight defects,
scab or major defects. For small objects, the smallest error rate was 36 percent (by
quadratic discriminant analysis, QDA), while for larger ones it was 14 percent for the
healthy tissue versus the defects (by back-propagated neural networks; BPNN), and 29
percent between the defects (BPNN). By applying the standards, the global error rate of
the fruit images was 39 percent when four classes were considered and 18 percent when
two classes (i.e. accepted (at least Category I) and rejected) were taken into account.
The main draw back of this method is that the defect has to be classified in
a supervised way before the fruit is graded. This can only be achieved under laboratory conditions, and must include all the possible defects. However, it is well known
that the defects present in one season or in one orchard may be different from those in
another year or in another orchard. This requires classification of many defects – up to
3600 for Golden Delicious and 4000 for Jonagold (Leemans, 1999), during a minimal
period. Each time any part of a former algorithm evolves (the segmentation one mainly),
and even if the image database is conserved over the years, this tedious work has to be
done again.
7.4 Hierarchical grading based on object clustering and
fruit classification
This method is similar to the previous one, except that the defects are classified in
an unsupervised way, while the fruit grading is supervised. During a training session,
Defect recognition and fruit classification 235
a set of representative apples, including healthy fruits and defects, are presented to a
machine. The segmentation process is applied on the images of the apples and a database
containing features of the segmented objects is built. One part of this database is used for
learning while the rest is kept for validation. The objects are clustered (unsupervised
classification, off-line) on the basis of their features. Each fruit is then characterized by
the number of objects of each kind. These data are then used to train a supervised classification of the fruit (using quadratic discriminant analysis or back-propagated neural
network, off-line). When this process has been completed, the machine is ready to
run. The apples to be graded are presented to the machine, images are acquired and
segmented in the same way, the objects found are classified in the previously defined
clusters, and the fruits are graded according to the number of objects in each cluster.
For the final user, this process is transparent. A batch of healthy and defective fruits
is simply presented to the machine for learning, and the machine will reproduce their
grading afterwards. If a new kind of defect arises, samples have to be shown to the
machine in order to expand the database and allow the machine to “relearn”. The only
parameter that may be adjusted is sensitivity of the defect detection, i.e. the pixel
grading a posteriori probabilities. With this fruit-grading method the “training” only
takes a few seconds, while it would take days or weeks for the method presented in the
previous section.
Leemans and Destain (2004) used the k-mean clustering algorithm to establish the
clusters on the basis of the object features (four color parameters, five for the texture,
five for the shape, four for the proximity with the calyx and stalk-end). The distribution
of the clusters depended on the way that the points were dispersed into the n – (the
dimension of features) dimensions space. The parameters related to the size (i.e. the area
and the perimeter) presented a very asymmetric distribution with many small objects.
The fourth root of the area and the square root of the perimeter of these parameters
had better cluster repartition. The optimal number of classes was evaluated as 12 for
Golden Delicious and 16 for Jonagold. Figure 9.15 represents the clusters on the scatter
diagram for several parameters. Some clusters presented a clear distinction for one or
two parameters – for example, cluster 2 grouped the largest objects, while the brightest
were in cluster 4. The link between the groups and the defects was not always clear.
This can be easily understood if it is remembered that the aspects of many defects vary
widely with the age of the defect. Scab is typically circular and dark when “young,” but
for an old scab the fruit has meanwhile grown and the defect has become fragmented,
showing mainly scar tissue (Figures 9.5, 9.9, 9.11). Quite logically, both objects were
found in different groups according to their age. Oddly enough, one group did not
include, typically, a defect, but mostly the healthy tissue surrounding it. This group
was not observed in healthy fruits, which can only mean that the color of the skin is
influenced by the presence of a defect.
In order to achieve the grading of the fruit, the defects had to be enumerated. The
proposed method was to use the sum and the standard deviation of the a posteriori
classification probabilities of the objects in the clusters. The fruits were then graded
into a class by a supervised classification. Taking into account the number of clusters,
there are around 30 discriminant parameters for an apple fruit. Moreover, as most
clusters comprise only defects, which are far from being encountered in every image,
236 Quality Evaluation of Apples
Figure 9.15 Scatter diagrams showing the clusters of objects segmented on the fruit images for (a) the
first two principal components and (b) the fourth root of the area (R4Area), the mean value of the red
channel (R), and the mean value of the gradient of the red channel (SobR).
most sums and standard deviations are null. In order to avoid mathematical problems
and to decrease the amount of data, a principal component analysis was carried out and
the m first components, representing 97 percent of the total variability (m was about
14 or 15), were used with quadratic discriminant analysis. For the neural networks, all
the components were fed to the network.
The classification results are presented in Table 9.3. The multilayer back-propagated
neural network usually gave similar results to the quadratic discriminant analysis. When
four classes were considered, apples without defects are classified best. As their proportion in the database was quite low compared with what is encountered at the production
Defect recognition and fruit classification 237
Table 9.3 Global error rate (%) in cross validation (one image per fruit).
Golden delicious
(4 classes)
(4 classes)
Golden delicious
(2 classes)
(2 classes)
Quadratic discriminant analysis
Back-propagated neural network
facilities, the global error rate should thus be lower. All defects encountered during the
last four years were taken into account to build the database. Some of these, such as
aphid disease, did not produce any contrasted object, but merely a three-dimensional
change in texture and discoloration of the blush (Figure 9.5) – which were barely visible and hardly expected to be detected, but were anyhow considered. Recent bruises
(Figure 9.5) of less than 2 hours, particularly those under the blush, were also not
visible. These defects were quite difficult to grade correctly.
The method proposed by Leemans and Destain (2004) was implemented on a twoline grading prototype (Figure 9.4) with two cameras observing the lines at different
angles and each of them observing several apples. Each time, an apple was completely
presented in the image, and a region of interest (ROI) was delimited (see section 2.1).
The ROIs of the different views were calculated to recover the whole surface of the
fruit. The pixels within these ROIs were taken into account for the fruit color grading,
while the objects with their center of gravity within the ROIs were added to the fruit
database and processed using the object clustering and fruit classification method.
The parameters regarding the calyx and stalk-end were, however, different. Here, the
positions of each object and each cavity were transformed from the image coordinates to the fruit spherical coordinates. Based on the position of the cavities (resulting
from pattern matching), a change of coordinates brought the poles of the new coordinates system into the cavities. The “co-latitude” (the angular distance from the pole
to the object) was then the only proximity parameter between each object and the
The prototype was tested on Jonagold apples. For two classes (accepted and rejected),
the error rate was 5 percent in resubstitution and 27 percent in validation if the expert
pre-graded the fruits, and 14 percent if the expert post-graded the fruits (the expert had
to confirm the machine grading). From the latter, it can be deduced that around half of
these errors may be considered as benign.
The machine was tested in “real” conditions. The learning was conducted with apples
from a particular orchard, and evaluation of the grading performances was carried out
afterwards on other apples from the same orchard. Under specific conditions, 100 fruits
were necessary for learning, in order to achieve optimal grading.
The difference in the error rates with the same method applied on a single image
per static fruit is attributed to the fact that taking into account several images is more
complex, and to the acquisition system. It was more difficult to obtain uniform lighting over two grading lines, and the segmentation was consequently coarser. Fruits
with defects difficult to segment, such as russet and bruising, were accordingly less
accurately graded.
238 Quality Evaluation of Apples
8 Conclusions
Evaluation of the external quality of apples covers different aspects – the shape, the
intensity of the ground color, the proportion of the blush area, and the presence of
defects together with their importance.
Repartition of the incident light on the apple surface is of primary importance, and
many researchers used flat-field correction prior to image analysis.
The shape can be evaluated by Fourier descriptors, with which the fruits are graded
with an error rate of 6 percent using linear discriminant analysis. This fulfills the
standard requirements, but requires presentation of the fruit with its stalk–calyx axis
perpendicular to the optical axis of the camera, which can be achieved by mechanical
devices at the highest success rate of 98 percent.
To grade bi-color fruits according to their color, the blush has to be distinguished
from the ground color. Since the probability density distribution of the red, green, and
blue components is far from a Gaussian distribution, back-propagated neural networks
operated better than linear discriminant analysis. The fruit can be graded in groundcolor classes thanks to the hue value or the first canonical variate (calculated from
RGB data) with linear discriminant analysis. The misclassification rate is 9 percent,
which is about the same as the best human graders.
Assessment of blemishes remains the most difficult part of quality evaluation.
The calyx and the stalk-end have to be separated from the defects. This is usually performed by using image segmentation techniques, pattern-matching methods,
or three-dimensional apple surface-curvature evaluation. The different methods may
achieve low error rates of around 3 percent. Calyxes and stalk-ends are always correctly
recognized, while some circular dark defects are misclassified.
With regard to defect detection, unsupervised segmentation methods (thresholding
and flooding algorithms on monochrome images and Mahalanobis distance in the
RGB space) give correct results for mono-color fruits. For bi-color fruits, supervised
techniques, including back-propagated neural networks or a posteriori probabilities
computed by Bayes’ theorem and numerical models of the color probability density
function, are usually used. An unsupervised method derived from the “mean-shift”
procedure is also proposed, which seems more robust as it is, for example, not rigidly
linked to the illuminant.
The study of image segmentation for defect detection demonstrates the proximity
between some defects, mainly russet and bruises, and the healthy tissue, especially
in the RGB space. Selected wavebands give better segmentation, but the results are
nevertheless not perfect.
When the calyx or the stalk-end have been removed from the segmentation by either
mechanical or image analysis, evaluation of apple quality can be undertaken directly
during the defect segmentation process using the defective pixel information, usually
computing global statistical “color” parameters (mean, median, and standard deviation
values) and/or the area of the defective part. A more complex system grade the fruit
hierarchically by considering the entire information obtained from each segmented
object. First, the objects are classified into clusters in an unsupervised learning step;
secondly, the fruit are graded based on the information relative to each cluster.
References 239
Comparing the different methods is hazardous because the sample sizes (from several
hundreds to several thousands), the sampling period (one picking season or during
several years), the kind of defects encountered, and the number of classes considered
(from two to four) vary widely from one study to another. The best results, obtained
from one image taking into account all the encountered defects and the calyx and stalkend, achieve a correct classification rate (two classes are considered) of 90 percent
with global statistical parameters on images acquired at selected wavelength bands,
and of 92 to 95 percent with a hierarchical method according to the variety. Taking into
account the ratio of defective to healthy fruits in a real batch of apples (about 10–15
percent), both methods are within the tolerances of the standards, but can of course be
improved to limit the pay losses. The hierarchical grading method is implemented in an
industrial color-grading machine, which acquires several images of each fruit in order
to cover its whole surface. In these conditions, the error rate grows to 27 percent for
bi-color Jonagold apples, although half these errors are not considered to be serious.
This illustrates that uniform lighting and robust segmentation methods are of primary
importance in maintaining good grading accuracy when laboratory conditions are no
longer being met.
charge-coupled device (camera sensor)
back-propagated neural networks
linear discriminant analysis
near infra-red; in CCD vision, the part of the electromagnetic spectrum
from the limit of the visible (780 nm) to the limit of the CCD sensitivity
(if not stated otherwise, around 1000 nm for silicon devices)
mid infra-red (5 µm–40 µm)
PDF(s) probability density function(s)
quadratic discriminant analysis
red, green, and blue (color space)
region of interest, i.e. the part of the fruit image taken into account for the
ground color measurement or the defect detection
Cheng X, Tao Y, Chen YR, Luo Y (2003) NIR/MIR dual-sensor machine vision system for
on-line apple stem-end/calyx recognition. Transaction of the ASAE, 46 (2), 551–558.
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (5), 1–18.
Ferré G, Massol G, Le Fur G, Villeneuve F (1987) Couleur des pommes et maturité. Utilisation d’un colorimètre: perspectives. Infos-Centre Technique Interprofessionnel des
Fruits et Légumes, 30, 19–24.
240 Quality Evaluation of Apples
Guedalia ID (1997) Visual based quality sorting of fruit. In Proceedings of he Sensors for
Nondestructive Testing International Conference and Tour, Orlando, Florida, USA,
18–21 February.
Heinemann PH, Varghese ZA, Morrow CT, Sommer III HJ, Crasweller R M (1995) Machine
vision inspection of Golden Delicious apples. Applied Engineering in Agriculture, 11
(6), 901–906.
Hu MK (1962) Visual pattern recognition by moment invariants. IRE Transaction
Information Theory, 8 (2), 179–187.
Kapur JN, Sahoo PK, Wong AKC (1985) A new method for gray-level picture thresholding
using the entropy of the histogram. Graphical Models and Image Processing, 29,
Kleynen O, Destain M-F (2004) Detection of defects on fruits by machine vision and
unsupervised segmentation. In Proceedings of the International Conference on
Agricultural Engineering AgEng2004, Leuven, Belgium, Paper WS3–310.
Kleynen O, Leemans V, Destain M-F (2003) Selection of the most efficient wavelength
bands for “Jonagold” apple sorting. Post-Harvest Biology and Technology, 30,
Kleynen O, Leemans V, Destain M-F (2004) Development of a multi-spectral vision system
for the detection of defects on apples. Journal of Food Engineering, 69, 41–49.
Leemans V (1999) Contribution au classement des fruits par analyse d’images numériques.
Application au tri en ligne des pommes Golden delucious et Jonagold. PhD Thesis,
Gembloux Agricultural University, Gembloux, Belgium.
Leemans V, Destain M-F (2004) A real-time grading method of apples based on features
extracted from defects. Journal of Food Engineering, 61, 83–89.
Lespinasse J-M (1971) Quelques observations sur la coloration de l’épiderme de la variété
de pommier Golden delicious. L’arboriculture fruitière, 214, 26–29.
Mehl PM, Chen YR, Kim MS, Chan DE (2004) Development of hyperspectral imaging
technique for the detection of apple surface defects and contaminations. Journal of
Food Engineering, 61, 67–81.
Miller WM, Drouillard GP (1997) On-line blemish, color and shape analysis for Florida
citrus. In Proceedings from the Sensors for Non destructive Testing International
Conference and Tour, Orlando, USA, pp. 249–260.
Moltó E, Blasco J, Benloch JV (1998) Computer vision for automatic inspection of agricultural produces. Proceedings of the SPIE’s 1998 Symposium on Intelligent Systems
and Advanced Manufacturing – Precision Agriculture and Biological Quality, 3543,
Nakano K (1997) Application of neural networks to the color grading of apples. Computers
and Electronics in Agriculture, 18, 105–116.
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Transactions
on Systems, Man and Cybernetics, 9 (1), 62–66.
Penman D W (2001) Determination of stem and calyx location on apples using automatic
visual inspection. Computers and Electronics in Agriculture, 33, 7–18.
Ridler TW, Calvard A (1978) Picture thresholding using an iterative selection method.
IEEE Transactions on Systems, Man and Cybernetics, 8, 630–632.
References 241
Schrevens E, Raeymakers L (1992) Color characterisation of Golden Delicious apples using
digital image processing. Acta Horticulturae, 304, 159–166.
Throop JA, Aneshansley DJ, Anger WC, Peterson DL (2005) Quality evaluation of apples
based on surface defects: development of an automated inspection system. Postharvest
Biology and Technology, 36, 281–290.
Unay D, Gosselin B (2005) Artificial neural network-based segmentation and apple grading
by machine vision. In: Proceedings of the IEEE International Conference on Image
Processing, Genova, Italy, 2, II-630-3.
Unay D, Gosselin B (2007) Stem and calyx recognition on “Jonagold” apples by pattern
recognition. Journal of food Engineering, 78 (2), 597–605.
Wen Z, Tao Y (1998) Brightness-invariant image segmentation for on-line fruit defect
detection. Optical Engineering, 37 (11), 2948–2952.
Yang Q (1993) Finding stalk and calyx of apples using structured lighting. Computers and
Electronics in Agriculture, 8, 31–42.
Yang Q (1994)An approach to apple surface feature detection by machine vision. Computers
and Electronics in Agriculture, 11, 249–264.
Yang Q, Marchant JA (1996) Accurate blemish detection with active contour models.
Computers and Electronics in Agriculture, 14, 77–89.
Quality Evaluation of
Citrus Fruits
Enrique Moltó and José Blasco
Instituto Valenciano de Investigaciones Agrarias, Centro de
Agroingeniería, Cra. Moncada-Náquera km 5, 46113 Moncada
(Valencia), Spain
1 Introduction
1.1 Economic importance of citrus production
Citrus fruits are the primary fruit crop in international trade in terms of value. Commercially, several species are considered under the term citrus, including lemons (varieties
grown from the species Citrus limon), limes (Citrus latifolia and its hybrids), mandarins
(Citrus reticulata Blanco), satsumas (Citrus unshiu Marcow), clementines (Citrus
clementina Hort. ex Tanaka), common mandarins (Citrus deliciosa Ten) and tangerines (Citrus tangerina Hort. ex Tanaka), oranges (Citrus sinensis L. Osbeck), grapefruit
(Citrus paradisi Macfad. and its hybrids) and pummelos (Citrus maxima Burm. Merr.
and their hybrids) (UNECE, 2004).
There are two clearly differentiated markets in the citrus sector: the fresh citrus fruit
market, with a predominance of oranges, and the processed citrus products market,
consisting mainly of orange juice. Current annual worldwide citrus production is estimated at over 105 million tonnes, with more than half of this being oranges. About
one-third of citrus fruit production goes for processing, and more than 80 percent of
this is for the production of orange juice.
Citrus fruits are grown all over the world. According to the FAO (FAOSTAT, 2006),
there are 140 citrus-producing countries. Around 70 percent of the world’s total citrus
output is grown in the northern hemisphere, in particular in Brazil, countries around
the Mediterranean, and the United States. The greatest producer in Europe is Spain,
which accounts for more than 55 percent of the European citrus output.
1.2 Physiological and physicochemical characteristics of
citrus fruits that affect inspection
Citrus are non-climacteric. Traditionally, according to the respiratory
pattern, fruits are classified as climacteric or non-climacteric (Biale, 1964).
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
244 Quality Evaluation of Citrus Fruits
Non-climacteric fruits are those that do not exhibit increases in ethylene and
respiration, but rather undergo a gradual decline in respiration during ripening.
Citrus are very sensitive to chilling. Excessive cold storage or chilling in the field
has an important effect on citrus, not only on the external appearance but also
Citrus withstand manipulation in the packing house very well due to their thick
peel. Major damage in the packing lines may occur due to friction on some parts
of the conveyor belts, but not to collisions with parts of the machines or with other
fruit. This facilitates their handling, but interferes with their internal inspection
when NIR is employed to estimate sugar content.
Citrus generally show rounded surfaces, which produce a particular light
reflection pattern.
Citrus usually present a uniform color. However, early varieties of citrus in many
countries usually meet legal maturity standards before the peel attains the characteristic varietal color, and therefore require de-greening. Occasionally, latermaturing varieties may similarly require de-greening. Early-season citrus fruits
may reach an acceptable degree of internal maturity and sugar-to-acid ratio before
the peel attains a marketable yellow or orange color, and ethylene used in citrus
de-greening rooms will degrade the green-pigmented chloroplasts present in the
peel. Automatic inspection of early fruit has to be able to cope with the possibility
of finding specimens with different levels of green colors on their surface.
1.3 Quality features to be inspected in citrus fruits
Quality standards of citrus for fresh consumption are mainly based on the absence of
bruises and rots, as well as an adequate shape, color, and size. Maturity of citrus fruits is
defined by parameters, specified for each species, concerning minimum juice content,
minimum total soluble solids content (sugar content), sugar-to-acid ratio, and coloring.
Size is determined by the maximum diameter of the equatorial section of the fruit.
The following parameters related with the taste and maturity are commonly estimated
in packing houses and in the field in order to assess quality:
The color index (CI), defined as CI = 1000a/(L b), with L, a, b being the Hunter
Lab color space coordinates. Negative values of CI indicate a dark green/green
color; values around zero, a green-yellow color (intermediate); small positive
values, a yellow color; and large positive values, red-orange coloration. This
index is used to determine the harvesting date, but it also plays an important role
in establishing the duration of the de-greening treatment.
The maturity index (MI), which is a ratio between the soluble solid contents and
the acidity expressed in terms of citric acid. Fruits with an MI value of more than
a certain threshold (which depends on the species and variety) are considered to
be ripe and to have a taste that makes them suitable for commercialization.
Consumer preferences, which depend on cultural habits and many other socioeconomic characteristics. A study performed by Campbell et al. (2004) identified
three segments of consumers: the no-blemish segment, the price-sensitive
Introduction 245
segment, and the no-seeds segment. These could be the most valued attributes of
fresh citrus in the market.
1.4 Major defects found in citrus
As is the case with other commodities, citrus fruits can present bruises or rots caused
in the field or during handling and processing in the packing house. However, not all of
these defects have the same economic importance. Some types of damage do not evolve,
but others do – especially that related to fungal infections. It is extremely important to
detect such damage in the earlier stages. If this does not occur, infection may be spread
to other fruit, or the damage may be invisible during inspection but then appear at the
destination market place, thus causing the consignment to be rejected by the buyer.
Types of damage that do not evolve or that only cause appearance-related problems
1. Mechanical damage during growth or at harvest – scratches (Figure 10.1a)
and blemishes due to hail, wind, loss of stem (Figure 10.1b), stem puncture
(Figure 10.1c), etc.
2. Incorrect agricultural practices (fertilization, irrigation or pest control) that
produce physiological damage (Figure 10.1d).
3. Damage caused by pests:
• thrips punctures epidermal cells, leaving scabby, grayish and silvery scars on
the peel (Figure 10.1e)
• scales (Figure 10.1f), which include Aonidiella aurantii Mask. (Californi red
scale), M. beckii (purple scale), and M. gloverii (Glover scale), are small,
round insects that attach themselves to the peel and are difficult to remove.
4. Sooty mould (Capnodium elaeophilum Prill.), which consists of a group of fungi
that form a thick, black coating. Sometimes it can be removed simply by rinsing
the fruit (Figure 10.1g).
5. Incorrect de-greening, which produces brown blemishes of different sizes.
6. Damage caused during handling and conservation in the packing house:
• mechanical injuries caused by direct impact with sharp or blunt objects
• zebra skin, caused in highly turgid mandarin peel by mechanical abrasion
• age-related breakdown, due to cell weakening and dehydration of mature fruit
• chilling injuries, peel discoloration or collapse due to low storage temperatures
(Figure 10.1h)
• post-harvest pitting, caused by oil-gland collapse associated with mechanical
damage or reduced gas exchange
• oleocellosis – rupture of oil glands that “burns” the peel (Figure 10.1i)
• brush burn, which is damage caused to the peel by abrasion.
Amongst the kinds of damage that may evolve and spread to other fruit, the main
fungal infestations are produced by (EPPO, 2004):
Penicillium digitatum (green mould, Figure 10.1j) and P. italicum (blue mould),
which are the major causes of rottenness. They produce large areas of green or
blue color.
246 Quality Evaluation of Citrus Fruits
Figure 10.1 Different external damage on oranges: (a) scarring; (b) loss of stem; (c) stem puncture;
(d) phytotoxicity; (e) thrips; (f) scales; (g) sooty mould; (h) chilling injury; (i) oleocellosis; (j) Penicillium sp.;
(k) anthracnose; (l) medfly.
Aspergillus spp., which produces light-colored, soft areas that appear on the fruit
and the fruit tissues. Afterwards, rot generates a powdery mass of spores.
Alternaria citri, which affects mainly navel oranges and lemons. In navel oranges
the disease is also called black rot, and results in firm, dark-brown to black spots
at the stylar end or in the navel. The disease is also a problem for the processing
industry because of the bitter taste of infected fruit.
Galactomyctes citri-aurantii, which produces so-called sour rot. Lesions first
appear as light- to dark-yellow water-soaked areas. As the rot progresses, the rind
and juice vesicles degrade, causing the fruit to disintegrate into a watery mass.
Colletotrichum gloeosporioides, which causes anthracnose (Figure 10.1k) and
produces rind collapses. Early-season fruit is especially susceptible to anthracnose, and disease severity is greatly increased by exposure to high levels of
ethylene during de-greening.
Incorrect harvesting practices sometimes produce skin breakages that favor fungal
infections. Moreover, some pests, such as the medfly (Ceratitis capitataWied.), lay their
eggs under the peel, thus opening a pathway for further fungal infections (Figure 10.1l).
1.5 The citrus inspection line
Citrus are intensively inspected and sorted when destined for consumption as fresh
fruit. The process starts when field boxes, usually made of plastic, are dumped on the
Image analysis in the visible spectrum 247
line. Since fruit comes from the field in 200–300-kg bins, the bin dumper usually has
an unstacker. The first inspection is made on a roller elevator that transports the fruit
to the washing machine. At this point, rotten, cut, and very poor-quality specimens are
removed. The fruit are then cleaned by passing them over a series of nylon brushes, and
go through a detergent applicator curtain where fungicide may also be applied if needed.
Immediately afterwards, they are rinsed in a water drencher and drained by being passed
over rollers. In the next step, the fruit are carried on a roller conveyor, usually made
of PVC, and undergo a manual selection process. Second-quality pieces of fruit are
removed, and top-quality fruit are left to continue. The fruit come to a pre-drying tunnel
for a later wax application. The waxing machine consists of a series of brushes and
nozzles that dispense wax for polishing and protection; normally, at the same time a
fungicide is added to the wax. Following this step, the fruit are dried in a tunnel. A droproll sizer sorts the fruit, usually into eight sizes, according to the equatorial diameter,
and a set of conveyors distribute the sorted fruit to the packing tables and other packing
machines. The citrus packing line is usually adapted to the specific characteristics of
each user, and its through output is between 10 and 20 tonnes per hour. Often there are
two or three parallel lines in packing houses, which can handle up to 60 tonnes per hour.
As a complement to the process, citrus pregrading lines are sometimes included. On
these lines, the fruit are sorted by size and color when they arrive from the field and
are put in boxes or bins. Then they are transported to de-greening rooms if necessary,
depending on their color index, before going directly into cool chambers to be packed
using the standard line mentioned above. This is achieved by using electronic sizers
and a color-sorting device. Inspection of fruit by means of machine vision involves a
series of difficulties that need to be overcome:
citrus fruits present rounded surfaces that make it difficult to inspect the boundary of each fruit in the images and to perform an accurate measurement of the
commercial size of the fruit based on the “equatorial diameter.”
defects found in the images have different economic importance, and for this
reason it is crucial to discriminate them. Fungal infections must be identified as
early as possible.
shape and color are important features by which to differentiate the defects, but
by themselves are not adequate for this task. Although most citrus fruits have
uniform peel colors, green areas may appear in some specific varieties, which
tend to be early varieties that usually have a great economic value.
fruit can reach the inspection chamber waxed and wet, and thus their skins may
cause specular reflectance of the lighting source.
2 Image analysis in the visible spectrum
Studies of the reflectance properties of citrus fruit (Gaffney, 1973) can be considered
as the bases for subsequent work on image analysis for citrus inspection. These studies
determined the visible and near-infrared wavelengths at which greater contrast between
the peel and major defects can be achieved.
248 Quality Evaluation of Citrus Fruits
However, because most of the citrus production costs are related to harvesting, a
great deal of effort was expended in the mid-1980s and the 1990s on developing robots
for harvesting. This required the development of machine vision systems to detect fruit
on trees in outdoor conditions. Initial works were based on exploiting the differences in
reflectance of leaves and fruit (Moltó et al., 1992), while others (Slaughter and Harrell,
1989) used Bayesian approaches such as discriminant analysis as a tool for segmenting
color images. The sphericity of fruit was also used as a means to detect green fruit
surrounded by leaves (Pla et al., 1993).
2.1 Scene lighting
In all inspection systems that make use of machine vision, illumination is one of the
conditioning factors that most seriously affects the final outcome. Unsuitable lighting
can lead to unexpected or incorrect results. One of the problems that arise when illuminating citrus fruits inside the inspection chamber of a sorting machine is that they
are transported very close together and may even be in contact with one another. In a
typical case, the scene captured by the camera covers three or four pieces of fruit that
are traveling along the line. This leads to the appearance of shadows cast by the various
pieces of fruit. Diffuse frontal light creates a uniform illuminated area where there are
no shadows and minimizes the negative effects of specular reflection, which makes it
easier to analyze the images. Commonly, light is diffused by focusing the light source
on a diffusing surface, which also increases the area that is illuminated. Diffusion of
the light in the scene that is being illuminated must be as homogeneous as possible, so
that no part of the scene (such as the center) is more brightly lit than others.
Indeed, one of the main problems that arise when illuminating citrus fruit is that
bright spots on the peel are caused by reflection from lighting that is too direct. The
electrical and mechanical specifications of the system often make it difficult to produce
illumination that is diffuse enough to prevent the appearance of bright spots on the skin
of the fruit. The presence of water or waxes on the peel further increases this effect.
These bright spots can affect the estimation of the color of the fruit (Klinker et al.,
1987) and sometimes conceal the marks or blemishes that it is important to detect.
Specular reflections on the surface of objects can be eliminated using polarized light.
This can be accomplished by placing a polarizing filter at the light source and another
on the camera lens. The total elimination of bright spots is achieved when the filter
polarizes the light in a direction perpendicular to the light source. Moltó et al. (2000)
used such a system in a machine for inspecting citrus fruits. However, filters absorb
light, and thus the intensity of the light source has to be increased.
The problems involved in illuminating spherical fruit being transported automatically, such as the stability of the light source and the spectrum of the illuminant, were
studied by Affeldt and Heck (1995). Incandescent bulbs do not suffer from pronounced
flickering but, owing to the lower color temperature they produce, certain colors cannot
be seen properly. Conversely, common fluorescent lamps offer very good chromatic
reproduction but do not give out continuous light. Instead, they produce a flicker that
varies according to the frequency of the current, thus producing both well-lit and
dimly illuminated images. The solution is to fit the lamps with electronic ballast that
Image analysis in the visible spectrum 249
Figure 10.2 (a) RGB image of a rotten fruit illuminated with fluorescent tubes; (b) monochromatic
(560-nm) image of the same fruit with the same illumination; (c) monochromatic image of the same fruit
using UV lighting. The rotten areas are those that only clearly differ from the background.
increases the frequency of the current to a value that is much higher than that during
image acquisition.
Mention must also be made of special lighting systems, such as those used to produce fluorescence by exciting a molecule with high-energy (short-wavelength) light
and the subsequent instantaneous relaxation when lower-energy (longer-wavelength)
light is emitted. The use of such light allows certain types of external damage to be
observed. Ultraviolet (UV) sources (350–380 nm), which may be either fluorescent
tubes or mercury vapor lamps, can induce visible fluorescence (550–500 nm) of essential oils present on the skin due to cell breakage (Figure 10.2). Uozumi et al. (1987)
detected mechanical damage presented in satsuma mandarins using this technique. The
fluorescence of the chlorophyll was employed by Obenland and Neipp (2005) to locate
incipient peel injury caused by a hot-water treatment in lemons.
2.2 Acquisition of images of the peel
Color cameras are usually employed in quality inspection of fruit because the color
of the peel is one of the features that is considered in the quality standards. On-line
systems should be able to cope with the movement of the fruit under the camera and
capture an image that is large enough to allow the inspection of several fruit at the same
time. Image resolution must be high enough to allow the detection of small marks on
the skin. Furthermore, studies of static fruit can be conducted with more conventional
cameras and at a higher image resolution.
Progressive scan cameras are used in on-line systems, since they make it possible to
freeze the image inside a single field with no shifts due to movement of the interlaced
cameras. The exposure time, regulated by the shutter, must be set properly. Larger
image sizes (higher resolutions) entail longer transfer times between the camera and
the computer, as well as longer image-processing times.
Because fruit are round and images are flat, commercial inspection lines rotate the
fruit under the camera, recording several images in order to maximize the surface being
inspected. In this case it is crucial to regulate the rotation speed of the fruit, both to prevent skin from overlapping in different images when the speed is too low and to ensure
250 Quality Evaluation of Citrus Fruits
600 × 180 pixels
600 fruits/min
Strips of
each fruit
100 ms
300 ms
Figure 10.3
Sequence of images of central strips of a fruit.
that the whole surface is inspected without leaving any areas unexamined. Carrión
et al. (1998) studied different feed speeds and rotation rates for fruit on a roller-based
transport system in order to optimize the image acquisition of oranges and mandarins.
Two different approaches are used to maximize the surface area that is inspected. The
first consists of taking several images of the whole surface of the skin exposed to the
camera, and analyzing the images separately to obtain a single result. The other consists
of acquiring a larger number of images of the central part of the fruit (“strips”) and then
joining them together to create a sort of map of the surface of the fruit (Aleixos et al.,
1998), as shown in Figure 10.3. Here, up to seven images of the central part of the fruit
are acquired while it rotates. The results of such a system can be seen in Figure 10.4.
In this case, the size and shape of the fruit are inspected using complementary images
of the whole fruit.
2.3 Analysis of the images
Different techniques are used to analyze the images, depending on the purpose of the
analysis and the time restraints. The images are pre-processed or not, according to
the segmentation technique used. Images may display some noise due to inappropriate
lighting, the presence of dust and dirt in the transport system, and electromagnetic
artefacts produced by the electrical devices installed close to the cameras, wires, and
computers. Low- and mid-pass filters are frequently applied to remove this type of
noise. Some researchers reduce the number of colors in the image to homogenize the
different regions and highlight the contours (Blasco et al., 2007a).
Image analysis in the visible spectrum 251
Figure 10.4 Reconstruction of the skin of the fruit by joining the central strips together; the images are of
oranges (left and center) and a lemon (right).
Segmentation techniques reported in the literature can be classified as either pixeloriented or region-oriented techniques. Basically, they differ in the information that
they use to perform the segmentation and the time they require to process the images.
Pixel-oriented techniques are based on the information provided by one single pixel,
without taking into account the elements that surround that pixel. They have been
widely used because of their simplicity and lower computational costs. The information
used to classify a pixel is normally based on color coordinates and is suitable for the
detection of clearly differentiated marks and the stem. A simple technique was proposed
by Cerruto et al. (1996), which segments blemishes in oranges using histograms of
the three components of the pixels in HSI color space. To estimate the maturity of
citrus, Ying et al. (2004) used a dynamic threshold in the blue component to segment
between fruit and background. They then used neural networks to distinguish between
mature and immature fruit. Neural networks were also employed by Recce et al. (1998)
to segment images by using the histogram in the red and green components of RGB
images. They aimed to classify fruit while taking into account the presence of damage,
with special attention being paid to the detection of the stem. The model employed by
these authors is based on back-propagation training of a multilayer perceptron with ten
inputs (features of the red and green histograms) and two outputs (no defect/defect, and
defect/stem). One of the major disadvantages of neural networks is the long time taken
to process the images, especially when the inspection is intended for real-time systems.
Aleixos et al. (1999) modified the linear technique developed by Harrel (1991) and
used a quadratic Bayesian classifier to classify pixels in images of oranges as belonging
252 Quality Evaluation of Citrus Fruits
Figure 10.5 Sequence of segmentation performed by a region growing technique. (a) Original image;
(b) homogeneity map; (c) original set of seeds; (d)–(g) growing region iteration; (h) final result of the
region growing; (i) segmentation after the region merging.
to the background, the stem, and damaged or sound skin. The success rate using this
technique was 98 percent.
Many pixel-oriented segmentation techniques require prior training. An operator
manually selects representative samples of pixels from all the regions into which the
fruit is going to be segmented. The process is repeated with different images in order
to obtain representative statistics to construct classification models based on Bayesian
theory. The great advantage of these methods is their versatility, which allows the
inspection system to work very fast with different region classes. However, one disadvantage is that re-training has to be repeated several times in order to cope with the
changes of color that citrus fruit display throughout the harvesting season.
Conversely, region-oriented segmentation techniques take into account the information supplied by each pixel and its neighbors, thus including data about the homogeneity
of the surroundings, the presence of borders, and other types of information. The main
Image analysis in the visible spectrum 253
Figure 10.6 Anomalies in the transport of the fruit: fruit that are in contact (a) or on top of one another
(b) are detected by analyzing changes in the X-direction (horizontal). Fruit travelling on the same rollers
(c) are detected by analyzing changes in the Y-direction (vertical).
advantage of these techniques is their robustness when dealing with changes of color;
their major drawback is high computing costs.
Because citrus fruits have a homogeneous skin color when they are ripe, grouping
techniques such as those based on iterative region-growing can be successfully applied
(Blasco et al., 2007a). The a priori knowledge of the homogeneity of the color of the
peel is used to detect the areas on the skin with more uniform colors, and to determine
the pixel seeds (Figure 10.5). A specific adaptation for detecting defects in citrus fruit,
and especially the smaller ones, consists of creating new seeds to allow new regions to
appear during the iterative process of region growth. In a second step, several criteria
may be used to join similar regions together. Carrión et al. (1999) segmented images of
oranges using a grouping technique based on color contrast between different regions
in order to discriminate sound and damaged peel.
As a result of the segmentation process, each pixel is classified as belonging to a
region of interest. In images of citrus fruits, these may be regions corresponding to
different types of sound peel (which may be green, yellow, light, dark, etc.), stalk,
calyx, and areas with different types of damage.
Because several pieces of fruit may be presented in the image, it is necessary to
distinguish pixels that belong to different specimens. Many errors result from the
contact of large pieces of fruit (Figure 10.6a), which may lead the system to consider
two pieces of fruit as being a single object and thus cause a mistaken estimation of the
size. Errors are also produced by fruit being located in the wrong place – for example,
one on top of two others (Figure 10.6b) – and by the presence of more than one fruit on
a single roller (Figure 10.6c). A system for on-line inspection of the quality of citrus
fruits must be capable of detecting these errors because the results will otherwise be
incorrect (Aleixos et al., 2002).
Prior knowledge of the shape of the fruit (for instance, oranges are almost spherical, mandarins and grapefruit are more flattened at the poles, and lemons have a
254 Quality Evaluation of Citrus Fruits
characteristic shape) helps in this task. These shapes are predominantly convex, yet
when errors occur during transport, the objects detected as being several fruit joined
together display a number of concavities. For example, identification can be assisted
by searching for sudden changes in the direction taken by the contour of the object
(Yu, 2003).
Once the fruit has been segmented and the contour determined, it is analyzed in
order to estimate the size. A system in which the fruit is not properly oriented cannot
determine where the equatorial region of each piece of fruit is. A criterion must therefore
be established for calculating the diameter. This can be taken as being the maximum or
average diameter of the contour of the fruit, which means that a number of diameters
must be calculated in order to determine which is the longest or the average. It can also
be the diameter that is calculated by the main axis of inertia, which is more important
for mandarins and lemons, owing to their more irregular shape, than for oranges.
Unlike apples, the presence of a stalk represents a lack of quality in citrus due to the
fact that a long stalk can cause damage to other fruit during storage. Plá and Juste (1995)
developed a thinning algorithm to detect the presence of stalks in citrus fruits. Ruiz
et al. (1996) proposed another approach and analyzed the curvature of the perimeter
of the profiles of oranges for the same purpose.
Damage to citrus fruits can be of different shapes, color, texture, and size. Generally,
the color of the blemishes and marks is different in the characteristic of orange color
of oranges and mandarins than it is in the yellow color of lemons. This was the underlying principle used by Blasco et al. (2007a) to obtain 95 percent correct detection
of blemishes in oranges and mandarins by segmenting the images with discriminant
Neural networks were used by Fierro et al. (2004) to classify citrus according to visual
characteristics such as size, color and external defects, and by Miller and Drouillard
(2001) to classify oranges, grapefruit, and tangerines. They compared several models
of neural networks with a success rate of over 98 percent in discriminating fruit by the
presence of blemishes.
2.3.1 Identification of damage
Following detection of a blemish, the next step is its identification. It is essential to
know whether a blemish only affects the appearance of the fruit, or whether it can
progress to the point where it damages the whole piece or contaminates other fruit. The
identification of damage can be used to determine the final destination of the fruit, and
this will maximize the commercial benefits. Many types of damage consistently have
a similar aspect in different fruits, such as scales, fly bites, rotting, and certain forms
of damage caused during harvesting. In these cases, damaged areas can be identified
by analyzing their contour. The fast Fourier transform is one of the most commonly
used methods for determining the shape of objects, and has sometimes been employed
in the case of agro-food products. Tao et al. (1995) and Blasco and Moltó (2001) used
Fourier descriptors of the perimeter from the Polar signature to discriminate between
different types of damage to fruits and vegetables. Blasco et al. (2007b) employed
different color coordinates to distinguish one type of damage from another.
Quality inspection in the non-visible spectrum 255
3 Quality inspection in the non-visible spectrum
Unlike the human eye, machine vision inspection is not limited to the visible (400- to
700-nm) region of the electromagnetic spectrum. The sensitivity of cameras based on
a charge-coupled device (CCD) allows inspection in the range between approximately
400 and 1000 nm. Sound peel and different types of blemishes display distinct spectral
responses (Gaffney, 1973). Fujita and Tono (1985) demonstrated that some external
damage to citrus fruits could be detected within the range of 265–325 nm using ultraviolet spectrometry. Blasco et al. (2007b) proposed a multispectral system based on
four inspection imaging systems (visible, NIR, UV, and fluorescence) to identify the
cause of 11 different blemishes, caused by damage, in oranges and mandarins.
Nowadays, commercial citrus inspection systems are often equipped with two different types of camera – one sensitive to visible light and the other capable of detecting
infrared radiation. While the camera sensitive to visible light is used to estimate the
color of the fruit and to detect the presence of damage, the second camera is employed
to distinguish the shape of the fruit from the background of the image and, for this
reason, to determine the size of the fruit.
Many packing lines use the UV-induced fluorescence of essential oils as a means
to detect invisible damage and some fungal infections. This task is currently performed manually, but operators must wear protective glasses and gloves because
prolonged exposure to ultraviolet radiation can be dangerous for their skin. Nevertheless, automatic inspection in these chambers is expected to become a reality in the
near future.
3.1 Hyperspectral vision
Use of interferometric filters coupled to cameras is a well-known technique for increasing the contrast of specific blemishes in many kinds of fruit (Mehl et al., 2004).
However, these methods are associated with a number of difficulties because it is necessary to change the lens filter on the camera to obtain images at different wavelengths.
A step forward in this regard is the use of hyperspectral imaging systems such as those
employed by Mehl et al. (2002), Ariana et al. (2006), and Gómez et al. (2006). These
systems are equipped with tunable filters, allowing the wavelengths to be changed
using software. Consequently, it becomes possible to acquire images of the same scene
at different wavelengths. Figure 10.7 shows a sequence of images acquired every 10 nm
in the visible spectrum at wavelengths between 410 and 660 nm.
The excessive amount of time required to acquire the images and the large amount
of data generated are the two major drawbacks that currently make these methods
unsuitable for on-line inspection systems. Nevertheless, they do offer information that
can be extremely useful when designing such systems.
Gómez et al. (2006) selected the most suitable wavelength for detecting infection
by Penicillium digitatum in mandarins. The vast amount of information generated
from images was reduced by using statistical techniques such as principal components
analysis (PCA) and partial least squares (PLS).
256 Quality Evaluation of Citrus Fruits
700 nm
690 nm
680 nm
670 nm
660 nm
650 nm
640 nm
630 nm
620 nm
610 nm
600 nm
590 nm
580 nm
570 nm
560 nm
550 nm
540 nm
530 nm
520 nm
510 nm
500 nm
490 nm
480 nm
470 nm
460 nm
450 nm
440 nm
430 nm
420 nm
Figure 10.7 Images of the same fruit illuminated with white light but acquired at different wavelengths.
The image in the upper left corner was acquired using a standard B/W camera.
4 Internal quality inspection
Internal quality inspection of citrus has traditionally been performed statistically by
means of destructive testing of samples. The main drawbacks are that the sampled
fruit is destroyed, and that it is impossible to guarantee the quality on an individual
basis. In order to solve these problems, machine vision has also been employed to
obtain internal features. Kondo and colleagues (Kondo, 1995; Kondo et al., 1995,
2000) introduced morphological features of oranges (such as shape, texture, and size),
which were calculated using image analysis, as inputs in a neural network to predict
the sugar content and the pH of the fruit. They concluded that it was possible to obtain
these internal features from the appearance of the fruits. However, magnetic resonance
imaging (MRI) is now being studied as a tool to obtain internal images of the fruit to
assess its quality.
MRI is based on the concept that protons are positively charged and their movement
induces a magnetic field, which makes them behave like little magnets. In the absence
of an external magnetic field, the protons in a tissue sample are oriented randomly in
all directions. When exposed to the influence of an external magnetic field, such as that
produced by a powerful magnet, they then align themselves with the magnetic field.
The frequency at which the protons spin is called the precession frequency.
Internal quality inspection 257
Figure 10.8 MR images of a mandarin with seeds obtained using different pulse sequences: (a) spin-echo
TE = 18 ms and TR = 1500 ms; (b) spin-echo TE = 80 ms and TR = 1500 ms; (c) spin-echo TE = 120 ms and
TR = 1500 ms; (d) gradient-echo TE = 90 ms and TR = 14 ms.
Figure 10.9 (a) Magnetic resonance imaging of fruit; (b) application of segmentation and (c) contour
analysis algorithms; (d) the fruit after being opened.
If we emit a short pulse of radio waves (RF) with the same frequency as the precession
frequency, the protons absorb energy from that pulse. When the RF pulse stops, the
protons release this energy and return to their original orientation. The response of
protons in different tissues varies according to their molecular structure. By combining
gradient magnet fields in an appropriate way, gray-level two-dimensional images of
thin slices of fruit can be obtained. Specific sequences of pulses may increase the
contrast of the response of fatty tissues (like those in the seeds) and those that have a
greater water content (like those in the pulp), or air inside the fruit (Figure 10.8). Clark
et al. (1999) utilized MRI to study the changes produced in the tissues of mandarins
while they are growing. Moltó and Blasco (2000) and Blasco et al. (2002) used the
same technique to detect seeds in static images of mandarins (Figure 10.9)
In these images, seeds contrast very well with the pulp of the fruit and the images can
therefore be segmented using simple thresholds. The problem that arises when detecting
seeds in mandarins in movement, as is the case on a packing line, was successfully
solved by Hernández et al. (2005). In addition, MRI has been applied to discover
forms of internal damage, such as those caused by freezing (Gambhir et al., 2004,
2005; Hernández et al., 2004). Today MRI equipment is still far too expensive, but
application of this technology in agro-food inspection has a very promising future.
Other techniques that can possibly be used to inspect the internal quality of citrus
fruits, but have so far received little attention, are X-rays or computerized axial tomography (CAT) scanning. In the images obtained by CAT scanning (Figure 10.10) the seeds
can be seen, although the contrast with the neighboring area is low and it is therefore
difficult to detect them using pixel-oriented algorithms and algorithms based on searching for contrast differences. The images obtained by means of X-rays (Figure 10.11)
258 Quality Evaluation of Citrus Fruits
Figure 10.10
Image of the inside of a mandarin obtained using computerized axial tomography (CAT).
Figure 10.11
X-ray image of mandarins with seeds.
represent projections of the whole volume of the mandarin in one plane. As seeds have
a similar density to that of pulp, they are masked and cannot be distinguished.
5 Inspection of clementine and satsuma
Apart from being sold as fresh items, citrus (and especially mandarins) are also
marketed as tinned fruit. The industrial process involves peeling the mandarins and
Inspection of clementine and satsuma segments 259
Figure 10.12 Inspection of satsuma segments.
separating the segments automatically, and the segments must then be inspected to
prevent any seeds, pieces of peel or broken segments from being canned. Nevertheless,
the inspection process has still not been automated in the industry. This is partly due to
the difficulty involved in individualizing and separating the product automatically, and
also to the lower specific weight of this industry with respect to the fresh fruit business.
As far as whole fruit inspection is concerned, the characteristics that mainly differentiate the analysis of these images for automatic inspection purposes are that segments
are always wet, and many objects of interest appear in each image. Moreover, handling
of segments in the line is far more difficult because they have a very soft structure and
thus break very easily.
Automatic inspection systems generally attempt to analyze certain external parameters, such as the size, shape, and the presence of seeds. Backlighting is a usual means to
detect the contour of the segments. This illumination is achieved by applying a uniform
light with the object located between the light source and the camera. By illuminating
the object from the rear, the background is intensely and uniformly lit, while the rest of
the object appears much darker. This results in a high-contrast image of a dark object
on a light background. The seeds, which are more opaque than the pulp, will also stand
out more sharply against the background.
Simultaneous analysis of several morphological parameters can determine the shape
of irregular objects and distinguish objects with different shapes and areas. Tomás et al.
(1994) and Torres et al. (1994) used a combination of the area, the perimeter, the center
of mass, and the moments of inertia to detect broken segments and other defects in
satsuma segments. Blasco et al. (2007c) used the fast Fourier transform (FFT) of
the perimeter signature, as well as morphological parameters, to estimate the shape
of the segment and to detect broken segments. Galindo et al. (1997) used histogram
analysis and calculation of a number of morphological characteristics to detect the
presence of seeds in satsuma segments. In the work by Blasco et al. (2006), authors
employed a double segmentation procedure in RGB images. First, they distinguished
the background and the object by utilizing a threshold in the B band, and then they use
the R band to differentiate between segment and seed (Figure 10.12). In this work they
also analyzed the shape of the segment using a combination of morphological features.
260 Quality Evaluation of Citrus Fruits
6 Conclusions
Citrus fruits are the primary fruit crop in international trade in terms of value. There
are two clearly differentiated markets in the citrus sector: the fresh citrus fruits market,
with a predominance of oranges, and the processed citrus products market, consisting
mainly of orange juice. For these reasons, a lot of effort has been put in automating
the inspection of citrus quality. Machines are capable of sorting fruit by size and
weight, but consumers value more quality attributes, such as the appearance of the
fruit, its taste (which is related to the sugar-to-acid ratio), and the high vitamin content.
Furthermore, packing houses need to detect rotten fruit in order to avoid the spread
of fungal infections and blemishes that can be a focus of future infections, causing
undesired loses of quality.
In the last 20 years, research and development on machine vision inspection has
been focused on the measurement of size and shape, color sorting, and detection
of blemishes with a great success. Current commercial machines are nowadays able
to obtain these objectives with an adequate processing speed of close to 15 fruit/s
per line. The next step is the possibility of identifying the defects and sorting the
fruit according to potential negative effects. This objective has been partially solved
in commercial machines, but more effort is required to improve their precision and
processing speed. The possibility of illuminating and acquiring scenes with new
light sources outside the visible spectrum has offered new methods to reach this
Internal quality of fruit is also essential for consumers, but this cannot be assessed
with current machines. Efforts have been directed towards the estimation of sugar and
acid content, detection of seeds, internal physiological disorders, fungal infections,
and presence of insects, but current machines are unable to achieve this objective online and thus results are still at the laboratory level. It is very difficult to use NIR
spectroscopy, which has been successfully applied for the estimation of sugar and
acid contents in other fruit, in citrus fruits due to the thickness of their skin. The
reduction in the price of novel technologies that have been used in medicine, such as
magnetic resonance spectrometry and imaging, low-power X-rays, and computerized
axial tomography, opens the door to obtaining non-invasive internal images of the fruit
that can be used for quality inspection, but these technologies are still far from being
applied in real-time systems. The on-line assessment of internal quality in citrus is still
a major challenge for researchers.
Affeldt HA, Heck RD (1995) Illumination methods for automated produce inspection:
design considerations. Applied Engineering in Agriculture, 19 (6), 871–880.
Aleixos N, Blasco J, Moltó E (1998) Design of a vision system for real-time inspection of
oranges. In Proceedings of the 14th International Conference on Pattern Recognition
and Image Analysis, Barcelona, Spain, Vol. I, pp. 387–394.
References 261
Aleixos N, Blasco J, Moltó E (1999) Design of a vision system for real-time inspection
of oranges. In Proceedings of the VIIIth National Symposium on Pattern Recognition
and Image Analysis, Bilbao, Spain, pp. 387–394.
Aleixos N, Blasco J, Navarrón F, Moltó E (2002) Multispectral inspection of citrus in real
time using machine vision and digital signal processors. Computers and Electronics
in Agriculture, 33 (2), 121–137.
Ariana DP, Lu R, Guyer DE (2006) Near-infrared hyperspectral reflectance imaging for
detection of bruises on pickling cucumbers. Computers and Electronics in Agriculture,
53, 60–70.
Biale JB (1964) Growth, maturation, and senescence in fruits. Science, 146 (3646),
Blasco J, Moltó E (2001) Citrus peel defect classification using a Fourier-based technique.
In Proceedings of the IXth Spanish Symposium on Pattern Recognition and Image
Analysis, Benicassim, Spain, Vol. I, pp. 417–422.
Blasco J, Moltó E, Alamar MC (2002) Detection of seeds in mandarins using magnetic resonance imaging. In Proceedings of the 6th International Conference on Applications
of Magnetic Resonance in Food Science, Paris, France, p. 47.
Blasco J, Cubero S, Arias R, Juste F, Moltó E (2006) On-line quality grading of mandarin segments by computer vision. In XVIth CIGR World Congress: Agricultural
Engineering for a Better World, Book of Abstracts, Bonn, Germany, pp. 623–624.
Blasco J, Aleixos N, Moltó E (2007a) Computer vision detection of peel defects in citrus
by means of a region oriented segmentation algorithm. Journal of Food Engineering,
81 (3), 535–543.
Blasco J, Aleixos N, Gómez J, Moltó E (2007b) Citrus sorting by identification of the most
common defects using multispectral computer vision. Journal of Food Engineering, 83
(3) 384–393 (available on-line at: http://dx.doi.org/10.1016/j.jfoodeng.2007.03.027).
Blasco J, Cubero S, Arias R, Gómez J, Juste F, Moltó E (2007c) Development of a computer
vision system for the automatic quality grading of mandarin segments. Lecture Notes
in Computer Science, 4478, 460–466.
Campbell BL, Nelson RG, Ebel RC, Dozier WA, Adrian JL, Hockema BR (2004)
Fruit quality characteristics that affect consumer preferences for satsuma mandarins.
Hortscience, 39 (7), 1664–1669.
Carrión J, Torregrosa A, Ortí E, Moltó E (1998) First results of an automatic citrus sorting machine based on an unsupervised vision system. International Conference on
Agricultural Engineering, AgEng 98, Oslo, Norway, EurAgEng Paper No. 98F-019.
Carrión J, Steinmetz V, Moltó E (1999) An adaptative, unsupervised image analysis
algorithm for external inspection of oranges. In Proceedings of the VIIIth National
Symposium on Pattern Recognition and Image Analysis, Bilbao, Spain, Vol. II, p. 57.
Cerruto E, Failla, S, Schillaci, G (1996) Identification of blemishes on oranges. International Conference on Agricultural Engineering, AgEng 96, Madrid, Spain, EurAgEng
Paper No. 96F-017.
Clark CJ, Richardson AC, Marsh KB (1999) Quantitative magnetic resonance imaging of
satsuma mandarin fruit during growth. Hortscience, 34 (6), 1071–1075.
European and Mediterranean Plant Protection Organization (EPPO) (2004). Citrus.
EPPO/OEPP Bulletin, 34 (1), 43–56.
262 Quality Evaluation of Citrus Fruits
FAOSTAT (2006). URL: http://faostat.fao.org/ (accessed February 2006).
Fierro LP, Velasco MP, Velasco CZ, Leon-Tellez J (2004) Digital image analysis in the design
and implementation of a visual inspection system for fruit grading. In Proceedings of
the Society of Photo-optical Instrumentation Engineers (SPIE), 5622, ICO Regional
Meeting, pp. 91–96.
Fujita S, Tono T (1985) The relationship between browning due to rind-oil spot of citrus
fruit peel and the absorption spectra of its extract. Journal of the Japanese Society for
Horticultural Science, 54 (1), 109–115.
Gaffney JJ (1973) Reflectance properties of citrus fruit. Transactions of the ASAE, 16 (2),
Galindo M, López JA, Contreras LA,Tomás LM (1997) Defects modelling through artificial
vision techniques, applied to satsuma and tangerine slices quality control. In Robotics
and Automated Machinery for Bio-productions, BIO-ROBOTICS 97, Gandía, Spain,
pp. 89–94.
Gambhir PN, ChoiYJ, McCarthy MJ (2004) Development of rapid and non-invasive nuclear
magnetic resonance method for identifying freeze damaged citrus fruits. 2004 IFT
Annual Meeting, Las Vegas, USA.
Gambhir PN, Choi YJ, Slaughter DC, Thompson JF, McCarthy MJ (2005) Proton
spin–spin relaxation time of peel and flesh of navel orange varieties exposed to freezing
temperature. Journal of the Science of Food and Agriculture, 85, 2482–2486.
Gómez J, Blasco J, Aleixos N, Juste F, Moltó E (2006) Hyperspectral computer vision
system for early detection of Penicillium digitatum in citrus fruits. XVIth CIGR World
Congress: Agricultural Engineering for a Better World, Book of Abstracts, Bonn,
Germany, pp. 241–242.
Harrel RC (1991) Processing of color images with Bayesian discriminate analysis. In 1st
International Seminar on Use of Machine Vision Systems for the Agricultural and
Bio-Industries, Montpellier, France, pp. 11–20.
Hernandez N, Barreiro P, Ruiz-Altisent M, Ruiz-Cabello J, Fernandez-Valle ME (2004)
Detection of freeze injury in oranges by magnetic resonance imaging of moving
samples. Applied Magnetic Resonance, 26 (3), 431–445.
Hernandez N, Barreiro P, Ruiz-Altisent M, Ruiz-Cabello J, Fernandez-Valle ME (2005)
Detection of seeds in citrus using MRI under motion conditions and improvement with
motion correction. Concepts in Magnetic Resonance: Part B – Magnetic Resonance
Engineering, 26B (1), 81–92.
Klinker GJ, Shafer SA, Kanade T (1987) Using a color reflection model to separate
highlights from object color. In Proceedings of the 1st International Conference on
Computer Vision, ICCV87, London, UK, pp. 145–150.
Kondo N (1995) Quality evaluation of orange fruit using neural networks. In Food
Processing Automation IV, Chicago, USA, pp. 95–101.
Kondo N, Murase H, Monta M, Shibano Y, Mohri K (1995). Study on quality evaluation of
orange fruit using image processing. In IFAC International Federation of Automatic
Control Preprints, pp. 93–96.
Kondo N, Ahmad U, Monta M, Murase H (2000) Machine vision based quality evaluation of
Iyokan orange fruit using neural networks. Computers and Electronics in Agriculture,
29 (1–2), 135–147.
References 263
Mehl PM, Chao K, Kim MS, Chen YR (2002) Detection of defects on selected apple
cultivars using hyperspectral and multispectral image analysis. Transactions of the
ASAE, 18 (2), 219–226.
Mehl PM, Chen YR, Kim MS, Chan DE (2004) Development of hyperspectral imaging
technique for the detection of apple surface defects and contaminations. Journal of
Food Engineering, 61, 67–81.
Miller WM, Drouillard GP (2001) Multiple feature analysis for machine vision grading of
Florida citrus. Applied Engineering in Agriculture, 17 (5), 627–633.
Moltó E, Blasco J (2000) Detection of seeds in mandarins using magnetic resonance imaging. International Society of Citriculture Congress 2000, Orlando, Florida, USA,
Paper No. 403.
Moltó E, Pla F, Juste F (1992) Vision systems for the location of citrus fruit in a tree canopy.
Journal of Agricultural Engineering Research, 52, 101–110.
Moltó E, Aleixos N, Blasco J, Navarrón F (2000) Low-cost, real-time inspection of oranges
using machine vision. In Agricontrol 2000. International Conference on Modelling
and Control in Agriculture, Horticulture and Post-harvested Processing, Wageningen,
The Netherlands, pp. 309–314.
Obenland D, Neipp P (2005) Chlorophyll fluorescence imaging allows early detection and
localization of lemon rind injury following hot water treatment. Hortscience, 40 (6),
Plá F, Juste F (1995) A thinning algorithm to characterize fruit stems from profile images.
Computers and Electronics in Agriculture, 13, 301–314.
Plá F, Juste F, Ferri FJ (1993) Feature extraction of spherical objects in image analysis.
An application to citrus picking robot. Computers and Electronics in Agriculture, 8,
Recce M, Plebe A, Taylor J, Tropiano G (1998) Video grading of oranges in real time.
Artificial Intelligence Review, 12, 117–136.
Ruiz LA, Moltó E, Juste F, Pla F, Valiente R (1996) Location and characterization of the
stem–calyx area on oranges by computer vision. Journal of Agricultural Engineering
Research, 64, 165–172.
Slaughter, DC, Harrell RC (1989) Discriminating fruit for robotic harvest using color in
natural outdoor scenes. Transactions of the ASAE, 32 (2), 757–763.
TaoY, Morrow CT, Heinemann PH, Sommer HJ (1995) Fourier-based separation technique
for shape grading of potatoes using machine vision. Transactions of the ASAE, 38 (3),
Tomás LM, Torres R, López JA, Doménech G (1994) Color image processing and artificial
vision techniques, used for detection, segmentation and identification of satsuma
slices. In Third International Conference on Automation, Robotics and Computer
Vision, Vol. III, pp. 1955–1959.
Torres R, Tomás LM, López JA, Doménech G (1994) Automatic vision inspection system
for the analysis and detection of breakages and defects of satsuma slices. In Proceeding
of the International Conference on Systems, Man and Cybernetics, San Antonio, USA,
pp. 853–858.
United Nations Economic Commission for Europe (UNECE) (2004) UNECE Standard
FFV-14. URL: http://www.unece.org (accessed March 2006).
264 Quality Evaluation of Citrus Fruits
Uozumi JL, Kawano S, Iwamoto M, Nishinari K (1987) Spectrophotometric system for
quality evaluation of unevenly colored fruit. Journal of Japan Society of Food Science
and Technology, 34, 163–170.
YingYB, Xu ZG, Fu XP, LiuYD (2004) Non-invasive maturity detection of citrus with computer vision. In Proceedings of the Society of Photo-optical Instrumentation Engineers
(SPIE), 5271, Monitoring Food Safety, Agriculture, and Plant Health, pp. 97–107.
Yu B (2003) Recognition of freehand sketches using mean shift. In Proceedings of the ACM
(IUI’03), Miami, Florida, USA, pp. 204–210.
Quality Evaluation of
Masateru Nagata and Jasper G. Tallada
Faculty of Agriculture, University of Miyazaki, Miyazaki,
889-2192, Japan
1 Introduction
1.1 Overview of production
The strawberry (Fragaria x ananassa Duch.) is well known for its attractive appearance,
unique flavor, and nutritional content (Hancock, 1999). It has achieved widespread
popularity, being included in the regular diets of millions of people as a fresh table
fruit, a garnish for salads and cakes, a filling when processed into jams and pastes, a
unique flavoring for ice creams, milk shakes, candies and juices, and in many other
preparations. The world production of strawberries (Table 11.1) is led by the United
States, followed by Spain and Japan. While the demand for the fruit in Japan is generally
stable at more than 200 000 tonnes annually, the production costs are expected to
increase due to aging farmworkers and a declining interest in agriculture (MAFF, 2004).
This is supported by some estimates of the labor requirement in Japan for Toyonoka
and Nyoho varieties that reveal substantially high rates, which may range from 2000 to
2500 hours for a 0.1-ha farm area (Fushihara, 1998). About 60–65 percent (or 1200–
1500 h) of the labor is devoted to harvesting, sorting, and packing operations (Bato,
2000). These operations are tedious and highly repetitive, and must be done within the
short harvest season. Inasmuch as the strawberries are normally harvested fully ripe to
maximize sweetness and flavor, extra care must be taken to avoid mechanical damage,
which adds further complications to their processing.
1.2 Necessity for quality measurement in Japan
Japan produces strawberries of several varieties, mainly under a forcing culture (either
annual hills or bench-culture methods) according to the region in which they are popularly grown (Figure 11.1). Some of these varieties are Reiko, Toyonoka, Nyoho,
Tochitome, Akihime, and Sachinoka (Bato, 2000). Although most strawberries have
common general characteristics, each variety is graded according to the ripeness, size,
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
266 Quality Evaluation of Strawberries
Table 11.1 Strawberry production in metric tons of selected top producing countries.
United States of America
Korea Republic of
Russian Federation
United Kingdom
Serbia and Montenegro
862 828
343 105
205 300
180 241
171 314
195 661
141 130
128 000
130 000
104 276
105 000
70 612
52 000
57 819
34 400
46 100
32 072
25 057
27 300
19 142
749 520
326 000
208 600
185 000
242 118
184 314
130 688
125 000
117 000
110 130
90 000
68 137
35 000
52 737
33 800
36 500
34 518
34 696
29 700
22 934
893 670
328 700
210 000
185 000
154 830
153 962
142 032
130 000
120 000
110 000
90 000
68 137
52 000
48 700
39 200
37 000
36 000
34 811
29 700
27 562
Source: FAO, United Nations.
Figure 11.1 Strawberries in Japan are mainly grown in greenhouses, using a forcing culture and
annual-hill system.
and shape, based on standards defined by the Japan Agricultural Co-operatives (JA)
in each prefecture. The grades will command different prices in the market and hence
have different packaging requirements. As explained earlier, strawberry harvesting and
processing are mostly performed manually. Because of the need for careful handling
Grading of size, shape, and ripeness 267
of the soft fruits, sorting and grading are done manually, and packing into plastic
clamshells according to size and quality grades is accomplished within the farmers’
premises. The packed fruits are immediately placed in holding boxes for later pick-up
by an agent of JA. In some cases the fruits are placed in refrigerated storage if there
are delays in pick-up.
Recently, the government of Japan has earmarked the desire for a product traceability system, which requires an even more sophisticated set of mechanisms for the
accurate measurement of fruit properties in order to tie them to their production area
and cultivation methods. This can enhance the safety of foods and provide people with
apt information about what they consume, but it will also bring greater challenges and
opportunities for the agricultural sector – especially in implementing the procedures to
enhance the general quality of products. Meanwhile, some preliminary developmental
research studies have been initiated on robotic harvesting of strawberries that works
within the framework of a product traceability system (Kondo et al., 2005). Nevertheless, a commercial working system that can integrate the previous technologies is yet to
be developed. For some years now computer vision has been successfully employed for
high-speed grading of a number of fruits, and it is certainly an enabling technology to
address the labor problems and successfully automate an on-line post-harvest system
for strawberries.
1.3 Computer vision technologies for quality measurement
To address the special requirements for processing of the strawberry fruits, computer
vision-based studies were initiated to investigate automation of the measurement of
fruit qualities. Generally, the techniques may be divided into two major categories: the
first deals with the estimation of external quality parameters, such as size, shape, and
the level of ripeness, to judge their class grades; the second category employs more
sophisticated techniques that deal with non-destructive assessment of internal quality
parameters such as firmness, sweetness, and pigment content. Color CCD (chargecoupled device) -based computer vision systems have been successfully employed for
rapid and consistent assessment of external quality parameters in many fruits (Abbott,
1999). Since spectrometric optical methods are accurate for chemometric analysis of
internal quality, a hyperspectral imaging or imaging spectrometry system offers an
even more reliable approach to configuring specific components of a multispectral
imaging application, especially for evaluating spatially variable characteristics such as
bruising and pigmentation distribution.
2 Grading of size, shape, and ripeness
2.1 Standards for quality grades
For each particular variety, strawberries in Japan are graded based on size, shape,
and ripeness, by standards such as that shown in Figure 11.2, to determine packaging
268 Quality Evaluation of Strawberries
Grade A with bright red color
Grade A with pale red color
Grade B with bright red color
Grade B with pale red color
Grade A limit
Head with green color
Odd shape
Substandard fruits
up with
Figure 11.2
Top layer
Bottom layer
⫹ 14
Top layer
Bottom layer
⫹ 18
Each variety of strawberry is graded according to ripeness, size, and shape.
requirements. These in turn will determine the handling operation, the market price,
and, consequently, the gross sales of the product. Since the uniformity of appearance
and the manner of presentation greatly affect the consumers’purchase decision, farmers
devote time to grading the fruits carefully and properly immediately after harvesting,
regardless of the time that this may take. To reduce the time, several studies have
been conducted to grade strawberries using computer vision; these are discussed in the
following section.
2.2 Preliminary study for size and shape judgment
The earliest study dealt with judgment of the shape of the fruit (Figure 11.3), and
classified the fruit into three classes for the fresh food market (Nagata et al., 1996).
Strawberries were manually placed in a rotary tray conveyor and images were taken
using a color CCD camera (ELMO EC-202 II; Elmo Co., Ltd., Nagoya, Japan), as
shown in Figure 11.4.
A general threshold was selected to isolate the strawberry from the background
using the CIE (Commission Internationale de L’Eclairege) L*a*b* color model, since
the color planes of a* and b* were not strongly affected by changes in illumination
levels (Cao et al., 1999). From the binary image of the fruit, six features (lengths of
lines in pixels) were extracted from the berries (Figure 11.5) by image processing.
Parameter Wmax was the longest horizontal line across the fruit, and the height H was
the perpendicular distance from the bench line to the tip of the fruit. The rest of the
parameters were computed as shown in Figure 11.5.
Grading of size, shape, and ripeness 269
Grade A
Grade B
Grade C
Figure 11.3 Patterns for the judgment of shape grades of strawberry fruits for the fresh food market: A,
Excellent; B, Good; C, Reject.
Figure 11.4 Automatic sorting system for strawberries: (1) rotary table (2) CCD camera; (3) computer;
(4) neural network classification model; (5) sorting robot; (6) grading table.
Figure 11.5 Linear features extracted from strawberries for shape classification.
270 Quality Evaluation of Strawberries
The slant length L was the distance between the midpoints of Wmax and W1 .
Dimensionless parameters K1 to K6 were computed based on the following formulas:
K 1 = W 1 /W max
K 2 = W 2 /W max
K 3 = W 3 /W max
K 4 = W 4 /W max
K = H/W max
⎩ 5
K 6 = L/H
The parameters K1 to K6 were used as inputs to a three-layer artificial neural network
used as the shape classification judgment model. Tests were conducted using four
varieties of strawberries, namely Reiko (122 pieces), Toyonoka (187 pieces), Nyoho
(170 pieces), and Akihime (167 pieces), consisting of several sizes of fruit. Since the
algorithm requires careful upright positioning of the fruit, several angular departures
from the vertical were explored, from 0◦ to 15◦ clockwise and counter-clockwise, at
intervals of 3◦ . The results showed that the judgment accuracy ranged from 94 to 98
percent, with misclassification occurring mostly when the strawberries were maximally
displaced from the upright orientation.
For size judgment, a simple regression analysis sought a linear relationship between
the projected area of the binary fruit image and the measured fruit mass. The results
showed high coefficients of determination (93–97 percent) for several varieties of
strawberries (Cao et al., 1996).
2.3 Advanced techniques for size and shape judgment
A subsequent study aimed to improve the accuracy of judgment by considering fruits at
any orientation (Bato et al., 1999; Nagata et al., 2000). This time, however, a different
approach was employed to isolate the bench line by first computing the moment center
of the fruit and then successively drawing exploratory lines at the intervals of 30◦ ,
15◦ , and 1◦ respectively until the maximum diameter was found (Figure 11.6). The
algorithm then proceeded as discussed above. Some exploratory tests were performed
using several sizes of Akihime strawberries to determine the stability of the shape
parameters at varying angles of presentation. The results showed that the computed
parameters did not differ significantly (at the level of confidence of 95 percent) with
the upright-oriented (0◦ ) berries.
The study also aimed to develop a simple system with a computer program for size
and shape judgment of Akihime strawberries in an on-line fashion (Bato et al., 2000).
This time, however, a simpler heuristic for shape judgment was used based on the area
ratio (area of projected binary image of the fruit above Wmax to the area of the enclosing
rectangle MNOP), as shown in Figure 11.7.
Cut-off points were used to classify the sizes of the berries, based on the total
projected area of the fruits, which were reckoned from a calibration. Test results showed
remarkable near-perfect classification accuracy in both size and shape judgments. A
diagram of the on-line system is shown in Figure 11.8, and a flowchart of the algorithm
depicting the operating program for the grading operation in Figure 11.9.
Grading of size, shape, and ripeness 271
Frozen image
Moment center
15° lines
30° lines that
intersect the calyx
Shortest 1° line
1° lines
30° lines
Moment center
diameter, Wmax
Wmax and
height, H
Figure 11.6 An algorithm to calculate the maximum diameter W max and the height H at any fruit
Grade A
Grade B
Grade C
Figure 11.7 Diagram for the classification criterion area ratio (Ra) for shape grades A, B, and C of
strawberry of Akihime variety.
Sorting system
Belt speed
3000 mm
Figure 11.8 Pictorial diagram of a belt-type strawberry sorting system.
272 Quality Evaluation of Strawberries
Freeze image and binarize
Direction part
Extract RGB image frames
Calculate area A, locate moment
center and scan boundaries
Locate green calyx and
reference line
Find the shortest 1° diagonal
Locate the Wmax, and H
Compute for the area ratio RA
RA ⬍ 0.745
Display ‘‘Grade A’’
Judgment part
RA ⬍ 0.837
Display ‘‘Grade B’’
Display ‘‘Grade C’’
A < 1240 mm2
A < 1949 mm2
Display ‘‘3L’’
Shape ⫽ A
Display ‘‘1L’’
Display ‘‘2L’’
No robot arm movement
Separation part
Shape ⫽ B
Move fruit to left side
Shape ⫽ C
Move fruit to right side
Robot arm moves to initial position
Figure 11.9
Flowchart of grading program for the on-line strawberry sorting system.
2.4 Grading of ripeness
Ripeness grades of strawberries are gauged according to the extent of coverage (as a
percentage) of red coloring on the fruit surface. Although working with the red plane
of the RGB (red, green, blue) color system, on which digital images are based, seems
to be a direct option, the a* plane of the CIE L*a*b* color model provides a systematic
and more consistent level of color because of decoupling of the luminance information
Detection of bruises and fecal contamination 273
Figure 11.10 Ripeness level is determined by computing the ratio of number of red pixels to the number of
pixels of the berry object: (a) original image; (b) berry object; (c) red pixels.
from the hue values. This is especially useful when small changes in lighting intensity
are expected. Figure 11.10 shows an image of a sample processed for judgment of
By taking the ratio of the red color pixels at a certain threshold to the total area (pixels)
of the binary fruit image, the level of ripeness can be estimated by the equation (11.2):
Ri =
× 100
where Ri is the percentage ratio as measurement of the level of ripeness for sample i,
Dr is the number of red pixels after thresholding in the a* plane, and Da is the total
number of pixels of the fruit berry object, which is easily obtained by thresholding on
the L* plane. Thus, it is necessary to use two image planes to obtain successfully the
fruit object and the “ripe” pixels. The complications introduced by the calyx can be
disregarded, since farmers normally use only the visible parts of the berry.
3 Detection of bruises and fecal contamination
3.1 The importance of detecting bruises
Farmers normally hand-harvest strawberries at a fully-ripe stage to maximize their
sweetness and flavor. However, owing to their delicately small size and soft texture
they are quite prone to mechanical damage caused by bruising, which can happen at
several points during processing. Compression internal bruises occur primarily during
harvesting, sorting, and packaging, when the fingers of workers press too hard on
the fruits, or when the fruits are stacked too high, leading to high lateral pressures
after being placed in trays or bins. This type of bruising is difficult to detect because
it exists beneath the fruit skin, which renders the color vision approach somewhat
inadequate because of small and inconsistent differences between the color levels of
bruised and non-bruised tissues. Meanwhile, other bruise types, such as impact and
vibration bruises, may also occur, but these are relatively negligible.
274 Quality Evaluation of Strawberries
The presence of latent defects such as bruises not only causes the strawberries to fail
to fulfill the quality expectations of the consumers, but also incurs a loss of trust and
confidence, and subsequent economic losses, in bringing products of lesser quality
from the source to the retail market. Bruises on the berries may even raise issues
of product safety, since they may harbor pathogens that can be deleterious to human
3.2 Color imaging for bruise detection
Early studies were performed evaluating the feasibility of using color machine vision
and near-infrared imaging to detect bruises on strawberries (Shrestha et al., 2001;
Shrestha, 2002). Several levels of controlled bruising force were applied to some
100 fully-ripe Akihime strawberry fruits through a flat steel tool in a universal testing machine. Using a Sony DCR-VX1000 CCD video color camera (Sony Co., Tokyo,
Japan.) under a specially designed ring fluorescent light assembly that reduced the specular reflections, 8-bit RGB color images of bruised and non-bruised fruits were acquired
and analyzed by Adobe Photoshop software (Figure 11.11). The study also aimed to
compare bruise development between berries held at room temperature (18◦ C and 65
percent relative humidity) and those stored at low temperatures (2◦ C and 95 percent
relative humidity) in order to understand the effects of storage time. The performances
of RGB and CIE L*a*b* color systems in detecting bruises were likewise compared.
The results showed that force levels exceeding 2 N caused some color changes, based
on RGB chromatic image differences, between bruised and non-bruised tissues of the
berries; however, the changes were not enough to detect the extent of bruises, which
Figure 11.11 The color-image capturing system for bruise detection: (a) color video camera; (b) color
monitor; (c) ring fluorescent lamps; (d) sample stage; (e) strawberry fruit sample; (f) light intensity
regulator; (g) computer.
Detection of bruises and fecal contamination 275
thereby necessitated the use of a second color system. Since the strawberries were
mostly red in color, the a* plane of CIE L*a*b* provided a more efficient means to
detect bruises, particularly those of more than 48 hours in age. Reduction of a* difference values of about 6 points between bruised and non-bruised tissues 24 hours after
bruising, and 25 points 48 hours after bruising, were found to be sufficient to suggest
a possible method for the detection of bruises. However, no conclusive technique was
achieved using this approach.
3.3 NIR imaging for bruise detection
Another preliminary study used a near-infrared (NIR) spectrometer and a NIR imaging camera to identify optimal wavelengths for bruise detection (Nagata et al., 2001).
Using a Hamamatsu PMA-11 spectrometer (Hamamatsu Photonics, Japan) with spectral sensitivity from 600 to 1000 nm and a NIR light source C1358-02 (Hamamatsu
Photonics, Japan) incorporating an 800-nm high-pass filter, spectral reflectance scans
from bruised and non-bruised tissues of strawberries with several levels of bruising
were obtained. The raw reflectance plots showed minimum differences of intensities at
around 860 nm for the different levels of bruising with respect to non-bruised tissues,
but large differences were observed at the range of 945–975 nm. This was used as the
basis for selecting 860-nm and 960-nm high-pass filters for the NIR imaging camera.
A near-infrared imaging system was developed based on a Vidicon C2741-03H
(Hamamatsu Photonics, Japan) camera fitted with high-pass filters under NIR lighting
provided by two units of C1358-02 (Hamamatsu Photonics, Japan). The set-up was
placed inside an air-cooled illumination chamber covered with heavy black curtains to
eliminate stray external light. Images were obtained using the two filters mentioned
above, and a judgment algorithm based on image subtraction was devised. An alternate
algorithm used a reference image of a “standard” non-bruised strawberry as a basis
from which bruises on fruits were detected. Although the results showed some promise,
further studies were recommended.
3.4 Hyperspectral imaging for bruise detection
3.4.1 Hyperspectral imaging set-up and analysis of data
A consequent VIS/NIR hyperspectral imaging study based on a Varispec liquid crystal
tunable filter (LCTF, Cambridge Research and Instrumentation, Inc., Woburn, MA,
USA) was carried out to identify specific wavelengths and evaluate several judgment
algorithms for the detection of bruises on strawberries (Nagata et al., 2006). Some 120
Akihime strawberries at two levels of ripeness were carefully obtained. On the fruits,
some six levels (0 to 3 N) of bruising were controllably applied using a universal testing
machine as shown in Figure 11.12 (STA-1150; Orientec Corporation, Japan).
Spectral images were acquired using an Apogee AP2E 14-bit monochrome CCD
camera (Apogee Instruments, Inc., Auburn, CA, USA) with the CRI VIS/NIR Varispec
LCTF (650 to 1100 nm with 10-nm FWHM bandwidth; Cambridge Research and
Instrumentation, Inc., Woburn, MA, USA). Lighting was provided by a Dolan–
Jenner Fiberlite PL950 quartz tungsten-halogen light source (Dolan–Jenner Industries,
276 Quality Evaluation of Strawberries
Figure 11.12 The Orientec STA-1150 universal testing machine comprised the bruising set-up for a
strawberry using a 25-mm diameter ball tip (inset).
St. Lawrence, MA, USA) that had smooth emission in the NIR range. The entire system
was placed in an air-cooled (22◦ C) and light-tight chamber (Figure 11.13).
Using MATLAB version 6.5 and its Image Processing Toolbox (The Mathworks,
Inc., Natick, MA) for all image-processing tasks, 8-bit mask images were derived from
the 980-nm spectral images and used to manually mark off bruised and non-bruised
pixels of tissues for all the fruits (Figure 11.14).
Spectral data were obtained from flat-field corrected images automatically, employing an image-processing script and using the following equation:
Inorm (x, y) =
Isample (x, y) − Idark (x, y)
Ireference (x, y) − Idark (x, y)
where Inorm (x, y), Isample (x, y), Ireference (x, y), and Idark (x, y) are the flat-field corrected,
sample, reference, and dark images, respectively, for a pixel at location coordinates
(x, y); m is a factor based on the mean diffuse reflectance of the reference panel for the
Spectralon panel having 99 percent reflectance (SRT-99-050, Labsphere, Inc., North
Sutton, NH), which was assumed to be 1.0 for simplicity.
Detection of bruises and fecal contamination 277
Figure 11.13 The VIS/NIR hyperspectral imaging set-up: (a) Apogee AP2E camera; (b) Nikkor lens; (c) CRI
Varispec liquid crystal tunable filter; (d) cooler exhaust; (e) Dolan–Jenner Fiber-Lite light source; (f) Varispec
controller box; (g) sample stage.
Apogee CCD
Wavelength selection
980 nm
Assessment of bruise injury
Figure 11.14 Pictorial diagram of hyperspectral image data acquisition and analysis for identification of
optimal wavelengths for bruise detection on strawberries.
278 Quality Evaluation of Strawberries
Relative reflectance
Fully-ripe, non-bruised
70–80% ripe, non-bruised
Fully-ripe, bruised
70–80% ripe, bruised
640 660 680 700 720 740 760 780 800 820 840 860 880 900 920 940 960 980 1000
Wavelength, (nm)
Figure 11.15
Spectra of bruised and non-bruised strawberry fruit tissues at different ripeness levels.
3.4.2 Results of bruise detection
The plot of the relative reflectance spectra illustrated in Figure 11.15 shows a strong
absorption at about 675 nm owing to chlorophyll, and at around 960–980 nm owing to
moisture. By using stepwise linear discriminant analysis, optimal wavelengths were
selected at 825 and 980 nm that were consequently used by the three bruise judgment
algorithms, namely linear discriminant analysis (LDA), normalized difference (ND),
and back-propagation three-layer architecture artificial neural network (ANN). The
825 nm wavelength was close to the peak of the spectra that may have provided a
reference wavelength for the computations. The 980 nm wavelength was likely selected
because of the increased and localized absorption of light at this wavelength by the
expelled water from bruised tissues.
Figure 11.16 shows an example of judgment using the three methods, with ANN
covering the widest area of detected bruising, followed by the ND method. The three
methods had similar patterns of performance, with ANN giving the best classification
accuracy. However, the ND method was found to be more attractive because of its
robustness and simplicity of implementation in an on-line system. The study had also
identified thresholds of 1.0–1.5 N for the 70–80 percent ripe and 0.5–1.0 N for the
fully-ripe strawberries. The study also demonstrated that the detected extent of the
bruised area decreased considerably with storage time, possibly due to some changes
occurring in the tissues – such as re-absorption of expelled cellular materials by tissues
adjacent to the bruised areas on the fruit.
Estimation of firmness and soluble-solids content 279
Figure 11.16 Bruises on a strawberry fruit sample detected using three methods of judgment on a 70–80%
ripe strawberry receiving a 2.0-N bruising force: (a) original; (b) results by linear discrimnant analysis;
(c) results by normalized difference method; (d) results by using artificial neural network.
3.4.3 Detection of fecal contamination
Increasing concerns regarding the safety of food prompted authorities at the Food and
Drug Administration in the US to recall, in 1997, a major shipment of strawberries
scheduled for a school lunch program, when an association with an outbreak of hepatitis A virus was suspected (GAO, 2000). Fecal contamination of fruit is a major concern
in the US because of the open-field method of cultivation of strawberries. Vargas et al.
(2004) reported a fluorescence imaging study to detect fecal contamination on strawberries at 1 : 10, 1 : 50, and 1 : 100 dilution levels. A hyperspectral imaging system that was
capable of both fluorescence and reflectance modes was used. It had a Specim Imspector v1.7 spectograph (Spectral Imaging Ltd., Oulu, Finland) with sensitivity ranging
from 425 to 951 nm. An ultraviolet-A fluorescent light source, which was short-pass
filtered through a UG1 Schott Glass filter, provided excitation for fluorescence. Fecal
contaminants were noticeable at 680 nm, which is well known for fluorescence emission caused by chlorophyll a. A greater improvement in the contrast of contaminants
was achieved by obtaining the ratio of images at 680 nm to the images at 745 nm.
4 Estimation of firmness and soluble-solids
4.1 The importance of the measurement of internal quality
Recent consumer trends have led to increasing demands for satisfaction of expectations
regarding food quality, and for achieving a relatively homogenous assortment of fruits
in a particular batch – for example, packed fruits in a clam shell or fruit tray having
a similar level of sweetness. The fruit industry operators lever this for marketing by
providing more information and thresholds regarding quality as part of their guarantee.
In Japan, for example, along with external quality features, peaches and mangoes
are also now being graded based on sweetness in some on-line systems. A guarantee
of sweetness is stamped on the boxes to make the consumers aware of the quality
280 Quality Evaluation of Strawberries
of contents. This technological trend accentuates the increasing importance of nondestructive measurement of some internal quality parameters of fruits.
Spectrometric approaches, while useful and effective for non-destructive estimation
of internal quality, can only provide partial information, because the measurements
are confined to a relatively small patch or area of a fruit’s surface. Hyperspectral
imaging or imaging spectrometry is now being explored to overcome this limitation
by taking advantage of spatially resolved spectral information. It may also be used to
identify specific wavelengths to configure a multispectral imaging system for such an
Two studies have used hyperspectral imaging in the near-infrared range for nondestructive measurement of fruit firmness as an estimate of texture, and of solublesolids content as an indicator of sweetness; these are discussed in the following sections.
4.2 Measurement of firmness
Nagata et al. (2005) employed the same hyperspectral imaging system used in the earlier
study on bruise detection in strawberries for the non-destructive estimation of firmness
of strawberries. Hyperspectral images (650–1000 nm at 2-nm intervals for 71 images
in a set) were taken from 125 strawberries (Akihime variety) at three levels of ripeness
(50–60 percent, 70–80 percent, and fully-ripe levels). The berries were then subjected
to firmness testing using a universal testing machine, the Orientec Universal Testing
Machine STA-1150 (Orientec Corporation, Japan), which drives a 3-mm cylindrical
steel tip into the center of the fruits at 100 mm/min.
After flat-field correction on the images (equation (11.3)) had been applied, relative reflectance values were extracted and averaged on a small circular area on the
fruit surface most likely coincident with the point where firmness tests were made.
Stepwise linear regression was applied to identify a limited set of wavelengths and to
develop models for estimation, utilizing 60 percent of the samples for calibration and
the remaining 40 percent for validation (prediction). The following parameters were
computed to describe the goodness of fit of the models:
(Y i − Ŷ i )2
NC − p − 1
(Y i − Ŷ i − bias)2
NP − 1
bias =
Ŷ i
where SEC and SEP were the standard errors for calibration and prediction, respectively; Yi and Ŷi were the measured and predicted values, respectively, of firmness for
Estimation of firmness and soluble-solids content 281
Firmnes, (MPa)
Level of ripeness
Figure 11.17 Profile of firmness at three ripeness maturity levels of strawberry.
Table 11.2 Characteristics of prediction models for estimation of firmness.
70% to fully-ripe maturity level fruits group
680, 990
680, 990, 650
50% to fully-ripe maturity level fruits group
685, 985
685, 985, 865
Predictors (nm)
Standard error of calibration, MPa
Correlation coefficient of calibration
Standard error of prediction, MPa
Correlation coefficient of prediction.
each sample i; NC and NP were the number of samples in the calibration and prediction
sets, respectively; and p was the number of parameters or wavelengths in the model.
Figure 11.17 shows the firmness profile of the samples, with higher data variability
occurring in less ripe berries. The results of calibration model development for each
of the two groups of samples (the 70–80 percent to fully-ripe sample group, and the
group consisting of the entire sample) are shown in Table 11.2.
Since a wider range of ripeness seemed to give better estimates of firmness, careful
attention should be placed on the selection of samples for study. Figure 11.18 shows
calibration plots for the two groups of strawberries examined in the study. An important aspect of analyzing spectral data that has been gathered at close intervals is the
282 Quality Evaluation of Strawberries
Predicted firmness (Mpa)
Predicted firmness (Mpa)
SEP = 0.258
Rp = 0.599
Measured firmness (MPa)
SEP = 0.350
Rp = 0.786
Measured firmness (MPa)
Figure 11.18 Prediction plots of the three-wavelength models for estimation of firmness in (a) 70% to
fully-ripe and (b) 50% to fully-ripe strawberries groups.
10ñ20% Ripe 30ñ60% Ripe
Pseudo color
Figure 11.19 Gray-scale version of pseudo-color maps showing the relative distribution of anthocyanin
pigmentation in strawberries.
occurrence of high multicollinearity between contiguous wavebands that seem to make
the selection of a set less than optimal. As an additional aid for further validation of
the results to provide an even wider range of choices, a robust regression search was
performed taking all possible combinations of wavelengths in a set of two and three
wavelengths. From this computationally intensive process, the selected wavelengths
were confirmed to be really optimal, and possible ranges of wavelengths could be
selected from the following: 665–685 nm, 755–870 nm, and 955–1000 nm.
Estimation of anthocyanin distribution 283
Table 11.3 Characteristics of prediction models for estimation of soluble
solid content.
Predictors (nm)
910, 695
910, 695, 680
910, 695, 680, 885
910, 695, 680, 885, 690
Standard error of calibration, %Brix
Correlation coefficient of calibration
Standard error of prediction, %Brix
Correlation coefficient of prediction.
4.3 Measurement of soluble-solids content
Using a generally similar set-up system with the same procedures, a study for the estimation of soluble-solids content (SSC) as an estimate of the sweetness or sugar content
of Akihime strawberries was conducted (Nagata et al., 2005; Kobayashi, 2006). After
the collection of hyperspectral images, the strawberries were sliced and the juice was
hand-expressed, using gauze, for SSC analysis with a digital refractometer Brixmeter
RA-410 (Kyoto Electronics Manufacturing Co. Ltd., Japan) expressed as Brix degree.
This time, however, the second derivative of the logarithmic absorbance transform of
the relative reflectance was applied, and the results of the model development are shown
in Table 11.3.
5 Estimation of anthocyanin distribution
5.1 Necessity of measurement of anthocyanin content
Strawberries contain a fairly large amount of anthocyanin pigmentation (a flavonoid)
throughout their flesh, causing their remarkable redness of color. With the recent
interest in the healthy benefits of polyphenols, a technique was in demand to
non-destructively quantify the distribution of anthocyanin within the fruits.
5.2 System set-up and sample analysis
A hyperspectral imaging study (Kobayashi, 2006; Kobayashi et al., 2006) was conducted in the visible range using an Apogee AP2E camera (Apogee Instruments, Inc.,
Auburn, CA, USA) fitted with a Varispec liquid crystal tuneable filter (VS-V153-10HC-20, Cambridge Research and Instrumentation, Inc., Woburn, MA, USA) with four
units of Direct Light (SI Seiko Co., Ltd., Matsuyama, Japan) light sources having integral polarizing and infrared filters. To efficiently remove halation or specular reflections
on the fruit surfaces, a polarizing filter was also attached to the camera lens. Some 120
Akihime strawberries were collected at levels of 10–20 percent ripe, 30–60 percent ripe,
284 Quality Evaluation of Strawberries
Table 11.4 Characteristics of prediction models for estimation of anthocyanin content.
Predictors (nm)
508, 506,
508, 506, 507
508, 506, 507, 531
508, 506, 507, 531, 533
Standard error of calibration
Correlation coefficient of calibration
Standard error of prediction
Correlation coefficient of prediction.
and 70–100 percent ripe, based on red color coverage. After collection of the spectral
image sets (450–650 nm at 1-nm intervals), each berry was immediately sliced and
pigmentation was extracted for 20 hours using a 50% v/v acetic acid. To quantify the
amount of pigmentation, spectrometer readings (transmittance mode) were taken from
the liquid extracts. Absorbance logarithmic and second-derivative transforms were performed on the spectrometer readings and the mean reflectance spectral data obtained
from the hyperspectral image datasets.
5.3 Model development
Analysis of spectrometric data of the aqueous extracted pigment has shown strong
second-derivative peaks at 504, 509, 512, 518 and 524 nm, from which 504 nm was
selected to provide a relative measure of pigmentation contents (Table 11.4). In turn,
these values (at 504 nm) were predicted in vivo using the hyperspectral image data. The
stepwise linear regression analysis showed that the five-wavelength model comprising
spectral image data from 508-, 506-, 507-, 531- and 533-nm wavelength images could
reasonably estimate anthocyanin with a linear correlation of 0.932 for prediction at a
standard error of 0.213. From this, pseudo-color maps were generated to visualize the
distribution of anthocyanin pigmentation (see Figure 11.19).
6 Further challenges
Computer vision has been successfully employed to quantify some quality parameters
of strawberries. Nevertheless, there are still a number of challenges in completing the
picture of fruit quality assessment by optical methods. While color-based vision systems have already gained ground for on-line quality assessment of a number of different
types of fruit, a similar commercial system is yet to be developed for strawberries that
can quickly grade them according to ripeness, size, and shape. As expected, there are
difficulties in the fruit-handling aspect because of the small and delicate nature of the
References 285
An alternative automated approach that is currently actively being studied in the
University of Miyazaki and Yokohama University is the use of robotic harvesters that
can evaluate in situ whether the fruits are harvestable, and then simultaneously assess
the level of quality to put them into proper grade boxes. This would not only solve
the labor management problem, but also provide valuable feedback on the effects of
cultural operations for the future adjustment of growing conditions. Such an approach
can easily be integrated into a food traceability system to account for the origin of
foods in real time.
Increasingly, work will be conducted on better non-destructive evaluation of further internal quality parameters, to include titratable acidity and its interaction with the
soluble sugar content for a better estimate of taste. Likewise, the presence of physiological and pathological disorders (such as fungal infection causing internal rot) should
be considered in future systems. Other modes of light-matter interaction also need
evaluation, especially for a greater number of varieties of strawberries. A multispectral
imaging system with a common-aperture multiple CCD camera is mostly used as a
good method for quality evaluation, especially when inspection can be accomplished
over the entire fruit surface. The applicability of integrating light-scattering imaging
and fluorescence imaging is yet to be investigated. Finally, most research work focuses
on developing models for single parameter measurement, but it will be more attractive if several parameters are simultaneously estimated using a few wavelengths. This
certainly entails a lot of mathematical optimization, but is nevertheless possible.
7 Conclusions
Computer vision is a necessary and promising technology for analyzing the qualities of
strawberries. Grading fruits according to external quality parameters such as size and
shape can be easily implemented using common color cameras. The non-destructive
measurement of internal qualities such as internal bruising, firmness, and sweetness
requires imaging in narrow bands of wavelengths to comprise a multispectral imaging
system. The central wavelength of filters can be determined either using spectrometry or
through a hyperspectral imaging study. Yet an operational commercial system utilizing
the technology, which seems feasible, is still to be developed. With such as system,
a better assortment of strawberry fruits having consistent quality while satisfying the
requirements of consumers and the processing industry can be achieved in the future.
Abbott JA. 1999. Quality measurements of fruits and vegetables. Postharvest Biology and
Technology, 15, 207–225.
Bato PM (2000) Studies on harvesting and sorting of strawberry using machine vision. PhD
Thesis, United Graduate School of Kagoshima University for Agricultural Sciences,
Kagoshima, Japan.
286 Quality Evaluation of Strawberries
Bato PM, Nagata M, Cao Q, Shrestha BP, Nakashima R (1999). Strawberry sorting using
machine vision. ASAE Paper No. 993162, ASAE, St Joseph, MI, USA.
Bato PM, Nagata M, Mitarai M, Cao Q, Kitahara T (2000) Study on sorting system for
strawberry using machine vision (part 2): development of sorting system with direction
and judgment functions for strawberry (Akihime variety). Journal of Japan Society of
Agricultural Machinery, 62(2), 101–110.
Cao Q, Nagata M, Mitarai M, Fujiki T, Kinoshita O (1996) Study on grade judgment of fruit
vegetables using machine vision (part 2): judgment for several varieties of strawberry
by developed software [in Japanese]. Journal of Science and High Technology in
Agriculture, 8(4), 228–236.
Cao Q, Nagata M, Wang H, Bato PM (1999). Orientation and shape extraction of strawberry
by color image processing. ASAE Paper No. 993161, ASAE, St Joseph, MI, USA.
Fushihara H (1998) Reduction of labor in the field. In The Japan Strawberry Seminar,
pp. 18–24.
GAO (2000) School meal program: few outbreaks of foodborne illness report. GAO Reference No. GAO/RCED-00-53 School meal program. Washington: United States
General Accounting Office.
Hancock JF (1999) Strawberries. New York: CABI Publishing.
Kobayashi T (2006). Basic study on quality estimation and testing for fresh fruits and
vegetables using hyperspectral imaging [In Japanese]. PhD Thesis, United Graduate
School of Kagoshima University for Agricultural Sciences, Kagoshima, Japan.
Kobayashi T, Nagata M, Goto Y, Toyoda H, Tallada J (2006) Study on anthocyanin pigment
distribution estimation for fresh fruits and vegetables using hyperspectral imaging
(part 2): visualization of anthocyanin pigement distribution of strawberry (Fragaria
x ananassa Duchesne) [In Japanese]. Journal of Science and High Technology in
Agriculture, 18(1), 50–57.
Kondo N, Ninomiya K, Hayashi S, Ota T, Kubota K (2005). A new challenge of robot for
harvesting strawberry grown on table top culture. ASAE Paper No. 053138, ASAE,
St Joseph, MI, USA.
MAFF, The Ministry of Agriculture, Forestry and Fisheries (2004) Annual Report on Food,
Agriculture and Rural Areas in Japan. MAFF.
Nagata M, Cao Q, Mitarai M, Fujiki T, Kinoshita O (1996) Study on grade judgment of
fruit vegetables using machine vision (part 1): the development of sorting system and
software for shape judgment on multiple-layer neural network [In Japanese]. Journal
of Science and High Technology in Agriculture, 8(4), 217–227.
Nagata M, Bato PM, Mitarai M, Cao Q, Kitahara T (2000) Study on sorting system for
strawberry using machine vision (part 1): development of software for determining
the direction of strawberry (Akihime variety). Journal of Japan Society of Agricultural
Machinery, 62 (1), 100–110.
Nagata M, Shrestha BP, Gejima Y (2001) Study on image processing for quality estimation
of strawberries (part 2): detection of bruises on fruit by NIR image processing. Journal
of Science and High Technology in Agriculture, 14(1), 1–9.
Nagata M, Tallada JG, Kobayashi T, Toyoda H (2005) NIR hyperspectral imaging for measurement of internal quality in strawberries. ASAE Paper No. 053131, ASAE, St
Joseph, MI, USA.
References 287
Nagata M, Tallada JG, Kobayashi T (2006) Bruise detection using NIR hyperspectral imaging for strawberry (Fragaria x ananassa Duch.). Environment Control in Biology, 44
(2), 133–142.
Shrestha BP (2002) Basic studies on quality estimation and sorting system for fruit vegetables. PhD Thesis, United Graduate School of Kagoshima University for Agricultural
Sciences, Kagoshima, Japan.
Shrestha BP, Nagata M, Cao Q (2001) Study on image processing for quality estimation of
strawberries (part 1): detection of bruises on fruit by color image processing. Journal
of Science and High Technology in Agriculture, 13(2), 115–122.
Vargas AM, Kim MS, Tao Y, Lefcourt A, Chen Y (2004) Safety inspection of cantaloupes
and strawberries using multispectral fluorescence imaging techniques. ASAE Paper
No. 043056, ASAE, St Joseph, MI, USA.
Classification and
Quality Evaluation of
Table Olives
Ricardo Díaz
Instrumentation and Automation Department, Food Technological
Institute AINIA, Paterna (Valencia), 46980, Spain
1 Introduction
The table olive is a food product that is usually consumed as an aperitif or in salads,
and has a wide market in Mediterranean countries. Olives are the fruit of olive trees,
mainly of the variety Olea europea pomiformis. After the ripening process on the tree
during the summer, the olives are picked, and the product is fermented in big tanks to
give the final product. Therefore, its presentation is essential, and it is also essential
to have a homogeneous appearance in terms of size, color, and, above all, the absence
of defects. Traditionally, the people who are in charge of separating the olives into
categories are expert workers, placed on both sides of the processing line where they
separate the olives that present defects.
Monochromatic cameras were used in the first attempts to automate the grading
process of fruit and vegetables (Nimesh and Delwiche, 1994). Davenel et al. (1988)
designed a system for the automatic detection of surface defects on Golden Delicious
apples, which were graded into four grades with a capability of five fruit per second.
The application of color cameras in olive image analysis allowed the characterization
of the most frequent defects in olives and their colorimetric properties (Díaz et al.,
2000). In small fruit such as olives, classification is based on visual characteristics
like the color of the olive skin and the presence of defects. Okamura et al. (1993) used
machine vision to facilitate the mechanization of the selection process of the olives by
means of charge-coupled device (CCD) color cameras and frame-grabber cards.
The main objective in the application of machine vision to the inspection of olives
and other similar small fruit such as cherries is the identification of fruit with defects
for separation (Delwiche et al., 1993). This process is commonly called sorting, where
the fruit are divided into different classes according to their quality.
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
290 Classification and Quality Evaluation of Table Olives
2 Classification of table olives
2.1 Production process
The olive varieties commonly used for processing as table olives include Gorda,
or “the fat one”, (Olea europea Regalis, Clemente); Manzanilla (Olea europea
pomiformis); Whiteleafed, or “Hojiblanca” (Olea europea arolensis), among others
(Garrido-Fernandez et al., 1999). However, Manzanilla is the preferred variety due to
its excellent taste, shape, and color.
Green olives are picked before the ripening cycle, when their color has not changed
and the size is adequate. At this moment the color of the olives is yellowish-green.
The collected olives are treated in lye and finally immersed in brine, where lactic
fermentation takes place (Sanchez et al., 1990). Manzanilla olives have an average
size of 200–280 fruit/kg, and a very good flesh to stone ratio (6 : 1). This product
is harvested in September and is done manually to avoid any damage to the fruit
(Colmagro et al., 2001). A basic flowchart of the process is shown in Figure 12.1, and
involves three classifications:
First classification. Here, the ripe fruit are separated from the green ones. This
classification is very important because the ripe fruit, after fermentation, become
too soft. This process needs a very high analysis speed in order to sort all the
collected olives in a small time period. Consequently, an automated system for
sorting based on differences in the average color of the olive is adequate.
Figure 12.1
Flowchart of olive processing.
Classification of table olives 291
Second classification: After fermentation, the olives have to be classified into four
categories. Each individual fruit is classified according to the size and number
of defects. This is the main problem for the producer companies.
Third classification. Before canning them, producers carefully select the olives
in order to fill the cans with homogeneous products. The olives selected have to
be of the same hue.
The most complex stage is the second classification. A system that is able to perform
this process will also fulfill the requirements for the first and third classifications.
2.2 Classification by quality
When the fermentation process, which is carried out in large tanks, has been completed,
the table olives are classified according to quality (Karaoulanis and Bamnidou, 1995).
The brine olives are taken from the hoppers to the processing line where, after being separated according to their size, they are classified. Manufacturers usually classify olives
using two classes; suitable and unsuitable for consumption. Nevertheless, on some
occasions the producers use three or even four classes. The fourth class of olive is unsuitable for consumption, since it consists of dark olives or those that have large defects
on their surface. Third-class olives have clearly noticeable defects. Although these are
suitable for consumption, they are usually sent to different markets from table olives.
The main difference between the first and second classes is the total absence of defects
in the former; however, these two classes can be combined depending on the season
and the target market. Olives of the four different classes are shown in Figure 12.2.
Figure 12.2 Olives of the (a) first, (b) second, (c) third and (d) fourth classes.
292 Classification and Quality Evaluation of Table Olives
When classification is performed by human operators only two categories are used,
because the operators are located on both sides of the processing line and they remove
only olives with defects while the others continue through the system. However, when
an automatic system is used it is possible to sort olives into more classes (three or four).
2.2.1 Types of defect
The most common defects observed in Manzanilla olives are as follows.
1. Skin damage:
• wired (linear blemish) – shallow linear fissures that appear on the surface of
the fruit but can be revealed through the skin
• scratched – superficial defects that penetrate the flesh, acquiring brown
• wrinkled – olives with low water content (typically collected from the ground)
• peeled – zones of the epidermis of the olive with a round form that presents
brown tonalities
• beaten – superficial lesions that do not penetrate the flesh, and occasionally
have brown tonalities
• hail-affected – circular blemishes produced in olives after a hailstorm.
2. Flesh damage:
• torn – olives that have been broken during mechanical transport in the
process line
• insect-pitting – dark holes in the olives caused by insects.
3. Textural defects:
• soft olives – olives with a soft texture but no significant colorimetric variations.
4. Color defects:
• dark color – olives that differ clearly from the common green color.
Depending on the nature and origin of the above defects, the olives are sorted into
two, three or four different classes as described previously. The frequency of appearance
of defects for a selected caliber (280/230) in a sample of 50 olives of each category
(first, second, third, and fourth) is shown in Table 12.1.
2.2.2 Characterization of defects
Díaz et al. (2004) measured the color of defects using a Miniscan Hunter Lab spectrocolorimeter MS/S-4000S and the data obtained were analyzed using the Hunter
Lab Universal Software 3.1 (Hunter Associates Laboratory, Reston, USA). The
reflected light was measured using color coordinates CIEL*a*b*: L* (luminosity),
a* (red/green), and b* (yellow/blue). The study was carried out using D65 illumination
with an observer at 10◦ . The average color of the defects was measured by cutting up
olive pieces and filling the 80-mm circular box used for the reflection measurement.
In some cases there is a noticeable relationship between defect and class – for
instance, the abnormal color that is related to the fourth class. This class consists
of over-fermented olives with a dark color throughout the whole skin, and green olives
with large stains of dark coloration. As can be seen in Table 12.2, the abnormal color is
related with low values of L*, a*, and b* coordinates. However, torn and broken olives
Classification of table olives 293
Table 12.1 Frequency of appearance of defect in a batch of 200 olives.
Abnormal color
Without defects
Table 12.2 CIELab coordinates of the analyzed defects in olives.
Wired (linear blemish)
Abnormal color
45,22 ± 5,17
41,51 ± 2,94
43,38 ± 2,32
39,07 ± 3,78
45,04 ± 1,63
43,70 ± 4,06
43,18 ± 3,12
49,49 ± 4,09
3,40 ± 1,13
4,12 ± 1,10
4,28 ± 0,98
0,86 ± 1,31
3,45 ± 0,66
4,04 ± 1,15
3,42 ± 0,98
3,31 ± 1,12
19,58 ± 1,97
17,99 ± 3,84
18,00 ± 3,64
10,61 ± 4,78
22,69 ± 2,25
19,42 ± 3,78
18,45 ± 4,73
24,30 ± 2,13
1,40 ± 0,05
1,34 ± 0,07
1,33 ± 0,07
1,43 ± 0,14
1,42 ± 0,02
1,37 ± 0,05
1,38 ± 0,06
1,44 ± 0,05
19,90 ± 2,03
18,50 ± 3,78
18,54 ± 3,57
10,73 ± 4,78
22,96 ± 2,28
19,86 ± 3,83
18,80 ± 4,73
24,55 ± 2,13
are also assigned to the fourth class, but in this case the values of the coordinates L*
and b* are the highest, being very close to the values of the skin of first-class olives.
This is because of the absence of the skin – the inner part of the fruit, which is of a
lighter color, appears in the image.
With colorimetric analysis, peeled olives cannot be detected in the same way as soft
olives because they have the same appearance as good olives. However, wired olives
can be detected by a small but detectable colorimetric variation close to the linear defect
on the surface.
The most common defects in the second and third classes are beaten, scratched, and
hail-affected. Based on the defect size, the olive can be assigned to either of these two
2.3 Industrial needs of table-olive producers
The sorting process for table olives has been performed by human experts for many
years, who removed the poorest-quality olives following inspection made over a selection table. Olive producers now need an industrial system that is able to operate on-line
with good reproducibility.
294 Classification and Quality Evaluation of Table Olives
The olive-harvesting season takes place between October and November, meaning
that a lot of fresh samples are collected, and require analysis, over a short period of
time. After the fermentation process, the olives are analyzed in batches. Each batch
consists of all the olives located inside the tank where the fermentation process took
place. The size of this batch depends on the producer, but a common size is around
1 tonne. An automatic system has to be capable of processing at least two batches per
hour, and therefore, a processing capability of 2 tonnes per hour is the minimum speed
required by the industrial processor.
The number of classes that the olives are separated into depends on the producer and
the target market. However, a minimum of two classes is required, and a maximum
of four classes is used on an industrial scale. In the sorting process, a classification
mistake allocating an olive to an adjacent class can be assumed to happen, but an olive
must never be assigned to a class more than one above or below its correct level.
3 Application of machine vision
3.1 Industrial system for olive sorting
An industrial system for olive sorting is formed by a transport system, a lighting
chamber, an image-processing system, and a rejection system, to send the classified
olives to different lines. As olives have a spherical shape, the transport unit must allow
the linear movement of the fruit with rotation in order for the camera to capture the
whole surface.
In sorting systems it is necessary to singularize the fruit so that they can be separated
after image processing (Gunasekaran and Ding, 1994). Two main kinds of systems are
used: conveyor belts with a mesh where the fruit are allocated, and cylindrical rollers
with holes for the fruit to join a chain. A system based on a belt with rollers is shown
in Figure 12.3. The rollers rotate their own axes in such a way that three images of each
row of olives are captured, to allow analysis of virtually the whole surface area.
To illuminate the olives and avoid shadows in the images a powerful system with
homogeneous light is needed, such as a system based on high-frequency tubes with
low power consumption. The selection and disposition of the illumination system is
Figure 12.3 The camera (1) takes images that are acquired and processed by the machine vision system.
(2) This system sends the parameters of each olive to the control system (3), which decides which class the
olive belongs to and when it should be ejected.
Application of machine vision 295
essential (Paulsen et al., 1986) because the olives are in brine, and its glint can mask
colorimetric information regarding defects. In order to minimize this effect there are
different techniques – using polarizing filters to remove the shine from the wet olives,
and using filtering screens, made of translucent polymeric film, which facilitate lightscattering with different angles onto the samples.
3.2 The image analysis system
The equipment used to capture the images consists of a PC, a XC-003 Sony color
3-CCD camera and a Matrox Meteor RGB frame grabber (Dorval, Quebec, Canada).
A photocell synchronizes image capture by the camera with the movement of the
rollers. Each image captured has a resolution of 768 × 576 pixels, containing 6 rows
of 11 olives (66 olives per image).
3.3 Image processing
The processing of an image starts with the transfer of the image to the specific buffers of
the RAM memory of the PC, where the image-analysis process is applied. After that,
segmentation is performed to determine whether each pixel belongs to background,
skin, or defect. This process requires previous training to generate a Bayesian classification model. Pre-processing is proposed to filter the segmented image in order to
eliminate noise at the edge-detection stage. The following step is shape recognition,
where an algorithm determines the position of the olive in the image. The area and
the average color of each type of skin and defect are calculated. A flowchart of the
image-analysis process is shown in Figure 12.4.
The parameters obtained from each olive in each image are the numbers of pixels
of lighter skin (Skin 1), darker skin or olive profile (Skin 2), light defect (Stain 1),
dark defect such as a bite (Stain 2), and unusual dark color (Stain 3). From 3 images
of each olive, a total of 15 parameters can therefore be obtained. Figure 12.5 is a view
of images following processing.
To build a model, a set of pre-classified olives based on observations of the human
assessment of fruit quality is needed. The larger the set is, the more robust the model
obtained will be. In this case, a model with 400 pre-classified olives was used (Picus
and Peleg, 2000). After the parameters have been drawn out, the influence of each
Figure 12.4 Flowchart of olive-sorting process.
296 Classification and Quality Evaluation of Table Olives
Figure 12.5
After the segmentation process.
parameter on each olive class can be analyzed in more detail. The average, maximum,
and minimum values of each parameter in the four classes are represented in Figure 12.6
in a box–whisker diagram. It can be observed that the parameter of Skin 1 has enormous influence on the first and second classes, but less influence on the fourth class.
The influence is proportional to the quality of the olives. Meanwhile, the parameters of
Skin 2 have an important influence on the third class, but less influence on the second
and fourth classes. This makes sense, because this parameter tries to collect the colorimetric information of the dark skin in the olive profile, which coincides with the color
coordinates of some typical defects of the second and third types. The parameters of
Stain 1, Stain 2, and Stain 3 all have an important influence on the fourth class, whereas
those of Stain1 have more influence on the second class and of Stain 2 on the third class.
3.4 Classification algorithms
In order to perform the classification, it is necessary for a system to be able to learn
from olives that have been pre-classified by professional people; the knowledge of
experts must be used to train the system to reproduce the classification process.
One of the techniques most used for the classification process of images is discriminant analysis. Tao et al. (1995) used linear discriminant analysis in a grading system to
classify fruit and vegetables into different classes, while Steinmetz et al. (1999) used
non-linear discriminant analysis for the classification of peaches.
Yang (1993) used three-layered, 9-6-3, neural networks for the classification of apples
with a classification accuracy of 96.6 percent. Nagata and Qixin (1998) developed a
grading system for fruit and vegetables using neural network technologies, obtaining a high level of accuracy for strawberries and green peppers (94–98 percent and
Application of machine vision 297
Skin 1
Skin 2
Stain 1
Stain 2
Stain 3
Figure 12.6 Box–whisker diagrams of the five parameters used for the sorting process for the four
298 Classification and Quality Evaluation of Table Olives
89 percent, respectively). Díaz et al. (2004) applied three different learning and classification algorithms – the classical Bayesian discriminate analysis, the partial least
square multivariant discriminant analysis, and neural networks with a hidden layer –
to extract parameters, and compare and analyze the grading results in olives. The best
results for olive image processing were obtained by applying these three different classification techniques. From the population available, half was used to train the three
types of algorithms and the other half to validate the results.
Bayesian discriminant analysis (Chtioui et al., 1996) is one of the classical techniques
most widely used in the classification of vegetable products by means of machine
vision. The Mahalanobis distance version to the 1-k nearest neighbors is usually used.
This algorithm is a simplification of the Bayesian classificator, due to the assumption
that the covariance matrices of each class are equal. The function that calculates the
distance from a new vector to the centroid of every class is:
d M (X, i) = (X − mi )T C −1
i (X − mi )
where i is the class, X is the array of characteristics of each olive: XN = [X1 . . . XN ],
mi is the mean array of the characteristics of the i class, and Ci is the covariance matrix
of the i class.
In order to obtain the average vectors of each class and the covariance matrix, it is
necessary to carry out previous training of the system to characterize the recognizer.
Normalization is performed by adding the results of the three views of each parameter
and dividing by the total area.
Statistical projection methods, such as principal component analysis (PCA) and
partial least squares (PLS), compress the most important information from the original
data into fewer variables called principal components. These methods are referred to
as projection methods, because they project the data into a new principal component
system. The PLS prediction method has the advantage that, in comparison with other
classical methods, it does not require complex mathematical operations that involve a
huge computational load. This is because it does not require the operation of matrix
inversion. Moreover, PCA and PLS are able to calculate data with a certain level of
noise, because these methods only extract the relevant information.
The PLS discriminant technique (PLS-DA), which is a variant of PLS, has been
used because this method allows classification of the samples into specific categories
or classes, whether previously introduced or not.
The main advantage of an artificial neural network is its flexible structure. Although
complex, this method allows adaptation to the conditions of a classification problem
with certain accuracy. With a suitable structure of the neural network and an optimum
training process, an adapted classification system can be obtained to classify olives.
The artificial neural network used is a network of three layers – an input layer, a
hidden layer, and an output layer. The input layer tested with olives was made up of
15 neurons with every one corresponding to each parameter. The hidden layer was also
made up of 15 neurons, connected to all the input neurons and output neurons. Finally,
there were 4 neurons in the output layer, each for an olive category.
The learning function used to train the neural network is resilient back-propagation
(Rprop). This function is an adaptive local process that performs supervised batch
Application of machine vision 299
learning in multilayer perceptions (Riedmiller and Braun, 1993). The basic principle
of the Rprop is to eliminate the harmful influence of the size of the partial-error derivate
on the weight step. For this reason, the sign of the derivate is only taken into account
to indicate the direction of the weight update. In this way, the neural network is able to
adapt its structure in order to minimize the error function between the target outputs
and the network outputs in a fast and, at the same time, safe way.
3.5 Classification results in a real-time sorting system
In the research carried out by Díaz et al. (2004), validation samples were separated into
four different batches corresponding to the four available categories. In this way, it is
possible to analyze the failures in each batch and to see which class they are assigned
to, because although it is acceptable for a second-class olive to be classified as first
class, a fourth-class olive must not be allocated to the first class.
In Tables 12.3–12.5, the four batches of olives are indicated in the rows that correspond to the four categories, previously classified by experts. The columns indicate
the number of olives that each technique assigned to each category. Finally, the last
column shows the success rate that each technique achieved in comparison to the
previous classification performed by the expert.
The results of the application of the Mahalanobis algorithm are shown in Table 12.3.
It worked quite well in detecting olives of the poorest category, but failed to work for
the first- and second-class of olives.
Following PLS discriminant analysis of the training samples, the three principal
components obtained contained 80 percent of the data variability. This technique allows
representation of the scores to analyze the relationship among the samples (Figure 12.7).
All the parameters extracted from the olive images were used as variables of the model,
and were previously centered and scaled. The results of the classification using the
PLS discriminant algorithm are shown in Table 12.4.
Before training the neural network, it is necessary to select the activation functions of
the neurons and the characteristic parameters of the algorithm. The activation function
selected is the typical “logistic” function. The parameters of the learning function are
the initial update-value (0 ), the limit for the maximum step (max ) and a parameter
related to the weight-decay to improve the error output (α). The following values
were given to these parameters: 0 = 0.2, max = 10 and α = 4. The neural network
application results are shown in Table 12.5.
By analyzing these results in detail, it can be explained that the principal deviations
have been olives classified accidentally to adjacent classes – first-class olives allocated
Table 12.3 Classification results with the Mahalanobis algorithm.
% Success
300 Classification and Quality Evaluation of Table Olives
Class 1
Class 2
Class 3
Class 4
t [2]
Figure 12.7 Plot of the scores of the first and second principal components. Scores of the first and second
classes are overlapped, while the third and fourth classes are more separated.
Table 12.4 Classification results with PLS discriminant.
% Success
Table 12.5 Classification results with neural network with a hidden layer.
% Success
to the second class, or vice versa. However, this discrimination is not usually made in
the sector industries that habitually consider these categories as a single class, because
olives from the second class only have defects of a very small size.
The Mahalanobis distance algorithm achieved good classification percentages for
the first, third, and fourth categories. For the second category, a quite reasonable result
Industrial applications 301
was achieved, although it can be seen that 50 percent of second-class olives were
wrongly assigned as first class. The most relevant failure was the misclassification of
6 percent of third-class olives to the first class.
PLS discriminant analysis improved the classification results of the first and third
classes, although some samples from the fourth class were misclassified to the third.
Furthermore, overlapping between the first two classes was repeated, especially when
classifying second-class samples; 60 percent of the second-class olives were assigned
to the first class.
Finally, neural networks improved the results obtained with previous techniques.
An explanation of these improved results against Mahalanobis distance could be that
the covariance matrixes were not equal. The olives of the first and third classes were
classified perfectly, while those of the second and fourth classes had a failure rate of
8.9 percent and 6.7 percent, respectively. With this algorithm, the first and second
classes appear clearly separated, since the overlap presented in the other algorithms
was eliminated. The only remarkable failure was a second-class olive that was assigned
to the fourth class.
The results can be improved by increasing the image resolution or decreasing the
number of olives per image. However, the former will cause a rise in the final price of
the system for possible users, and the latter will involve a reduction in the production
4 Industrial applications
The Selector 4000 from Multiscan Technologies is an olive-sorting machine with a
production rate of 4 tonnes per hour and a yield greater than 90 percent. The machine
is associated with a high-speed rejection system based on air electronic valves, and
built with stainless steel T-316 and T-304. The system is based on a color camera with a
stroboscopic lighting system. The image analysis, consisting of color recognition, even
shape, and size of defects, is able to separate the olives into three different classes.
5 Conclusions
Machine vision is a very useful tool for sorting small fruit such as olives, cherries, apricots, and plums. In a mechanical transportation system with the ability to rotate the fruit,
the entire surface is analyzed. With image analysis, colorimetric differences among the
classes can be identified. By extracting parameters with information regarding the main
characteristics of the fruit, it is possible to perform good separation using a learning
algorithm, which is able to learn from previous pre-classified batches. This process
can be performed in real time at the production rates that olive processors need (up to
4 metric tonnes per hour).
However, sometimes surface evaluation alone is not enough. There are no colorimetric differences between soft olives (fermented olives with a soft texture and a
poor quality level) and first-class olives. Therefore, additional analysis is needed with
302 Classification and Quality Evaluation of Table Olives
technology able to obtain information regarding texture, using firmness sensors and
spectroscopic analysis. The problem of detection of pits has not yet been satisfactorily solved yet, although the application of image analysis with X-rays and NMR is a
possible solution.
The application of machine vision for olives and other small fruits was promoted by the
Ainia Technological Centre, the main objectives of which are to promote technological
research and development, increase production quality, improve competitiveness, and
encourage industries to modernize and diversify.
The samples were prepared by La Española, within a European project funded by the
European Commission, “New Imaging Processing for the Characterisation of Olives
and other Fruits (NIPCO) FAIR-CT97-9505”. In this project, the IVIA collaborated in
the image analysis system design, in the persons of Enrique Moltó and José Blasco.
ChtiouiY, Bertrand D, DatteeY, Devaux MF (1996) Identification of seeds by color imaging:
comparison of discriminant analysis and artificial neural network. Journal of the
Science of Food and Agriculture, 71 (4), 433–441.
Colmagro, S, Collins G, Sedgley M (2001) Processing technology of the table olive.
Horticultural Reviews, 25, 235–260.
Davenel A, Guizard CH, Labarre T, Sevila FJ (1988) Automatic detection of surface defects
on fruit by using a vision system. Agricultural. Engineering Research, 41, 1–9.
Delwiche MJ, Tang S, Thompson JF (1993) A high speed sorting for dried prunes.
Transactions of the ASAE, 36 (1), 195–200.
Department of Agriculture, Food Safety & Quality Service (1977) United States Standards for Grades of Green Olives. United States of America, UNITED-STATESSTANDARD, 727–735.
Díaz R, Faus G, Blasco M, Blasco J, Moltó E (2000) The application of a fast algorithm
for the classification of olives by machine vision. Food Research International, 33,
Díaz R, Blasco M, Blasco J, Moltó E (2004) Comparison of three algorithms in the classification of table olives by means of computer vision. Journal of Food Engineering,
61, 101–107.
Garrido-Fernandez A, Romero-Barranco C (1999) Quality of table olives. Grasas y Aceites,
50 (3), 225–230.
Gunasekaran S, Ding K (1994) Using computer vision for food quality evaluation. Food
Technology, 48 (6), 151–154.
Karaoulanis GD, Bamnidou A (1995) Color changes in different processing conditions of
green olives. Grasas y aceites, 46 (3), 153–159.
References 303
Nagata M, Qixin C (1998) Study on grade judgment of fruit vegetables using machine
vision. Japan Agricultural Research Quarterly, 32 (4), 257–265.
Nimesh S, Delwiche MJ (1994) Machine vision methods for defect sorting stonefruit.
Transactions of the ASAE, 37 (6), 1989–1997.
Okamura NK, Delwiche MJ, Thompson JF (1993) Raisin grading by machine vision.
Transactions of the ASAE, 36 (2), 485–492.
Paulsen MR, McCleure WF (1986) Illumination for computer vision systems. Transactions
of the ASAE, 29 (5), 1398–1404.
Picus M, Peleg K (2000) Adaptive classification – a case study on sorting dates. Journal
of Agricultural Engineering Research, 76 (4), 409–418.
Riedmiller M, Braun H (1993) A direct adaptative method for faster backpropagation learning: the RPROP algorithm. In Proceedings of the IEEE International Conference on
Neural Networks, San Francisco, pp. 586–591.
SanchezAH, Rejano L, Duran MC, de CastroA, MontañoA (1990) Elaboración de aceitunas
verdes con tratamiento alcalino a temperatura controlada. Grasas y aceites, 41 (3),
Steinmetz V, Roger JM, Moltó E, Blasco J (1999) On-line fusion of color camera and
spectrophotometer for sugar content prediction of apples. Journal of Agricultural
Engineering Research, 73 (2), 207–216.
Tao Y, Heinemann PH, Varghese Z, Morrow CT, Sommer HJ (1995) Machine vision for
color inspection of potatoes and apples. Transactions of the ASAE, 38 (5), 1555–1561.
Yang Q (1993) Classification of apple surface features using machine vision and neural
networks. Computers and Electronics in Agriculture, 9 (1), 1–12.
Grading of Potatoes
Franco Pedreschi,1 Domingo Mery2 and Thierry Marique3
1 Universidad de Santiago de Chile, Departamento de Ciencia y
Tecnología de Alimentos, Av. Ecuador 3769,
Santiago de Chile, Chile
2 Pontificia Universidad Católica de Chile, Departamento de
Ciencia de la Computación, Av. Vicuña Mackenna 4860 (143),
Santiago de Chile, Chile
3 Centre pour l’Agronomie et l’Agro-Industrie de la Province
de Hainaut (CARAH), 7800 Ath, Belgium
1 Introduction
Potatoes (Solanum tuberosum) form one of the major agricultural crops in the world, and
are consumed daily by millions of people from diverse cultural backgrounds. Potatoes
are grown in approximately 80 percent of all countries, and worldwide production
stands in excess of 300 million tonnes per year – a figure exceeded only by wheat,
maize, and rice. Large variation in the suitability of potatoes for the processing of
chips and French fries leads to particular quality demands compared to ware potatoes.
In 2001 about 50 percent of the US potato crop was processed, producing 11 300 million
kg of processed potatoes, of which 21.6 percent was made into potato chips.
Grading and sorting of potatoes ensures that derived products meet the defined grade
requirements for sellers, and the expected quality for buyers (Heinemann et al., 1996).
Grading is particularly important for potatoes because the size, shape, color, and defects
depend greatly on environmental conditions and handling, and is performed primarily
by trained human inspectors who assess the potatoes by “seeing” or “feeling” a particular quality attribute. However, there are some disadvantages to using human inspectors,
including inconsistency, short supply of labor, and the expense of the large amounts
of time required due to the huge volume of production. Product experts characterize potato defects and diseases based on color and shape features, and thus computer
vision may improve inspection results and be able to take over the visually intensive
inspection work from human inspectors. Automation is desirable because it can ensure
consistency in product quality and can handle large volumes. A completely automated
inspection station requires the incorporation of machine vision and automation into a
system consisting of the appropriate hardware and software for both product handling
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
306 Grading of Potatoes
and grading. Factors such as size, shape, greening, cracks, scab, etc. determine the
final grade of a potato (Noordam et al., 2000).
Every batch of potato tubers must be tested for quality before sale, and visual inspection is of great importance. This is true not only for stocks intended for industrial use,
but also and especially for those intended for domestic use, since potential consumers
attach predominant importance to external appearance. There exists a real need for
standardization of analysis, since quality evaluation determines the acceptance or rejection of submitted potato batches and, of course, the subsequent payment of producers
(Marique et al., 2003, 2005). The objective of this chapter is to present briefly how
potatoes are graded in industry, and describe the principal potato features and surface
defects that determine the strategies that must be applied for accurate grading.
2 Surface defects
There are problems associated with the classic visual evaluation procedures; in particular, results may vary with the assessor (Marique et al., 2003). Evaluation also depends
on the particular potato variety tested, as flesh color ranges from creamy white to buttery yellow. Moreover, defects can show broad variations of shape, aspect, and color
(white, gray, bluish, brown, black, etc.). In practice, several very different criteria are
evaluated, either after harvest or upon delivery (Marique et al., 2005):
Size distribution as well as the percentage of aberrant shapes (e.g. cracked, forked,
“doll”, “diabolo”, etc.). This can lead to huge losses at peeling or processing. Very
high proportions of such shapes occur as a consequence of bad meteorological
conditions during the growth of the tubers, such as alternating periods of dryness
and rain.
Surface roughness resulting from bacterial or fungal attacks (e.g. common scab –
Streptomyces sp.; silver scurf – Helminthosporium sp.; and Rhizoctonia solani).
This can lead to the very unpleasant appearance of tubers cultivated in “heavy”
soil (clay).
Tuber germination. This is generally as a consequence of senescence or of bad
storage at too low a temperature.
Green spots or regions following exposure to light. Sometimes superficial green
spots can penetrate deeper under the skin of the tuber, and thus affect peeled potatoes as well. Appreciation of such defects thus necessitates preliminary peeling
of the sample under controlled conditions.
Bruises, defined as colored marks that remain after two consecutive passes with
a kitchen vegetable peeler. Lifting and stockage of tubers are responsible for
up to 40–50 percent of bruises on domestic potatoes, and up to 100 percent
on loose industrial potatoes. Bruised potatoes lose weight due to an increase in
transpiration, they lose starch because of increased respiration, and they are more
prone to pathogen invasion (Rousselle et al., 1996).
Tuber diseases due to various viral, bacterial, and fungal agents. The most common are certainly soft rot (Erwinia), late blight (Phytophtora infestans), dry rot
Potato classification 307
(Fusarium) and gangrene (Phoma). These result in deep invasion and necrosis
of tissue, leading eventually to complete destruction of the tubers. The external
appearance can be extremely varied, in shape, color, or aspect, so microscopic
inspection may in some cases be required for correct identification.
3 Potato classification
Potatoes can be classified into five grades based on the USDA Standards as shown in
Table 13.1 (Heinemann et al., 1996). Some factors that contribute to the grade, such
as size, shape, and external defects, can be assessed by machine vision. The 1991
USDA Standards for Grades of Potatoes define three classes of shape: “well shaped”
(a potato that has the normal shape for the variety); “fairly well shaped” (the potato
is not materially pointed, dumbbell-shaped, or otherwise deformed); and “seriously
misshapen” (the potato is very deformed). These shape requirements are somewhat
abstract and difficult to comprehend, since there are no standard shapes available for
comparison. This is due to the unique shapes assumed by potatoes. These classes need
to be quantified for automated grading – i.e. each shape classification should have a
number or number range associated with it.
It is very important for the potato industry that it supplies tubers of uniform quality
(Thybo et al., 2004). Consequently, the industry needs rapid on-line and at-line methods
in order to:
sort the raw material into the given physical property categories prior to
predict the optimal use of the raw material
adjust processing to obtain the optimal quality of the processed product.
Nuclear magnetic resonance imaging (NMR imaging) is a modern technique which
gives valuable information about not only raw-potato water distribution, but also the
anatomic structures within the tubers. The structures are of importance for the perceived
mechanical properties of cooked potatoes. NMR imaging has been shown to have the
potential to predict potato quality attributes, and therefore may be an attractive method
to implement as in-/at-line grading during production (Thybo et al., 2004).
Table 13.1 The United States Department of Agriculture (USDA)
requirements for size and shape of potatoes (reprinted from Heinemann
et al., 1996©, courtesy of Springer Science and Business Media).
USDA Grade
Minimum diameter (cm)
US Extra No. 1
US No. 1
US Commercial
US Extra No. 2
Fairly well shaped
Fairly well shaped
Fairly well shaped
Not seriously misshapen
Seriously misshapen
308 Grading of Potatoes
The sensory mechanical quality is of uppermost in cooked potatoes, as this is one
of the most critical quality attributes in consumer evaluation. Therefore, development of on-line/at-line sensors enables grading and sorting of potatoes in relation
to their final qualities before marketing (Thybo et al., 2004). These authors used
non-destructive and non-invasive NMR imaging to describe the sensory mechanical quality of cooked potatoes. This was done by studying the correlation between
advanced image-analysis features determined in different regions of raw potatoes, and
sensory mechanical attributes of cooked potatoes. Moreover, correlations between specific image features and sensory data were also investigated. Features extracted from
images of raw potatoes using different image texture analysis methods were able to
classify the sensory mechanical variation in five potato varieties, and to predict the
sensory mechanical attributes in the cooked potatoes. NMR imaging of raw potatoes
also provides structural/anatomic information of importance for sensory perception in
cooked potatoes (Thybo et al., 2004).
For instance, in the potato chip industry, each batch of potato tubers must be tested
for quality before processing, and the visual aspect is therefore of great importance
(Marique et al., 2003). There are different procedures used in grading potatoes all over
the world. Currently, for whole tubers, these procedures are still mainly dependent
on visual inspection. Some research groups are working on new automated ways to
achieve this task, but methods are still under development. However, certain groups
can give a thorough review of the different criteria used to evaluate potato tuber quality
and so determine whether they are suitable for processing – e.g. surface appearance,
disease, shape, bruises, etc. These are strongly linked to image feature extraction, which
is one of the most actively researched topics in computer vision. The major types of
image feature are color, size, shape, and texture. Each of these provides important
information required for food quality evaluation, inspection, and grading. Moreover,
the proper combination of different image features, such as combining size with shape
and color with texture, can often increase the accuracy of the results. Sometimes such
a combination might even reveal some quality attributes that cannot be identified by
using only a single type of image feature.
Recently, the different features of color, size, shape, and texture have been combined for applications in the food industry because this increases the performance of
the methods proposed. Normally, by increasing the number of features used, the performance of the methods proposed can be increased as well. Therefore, to capture more
information about the quality of food from images, multiple features corresponding to
the grading system of the food products should be processed (Brosnan and Sun, 2004;
Du and Sun, 2004).
4 Applications
Various studies related to machine vision inspection of potatoes have been reported in
the literature. Automated inspection stations for machine vision grading of potatoes
on size and shape have been reported (Tao et al., 1990; Deck et al., 1992; Grenander and Manbeck, 1993; Heinemann et al., 1996). The color segmentation results of
Applications 309
a multilayer feed-forward neural network (MLFN-NN) and a traditional classifier for
the color inspection of potatoes have been compared (Kim and Tarrant, 1995). The
throughput of the system as reported by Heinemann et al. (1996) was three potatoes
per minute, and the classification results decreased significantly when the potatoes
were moving (Tao et al., 1990). None of the systems described above fulfills the potato
industry’s requirement for high throughput and real-time speed. Besides low throughput, none of the systems is capable of inspecting size, shape, and multiple color defects.
To overcome this low throughput, a PC-based high-speed machine vision system for
potato inspection with a throughput of 50 images per second was proposed (Lee et al.,
1994). The system was able to classify potatoes by size, weight, cross-sectional diameter, shape, and color. The weakness of the system was that the color classification
procedure discriminated between good potatoes and green potatoes only, and detection
of multiple color defects was not possible.
Heinemann et al. (1996) designed and implemented a prototype automatic station
for machine vision inspection and classification of potatoes, which focused on size
and shape. The system included integration of discrete machine vision and automation tasks into a complete software package; building a machine vision inspection
station interfaced with automation equipment and a computer using a data-acquisition
system; and evaluation of the system performance based on two potato quality features, i.e. shape and size. The station specifically consisted of an imaging chamber,
a conveyor, a camera, a sorting unit, and a personal computer for image acquisition,
analysis, and equipment control. The throughput rate of the station was three potatoes
per minute, which was prohibitively slow for sorting large quantities of potatoes but
was almost adequate for grading based on sampling. The motion of potatoes interfered
with accurate assessment of shape, although motion had little effect on determining the
size. The developed automatic inspection station did not consider external defects for
potato grading, and was capable of evaluating size and shape with some limitations.
Zhou et al. (1998) developed a PC-based vision system and applied it in computeraided potato inspection; it was able to classify 50 potato images per second by the most
important criteria (i.e. potato weight, cross-sectional diameter, shape, and color) in
sorting potatoes practically. An ellipse was used as a shape descriptor for potato-shape
inspection, and color thresholding was performed in the hue–saturation–value (HSV)
color space to detect green color defects. The average efficiency of this system was
91.2 percent for weight inspection and 88.7 percent for diameter inspection. The shape
and color inspection algorithms achieved 85.5 percent and 78.0 percent success rates,
respectively. The overall success rate, combining all the above criteria, was 86.5 percent.
Greening and other defects such as cracks, common scab, and rhizoctonia are also
important features which influence the qualities that consumers prefer. For a machine
vision system to be successful in the potato-packaging industry, such defects must be
detected (Marique et al., 2005).
As the machine vision system must operate in a potato-packaging plant, extra
demands are imposed on it. Apart from recognizing external defects and detecting
misshapen potatoes, it must also have high accuracy and a capacity of 12 tonnes per
hour. Noordam et al. (2000) thus developed a high-speed machine vision system for
the quality inspection and grading of potatoes. This real-time computer-aided potato
310 Grading of Potatoes
inspection system allowed determination of potato weight, cross-sectional diameter,
shape, and color, which combined with external defects were the four primary features
in sorting potatoes. The High-speed Quality Inspection of Potatoes (HIQUIP) system
incorporated conveyor lanes to transport the potatoes to and from the vision unit. Dust
and dirt were removed before inspection by washing, and a 3-CCD line-scan camera
inspected the potatoes as they passed under the camera. To achieve the required capacity
of 12 tonnes per hour, 11 SHARC Digital Signal Processors (Analog Devices ADPS21060) performed the image-processing and classification tasks. The total capacity of
the system was about 50 potatoes per second. The color segmentation procedure used
linear discriminant analysis (LDA) in combination with the Mahalanobis distance to
classify the pixels. For the detection of misshapen potatoes, a Fourier-based shape
classification technique and features such as area, eccentricity, and central moments
were used to discriminate between similar-colored defects. Experiments with red- and
yellow-skinned potatoes showed that the system was robust and consistent in classification. After inspection and grading, the potatoes were transported to a packaging
device where they were packed into small bags and sold on the consumer market.
Finally, there are several steps necessary before a potato inspection system can be
deployed in the field (Noordam et al., 2000):
1. The machine vision system must be integrated with a mechanical system
2. The integrated system must be evaluated thoroughly at a packing site, on many
more potatoes with real mixes and varying environmental conditions
3. More complicated algorithms should be explored and compared to assess whether
the additional computational complexity is justified
4. Development of new algorithms to detect potato features such as bruises and
knobs needs to be initiated.
4.1 Automated defect detection
Assessment of potato quality includes a low incidence of colored bruises as a result of
poor storage and manipulation practices. Up to now, automation has mainly focused
on the detection of bruises and necrosis on peeled potatoes. Some work has also been
done to sort incoming potato batches according to shape and green areas. There are
therefore two different aims. For on-line sorting, the important thing is to eliminate any
defective individual potato, since the presence of a single one in a package can result
in rejection by the consumer. For scoring incoming batches, more complex data are
obtained, equivalent to the classic visual evaluation by an operator. Marique et al. (2005)
developed a procedure to process and segment potato images by using Kohonen’s selforganizing map. Anomalous regions could be distinguished into three potato varieties.
Bruises that were very dissimilar in appearance were correctly identified, and some
particular defects, such as green spots, could be located as well.
4.1.1 On-line sorting
For whole potatoes, on-line sorting is applied immediately after peeling to eliminate
tubers presenting necrosis, bruises, and any defects resulting in abnormal coloration.
Applications 311
Cameras scan the stream of tubers and defective individuals are rejected by application
of ultra-fast air ejectors. The same device is generally equipped with lasers that allow
detection of foreign bodies, such as stones, glass, wood, metal, etc. These systems
can discriminate either between objects with similar color and a different aspect, or
between objects with a similar aspect and different colors. The general principle is
that laser light is reflected or scattered in various ways by different objects, thus these
devices can be used for sorting not only whole tubers, but also chips and potato flakes.
In all cases the general principle remains the same apart from very minor changes –
such as lower air pressures being used for lighter products (Marique et al., 2005).
4.1.2 Bruise and green-spot detection
CARAH’s food technology laboratory developed a model based on artificial neural
networks that permits identification and discrimination of both bruises and green spots
on peeled potatoes. This work was initiated to provide help to laboratories devoted to
potato quality evaluation, since it is extremely difficult to standardize assessor response
in different laboratories, even at different times in the same laboratory (Marique et al.,
Artificial neural networks were selected as this kind of mathematical model is
endowed with both good performance and a broad capacity of generalization, especially for complex and non-linear systems. In particular, Kohonen’s self-organizing
map (SOM) is a neural learning structure involving networks that perform dimensionality reduction through conversion of feature space to yield topologically ordered
similarity graphs or maps of clustering diagrams (Schalkoff, 1997). Kohonen’s SOM
has previously been employed in a number of varied recognition tasks, such as medical
diagnostics, multi-scale image segmentation, grapevine genotype clustering (Haring
et al., 1994; Manhaeghe et al., 1994; Mancuso, 2001), and baking curve identification
(Yeh et al., 1995; Hamey et al., 1997, 1998).
When presented with RGB pixel data values from a selection of three healthy potato
cultivars differing in flesh color (Asterix, Bintje, Charlotte), the SOM nodes organize
themselves according to the structure of the data whose topological and density features
in the node locations is captured as shown in Figure 13.1.
In a second step, the trained SOM can be presented with data values from bruised and
greenish potatoes. Pixels from healthy parts of the tubers will then be positioned near
the SOM network while pixels from bruised parts will be at a distance (Figure 13.2).
Figure 13.3a shows a typical image of a bruised half-potato, selected from a series
of 50 image samples that are submitted to the trained SOM. Figure 13.3b shows the
performance of the SOM by highlighting the region detected as a bruise, but not outside
the region.
The performance of the trained SOM was extended to more complex bruises, e.g.
bruises with irregular shapes and heterogeneous color (Figure 13.4). Some tubers
also presented green areas due to sunlight exposure (Figure 13.4a). As observed in
Figure 13.4b, the SOM correctly interpreted the RGB shades of the pixels, and good
segmentation of different areas was obtained.
Kohonen’s self-organizing map is thus suitable for identifying both bruised and green
areas on potato flesh. As bruises clearly contrast with healthy potato flesh, which is
312 Grading of Potatoes
Weight vectors
Figure 13.1 Structure of a trained two-dimensional hexagonal 10x10 Kohonen’s SOM in RGB space. Cyan
crosses: cluster of all pixels (RGB values) from three potato varieties. Red dots: neuron positions. Blue lines:
Euclidian distances between adjacent neurons. (Reprinted from Journal of Food Science, Vol. 70, Thierry
Marique, Stephanie Pennincx, Ammar Kharoubi. Image segmentation and bruise identification on potatoes
using a Kohonen’s self-organizing map. Pages 415–417, Copyright 2005, by courtesy of Institute of Food
Figure 13.2 Distribution in RGB color space of pixels from the half-potato image. In blue: pixels from
healthy parts of the tuber. In red: pixels from the bruised part of the tuber. (Reprinted from Journal of Food
Science, Vol. 70, Thierry Marique, Stephanie Pennincx, Ammar Kharoubi. Image segmentation and bruise
identification on potatoes using a Kohonen’s self-organizing map. Pages 415–417, Copyright 2005, by
courtesy of Institute of Food Technologists.)
Applications 313
Figure 13.3 Image of a half-potato tuber, showing a brownish bruise (a). Bruised area (in red) identified
by the SOM, superposed on the image of the half-potato (b). (Reprinted from Journal of Food Science, Vol. 70,
Thierry Marique, Stephanie Pennincx, Ammar Kharoubi. Image segmentation and bruise identification on
potatoes using a Kohonen’s self-organizing map. Pages 415–417, Copyright 2005, by courtesy of Institute of
Food Technologists.)
Figure 13.4 Half-potato image showing both bruised and green areas (a) and the same image with
superposition of regions identified by the SOM as green (in green) and bruised areas (in red) (b). (Reprinted
from Journal of Food Science, Vol. 70, Thierry Marique, Stephanie Pennincx, Ammar Kharoubi. Image
segmentation and bruise identification on potatoes using a Kohonen’s self-organizing map. Pages 415–417,
Copyright 2005, by courtesy of Institute of Food Technologists.)
very uniform in color, excellent results can be easily obtained. Further developments
will involve improvement in image capture, process, measurement, and calculation of
relevant surfaces of healthy and bruised areas.
4.2 Machine vision system
The complete potato inspection system developed by Noordam et al. (2000) consists of
a conveyor unit, a vision unit, and a rejection unit, which are all placed in a single line.
The conveyor consists of two conveyors (SC1 and SC2) to separate the potatoes and
create a single line of potatoes. The speed of conveyor SC2 is slightly higher than that
314 Grading of Potatoes
of conveyor SC1 to separate the potatoes at the transision from SC1 to SC2. Conveyor
SC2 transports the potatoes towards the vision unit, where inspection takes place.
The conveyor belts (VC1 and VC2) of the vision unit, placed one after another,
transport the potato under the camera for inspection. A digital 3-CCD color line-scan
camera scans the narrow gap between the conveyors VC1 and VC2 to achieve in-flight
inspection of the potato. To obtain a 360◦ view of the potato, mirrors are placed in the
small gap (4 cm) between conveyors VC1 and VC2. The lack of product holders and
the use of mirrors guarantee a full view of the potato.
The camera must grab 2000 lines/s to obtain the required resolution (2 pixels/mm),
and thus it requires powerful lighting equipment. The camera grabs continuously, and
the software detects when a potato passes the gap between the conveyors of VC1
and VC2. Therefore, the camera requires no additional starting signal when a potato
approaches the imaging area.
After inspection, the potato is transported to the rejection unit. The rejection unit
consists of individually controlled product holders. Each product holder is controlled
by electromagnets. Once a potato arrives at the correct rejection lane, the magnets are
released and the potato drops.
A high grab frequency requires dedicated hardware for the image-processing and
classification tasks. A Spectrum Signal PCI-card with 11 SHARCs (Analog Devices
ADPS-21060) digital signal processors (DSP) is responsible for the image acquisition
and classification tasks. One DSP communicates with the Host PC and transports
the measurement results to the screen for visualization. It also performs the color
segmentation, image compression, and spurious pixel removal. The remaining three
DSP are divided over the three conveyors to conduct the operations for color and shape
4.3 Characterization of potato defects
Product experts characterize potato defects and diseases based on color and shape.
Factors such as size, shape, greening, cracks, scab, etc. determine the final grade
of a potato. The potatoes are graded into four different categories depending on the
presence of defects and the area of the defects (Noordam et al., 2000). Similar diseases
in potatoes of different cultivars (scab, skin spot, and black scurf) may have a distinct
color due to the underlying skin color of the potatoes. This requires a different reference
set of images for each potato cultivar. Besides the difference in skin color for different
cultivars, variance in skin structure and shape are also important features. From each
cultivar, an image collection of all possible defects can be created. Each potato image
is accompanied with a sensorial description and stored in a database, which is then
used for the development and testing of the color and shape algorithms.
4.4 Algorithm design
1. Color segmentation. The majority of external defects and diseases are identified
by their color, which makes the classification of pixels into homogeneous regions
an important part of the algorithm. Multilayer feed-forward neural networks
Conclusions 315
(MLF-NN) and statistical discriminate functions have been used successfully
for the segmentation of potato images (Guttag et al., 1992; Kim and Tarrant,
1995; Heineman et al., 1996; Marique and Wérenne, 2001). Six different color
classes are identified: background, potato skin, greening, silver scab, outward
roughness, and rhizoctonia. The difference in skin colors means it is not enough
to use a single model for different potato cultivars (Noordam et al., 2000).
2. Discrimination between similar-colored objects. There are several defects and
diseases that have similar colors. For example, defects such as cracks and rhizotocnia both appear black. Discrimination between these defects is important,
because cracks are a more serious defect. Although rhizoctonia and cracks both
appear unappetizing, potatoes with cracks may become rotten and infect other
potatoes, and must therefore be removed from the batch. For discrimination
between cracks and rhizotocnia, additional shape features are used that can differentiate the two. Cracks and growth cracks appear more or less elongated in
comparison with rhizoctonia spots and common scab, and eccentricity (which
can vary from 1 to ∞ and can be considered as a measurement of length/width)
is used to discriminate cracks from rhizotocnia (Noordam et al., 2000).
3. Shape classification. Fourier descriptors (FD) and linear discriminant analysis
(LDA) are used to discriminate good potatoes from misshapen ones. A single
shape model is not enough to segment all potato cultivars into the classes of
good and misshapen. Well-shaped potatoes may vary from round to oval, or
even extremely oval. Therefore, different shape models are created for different
potato cultivars. A shape training set and shape test is created for each cultivar
to discriminate good potatoes from misshapen ones (Noordam et al., 2000).
5 Conclusions
Product inspection is a process that requires evaluation of large quantities of product
based on limited sampling. Inspection is usually conducted by trained human graders,
but the unavailability of these inspectors has led to efforts to automate the process. Prototype automated potato inspection stations based on previously developed algorithms
using shape and size have been developed and tested.
An automated station has been developed which is capable of evaluation of the size
and shape of potato tubers, with some limitations. The motion of the potatoes interferes
with accurate assessment of shape, although motion has little effect on determining
the size. The throughput rate of the station is three potatoes per minute; this would be
prohibitively slow for sorting large quantities of potatoes.
Other researchers have developed an affordable real-time computer-aided potato
inspection system for inspecting potato weight, cross-sectional diameter, shape, and
color which are the four primary features in sorting potatoes in practice. This machine
vision system is capable of handling up to 50 potato images per second, improving the
classification accuracy of previously developed systems for detecting other features
while still achieving real-time performance.
316 Grading of Potatoes
Some researchers have implemented algorithms to detect potato features such as
bruises, and have shown that Kohonen’s self-organizing map is suitable for identifying
both bruised and green areas on potato flesh. A two-dimensional SOM can be fitted
to RGB space distribution of pixels corresponding to three different potato varieties.
Pixels situated too far from the SOM are then identified as abnormal. As bruises
clearly contrast with healthy potato flesh, which is very uniform in color, excellent
results should be easily obtained. Further developments will involve improvement in
image capture, measurement, and processing, as well as assessment of the relevant
surfaces of healthy and bruised areas.
The authors acknowledge financial support from FONDECYT Project No. 1030411.
Brosnan T, Sun D (2004) Improving quality inspection of food products by computer
vision – a review. Journal of Food Engineering, 61, 125–135.
Deck S, Morrow CT, Heinemann PH, Summer HJ (1992) Neural networks for automated
inspection of product. American Society of Agricultural Engineers, 92, 3594–3601.
Du C, Sun D (2004) Recent developments in the applications of image processing techniques for food quality evaluation. Trends in Food Science and Technology, 15,
Grenander U, Manbeck KM (1993) Astochastic shape and color model for defect detection
in potatoes. Journal of Computer Graphic Statistics, 2, 131–151.
Guttag K, Gove RJ, VanAken JR (1992) A single hip multiprocessor for multimedia: the
MVP. IEEE Computer Graphics Applied, 12, 53–64.
Hamey LGC, Yeh JC-H, Ng C (1997) Objective bake assessment using image analysis and artificial intelligence. In Cereals ’97: Proceedings of the 47th Australian
Cereal Chemistry Conference, Perth, Australia, Royal Australian Chemical Institute,
pp. 180–184.
Hamey LGC, Yeh JC-H, Westcott T, Sung SKY (1998) Pre-processing color images with
a self-organising map: baking curve identification and bake image segmentation. In
Proceeding of the 14th International Conference on Pattern Recognition, Brisbane,
Australia, pp. 1771–1775.
Haring S, Viergever MA, Kok JN (1994) Kohonen networks for multiscale image
segmentation. Image and Vision Computing, 12, 339–344.
Heinemann PH, Pathare NP, Morrow CT (1996) An automated inspection station for
machine-vision grading of potatoes. Machine Vision and Applications, 9, 14–19.
Kim Y, Tarrant J (1995) Computer-aided inspection of potatoes and other agriculture products by digital image processing. A focused technology initiative proposal. University
of Washington, Seattle, WA.
References 317
Lee W, Kim Y, Gove RJ, Read CJ (1994) Media station 5000: Integrating video and audio.
IEEE Multimedia, 1, 50–61.
Mancuso S (2001) Clustering of grapevine (Vitis vinifera L.) genotypes with Kohonen
neural networks. Vitis, 40, 59–63.
Manhaeghe C, Lemahieu I, Voglaers D, Colardyn F (1994) Automatic initial estimation of
the left ventricular myocardial midwall in emission tomograms using Kohonen maps.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 64, 259–266.
Marique T, Wérenne J (2001) A general artificial neural network for the modeling of culture
kinetics of different CHO strains. Cytotechnology, 36, 55–60.
Marique T, Kharoubi A, Bauffe P, Ducattillon C (2003) Modelling of fried potato chips color
classification using artificial neural network. Journal of Food Science, 68, 2263–2266.
Marique T, Pennincx S, Kharoubi A (2005) Image segmentation and bruise identification
on potatoes using a Kohonen’s self-organizing map. Journal of Food Science, 70,
Noordam JC, Otten GW,TimmermansAJM, van Zwol BH (2000) High speed potato grading
and quality inspection based on a color vision system. Proceedings of SPIE: The
International Society for Optical Engineering, 3966, 206–219.
Rousselle P, Robert Y, Crosnier JC (1996) La pomme de terre. Production, amélioration,
ennemis et maladies, utilisations. Paris: INRA éditions, p. 607.
Schalkoff RJ (1997) Artificial Neural Networks. New York: McGraw-Hill, p. 448.
Tao Y, Morrow CT, Heinemann PH, Sommer HJ (1990) Automated machine vision
inspection of potatoes. American Society of Agricultural Engineers, 90, 3531–3539.
Thybo AK, Szczypinski PM, Karlsson AH, Donstrup S, Stodkilde-Jorgersen HS,
Andersen HJ (2004) Prediction of sensory texture quality attributes of cooked potatoes
by NMR-imaging (MRI) of raw potatoes in combination with different image analysis
methods. Journal of Food Engineering, 61, 91–100.
Yeh JCH, Hamey LGC, Westcott CT, Sung SKY (1995) Color baking inspection system
using hybrid artificial neural networks. In Proceedings of the IEEE International
Conference on Neural Networks, Vol. I, Perth, Australia, pp. 37–42.
Zhou L, Chalana V, Kim Y (1998) PC-based machine vision system for real-time computeraided potato inspection. International Journal of Imaging Systems and Technology, 9,
Quality evaluation of
Fruit by Hyperspectral
Renfu Lu
US Department of Agriculture, Agricultural Research Service,
Sugarbeet and Bean Research Unit, Michigan State University,
East Lansing, MI 48824, USA
This chapter was prepared as part of the author’s official duties as a US
government employee, and hence cannot legally be copyrighted.
Mention of commercial products in the book chapter is only to provide factual
information for the reader and does not imply endorsement by the United States
Department of Agriculture.
1 Introduction
As consumers are demanding better quality and safer food products, there is an increasing need for rapid and non-destructive quality evaluation of fresh fruit. Considerable
research has been reported on developing non-destructive techniques for the evaluation of fresh and raw food, and agricultural products, including fruits and vegetables
(Abbott et al., 1997; Lu and Abbott, 2004). Currently, optical techniques are widely
used for measuring, monitoring, controlling, grading, and sorting product items to
ensure their quality and consistency in the food and agricultural industries, because
they are rapid, non-invasive, and relatively easy to implement. Imaging is commonly
used to quantify the external or surface characteristics of product items, including size,
shape, and color. Conventional color or black-and-white imaging techniques, however,
are inadequate for measuring chemical constituents or internal quality attributes of
fresh fruit because they only record the spatial distribution of light intensities over a
broadband spectrum without detailed information for individual wavelengths. Many
chemical constituents are only sensitive to specific wavelengths or narrow wavebands.
Spectroscopy, on the other hand, measures light reflectance or transmittance at individual wavelengths or narrow wavebands, and is thus useful for ascertaining or measuring
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
320 Quality Evaluation of Fruit by Hyperspectral Imaging
the chemical constituents of the product. Visible and near-infrared (Vis/NIR) spectroscopy, which covers the spectral region between 400 nm and 2500 nm, has become
an important non-destructive technique for chemical analysis and quality assessment
of a large class of agricultural and food products (Williams and Norris, 2001). The
technique is now being used in on-line sorting and grading of individual fruit for
internal quality attributes (e.g. soluble solids) in some modern fruit packinghouses and
food processing plants. Conventional Vis/NIR spectroscopy only provides point or area
measurements, and it therefore cannot quantify the spatial variation or distribution of
properties and attributes in the product item. Moreover, the technique is largely empirical, relying on the development of calibration models relating spectral information to
reference measurements that are often destructive.
Hyperspectral imaging (also called imaging spectroscopy) combines the main features of imaging and spectroscopy so that it can provide both spatial and spectral
information about an object simultaneously. A typical hyperspectral image is threedimensional (and is hence called a hyperspectral image cube); two axes represent
spatial dimensions and the third axis represents a spectral dimension. Each hyperspectral image contains an enormous amount of information about the object. The technique
is thus useful for ascertaining and quantifying the spatial distribution of certain chemical constituents or quality attributes of product items. Hyperspectral imaging was first
developed for remote satellite sensing of the earth and for military reconnaissance over
20 years ago. In the past decade, the technique has received increasing attention for
assessment of the quality and safety inspection of food and agricultural products (Lu
and Chen, 1998). Considerable recent research has been reported on hyperspectral
imaging for detecting internal quality attributes of fruit (Martinsen and Schaare, 1998;
Lu, 2003a), defects on fruit (Lu, 2003b; Ariana et al., 2006), and fecal contaminants
on poultry products and on fresh fruits and vegetables (Kim et al., 2002; Park et al.,
This chapter presents new applications of hyperspectral imaging for measuring the
optical properties of fruits and assessing their quality attributes. A brief overview is
given of current techniques for measuring the optical properties of turbid and opaque
biological materials. A detailed description of a hyperspectral imaging technique
for measuring optical properties is then presented, which includes its principle, its
instrument set-up, calibration procedure, data analysis, and system validation. Examples are given of applying this technique to measure the optical properties of fresh
apples, peaches, and juices, and to assess fruit quality attributes based on the optical
properties and empirical data analysis methods. The chapter ends with some concluding
remarks on the prospects of this promising technique.
2 Techniques for measuring optical properties
When a continuous wave light-beam impinges on an opaque object such as a fruit, several forms of light interaction with the fruit will take place. Some light will be reflected
at the surface of the fruit, giving observers a sensation of glossiness or shininess, which
Techniques for measuring optical properties 321
is called surface or specular reflectance. Specular reflectance should be avoided in most
imaging and spectroscopic applications. The majority of the light, however, will penetrate the fruit. Light will attenuate in the fruit as a result of scattering and absorption,
which is wavelength-dependent. Some light will be absorbed in the fruit tissue, and
is subsequently converted into other forms of energy (such as heat and fluorescence).
Some light will reflect back and re-emerge (after multiple scattering) from the same
side of the fruit around the beam incident point; this is often called diffuse reflectance.
Some light may go through the whole fruit and emerge from the opposite side; this is
termed transmittance, and is essentially another form of diffuse reflectance.
Absorption and scattering are thus two basic phenomena when light interacts with
a turbid or opaque material, which may be characterized by the refractive index (n)
and the absorption (µa ) and scattering (µs ) coefficients. Light absorption is related to
the chemical constituents of the material. Scattering, on the other hand, is influenced
by density, cells, and intra- and extra-cellular structures, and hence may be useful
for ascertaining structural or physical characteristics of the material. Measurement of
optical properties can thus help us better understand how light interacts with the fruit,
and assess its structural and chemical properties.
Some basic work was done in measuring the optical properties of food and agricultural products based on the Kubelka–Munk equations (Kortüm, 1969) in the 1970s and
1980s (Birth, 1978, 1982; Birth et al., 1978), but little further progress was made in
understanding and measuring the optical properties of food and agricultural products
until recently. In the meantime, considerable progress has been made over the past
two decades in measuring the optical properties of biological tissues in the biomedical field because of the potential for medical diagnosis and human-tissue monitoring
(Tuchin, 2000; Vo-Dinh, 2003). Several techniques have been developed for determining the optical properties of biological tissues. These techniques may be classified into
three categories based on their measurement principles: time resolved (Patterson et al.,
1989), frequency domain (Patterson et al., 1991; Chance et al., 1998), and spatially
resolved (Groenhuis et al., 1983; Kienle et al., 1996).
2.1 Time-resolved techniques
In time-resolved techniques, a short laser pulse is sent to the scattering medium
(Figures 14.1a, 14.1b). After entering the scattering medium, photons will travel in
three different patterns: ballistic, snake, and diffuse (Mobley and Vo-Dinh, 2003). Ballistic photons travel in a straight path, and they contain geometric information about
the medium but are of little use in assessing the optical properties of the medium.
Snake photons will zigzag forward and only deviate slightly from a straight path; they
can carry information on the optical properties of the medium and any foreign object
on their pathway to the detector. Diffuse photons are the dominant component in the
scattering medium; they undergo multiple scattering actions in the medium, and have
different directions and path lengths. The diffuse component carries useful information
about the optical properties and local inhomogeneities of the scattering medium. For
the backscattering configuration shown in Figure 14.1a, only the diffuse component is
recorded, and the path distribution for diffuse photons is banana-shaped (the shaded
322 Quality Evaluation of Fruit by Hyperspectral Imaging
(Pulse or modulated) Detector
Figure 14.1 (a) Schematic of a backscattering sensing mode for measuring the optical properties of a
scattering medium with (b) time-resolved and (c) frequency-domain techniques. Solid lines in (b) and (c)
represent the light source and dotted lines are light detected. The shaded area in (a) denotes the diffuse
photon path distribution.
area in Figure 14.1a). Diffuse photons can be recorded using the time-correlated single
photon counting technique within a short time period (approximately 10−9 –10−12 s) at
a specified distance r from the beam incident point (Tuchin, 2000). The time-resolved
or temporal signals can be described by a diffusion approximation equation of the
non-stationary radiation transfer theory (Patterson et al., 1989). By using an inverse
algorithm to fit the diffusion approximation equation to the temporal profile data,
it may be possible to determine the absorption (µa ) and scattering (µs ) coefficients
simultaneously. In practice, it is often convenient to determine the reduced scattering coefficient [µs = µs (1 − g)], in which g is the anisotropy parameter, so that the
measurement of g can be avoided. The advent of tunable lasers has made it possible
to determine µa and µs at individual wavelengths over a broad spectral region. The
time-resolved technique was recently applied for measuring the optical properties of
fruits and vegetables (Cubbedu et al., 2001a, 1001b).
2.2 Frequency-domain techniques
Frequency-domain techniques use modulated light in the frequency range between
50 MHz and 1000 MHz to illuminate the scattering medium (Figures 14.1a, 14.1c).
The magnitude of the scattered light and the corresponding phase shift at modulation
Techniques for measuring optical properties 323
Continuouswave beam
reflectance, Re(r)
Figure 14.2 Principle of measuring the optical properties of a scattering medium using steady-state
spatially resolved spectroscopic or imaging technique.
frequencies are recorded, from which µa and µs can be determined using an appropriate equation derived from the non-stationary radiation transfer theory (Tromberg et al.,
1997; Chance et al., 1998). The current measurement schemes are based on heterodyning of optical and transformed signals (Chance et al., 1998). Frequency-domain
techniques are simpler and more reliable in data interpretation and noise immunity
compared with time-resolved techniques (Tuchin, 2000). While considerable research
has been reported on using frequency-domain techniques for medical diagnosis applications, the potential of the technique for food and agricultural products is yet to be
Time-resolved and frequency-domain techniques are useful for detecting abnormal tissues in a multi-component and heterogeneous medium. However, they need
expensive and sophisticated instrumentation, and measurements require good contact
between the sample and the detecting probe, thus making them time-consuming. The
techniques are not currently viable for assessing, sorting, and grading food and agricultural products because of their limitations with regard to speed, cost, and restrictive
measurement requirements.
2.3 Spatially-resolved techniques
Compared with time-resolved and frequency-domain techniques, spatially-resolved
spectroscopic or imaging techniques are less sophisticated in instrumentation and relatively easy to implement. Such techniques need a steady-state light source and detecting
device(s) to measure the spatial distribution of backscattered reflectance at individual
wavelengths (Figure 14.2). There are two types of sensing configuration for measuring the optical properties of the scattering medium. The first uses multiple detecting
optic fibers, which are connected to individual spectrometers or a single imaging
324 Quality Evaluation of Fruit by Hyperspectral Imaging
spectrometer, to measure backscattered reflectance at different distances from the light
illuminating point (Lin et al., 1997; Mourant et al., 1997; Dam et al., 2001). This
sensing configuration allows simultaneous measurement of optical properties at multiple wavelengths or over a specific spectral region. However, measurements again
require good contact between the detecting probes and the medium, which can be
problematic for many food products. The second type of sensing configuration is noncontact reflectance imagery (Wang and Jacques, 1995; Kienle et al., 1996). An imaging
device, such as a charge-coupled device (CCD) camera, is used to capture backscattered reflectance images from the scattering medium generated by a point light beam.
When a monochromatic light (such as a laser) is used, the imager can only measure
optical properties at one single wavelength. However, with the use of bandpass filters
or tunable filters (such as an acousto-optic tunable filter and a liquid crystal tunable
filter) the imager can capture spectral reflectance images at multiple wavelengths, and
thus it can determine optical properties over a spectral region. The reflectance imagery
configuration can be carried out without contacting the medium, which is advantageous
for food and agricultural products because of sanitation and safety concerns. However,
a large number of images is needed in order to determine the optical properties of
the sample over a large spectral range, which requires considerable time, and thus the
technique may not be suitable for the rapid measurement of food samples.
3 The hyperspectral imaging system
This section describes a new technique using hyperspectral imaging in line-scan mode
for the rapid acquisition of spatially-resolved scattering profiles at the wavelengths of
450–1000 nm. The theory, instrument set-up and calibration, data analysis, and system
validation are presented. Compared with the steady-state spatially-resolved techniques
discussed above, the hyperspectral imaging technique is faster and simpler, and is
capable of measuring the optical properties of turbid or opaque foods over a broad
spectral range simultaneously.
3.1 Theory
Light scattering and absorption in a scattering medium can be described by the radiation
transfer equation, also known as the Boltzmann equation, which in its original form
cannot be solved analytically (Tuchin, 2000). However, when scattering is dominant
(i.e. µs µa ), the transfer of photons may be considered a diffusion process, which
leads to considerable simplification of the Boltzmann equation. Analytical solutions
to the simplified steady-state diffusion equation may be obtained for some special
situations. Farrell et al. (1992) studied diffuse reflectance at the surface of a semiinfinite turbid medium when it is impinged upon by an infinitely small, steady-state
light source vertically (Figure 14.2). An analytical solution was derived from the diffusion equation to describe diffuse reflectance, Rf , at the surface of the semi-infinite
medium as a function of the source-detector distance and the optical properties of the
The hyperspectral imaging system 325
investigated medium (i.e. the absorption and reduced scattering coefficients and the
relative refractive index), which is given by the following equation:
Rf (r) =
a 1
1 exp(−µeff r 1 )
4π µt
r 21
1 exp(−µeff r 2 )
r 22
where r is the distance from the incident point, a is the transport albedo
[a = µs /(µa + µs )], µs is the reduced scattering coefficient [µs = µs (1 − g)] in
which g is the anisotropy parameter, µeff is the effective attenuation coefficient
[µeff = 3µa (µa + µs )]1/2 ], and µt is the total interaction coefficient (µt = µa + µs ).
Parameters r1 and r2 are the distances from the observation point at the interface to the
isotropic source and the image source:
r1 =
r2 =
+ r2
where A is an internal reflection coefficient determined by the mismatch of relative
refractive index, nr , at the interface, and it may be calculated by the following empirical
equation (Groenhuis et al., 1983):
1 + rd
1 − rd
in which
r d ≈ −1.44n−2
r + 0.710nr + 0.668 + 0.0636nr
nr =
For many biological materials, the refractive index may be considered constant and
wavelength-independent (Nichols et al., 1997; Dam et al., 1998). Consequently, once
A is determined, the shape of the spatial reflectance profile is uniquely determined by
µa and µs through equation (14.1). Conversely, if the reflectance profile resulting from
a point light source over the surface of the opaque medium is known, µa and µs may
be determined by applying an inverse algorithm to equation (14.1). Farrell’s model
(equation (14.1)) provides an excellent description of the spatial diffuse reflectance
profiles for turbid and opaque biological materials in which scattering is dominant
(Farrell et al., 1992; Nichols et al., 1997; Gobin et al., 1999).
326 Quality Evaluation of Fruit by Hyperspectral Imaging
Camera control
CCD camera
Optical fiber
Top view
Lamp control
Scanning line
(1.6 mm off the
incident point)
Figure 14.3 Hyperspectral imaging system for acquiring diffuse reflectance profiles from the fruit over the
spectral range of 400–1000 nm (Lu et al., 2006).
3.2 Instrument set-up
A hyperspectral imaging system in line-scan mode (also called pushbroom mode), as
shown schematically in Figure 14.3, is used for measuring the spatial distribution of
diffuse reflectance at the wavelengths of 450–1000 nm. The system consists of a highperformance back-illuminated 16-bit CCD camera (Model C4880-21, Hamamatsu
Corporation, Japan), an imaging spectrograph (ImSpector V9, Specim, Finland), a
zoom lens, and a computer-controlled DC light source coupled to an optic fiber with a
micro-lens connecting to the fiber-exiting end. The camera is equipped with a Peltier
cooling device to cool the CCD detector to −40◦ C to improve the dynamic range and
the signal-to-noise ratio of the CCD detector. The use of a high-performance CCD
camera with a large dynamic range is necessary because light attenuation in opaque
fruit is so significant that the diffuse reflectance profile changes dramatically in a
short distance. The imaging spectrograph is based on the prism–grating–prism principle, and does not have moving mechanical components (Figure 14.4). It has a slit
9.8 mm long and 80 µm wide, and line-scans the sample. When the incoming radiation
passes the prism–grating–prism unit, it is dispersed into different wavelengths without
altering its spatial information. The dispersed light is projected onto the pixels of the
CCD detector, creating a special two-dimensional image: one axis represents a spectral
dimension and the other axis a spatial dimension (Figure 14.4). If the sample is moving perpendicularly to the scanning direction, the imaging system can acquire spectral
information in the second spatial dimension along the direction of the moving sample
by synchronizing the integration time with the speed of the moving sample. As a result,
The hyperspectral imaging system 327
Matrix detector
Inspector optics
Spatial axis
Spectral axis
Objective lens
Entrance slit
Figure 14.4 Principle of the prism–grating–prism imaging spectrograph for acquiring spatial and spectral
information from an object (reproduced courtesy of Specim, Finland).
a three-dimensional image cube is created; each pixel taken from the image cube is
thus associated with a spectral curve.
As a broadband light beam (∼1.0 mm in size) is incident upon the sample, it generates a diffuse reflectance image at the sample surface. The hyperspectral imaging
system line-scans the surface of the sample 1.6 mm off the incident point to avoid
saturation of the CCD detector caused by high-intensity signals near the incident point
(Figure 14.3). Instead of obtaining a three-dimensional hyperspectral image cube, the
hyperspectral imaging system (Figure 14.3) only acquires a single two-dimensional diffuse reflectance image from each sample. The two-dimensional hyperspectral image
carries sufficient information to determine the optical properties of the sample, since
the diffuse reflectance profiles are symmetric with respect to the incident point. This
sensing method has the advantage of being rapid because the spatial reflectance distribution at the surface of the sample can be acquired, from a single scan, for the
wavelengths of 450–1000 nm simultaneously.
3.3 Instrument calibration
Proper calibration is required before the hyperspectral imaging system is used for imaging. Calibration requirements may vary depending on the application. Two forms of
calibration, spectral and geometric, are generally required for all applications. Spectral
calibration ensures that each pixel on the CCD area array is assigned to an appropriate wavelength, whereas geometric calibration corrects any distance distortions for
individual pixels of the image. A complete commercial hyperspectral imaging system is usually pre-calibrated spectrally and geometrically. However, many researchers
still assemble a hyperspectral imaging system with components from different vendors. Thus, spectral and geometric calibration must be performed. Spectral calibration
is usually performed using calibration lamps, such as xenon, mercury, and krypton
lamps and lasers, which have known spectral characteristics covering the range of the
hyperspectral imaging system (Lu and Chen, 1998). Geometric distortions cause a
328 Quality Evaluation of Fruit by Hyperspectral Imaging
straight line from the imaging scene to appear curved (“smile”) or rotate by a small
angle (“keystone”) on the image-displaying monitor. Full-scale geometric calibration
can be quite involved and a detailed description of the calibration procedure can be
found in Lawrence et al. (2003). Latest versions of imaging spectrographs can limit
geometric distortions to one pixel, thus eliminating the need to perform full-scale
geometric calibration.
When the hyperspectral imaging system shown in Figure 14.3 is used to measure
optical properties, one additional calibration procedure is recommended for correcting
non-uniform instrument responses of the imaging system. The non-uniform instrument
response refers to uneven intensity values for individual pixels from each spatial row
of the two-dimensional hyperspectral image at a specific wavelength when the imaging scene (flat surface) is under perfectly uniform illumination. Any optical design
imperfection in the imaging spectrograph and the CCD detector, and factors such as
the viewing angle and specific setting of the focusing lens, can cause non-uniform
instrument responses. Two different calibration methods for the non-uniform instrument response may be used; the first is based on the use of reference sample(s) with
known optical properties, while the second is direct calibration of the imaging system. The reference sample-based method can give good correction of the non-uniform
instrument response if the reference sample is properly selected and its absorption and
scattering coefficients are already known (Qin and Lu, 2006a). However, this method
has some drawbacks in that measurement results may be influenced by the reference
samples. To ensure good correction for the non-uniform instrument response, a set of
reference samples with a range of known optical properties should be used. Here we
describe a direct calibration method, which does not require reference samples and
thus is more robust and applicable to a large class of opaque samples.
The direct calibration method needs a precision motor-controlled stage, a highly
stable broadband light source (e.g. a quartz tungsten halogen lamp), and a reference
standard such as a white Teflon disk (Figure 14.5). The light sources commonly used for
hyperspectral imaging applications are DC-regulated, but their stability is not sufficient
for instrument response corrections. To calibrate the instrument response accurately,
a light intensity controller with a feedback function should be installed with the
DC-controlled light source to monitor and correct any lamp output changes in order
to maintain constant short- and long-term output from the lamp. Addition of the light
intensity controller can improve the light source stability by 10 times or more, to
±0.1 percent (Peng and Lu, 2006). Both the light source and the Teflon disk need
to be mounted on the same platform, which is in turn fixed to the precision stage
(Figure 14.5). The same light optic fiber and coupling lens assembly as shown in Figure 14.3 is used to deliver a broadband light beam to the Teflon disk. The precision stage
is properly aligned so that it moves in the direction parallel to the scanning line of the
imaging spectrograph. The hyperspectral imaging system collects spectral reflectance
images from the Teflon disk for each pre-specified displacement by the stage over a
distance range covered by the imaging system (the maximum scattering distance is
normally less than 40 mm). Since the imaging system is kept stationery and there is no
change in the relative positions of the light source and the Teflon standard, any intensity
change in the reflectance images for different stage positions must have been caused by
The hyperspectral imaging system 329
CCD camera
Light source
Motor-controlled stage
Figure 14.5 Experimental set-up for calibrating the non-uniform instrument response of the hyperspectral
imaging system shown in Figure 14.3.
550 nm
650 nm
750 nm
Normalized intensity
Distance (mm)
Figure 14.6 Non-uniform instrument responses of the hyperspectral imaging system at 550, 650, and
750 nm. The distance starts from the beam incident point.
the non-uniform response of the imaging system. Peak intensity values are extracted
from individual rows (along the spatial axis) of each hyperspectral image recorded for
each stage position. Instrument response curves for individual wavelengths are finally
obtained by plotting peak intensity values versus displacement of the stage.
Figure 14.6 shows the normalized instrument response curves for the hyperspectral
imaging system at three wavelengths (550, 650, and 750 nm), covering one-half of the
330 Quality Evaluation of Fruit by Hyperspectral Imaging
total scattering distance of 40 mm. Non-uniformity within 10 mm is less than 5 percent,
and it reaches 40 percent at 20 mm. Such non-uniform responses can cause considerable
signal distortions at positions far away from the light incident point, and hence should
be corrected. To correct the non-uniform instrument response, pixels from each spatial
scattering profile are multiplied by the reciprocals of the corresponding normalized
instrument response data for each wavelength.
3.4 Determination of absorption and reduced scattering
A typical two-dimensional hyperspectral image of an apple over the wavelengths
450–1000 nm and a scattering distance of approximately 30 mm is shown in Figure 14.7a. Due to limited gray-scale levels, the actual light intensities of individual
pixels on the hyperspectral image in this figure cannot be displayed adequately. This
two-dimensional hyperspectral image can be interpreted in two different ways. First,
each row of pixels taken from the image represents a spatial scattering profile consisting
of pixels from the scanning line of the sample at a specific wavelength (Figure 14.7b).
Therefore, each image may be viewed as being composed of hundreds of scattering
profiles over the spectral region of 450–1000 nm. Each scattering profile is approximately symmetric with respect to the incident point, since the scanning line of the
hyperspectral imaging system is perpendicular to the illuminating beam, and the beam
size (1.0 mm) and incident angle (< 15◦ ) are small. Second, if a column of pixels is
taken from the image in Figure 14.7a, it represents a single spectrum for a specific
pixel on the scanning line from the sample (Figure 14.7c). The image can thereby also
be considered as being composed of hundreds of spectra coming from individual pixels
on the scanning line of the sample.
Since the spatial scattering profiles from single rows contain noise, it is helpful
to average the data over several rows of pixels and convert the two-sided spatial
scattering profiles into one-sided scattering profiles to improve the signal-to-noise
ratio. Each experimental scattering profile, Re (r), is then normalized by its peak
value at the distance closest to the incident point (ro = 0.15–0.16 cm), which gives
Ren (r) = Re (r)/Re (ro ). The same normalization procedure is applied to the Farrell
model in equation (14.1), i.e. Rfn (r) = Rf (r)/Rf (ro ). The normalization step is necessary because the experimental data and Farrell model are not on the same scale.
Moreover, normalization also avoids the need for measuring absolute reflectance.
After the data pre-processing, the normalized Farrell model Rfn (r) is used to fit
the normalized experimental data Ren (r). Several non-linear curve-fitting algorithms
are available at this stage. A trust-region non-linear least squares fitting algorithm is
commonly used to extract the best-fit estimates of µa and µs (Qin and Lu, 2006b). To
improve the curve-fitting results, the data fitting is performed in three separate steps.
First, both µa and µs are treated as unknown variables and the estimated values of the
two parameters for all the samples are obtained. Second, the µs values obtained in the
first step are fitted with the wavelength-dependent function µs = αλ−β , which holds
for a large class of biological materials (Doornbos et al., 1999). Third, the results of
µs from the second step are inserted into the normalized Farrell model, and the fitting
The hyperspectral imaging system 331
Spatial axis
Intensity (CCD counts)
20 000
610 nm
676 nm
725 nm
807 nm
16 000
12 000
Distance (mm)
Intensity (CCD counts)
20 000
0.00 mm
1.57 mm
3.74 mm
16 000
12 000
Wavelength (nm)
Figure 14.7 (a) A two-dimensional hyperspectral scattering image for an apple; (b) spatial scattering
profiles at four wavelengths taken from four rows of the image; and (c) spectral profiles at three different
distances from the incident point taken from three columns of the image. The offset distance between the
scanning line and the beam incident center is 1.5 mm, which is not considered in (b) and (c) (adapted
from Lu 2003a).
procedure is repeated with µa as the only unknown. This three-step approach reduces
the fitting noise for both µa and µs , and thus improves the fitting accuracies compared
with the one-step approach.
The curve-fitting procedure is normally accomplished by minimizing the sum of
squares of the differences between experimental data and the Farrell model. The scattering profiles have a large dynamic range within a short distance of 10 mm (Figure 14.7b).
332 Quality Evaluation of Fruit by Hyperspectral Imaging
Consequently, the data points with high intensity values (closer to the light source) tend
to have a much larger impact on the error calculations, while the data points with small
absolute values (greater distances) have little influence on the goodness-of-fit. When
scattering is dominant (µs µa ), the reflectance data close to and far from the incident
point do not have an equal sensitivity to the optical properties of the sample. Signals
close to the source depend strongly on the anisotropy parameter and scattering coefficient, whereas those far from the source exhibit large dependence on the absorption
effect (Kienle and Patterson, 1997; Bevilacqua and Depeursinge, 1999). Hence, it is
necessary to use different weighting methods to calculate the total sum of errors at the
first and third steps of the curve-fitting procedure described above. For the first step
of determining µs , the following equal weighting method is recommended:
Fitting error =
[Ren (r i ) − Rfn (r i )]2
where i = 1, 2, 3, …, N . For the third step of determining µa , the relative weighting
method (equation (14.8)) should be considered in calculating the fitting errors:
Fitting error =
N Ren (r in ) − Rfn (r i ) 2
Ren (r i )
3.5 System validation
The hyperspectral imaging system should be validated against solid or liquid simulation samples (also called tissue phantoms in biomedical research) with known optical
properties to ensure that it meets the expected performance. Simulation samples are
commonly made with an absorption material, a scattering material, and a dilute. Common scattering materials include Intralipid, Lyposyn, and Nutralipid, whereas black
India ink and blue and green dyes are often used as absorbing materials. Liquid simulation samples are easy to prepare, but they cannot model the complexity of a real
sample in terms of shape, size, and structure. Solid simulation samples can be prepared to resemble actual samples of complex geometry/structure, and can be made
using either transparent hosts (such as polymers, silicone or gelatin) or inherently scattering materials like wax. Ideally, simulation samples should be prepared such that
they resemble actual samples to be measured in shape, size, and other physical characteristics. A few simulation samples should be prepared in order to cover the range
of optical properties that would be expected from the actual samples to be measured.
In addition, it is important that simulation samples meet the basic assumption of scattering dominance (µs µa ). A more detailed description of the requirements for and
preparation of simulation samples can be found in Tuchin (2000).
Figure 14.8 shows the actual and calculated absorption and reduced scattering coefficients for three simulation samples made from three different absorbing materials
(blue dye, green dye, and black India ink) and the common Intralipid material. A total
of 36 simulation samples were prepared, 12 for each type of simulation sample. These
Applications 333
µa (cm1)
µa (cm1)
µa (cm1)
Wavelength (nm)
Wavelength (nm)
µs' (cm1)
Wavelength (nm)
µs' (cm1)
µs' (cm1)
Wavelength (nm)
Wavelength (nm)
Wavelength (nm)
Figure 14.8 Comparison of actual and Farrell model-predicted values of the absorption (µa ) and reduced scattering (µs ) coefficients
for three types of simulation samples: (a) the blue dye-Intralipid sample; (b) the green dye-Intralipid sample; and (c) the black
India ink-Intralipid sample (Qin and Lu, 2006b).
simulation samples covered a large range of optical properties (µa = 0.0–0.8 cm−1 and
µs = 2.2–23.2 cm−1 ) that would be expected from different fruits. A detailed description of the validation procedure using simulation samples is given in Qin and Lu
(2006b). The calculated values of µa and µs match the actual values rather well for
the three samples over the entire spectral range between 500 and 900 nm. The average
errors for µa and µs for all 36 samples were 16 percent and 11 percent, respectively,
which are comparable to some reported studies using other spatially-resolved spectroscopic techniques (Kienle et al., 1996; Nichols et al., 1997). Higher errors for µa could
be due in part to its low absolute values compared with those of µs .
Once validated with simulation samples, the hyperspectral imaging system is ready
to measure the optical properties of turbid and opaque samples.
4 Applications
This section presents optical property data measured from fresh apples, peaches, and
fruit juices. Preliminary application examples are given to demonstrate how absorption
334 Quality Evaluation of Fruit by Hyperspectral Imaging
and reduced scattering coefficients may be used to predict fruit firmness and solublesolids content. Moreover, results are presented using empirical approaches to describe
scattering profiles for assessing the firmness and soluble-solids content of apples and
4.1 Optical properties of fruits and juices
Optical properties were obtained from 650 Golden Delicious apples and 800 Red Haven
peaches over the spectral range of 530–950 nm in two recent studies. The procedures
described in previous sections were applied to determine the absorption and reduced
scattering coefficients from the apple and peach samples. Since apples and peaches
have a curved surface, the scattering profiles from the two-dimensional hyperspectral
image (Figure 14.7) may have underestimated the actual reflectance intensities at the
surface of the fruit. For this reason, corrections for the fruit size effect on individual
scattering profiles (Figure 14.7b) should be considered. An equation was derived to
correct the fruit size effect based on the assumption that the fruit is approximately
spherical and the angular reflectance intensity distribution for a given point at the
surface of the fruit obeys the Lambertian Cosine Law (Kortüm, 1969). A detailed
description of the fruit size correction is given in Qin and Lu (2006b).
The absorption and reduced scattering coefficients for three Golden Delicious apples
and three Red Haven peaches are presented in Figure 14.9. The absorption coefficient
for the three Golden Delicious apples increases steadily from 530 nm and reaches a peak
at 675 nm, which corresponds to the chlorophyll absorption waveband. The µa value
decreases rapidly between 675 nm and 720 nm, and then slightly between 720 nm and
900 nm. Beyond 900 nm, the µa value starts to increase, which is due to water absorption around 950 nm (Figure 14.9a). Large differences in the absolute value of µa were
observed for all test apples in the visible range (530–700 nm). The reduced scattering coefficient for the three apples decreases slightly with the increasing wavelength
(Figure 14.9b). The µs spectra for all apples are nearly linear with wavelength. Values
of µs for the 650 Golden Delicious apples are in the range of 13–18 cm−1 , which are
at least one order greater than those of µa (= 0.0–0.3 cm−1 ). The general patterns and
ranges of µa and µs over the wavelengths 530–950 nm for apples obtained in this study
are comparable to those reported by Cubeddu et al. (2001a, 1001b).
Compared with apples, peaches exhibit somewhat different patterns in their µa spectra (Figure 14.9a). The absorption coefficient decreases between 530 nm and 650 nm,
increases between 650 nm and 675 nm, and reaches a peak at 675 nm. Absorption
peaks at 675 nm for the peaches are less prominent compared with those for apples,
indicating lower chlorophyll content in the peaches. Between 720 nm and 900 nm, µa
increases slightly or remains nearly constant, in contrast to the decreasing trend for
apples. Water absorption for the peaches at 950 nm is more conspicuous than that for
apples. The reduced scattering coefficient for peaches changes in a pattern similar to
that for apples, although the values are not the same.
Figure 14.10 shows values of the absorption and reduced scattering coefficients of
six juice samples, including three commercial brands of orange and citric juice. Values
of µa for the four fruit juice samples (Tropicana and Minutemaid orange juice, a citrus
Applications 335
µa (cm1)
Wavelength (nm)
µs' (cm1)
Wavelength (nm)
Figure 14.9 (a) Absorption µa and (b) reduced scattering µs coefficients of three Golden Delicious apples
and three Red Haven peaches over the spectral range of 530–950 nm.
juice, and an orange-pineapple mixed juice) are negligible over the entire spectral region
of 530–900 nm (Figure 14.10a). This indicates that these four juice samples behave like
pure scattering materials. Water is the predominant component in all juice samples, but
values of µa for water are less than 0.006 cm−1 in the spectral region of 500–700 nm
and between 0.006 and 0.070 cm−1 for 700–900 nm (Hale and Querry, 1973). The nonlinear curve-fitting algorithm may not accurately determine the absorption coefficient
at such low levels. The grape juice shows a decreasing trend in µa from 0.35 cm−1 at
530 nm to nearly zero at 610 nm and beyond. The V8 juice sample (a mixture of multiple
fruit and vegetable juices) has the highest value of µa (=0.80 cm−1 ) at 530 nm and
decreases rapidly to approximately zero at 710 nm and beyond.
Values of the reduced scattering coefficient for the six juice samples are different,
and they decrease steadily with the increase in wavelength (Figure 14.10b). The µs
values for the Minutemaid orange juice sample and the orange-pineapple juice sample
are close to each other over the entire spectral region. However, large differences in the
µs values exist between the two orange-juice samples (Tropicana and Minutemaid) for
the entire spectral region of 530–900 nm. These juice samples can be identified based
on the reduced scattering coefficient. Nevertheless, it would be difficult to separate the
four juice samples (except for the V8 grapefruit samples) based on µa .
336 Quality Evaluation of Fruit by Hyperspectral Imaging
Minutemaid orange
Tropicana orange
V8 vegetable
µa (cm1)
Wavelength (nm)
µs' (cm1)
Wavelength (nm)
Figure 14.10
(a) Absorption µa and (b) reduced scattering µs coefficients for six juice samples.
4.2 Quality assessment of fruits
Optical property data are useful in at least two situations. First, they can be used to
study and quantify light scattering and distribution within the fruit via Monte Carlo
simulations or other numerical methods to solve the diffuse theory model. This helps
us to understand the penetration of light in the fruit tissue and properly design a light
source-detector configuration for more effectively assessing fruit quality. Fraser et al.
(2003) studied light distribution and penetration depth in mandarins at 808 nm using
Monte Carlo simulations. Since no optical property data were available for mandarins,
they chose the absorption and reduced scattering coefficients based on the data for
other fruits and trial simulation results. Secondly, the optical property data can also
be directly used for predicting chemical constituents and quality attributes of fruits.
This section shows a preliminary application example in the latter case of using the
optical properties µa and µs to assess quality attributes of apple fruit. It then describes
two empirical methods to analyze hyperspectral scattering images for predicting fruit
firmness and soluble-solids content (SSC) of apples and peaches.
4.2.1 Assessing fruit quality by optical properties
A recent study (Qin and Lu, 2006b) showed that both absorption and scattering coefficients of homogenized milk at 600 nm were highly correlated with the fat content
Applications 337
(r = 0.995 and 0.998, respectively). Such results may not be totally surprising, since
milk is an ideal homogeneous scattering material for which the diffuse approximation
theory is expected to work well. However, as apples have complex structural properties
and irregular geometry, the diffusion theory may not work as well for apples as for milk.
Absorption and reduced scattering coefficients were obtained from 650 Golden
Delicious apples over the wavelengths of 530–950 nm, using the procedure described
earlier. The firmness of these apples was measured using a Magness-Taylor (MT) probe
mounted on a Texture Analyzer (Model TA.XT2i, Stable Micro Systems, Surrey, UK),
and SSC was measured from the juice released during the MT testing using a digital
refractometer. The apples were then divided into two groups: three-fourths for calibration and one-fourth for validation. Multi-linear regression models were developed by
relating µa , µs , and µeff to the fruit firmness and SSC of the calibration samples. The
models were then used to predict the fruit firmness and SSC of the validation samples. Figure 14.11 shows the correlations of µa , µs , and µeff with the fruit firmness
of Golden Delicious apples from the validation group. The absorption coefficient has
lower correlation with MT firmness with r = 0.56 (Figure 14.11a). Better correlation
(r = 0.60) is obtained between the reduced scattering coefficient and MT firmness
(Figure 14.11b). The best correlation with MT firmness (r = 0.67) is achieved with the
effective attenuation coefficient (Figure 14.11c). While the correlations are still low,
they are comparable with or better than those reported in studies using conventional
Vis/NIR spectroscopy (Lu et al., 2000; Park et al., 2003). Lower correlation between
MT firmness and µa can be expected because firmness is largely related to the structural
or mechanical properties of fruit. However, when µa , µs , and µeff were used to predict the SSC of the apples, correlation results (r = 0.34, 0.44, and 0.47, respectively)
are much worse than those obtained with Vis/NIR spectroscopy. Lower correlation
with µs can be expected since it is primarily related to the structural properties of the
fruit. The absorption coefficient µa would be expected to correlate with the fruit SSC
well as light absorption is related to the chemical properties of the fruit. One possible
reason could be lower accuracies in determining µa compared with µs . Since the absolute values of µa for Golden Delicious apples are generally at least one order lower
than those of µs (Figure 14.9), the inverse curve-fitting algorithm may have introduced
larger errors in determining the absorption coefficient of individual apples. Further, the
experimental scattering profiles are normalized prior to the curve-fitting to avoid the
need of measuring absolute reflectance. This normalization process could also lower
the fitting accuracies. Hence, improvements in the data analysis algorithm are needed
in determining the optical properties of fruit, especially the absorption coefficient.
4.2.2 Assessing fruit quality using empirical approaches
This chapter has so far focused on using the fundamental approach (i.e. the diffusion
theory model) to determine the optical properties of fruits. While promising results
have been obtained on predicting fruit firmness by optical properties, the results for
predicting fruit SSC are far from satisfactory. Two empirical methods are presented
here for describing the scattering profiles of the hyperspectral images. The methods,
as presented below, are simpler and generally better than the fundamental approach in
predicting fruit firmness and the SSC of apples and peaches.
338 Quality Evaluation of Fruit by Hyperspectral Imaging
Predicted firmness (N)'
r 0.56
SEP 11.2
MT firmness (N)
Predicted firmness (N)'
r 0.60
SEP 10.8
MT firmness (N)
Predicted firmness (N)''
r = 0.67
SEP = 10.1
MT firmness (N)
Figure 14.11 Prediction of Magness-Taylor (MT) firmness for Golden Delicious apples using
(a) absorption coefficient µa ; (b) reduced scattering coefficient µs ; and (c) effective attenuation
coefficient µeff , over the spectral range of 530–950 nm.
Applications 339
Relative reflectance'
Wavelength (nm)
Figure 14.12 Relative mean spectra obtained from the hyperspectral scattering images of 10 Golden
Delicious apples for a total scattering distance of 25 mm after correction by the mean spectrum of a white
Teflon standard.
In an experiment conducted in 2003, hyperspectral images were collected from 700
Golden Delicious apples that had been kept in a controlled environment (2 percent O2
and 3 percent CO2 at 0◦ C) for about 5 months. Mean reflectance values were calculated
for each scattering profile for a total scattering distance of 25 mm (Figure 14.7) over the
wavelengths of 500–950 nm. Relative mean spectra of apples were obtained by dividing
the sample mean reflectance by the mean reflectance of a whiteTeflon standard (with the
dark current subtracted). Figure 14.12 shows the relative mean reflectance spectra for
10 Golden Delicious apples over wavelengths of 500–950 nm. Similar to the absorption
spectra in Figure 14.9, there are two absorption peaks for the mean relative reflectance
spectra, at 680 nm and 950 nm, due to chlorophyll and water, respectively. Compared
with the µa spectra in Figure 14.9, chlorophyll absorption peaks of the mean spectra
in Figure 14.12 appear to be sharper and more prominent. The relative reflectance
increases steadily over the wavelengths 725–900 nm.
Principal component analysis was applied to reduce the data dimensionality and
extract the main features. A back-propagation feed-forward neural network with inputs
of principal component scores was used to predict fruit firmness and SSC. Good predictions of fruit firmness and SSC of Golden Delicious apples are obtained using
relative mean spectra, with correlation coefficients of 0.85 and 0.89, respectively, and
the corresponding standard errors of prediction (or SEP) of 6.9 N and 0.72 percent
(Figure 14.13). Both firmness and SSC prediction results are considerably better than
those obtained using µa , µs , or µeff , and the firmness prediction results are also better
than those using Vis/NIR spectroscopy (Lu et al., 2000; Park et al., 2003).
In another study, hyperspectral scattering images were obtained from 700 Red Haven
peaches using the same imaging system shown in Figure 14.3 but with a slightly
different arrangement for the light beam (1.6 mm in diameter and 17◦ incident angle).
340 Quality Evaluation of Fruit by Hyperspectral Imaging
r 0.85
SEP 6.90
Predicted firmness (N)''
MT firmness (N)
r 0.89
SEP 0.72
Predicted SSC (Brix) '
Measured SSC (Brix)
Figure 14.13 Prediction of (a) Magness-Taylor (MT) firmness and (b) soluble-solids content (SSC) for
Golden Delicious apples using a back-propagation feed-forward neural network with the mean spectra of
hyperspectral scattering images over the wavelengths of 500–950 nm.
A typical hyperspectral image from a peach fruit is shown in Figure 14.14a, which is
somewhat different from the one for the apple in Figure 14.7. First, the peach fruit has
lower reflectance at 600 nm and below, since its absorption coefficient appears to be
higher than that for apples. Second, the scattering profiles for peaches are broader than
those of apples. However, this does not necessarily mean that the reduced scattering
coefficient of peaches is higher than that of apples, since absorption and scattering are
Applications 341
Spectral axis
Intensity, CCD count
Distance, mm
Figure 14.14 (a) Hyperspectral image of a peach fruit covering the spectral range 500–1000 nm and a
scattering distance of 30 mm, and (b) a spectral scattering profile (circles) fitted by the two-parameter
Lorentzian function (equation (14.9), thin solid line). The offset distance of 1.6 mm is ignored in applying
the Lorentzian distribution function.
Instead of using mean spectra, a two-parameter Lorentzian distribution function was
proposed to describe each scattering profile over wavelengths of 600–1000 nm:
R(x) =
x 2
where x is the distance in mm, b represents the peak value of the scattering profile, and
c is the full scattering width at half maximum (mm). For convenience, the scattering
342 Quality Evaluation of Fruit by Hyperspectral Imaging
Parameter b (CCD counts)'
Wavelength (nm)
Parameter c (mm)
Wavelength (nm)
Figure 14.15
Peng, 2006).
Spectra of Lorentzian parameters b and c (equation (14.9)) for selected peach fruit (Lu and
distance on the left side of the beam incident point was designated to be negative and the
right side to be positive. The offset distance of 1.6 mm between the light beam center
and the scanning line was not considered. As shown in Figure 14.14b, the two-parameter
Lorentzian distribution function gives an excellent fit to the scattering profiles. For all
700 peach fruit, the average values of the correlation coefficient for the curve-fitting
results are greater than 0.995 for wavelengths of 600–1000 nm. Spectra of Lorentzian
parameters b and c for selected peach fruit are shown in Figure 14.15. The scattering
width, represented by Lorentzian parameter c, is relatively flat over the entire spectral
region. While the parameter b, which has not been corrected by a standard, changes
Applications 343
r = 0.85
SEP = 15.9
r = 0.82
SEP = 17.7
Predicted firmness, N
Predicted firmness, N
MT firmness, N
MT firmness, N
r = 0.88
SEP = 14.2
Predicted firmness, N
MT firmness, N
Figure 14.16 Prediction of Magness-Taylor (MT) firmness of peaches using (a) Lorentzian parameter b
(peak value); (b) Lorentzian parameter c (scattering width); and (c) the combination of parameters b and c
(Lu and Peng, 2006).
dramatically over the spectral region of 600–1000 nm, the magnitude of change for
the parameter c is much smaller. The values of Lorentzian parameter c vary between
2.5 mm and 3.8 mm for all peach fruit. There are two downward peaks on the parameter
c spectra; one is around 675 nm and the other at 950 nm, due to chlorophyll and water.
Multiple linear regression models were developed relating Lorentzian parameters b, c,
and their combination b & c (set side by side according to wavelength) to fruit firmness
for the calibration samples. Both Lorentzian parameters b and c are well correlated with
the firmness of peach fruit; however, parameter c, the scattering width, gives worse
predictions of fruit firmness than does parameter b (Figure 14.16). Better predictions
of fruit firmness (r = 0.88) are obtained when parameters b and c are combined.
The above two application examples indicate that the empirical approach has better
predictions of fruit firmness and SSC for apples and peaches than those obtained with
absorption and scattering coefficients. Poor results from the fundamental approach
could be attributed to errors in determining the absorption and reduced scattering
coefficients. Moreover, the empirical approach also compares favorably with NIR
344 Quality Evaluation of Fruit by Hyperspectral Imaging
spectroscopy in predicting fruit firmness. This could be due to the fact that the hyperspectral scattering method can better characterize the scattering properties of the fruit
than does NIR spectroscopy.
5 Conclusions
Rapid and accurate determination of the internal quality of fresh fruit poses technical
challenges because of the complex structural, physical, and chemical properties of
fruit. Researchers are continuing to investigate new, better methods and techniques for
assessing fruit quality. The hyperspectral imaging technique described in this chapter
provides a new opportunity for determining the optical properties and quality of fruit
and other food and agricultural products. Compared with other techniques currently
available, the hyperspectral imaging technique is faster, simpler, and easier to use for
determining the optical properties of turbid and opaque food and agricultural products.
The chapter has presented new optical property data for fresh apples, peaches, and
fruit juices, which will be useful in quantitative analysis of light absorption and scattering in these products. While absorption and reduced scattering coefficients are related
to fruit firmness and soluble-solids content, the results are still far from satisfactory.
This could be due to errors introduced in the curve-fitting process, and the deviation
of fruit shape from the model. Thus, improvements in the data analysis algorithm are
needed so that the absorption and scattering coefficients can be determined more accurately and consistently. The approach of using empirical equations to describe scattering
profiles achieved better prediction of fruit firmness and soluble-solids content than did
the fundamental approach. Hence the hyperspectral scattering technique is potentially
useful in assessing, sorting, and grading fruit quality. However, since hyperspectral
scattering images contain a large amount of information, further research should be
done on developing new or improved mathematical and/or statistical methods for better
description of spatial scattering profiles in order to achieve accurate prediction of fruit
firmness and soluble-solids content.
The author would like to thank Mr Jianwei Qin, a PhD graduate student in Biosystems
Engineering at Michigan State University, for his assistance in preparing this chapter.
Wavelength, µm
Absorption coefficient, cm−1
Effective attenuation coefficient [µeff = [3µa (µa + µs )]1/2 ], cm−1
Scattering coefficient, cm−1
Reduced scattering coefficient [µs = µs (1 − g)], cm−1
Total interaction coefficient (µt = µa + µs ), cm−1
References 345
Internal reflection coefficient
Transport albedo [a = µs /(µa + µs )]
Lorentzian function parameter representing the peak value of the scattering
profile, CCD counts
Lorentzian function parameter representing the scattering width at half
maximum, mm
Scattering anisotropy parameter
Refractive index of the ambient air
Refractive index of the medium or sample
Relative refractive index
Distance from the incident point, cm
The distance between the beam incident point and the scanning line
(ro = 0.16 cm), cm
Distance from the observation point at the interface to the isotropic source
in Farrell model (equation (14.1)), cm
Distance from the observation point at the interface to the image source
in Farrell model (equation (14.1)), cm
Re (r) Diffuse reflectance measured from a sample at distance r from the incident
Ren (r) Normalized experimental diffuse reflectance [Ren = Re (r)/Re (ro )]
Rf (r) Diffuse reflectance at distance r calculated from Farrell model
(equation (14.1))
Rfn (r) Normalized Farrell model-calculated diffuse reflectance
[Rfn = Rf (r)/Rf (ro )]
Scattering distance used in Lorentzian distribution function
(equation (14.7)), mm
Abbott JA, Lu R, Upchurch BL, Stroshine RL (1997) Technologies for nondestructive
quality evaluation of fruits and vegetables. In Horticultural Reviews, 20 (Janick J, ed.).
New York: John Wiley & Sons, Inc., pp. 1–121.
Ariana DP, Lu R, Guyer DE (2006) Near-infrared hyperspectral reflectance imaging
for detection of bruises on pickling cucumbers. Computers and Electronics in
Agriculture, 53 (1), 60–70.
Bevilacqua F, Depeursinge C (1999) Monte Carlo study of diffuse reflectance at sourcedetector separations close to one transport mean free path. Journal of the Optical
Society of America A – Optics Image Science and Vision, 16 (2), 2935–2945.
Birth GS (1978) The light scattering properties of foods. Journal of Food Science, 43,
Birth GS (1982) Diffuse thickness as a measure of light scattering. Applied Spectroscopy,
36 (6), 675–682.
Birth GS, Davis CE, Townsend WE (1978) The scattering coefficient as a measure of pork
quality. Journal of Animal Science, 46 (3), 639–645.
346 Quality Evaluation of Fruit by Hyperspectral Imaging
Chance B, Cope M, Gratton E, Ramanujam N, Tromberg B (1998) Phase measurement of
light absorption and scattering in human tissue. Review of Scientific Instruments, 69
(10), 3457–3481.
Cubeddu, R, D’Andrea C, Pifferi A, Taroni P, Torricelli A, Valentini G, Dover C,
Johnson D, Ruiz-Altisent M, Valero C (2001a) Nondestructive quantification
of chemical and physical properties of fruits by time-resolved reflectance
spectroscopy in the wavelength range 650–1000 nm. Applied Optics, 40 (4),
Cubeddu, R, D’Andrea C, Pifferi A, Taroni P, Torricelli A, Valentini G, Ruiz-Altisent M,
Valero C, Ortiz C, Dover C, Johnson D (2001b)Time-resolved reflectance spectroscopy
applied to the nondestructive monitoring of the internal optical properties in apples.
Applied Spectroscopy, 55 (1), 1368–1374.
Dam JS, Andersen PE, Dalgaard T, Fabricius PE (1998) Determination of tissue optical
properties from diffuse reflectance profiles by multivariate calibration. Applied Optics,
37 (4), 772–778.
Dam, JS, Pedersen CB, Dalgaard T, Fabricius PE, Aruna P, Andersson-Engels S (2001).
Fiber-optic probe for noninvasive real-time determination of tissue optical properties
at multiple wavelengths. Applied Optics, 40 (7), 1155–1164.
Doornbos RMP, Lang R, Aalders MC, Cross FW, Sterenborg HJCM (1999) The determination of in vivo human tissue optical properties and absolute chromophore
concentrations using spatially resolved steady-state diffuse reflectance spectroscopy.
Physics in Medicine and Biology, 44 (4), 967–981.
Farrell TJ, Patterson MS, Wilson B (1992) A diffusion-theory model of spatially resolved,
steady-state diffuse reflectance for the noninvasive determination of tissue optical
properties in vivo. Medical Physics, 19 (4), 879–888.
Fraser DG, Jordan RB, Künnemeyer R, McGlone VA (2003) Light distribution inside
mandarin fruit during internal quality assessment by NIR spectroscopy. Postharvest
Biology and Technology, 27 (2), 185–196.
Gobin L, Blanchot L, Saint-Jalmes H (1999) Integrating the digitized backscattered image
to measure absorption and reduced-scattering coefficients in vivo. Applied Optics, 38
(9), 4217–4227.
Groenhuis RAJ, Ferwerda HA, Ten Bosch JJ (1983) Scattering and absorption of turbid
materials determined from reflection measurements–1. Theory. Applied Optics, 22
(16), 2456–2462.
Hale GM, Querry MR (1973) Optical constants of water in the 200-nm to 200-micrometer
wavelength region. Applied Optics, 12 (3), 555–563.
Kienle A, Patterson MS (1997) Determination of the optical properties of semi-infinite turbid media from frequency-domain reflectance close to the source. Physics in Medicine
and Biology, 42 (9), 1801–1819.
Kienle, A, Lilge L, Patterson MS, Hibst R, Steiner R, Wilson BC (1996) Spatially resolved
absolute diffuse reflectance measurements for noninvasive determination of the optical scattering and absorption coefficients of biological tissue. Applied Optics, 35 (13),
Kim MS, Lefcourt AM, Chao K, Chen YR, Kim I, Chan DE (2002) Multispectral detection
of fecal contamination on apples based on hyperspectral imagery: Part I – application
References 347
of visible and near-infrared reflectance imaging. Transactions of the ASAE, 45 (6),
Kortüm G (1969) Reflectance Spectroscopy – Principles, Methods, Applications. NewYork:
Lawrence KC, Park B, Windham WR, Mao C (2003) Calibration of a pushbroom hyperspectral imaging system for agricultural inspection. Transactions of the ASAE, 46 (2),
Lin SP, Wang L, Jacques SL, Tittel FK (1997) Measurement of tissue optical properties
by the use of oblique-incidence optical fiber reflectometry. Applied Optics, 36 (1),
Lu R (2003a) Imaging spectroscopy for assessing internal quality of apple fruit. ASAE
Paper No. 036012, ASAE, St Joseph, MI, USA.
Lu R (2003b) Detection of bruises on apples using near-infrared hyperspectral imaging.
Transactions of the ASAE, 46 (2), 523–530.
Lu R, Abbott JA (2004) Force/deformation techniques for measuring texture. In Texture
in Food: Volume 2: Solid Foods (Kilcast D, ed.). Cambridge: Woodhead Publishing
Limited, pp.109–145.
Lu R, ChenYR (1998) Hyperspectral imaging for safety inspection of foods and agricultural
products. In Proceedings of SPIE Vol. 3544 – Pathogen Detection and Remediation
for Safe Eating (Chen YR, ed.). Bellingham: SPIE, pp.121–133.
Lu R, PengY (2006) Hyperspectral scattering for assessing peach fruit firmness. Biosystems
Engineering, 93 (2), 161–171.
Lu R, Guyer DE, Beaudry RM (2000) Determination of firmness and sugar content of apples
using near-infrared diffuse reflectance. Journal of Texture Studies, 31 (4), 615–630.
Lu R, Qin J, PengY (2006) Measurement of the optical properties of apples by hyperspectral
imaging for assessing fruit quality. ASABE Paper No. 066179, St Joseph, MI, USA.
Martinsen P, Schaare P (1998) Measuring soluble solids distribution in kiwifruit using nearinfrared imaging spectroscopy. Postharvest Biology and Technology, 14 (3), 271–281.
Mobley J, Vo-Dinh T (2003) Optical properties of tissue. In Biomedical Photonics
Handbook (Vo-Dinh T, ed.). Boca Raton: CRC Press LLC, pp. 2, 1–75.
Mourant JR, Fuselier T, Boyer J, Johnson TM, Bigio IJ (1997) Predictions and measurements
of scattering and absorption over broad wavelength ranges in tissue phantoms. Applied
Optics, 36 (4), 949–957.
Nichols MG, Hull EL, Foster TH (1997) Design and testing of a white-light, steadystate diffuse reflectance spectrometer for determination of optical properties of highly
scattering systems. Applied Optics, 36 (1), 93–104.
Park B, Lawrence KC, Windham WR, Buhr RJ (2002) Hyperspectral imaging for detecting
fecal and ingesta contaminants on poultry carcasses. Transactions of the ASAE, 45 (6),
Park B, Abbott JA, Lee KJ, Choi CH, Choi KH (2003) Near-infrared diffuse reflectance for
quantitative and qualitative measurement of soluble solids and firmness of Delicious
and Gala apples. Transactions of the ASAE, 46 (6), 1721–1731.
Patterson MS, Chance B, Wilson BC (1989) Time resolved reflectance and transmittance
for the noninvasive measurement of tissue optical properties. Applied Optics, 28 (12),
348 Quality Evaluation of Fruit by Hyperspectral Imaging
Patterson MS, Moulton JD, Wilson BC, Berndt KW, Lakowicz JR (1991) Frequency-domain
reflectance for the determination of the scattering and absorption properties of tissue.
Applied Optics, 30 (31), 4474–4476.
Peng Y, Lu R (2006) Improving apple fruit firmness predictions by effective correction of
multispectral scattering images. Postharvest Biology andTechnology, 41 (3), 266–274.
Qin J, Lu R (2006a) Hyperspectral diffuse reflectance imaging for rapid, noncontact
measurement of the optical properties of turbid materials. Applied Optics, 45 (32),
Qin J, Lu R (2006b) Measurement of the optical properties of apples using hyperspectral
diffuse reflectance imaging. ASABE Paper No. 063037, St Joseph, MI, USA.
Tromberg BJ, Coquoz O, Fishkin JB, Pham T, Anderson E, Butler J, Cahn M, Gross JD,
Venugopalan V, Pham D (1997) Non-invasive measurements of breast tissue optical
properties using frequency-domain photon migration. Philosophical Transactions:
Biological Sciences, 352 (1354), 661–668.
Tuchin V (2000) Tissue Optics: Light Scattering Methods and Instruments for Medical
Diagnosis. Bellingham: SPIE Press.
Vo-Dinh T (ed.) (2003) Biomedical Photonics Handbook. Boca Raton: CRC Press LLC.
Wang L, Jacques SL (1995) Use of a laser beam with an oblique angle of incidence to
measure the reduced scattering coefficient of a turbid medium. Applied Optics, 34
(13), 2362–2366.
Williams PC, Norris K, ed. (2001) Near-Infrared Technology in the Agricultural and Food
Industries, 2nd edn. St Paul: AACC, Inc.
Quality Evaluation of
Digvir S. Jayas1 , Prabal K. Ghosh2 , Jitendra Paliwal2 , and
Chithra Karunakaran3
1 Stored-Grain Ecosystems, Winnipeg, Manitoba,
Canada, R3T 2N2
2 Department of Biosystems Engineering, University of Manitoba,
Winnipeg, Manitoba, Canada, R3T 5V6
3 Canadian Light Source, Saskatoon, Saskatchewan,
Canada, S7N 0X4
1 Introduction
Global wheat production was about 630 million tonnes in 2005 from a total harvested
area of 217 Mha (FAOSTAT, 2006). The US, Canada, Argentina, France, and Australia
are the major wheat-growing and exporting countries in the world. Different wheat
classes and varieties are grown in these countries for different end uses. The quality of
wheat varies significantly due to differences in physiology, growing conditions, crop
management practices, and grain handling and storage techniques. Quality monitoring
of wheat is therefore an important consideration in these countries and around the
world to ensure marketing of wheat of consistent quality. Different grading standards
are established with respect to the various wheat classes produced in these countries
on the basis of end-use characteristics.
For example, in Canada, producers store their grain on the farm and deliver it in
farm trucks to the primary (country) elevators (grain-handling and storage facilities).
The grain is graded by visual inspection and in comparison with standard samples
(Anonymous, 1987). The standard samples are prepared every year to reflect the environmental conditions during harvest and carryover from the previous crop year. The
main grading factors are test weight (kg/hl), varietal purity, vitreousness, soundness,
and maximum limits of foreign materials. Full description of the grade standards and
their requirements for all the Canadian wheat classes is publicly available at the Canadian Grain Commission’s website (http://grainscanada.gc.ca/Pubs/GGG/ggg-e.htm).
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
352 Quality Evaluation of Wheat
Although moisture and protein content are not grading factors, the former is determined to assess the drying costs and safe storage period, while the latter is determined
to segregate wheat by protein content in order to charge a premium for high-protein
wheat. Similar factors are used in other countries.
Railcars transport the grain from primary elevators to terminal elevators (export
elevators). The terminal elevators receive, grade, clean, temporarily store, and ship
grain to importing countries. Also, the Canada GrainAct (1975) specifies zero tolerance
for stored-product insects in grain, necessitating the detection of such pests. Apart from
test weight, moisture content, and protein content determination, all other factors are
resolved by visual inspection. The test weight is determined, using a standard procedure,
by measuring the amount of wheat required to fill a known volume container, and is
expressed in kg/hl or lb/bu. The moisture content is determined using a specified
moisture meter (e.g. Model 1200A, Seedburo Equipment Co., Chicago, IL, or Model
919/3.5, Labtronics, Winnipeg, MB), and protein content is determined using a nearinfrared spectroscopy based instrument (e.g. Tecator InfraTec Model 1225, Foss North
America, Silver Springs, MD). There is a need to develop fast, accurate, automated,
objective, on-line systems to monitor other grading factors and several non-grading
factors (e.g. the presence of insects or insect-damaged kernels) that affect the end-use
quality of wheat. Although still at the research stage, there are several situations where
elevators can use machine vision, soft X-ray systems or other imaging systems for
grain quality monitoring – for example:
to positively identify grain on receiving it so that automation can proceed and
grains of different types will not be mixed
to determine the contents of grain samples collected before and after a cleaning
system, for optimization of the cleaning process
to monitor the constituents of grain as it is being loaded onto ships so that grain
meets the specifications of the importing country
to determine the impurities by other classes of wheat in the main class
to determine the foreign material in wheat
to detect immature life stages inside kernels of cereals, and immature and adult
insects outside grain kernels.
For terminal elevators, the grain industry, and farmers, there can be significant
potential benefits in adopting these advanced technologies. For example, if low-level
infestations can be detected quickly, then infested grain would not be mixed with other
non-infested grain. Such detection would reduce the cost of chemical fumigation by
treating the small quantity of infested grain rather than a large bin full of contaminated
This chapter gives general descriptions of machine vision, near infrared spectroscopy
(NIRS), hyper-spectral imaging, soft X-ray imaging, and thermal imaging systems, and
representative results obtained for wheat quality monitoring. Also, a brief description
is given of devices built to singulate kernels for presentation to the imaging systems,
and four potential applications of the current technology at grain-handling facilities
are highlighted.
Machine vision 353
2 Machine vision
2.1 Context for wheat quality monitoring
Machine vision, also known as computer vision, is the science that deals with object
recognition and classification by extracting useful information about the object from its
image or image set. It is a branch of artificial intelligence combining image-processing
and pattern-recognition techniques. The major tasks performed by a machine vision
system can be grouped into three processes: image acquisition, processing or analysis,
and recognition. Various characteristics of the objects are extracted and final decisions
are made using different image-processing algorithms and pattern-recognition techniques, respectively. Machine-vision based inspection is already in commercial use
in automotive, electronics, and other industries. Many of the industrial objects being
inspected are of defined size, shape, color, and texture. Agricultural or biological
objects, including grain kernels, on the other hand, are of variable size, shape, color,
and texture. In addition, these features may vary from year to year, by growing region
within a year, and even over a single growing season. These variations pose additional
image-processing challenge. Additional challenges come from the work environment
in the grain-handling facilities – for example, the simple task of identifying a sprocket
on the railcar for a robot to open and close the gate requires detection of different types
of sprockets, located at different positions on the railcars, under different lighting
conditions (Jayas et al., 2005).
Substantial work dealing with the use of morphological (size and shape) features
for classification of different grain species, classes, varieties, damaged grains, and
impurities has been reported in the literature (e.g. Neuman et al., 1987; Keefe, 1992;
Majumdar and Jayas, 2000a). A few investigations with selected clean samples have
been carried out using color and reflectance features (Hawk et al., 1970; Neuman
et al., 1989a, 1989b; Sapirstein and Bushuk, 1989; Shatadal et al., 1995a; Crowe
et al., 1997; Luo et al., 1999a; Majumdar and Jayas, 2000b; Paliwal et al., 2004a) for
classification of cereal grains and their varieties, and for correlating the vitreosity and
grain hardness of durum wheat. Only limited research has been reported based on the
use of textural features (Majumdar et al., 1999; Majumdar and Jayas, 2000c, 2000d)
for classification purposes. Efforts have also been made to integrate all these features
into a single classification vector (Paliwal et al., 1999) for grain kernel identification.
Most of the reported studies on machine vision analysis of grain samples have used a
manual system for placement of kernels in the camera’s field of view (FOV) to ensure
that kernels do not touch (Keefe and Draper, 1986; Zayas et al., 1986, 1989; Myers
and Edsall, 1989; Neuman et al., 1989a, 1989b; Symons and Fulcher, 1988a, 1988b,
1988c, 1988d; Sapirstein and Bushuk, 1989). Also, these studies did not report the
variations in the features caused by the samples of clean grain from various growing locations. For industrial applications, however, an automated sample presentation
device will be required and the system will have to be able to handle unclean samples
from many growing regions. A commercial grain-handling facility handles grains of
different grades grown under a wide range of climatic variables. All these grains must
354 Quality Evaluation of Wheat
be identified and quantified for practical implementation in optimizing the cleaning of
grain and the shipping of consistent quality grain.
2.2 Area-scan imaging
In area-scan imaging, a camera is used to capture the image of a specified area. This
area may include one or more singulated objects or a bulk sample. The camera used for
imaging might be analog or digital. The analog area-scan based system (Figure 15.1)
includes a monochrome or color camera with or without zoom lens or close-up lens,
a camera control unit, a monochrome or color monitor, a frame-grabbing board, and
a computer system. To provide rigid, stable support and easy vertical movement, the
camera is usually mounted on a stand.
Illumination can be provided using incandescent, halogen, or fluorescent lighting.
Luo et al. (1997) evaluated three types of light sources – incandescent, halogen, and
fluorescent lamps – and the fluorescent lamp with a light controller incorporated as part
of its power supply. Output gray levels from the three bands of the camera (red, green,
and blue, or RGB) were recorded for a range of lamp supply voltages and for an 8-hour
period with constant lamp supply voltages. Illumination uniformities over the FOV of
the camera were examined. Based on the results, the fluorescent lamps with controller
were found to give the most stable and uniform illumination. Also, fluorescent tubes
operated at a cooler temperature. Therefore, fluorescent lights are recommended for
continuous illumination in machine vision systems. Different shapes of fluorescent
lights (e.g. strip, ring, panel or compact) are currently being used. Light-emitting
diodes (LED) with an increased life, a smaller size, and easier handling are available
in a wide variety of shapes (e.g. ring, line, area or spot for producing directional light
and enhanced brightness). These are slowly replacing the fluorescent lighting systems.
To uniformly light the sample, a light diffuser is usually required. For a monochrome
Figure 15.1 Area-scan imaging system (photograph courtesy of Canadian Wheat Board Centre of Grain
Storage Research, Winnipeg, MB, Canada).
Machine vision 355
camera the signal is converted to a gray-scale image and saved. For a color camera
system, the composite color signal is converted by the camera control unit (at a speed
of 30 frames per second for NTSC cameras and at 25 frames per second for PAL
cameras) into three parallel analog RGB video signals and a synchronous signal.
The frame grabber installed in the computer system digitizes the RGB analog video
signals from the camera control unit into digital images and stores them for further
analysis. Frame grabbers can be replaced by digital cameras. The image-acquisition
algorithms and procedures are specific to the camera and the associated system. Spatial
calibration of the system should be done using an object of known consistent size (e.g.
a coin), and color calibration should be done using a standard card (e.g. the Kodak
color card). Such calibrations should be carried out regularly to ensure that no drift
has occurred in the light or in the system. Once the images have been acquired, the
objects must be separated from the background and then algorithms must be written
to extract the necessary features. For a color-camera based system, such details are
given elsewhere (Karunakaran et al., 2001). Continuous improvement is under way
for developing new and advanced cameras with increased performance in terms of
producing high-resolution images in a short time, and easy and secure connectivity
for faster data transfer to the computer system through camera links, firewire, Gigabit
ethernet or USB ports. Camera manufacturers are currently focused on incorporating
digital signal processors (DSP) inside the camera or in the frame-grabbing units for
faster image processing. A uniform application programming interface (API) is also
being developed to allow all compliant cameras to be programmed with a common
generic programming interface without writing separate algorithms for specific camera
systems. Connolly (2006) describes the recent trends in machine vision technologies
and future directions.
2.3 Line-scan imaging
A line-scan imaging system acquires images of moving objects one line at a time
and then assembles the lines into a rectangular or square image. This requires precise
control of the movement of the conveyor system so that consecutive lines being captured
neither overlap nor skip space. The line-scan camera-based system includes a conveyor
belt system, a monochrome or color line-scan camera and associated frame-grabbing
board, a power supply, a computer system, and an illumination source (Figure 15.2).
Similar to an area-scan camera, the line-scan camera can be fitted with lenses to focus
selectively on moving objects. The frame-grabber board supplies control information
to the camera, and monitors the speed of the conveyor belt via the output from a
rotary shaft encoder. The grain sample can be poured into a hopper attached to the
conveyor belt. Two-dimensional images are created by combining the desired number
of sequential lines, and the image files are saved. An example of a line-scan camera
system is described by Crowe et al. (1997). The current focus on developing API and
advanced DSP-based cameras may lead to the replacement of frame-grabbing boards.
The shape, size, color, and textural information can be extracted from the acquired
images of individual kernels, but only color and textural information can be extracted
from bulk images. If the classification is to be based on the information from individual
356 Quality Evaluation of Wheat
Gain feeder
Figure 15.2 Line-scan imaging system (photograph courtesy of Canadian Wheat Board Centre of Grain
Storage Research, Winnipeg, MB, Canada).
Figure 15.3 An automated seed presentation device for use in machine vision identification of grain
(photograph courtesy of Canadian Wheat Board Centre of Grain Storage Research, Winnipeg, MB, Canada).
objects, then there is a need to present all objects individually for imaging to an areascan camera or to a line-scan camera. If classification is to be based on information
from bulk images, then there is a need to present bulk samples consistently to a camera.
Three systems developed and evaluated at the Canadian Wheat Board Centre for Grain
Storage Research (CWBCGSR) are highlighted below.
2.4 Sample presentation devices
A kernel-positioning system (Figure 15.3) was designed and fabricated to automatically
pick up and separate kernels of various grain types (Jayas et al., 1999). The system was
Machine vision 357
Line-scan camera
Lamp shield
Grain hopper
Secondary conveyor belt
Primary conveyor belt
Figure 15.4 Dual conveyor grain presentation device: (a) schematic diagram (Visen, 2002), (b) model
prototype (photograph courtesy of Canadian Wheat Board Centre of Grain Storage Research, Winnipeg,
MB, Canada).
tested for its ability to separate kernels of four grain types: wheat (cv. Katepwa), barley
(cv. Manley), canola (cv. Tobin), and lentils (cv. Eston). Wheat and barley were tested
at three different moisture contents, and canola was tested at two moisture contents.
To determine whether the system had any bias towards the selection of certain seeds,
mixtures of different grain types were also used in the tests. The separation success
rates for samples of wheat, barley, canola, and lentils were 92, 79, 97, and 89 percent,
respectively. Non-separation events were either no kernels or multiple kernels. The
ability of the system to pick up and separate kernels was not influenced by moisture
content. In mixtures of grains (e.g. barley in wheat at 1, 3, and 5 percent levels by mass)
there was no significant difference in the number of “imageable” wheat kernels. The
system, however, had a bias to pick more kernels of the major grain component present
in the mixtures of wheat and canola. For example, in a 99–95 percent canola and 1–5
percent wheat sample, the device picked less than 0.2 percent wheat. The unit also has
many moving parts, and could therefore require considerable repair and maintenance
when used in an industrial setting.
To remove bias for picking canola (small) kernels and to make a unit that can be
used in the grain industry, where the environment is inherently dusty and vibration
prone, another prototype was built and tested (Figures 15.4a, 15.4b) (Visen, 2002). The
device consisted of a hopper and a seed-positioning device (slotted roll) that delivered
the kernels to a conveyor belt in a single layer. This conveyor belt was positioned
slightly over a second conveyor belt of the same size, running in the same direction,
but at a higher speed. As the seeds reached the end of the first belt, they fell onto
the second belt and dispersed. A line-scan camera, situated at some distance down
the length of this belt, then took images of individual seeds as they passed by. Three
different speed differentials were tested for separating five different classes of grain.
This unit is less complex, and successfully separated kernels of lentils, oats, barley,
wheat, and canola. The separated kernels are presented on a moving belt to the camera.
358 Quality Evaluation of Wheat
seed hopper
rotating disc
acrylic board
wheat kernel
polyethylene foam
Figure 15.5
Singulation device for presentation of grains to a soft X-ray machine (source: Melvin et al.,
The device was successful in separating the kernels of all these grain types, with mean
separation percentages of 94.5 percent, 96.1 percent, and 95.8 percent for the three
belt-speed combinations used (unpublished data). When grain samples were mixed
with secondary grains at levels of 1 percent, 3 percent, and 5 percent, little or no
difference was exhibited in the performance of the device. The images acquired from
the line-scan camera were compared to those obtained using an area-scan camera for
five cereal grains, namely barley, oats, rye, wheat, and durum wheat. No significant
difference was found in the morphological, color, and textural features of the images
acquired by area- or line-scan cameras.
A device to present individual kernels to the X-ray machine was developed and
tested (Figure 15.5) (Melvin et al., 2003). The device was able to present individual
kernels with 60–80 percent efficiency for different grain types. Failure resulted from
presenting two kernels or no kernels to the viewing area. From a practical point of
view, no kernel is not of concern because the algorithm can easily detect it and discard
it. Failure with two kernels can be solved by integrating the separation algorithm.
2.5 Development of separation algorithms
Mechanical systems such as described in section 2.4 cannot separate all kernels, and
therefore there is a need to separate touching kernels using software. An algorithm to
separate contiguous grain-kernel image-regions was developed (Shatadal et al., 1995b,
1995c). The disconnect algorithm is based on the principles of mathematical morphology. The disconnect algorithm was successful in separating conjoint kernel regions of
Hard Red Spring (HRS) wheat, durum wheat, barley, rye, and oats, with success rates
of 95, 95, 94, 89, and 79 percent, respectively. The speed of processing the images was
slow and required improvement.
Machine vision 359
Another algorithm was developed to fit ellipses to images of separated and touching
kernels with random orientations (Shashidhar et al., 1997; Visen et al., 2001). The
algorithm was evaluated for its ability to count objects in the images and to estimate
length, width, perimeter, and area of individual objects. The estimated parameters were
compared with measured parameters obtained using algorithms for extraction of morphological features. All 300 kernels were counted correctly; however, the randomness
of sampling points for the ellipse fitting may result in missing some objects in other
trial runs. Most of the estimated size features were not significantly different from the
measured parameters at P > 0.05.
An improved version of an ellipse-fitting algorithm combined with the mathematical
morphology method was developed and tested (Zhang et al., 2005). Typical touching
kernel patterns of four grain types, namely barley, CWAD wheat, Canada Western
Red Spring (CWRS) wheat, and oats, obtained from composite samples from several
growing locations across the western Canadian prairies were used to test this algorithm.
The accuracies of separation were 92.4 percent (barley), 96.1 percent (CWAD wheat),
94.8 percent (oats), and 97.3 percent (CWRS wheat).
A morphological image-processing algorithm based on watershed segmentation of
a distance transform graph of connected binary imagery was developed by Wang and
Paliwal (2006). The algorithm dealt with an “oversegmentation” problem in original
watershed segmentation by reconstructing internal markers through a series of morphological operations. The internal markers were then used to join overly segmented parts
belonging to the same component. Closed boundaries of each connected component
were finally pruned and extracted. The algorithm was applied to separate touching kernels of six grain types, namely CWRS wheat, Canada Western Hard White (CWHW)
wheat, CWAD wheat, six-row barley, rye, and oats. The segmentation method was
most successful on the three types of wheat kernels, and achieved correct segmentation rates of 94.4 percent (CWRS wheat), 92.0 percent (CWHW wheat), and 88.6
percent (CWAD wheat). The method was not as suitable for the three other grain types,
with segmentation rates of 55.4 percent (oats), 79.0 percent (rye), and 60.9 percent
(six-row barley). Sound CWRS wheat kernels were mixed with CWAD wheat and
broken wheat kernels, so that they were in contact, and were segmented using a developed watershed algorithm. Five geometric features were extracted from disconnected
binary images, and linear classifiers based on Mahalanobis distance were used to identify wheat dockage. The linear classifier identified 96.7 percent of adulterated CWAD
wheat kernels and 100 percent of broken CWRS wheat kernels.
2.6 Morphological, color, and textural algorithms
Once images are acquired, algorithms are needed for thresholding, pre-processing
operations, and segmentation, and for feature extraction from digital images of various
types of cereal grains and dockage fractions. Such algorithms were developed and
evaluated over several years at the CWBCGSR (Majumdar et al., 1996a, 1996b, 1999;
Nair and Jayas, 1998; Luo et al., 1999a, 1999b; Majumdar and Jayas, 1999a, 1999b,
2000a, 2000b, 2000c, 2000d). Paliwal et al. (2003a) further improved these algorithms,
which were coded in Microsoft Visual C++ environment. The program is used to
360 Quality Evaluation of Wheat
extract morphological, color, and textural features from different grains and dockage
fractions. A variation of the same program was used to extract color and textural features
from bulk samples of grain (Visen et al., 2004b). The program can batch process a large
number of image files stored on local and remote computers (connected by network),
and is smart enough to skip corrupted and non-existent files. The program output (i.e.
features of the objects in the image) can be written to a new text file or be appended
to an existing file. The output file consists of information about the specific filenames
from which the features of an object were extracted. This facilitates the back-tracking
of image files and their constituent objects from the corresponding feature values. The
contents of the output text files were tab delimited to enable easy export to spreadsheets.
The program is flexible enough to incorporate new features without necessitating any
major changes in the core program. The modular nature of the program enables the user
to choose specific features that must be extracted from the objects in the image files.
2.6.1 Morphological features
Morphological features illustrate the appearance of an object. Algorithms were
developed to extract morphological features based on basic size features (e.g. area,
perimeter, bounding rectangle, centroid, lower-order moments (normal, central, and
invariant), length and width, and angle of orientation) and derived shape features (e.g.
roundness, radius ratio, box ratio, area ratio, aspect ratio, and the coefficient of variation
of radii).
2.6.2 Gray-scale and color features
The gray values for monochrome images and the ratio of primary colors (i.e. red, green,
and blue) for color images are used for object recognition. The three primary colors
(RGB) are sometimes converted into the hue, saturation, and intensity (HSI) system or
the L*a*b* (CIELAB) color scheme for easy human perception. In the HSI system, hue
represents the dominant wavelength (i.e. pure color), saturation refers to the amount
of white light mixed with the hue or the pure color, and intensity is the brightness of
the achromatic light. In the L∗ a∗ b∗ color scheme, L∗ represents the lightness of the
color (L∗ = 0 yields black and L∗ = 100 indicates white), its position between magenta
(positive values of a∗ ) and green (negative values of a∗ ), and its position between yellow
(positive values of b∗ ) and blue (negative values of b∗ ). Algorithms were developed
to extract color features, based on means, variances, ranges, histograms, and invariant
moments of red, green, and blue bands.
2.6.3 Textural features
The texture of an object can be described based on the spatial distribution of image
intensities. Textural features thus provide information on the surface properties of
the objects, such as smoothness, coarseness, fineness or granulation. For example,
a smooth object has low variation in spatial intensities, whereas a coarsely textured
object has highly variable spatial intensities. Textural features can be described by
Fourier transformation or statistical approaches. However, sometimes two objects can
have the same morphological and color features; therefore, algorithms were developed
to extract textural features from gray-level histograms (GLH), gray-level co-occurrence
Machine vision 361
matrices (GLCM), and gray-level run-length matrices (GLRM) for red (R), green
(G), and blue (B) bands, and different combinations of RGB bands. The gray-level
co-occurrence matrix provides information about the distribution of gray-level intensities with respect to the relative position of the pixels with equal intensities. The
gray-level run-length matrix represents the occurrence of collinear and consecutive
pixels of the same or similar gray levels in an object. Readers are referred to Gonzalez
and Woods (1992); Majumdar et al. (1996b); Majumdar and Jayas (2000a, 2000b,
2000c, 2000d); and Karunakaran et al. (2001) for details of all the morphological,
color, or textural features extracted for image analysis.
2.6.4 Testing and optimization
A database was formed of high-resolution digital images of individual kernels and bulk
samples of the five most common Canadian grain types (barley, CWAD wheat, CWRS
wheat, oats, and rye) collected from 23 growing locations across western Canada. The
constituents of dockage were also divided into five broad categories (broken wheat
kernels, chaff, buckwheat, wheat-heads, and canola) and imaged. For the individual kernels, a total of 230 features (51 morphological, 123 color, and 56 textural)
were extracted from these images, and classification was performed using a four-layer
back-propagation network (BPN) (Jayas et al., 2000) and a statistical classifier (nonparametric). Because the shape and size information for bulk samples is irrelevant, only
color and textural features were extracted for them. Different feature models, namely
morphological (only for individual grain kernels and contaminants), color, textural, and
a combination of these, were tested for their classification performances. The results
of these classification processes were used to test the feasibility of a machine-vision
based grain cleaner.
For individual grain kernels, while using the BPN classifier, classification accuracies of over 98 percent were obtained for barley, CWRS wheat, oats, and rye (Paliwal
et al., 2003b). Because of its misclassification with CWRS wheat, CWAD wheat gave
a lower classification accuracy of 91 percent. For the dockage fractions, because of
the uniqueness in their size and/or color, broken wheat kernels, buckwheat, and canola
could be classified with almost 100 percent accuracy. The classification accuracies of
chaff and wheat-heads were low because they did not have well-defined shapes (Paliwal
et al., 2003a). The back-propagation network outperformed the non-parametric classifier in almost all instances of classification (Table 15.1). None of the three feature
Table 15.1 Classification accuracies of singulated cereal grain kernels determined by the BPN and
non-parametric classifier in parentheses (Visen, 2002).
CWAD wheat
CWRS wheat
Classification percentages using features
Top 60
Top 30
96.5 (93.2)
89.4 (90.7)
98.3 (97.0)
95.0 (91.4)
92.8 (91.4)
93.8 (71.5)
92.9 (79.1)
99.0 (92.3)
92.9 (66.8)
94.5 (87.8)
94.2 (91.4)
91.5 (92.5)
94.9 (95.6)
90.8 (91.1)
95.2 (96.2)
98.1 (90.4)
90.5 (90.3)
98.7 (97.1)
98.4 (95.8)
98.9 (98.0)
97.9 (95.1)
90.1 (90.6)
98.5 (93.6)
97.8 (94.7)
98.0 (98.2)
362 Quality Evaluation of Wheat
Table 15.2 Classification accuracies of bulk cereal grains determined by a BPN classifier (Visen,
CWAD wheat
CWRS wheat
Classification percentages using features
Top 40
Top 20
sets (morphological, color, or texture) on its own was capable of giving high classification accuracies. Combination of the three improved the classification significantly.
However, the use of all the features together did not give the best classification results,
as a lot of the features were redundant and did not contribute much towards the classification process. A feature set consisting of top 20 morphological, color, and textural
features each, gave the best results (Paliwal et al., 2003a). The better classification
accuracies obtained using neural network classifiers were in accordance with earlier
studies done at the CWBCGSR to compare their performance with statistical classifiers
(Jayas et al., 2000; Paliwal et al., 2001; Visen et al., 2002, 2004a).
To quantify the amount of impurity in a grain sample, a relationship between the
morphology and mass of the kernel (or dockage particle) was investigated. An area of
a particle in a given image gave the best estimate of its mass. This relationship was
tested and validated for quantifying the amount of impurity in a sample before and
after passing it through a lab scale cleaner (Paliwal et al., 2004b, 2005). To automate
it, it is desirable that the cleaner should have a decision support system to adjust its
parameters (such as vibration rate, grain flow rate, etc.) by calculating the amount of
impurity being removed from the sample. This was done by calculating the change
in the ranges of morphological features of the particles before and after the sample
was passed through the cleaner, which was significant (Paliwal et al., 2004b). This
information can be used to optimize a machine-vision based cleaner’s performance.
For the bulk samples, classification accuracies of over 98 percent were obtained
for all the grain types (Visen et al., 2004b). The best results were obtained using a
combination of both color and textural features. Other than oats, all the grain types
could be classified, with close to 100 percent classification accuracy using an optimized
set of just 20 features (Table 15.2). As classification of bulk samples will be required
to identify the contents of a railcar, perfect classification is not essential in such cases.
3 Soft X-ray imaging
An X-ray image is formed by penetrating, high-energy photons of 0.1–100 nm wavelength passing through an object. Two types of X-ray imaging are generally practiced
in the agri-food industry: soft X-rays with a wavelength of 1–100 nm, of low energy
and less penetrating power; and hard X-rays (or X-ray computed tomography) with
Soft X-ray imaging 363
X-ray tube
Figure 15.6 Soft X-Ray imaging system (photograph courtesy of Canadian Wheat Board Centre of Grain
Storage Research, Winnipeg, MB, Canada).
a wavelength of 0.1–1 nm, of high energy and greater penetration power, which are
restricted to use in high-density objects. The X-ray technique provides images based
on object density differences. A soft X-ray imaging system includes a fluoroscope
which produces soft X-rays and real-time images (Figure 15.6), a computer system,
and a digitizer. Current X-ray systems require that kernels be placed manually on the
platform between the X-ray tube and detector (Karunakaran et al., 2003a). Automation of this technology to scan a monolayer of bulk sample moving on a conveyor belt
would be ideal for use in the grain industry. Real-time hard X-ray imaging systems are
available for continuous food product inspection. The shielding of low-energy X-rays
and development of an X-ray detector to detect soft X-rays fast enough in a continuous
system are the hurdles in the development of a soft X-ray machine. However, industries
are at present working towards creating such a system where these machines would be
able to scan singulated grain kernels to detect insect infestation (Karunakaran et al.,
2004a). These machines can scan grain kernels at the rate of 60 g/min, like a continuous
machine vision system that captures color images of grain for identification (Crowe
et al., 1997). X-ray images can be acquired at different voltage and current settings.
For imaging grains, a 15-kV potential and 65-µA current works best (Karunakaran
et al., 2003a). Images formed on the detection screen are captured by a charge-coupled
device (CCD) monochrome camera and digitized into 8-bit gray-scale images at a spatial resolution unique to the system. A computer system is used for image acquisition
and post-processing.
3.1 Soft X-rays for insect infestation detection in grain
Artificial infestations by different life stages of Cryptolestes ferrugineus (Stephens),
Tribolium castaneum (Herbst), Plodia interpunctella (Hubner), Sitophilus oryzae (L.),
and Rhyzopertha dominica (F.) in CWRS wheat kernels were created. Manually
separated wheat kernels (kernels placed with the crease facing down), uninfested
and infested by different life stages of the insects, were X-rayed at 15 kV and
65-µA. Histogram features, histogram and shape moments, and textural features using
co-occurrence and run-length matrices were extracted for each kernel from the X-ray
images. A total of 57 extracted features were used to identify uninfested and infested
364 Quality Evaluation of Wheat
Table 15.3 Classification accuracies of CWRS wheat kernels uninfested
and infested by stored-grain insects, using BPN and linear-function
parametric classifier (all 57 features).
Insect type and stages
Classification percentages using:
Linear-parametric classifier
C. ferrugineus
T. castaneum
P. interpunctella
S. oryzae
Insect damage kernel
R. dominica
kernels using statistical and neural network classifiers (Karunakaran et al., 2003a,
2003b, 2003c, 2004b, 2004c, 2004d).
The linear-function parametric classifier and back-propagation neural network
(BPNN) identified more than 84 percent of infestations caused by C. ferrugineus and
T. castaneum larvae (Table 15.3). The infestations by C. ferrugineus pupae and adults
were identified with more than 96 percent accuracy, and 97 percent of kernels infested
by P. interpunctella larvae were identified by both the linear-function parametric classifier and BPNN. Kernels infested by different stages of S. oryzae and R. dominica larvae
were identified with more than 98 percent accuracy by the linear-function parametric
classifier and BPNN. The linear-function parametric classifier and BPNN performed
better than the quadratic-function parametric and non-parametric classifiers for the
identification of infested kernels by different insects. The soft X-ray method detected
the presence of live larvae inside the infested kernels. This was achieved by image
subtraction of two consecutive images of kernels that had live active insects inside them.
4 Near-infrared spectroscopy and hyperspectral
The near infra-red (NIR) region extends from 780 nm to 2500 nm in wavelength. The
most important aspect of near-infrared spectroscopy (NIRS) as an analytical tool is that
it can determine the chemical composition and physicochemical behavior of foods and
Near-infrared spectroscopy and hyperspectral imaging 365
their raw materials. This is due to the fact that NIRS analyzes the sample in a way that
reflects the actual number of molecules of individual constituents in the sample (Murray
and Williams, 1990). It is known that all organic matter consists of atoms, mainly carbon, oxygen, hydrogen, nitrogen, phosphorus, and sulfur, with minor amounts of other
elements. These atoms combine by covalent and electrovalent bonds to form molecules
(Campbell et al., 2002). Without external radiation, the molecules vibrate at their fundamental energy levels at ambient temperature. When radiated using a light source with
continuous spectral output, only light at particular wavelengths is absorbed. The energy
of photons at those wavelengths corresponds to the energy gaps between two fundamental energy levels, or overtones, and combinations of vibration levels. Absorption
of light in the NIR region involves transfer of radiation energy into mechanical energy
associated with the motion of atoms bonded together by chemical bonds (Wang, 2005).
4.1 Measurement modes of near-infrared radiation
When electromagnetic radiation interacts with a sample, it may be absorbed, transmitted
or reflected. Based on sample properties and forms of propagation of NIR light in the
sample, measurement of this radiation can be carried out by using the following modes
(Wang, 2005):
1. Transmittance. This is applied to measure transparent samples possessing a minimum light scattering effect. Usually, sample in liquid form or solvent is presented
in a glass or quartz cell since NIR light is transparent to glass. The fraction of
radiation (Is /Ip ) transmitted by the sample is called transmittance. In practice,
transmittance is converted to absorbance as in the following relationship:
= log
A = log
where A is the absorbance in absorbance units (AU), T is the transmittance (no
unit), Is is the incoming light energy (J), and Ip is the transmitted light energy
The relationship of the concentration of a sample, the sample thickness, and
the absorbance is governed by the Beer–Lambert law (Swinehart, 1972):
A = abc
where a is a constant called the absorptivity (l/mol · per meter), b is the sample
thickness (m), and c is the concentration of a sample (mol/l).
2. Transflectance. This is a modified version of transmittance. A retro-reflector
is often employed behind the sample cuvette to double the optical path length
through the sample.
3. Diffuse reflectance. This is applied to the measurement of solid samples and
is perhaps the most accepted measurement mode in NIR spectroscopy. The
Kubelka–Munk function has been introduced to describe the energy of reflected
radiation using two constants called the scattering constant (s) and the absorption constant (k). For a special case of an opaque layer of infinite thickness, the
366 Quality Evaluation of Wheat
relationship could be given by the equation (Kubelka and Munk, 1931):
F(R∞ ) =
(1 − R∞ )2
= =
where F(R∞ ) is the Kubelka–Munk function (no unit), R∞ is the reflectance of
the infinitely thick layer (no unit), k is the absorption constant (mm−1 ), and s is
the scattering constant (mm−1 ) (Birth and Zachariah, 1976).
Apparent absorbance, as given in the following equation, is used in practice
instead of the Kubleka–Munk function:
AR = log
where AR is the apparent absorbance, also in absorbance units (AU), and is
assumed to be proportional to concentration (c).
4. Interactance. This is a modified version of diffuse reflectance. The collected
radiation signal travels a much longer distance in the sample, and is assumed to
be richer in information on sample constituent than that collected under diffuse
reflectance mode.
5. Diffuse transmittance. This is different from diffuse reflectance in that the diffuse
transmittance signal is collected after light has traveled through the sample and
emerged on the other side of it. This mode is often used at short NIR wavelengths
with a turbid liquid or solid sample with a thickness of 10–20 mm.
4.2 NIR spectroscopy instrumentation
Practical application of NIR spectroscopy has been around for several decades, and
there has been a wide array of instruments available for different end-user purposes.
Each kind of instrument is based on different working principles and possesses certain performance characteristics. Currently, in the agricultural and food-related fields,
spectroscopic instruments can be put into two categories (Wang, 2005); dispersive and
non-dispersive systems.
4.2.1 Dispersive systems
Most dispersive systems are based on diffraction gratings. According to its instrument
configuration, this type of spectrometer can be divided into a scanning monochromator
and a spectrograph. A scanning monochromator works by mechanically rotating the
diffraction grating to tune the wavelength of light to be received by detector, whereas
a spectrograph utilizes a linear array detector such as a charge-coupled device (CCD)
or a photodiode array (PDA) in place of a single element detector, and light signals at
multiple wavelengths can be detected simultaneously. Diffraction-grating based instruments are relatively low in cost and very capable in many industrial sectors. Drawbacks
for the scanning diffraction-grating monochromators include their relatively slow scanning speed, and a degrading system performance over time due to mechanical fatigue
of moving parts. Compared to a scanning monochromator, a spectrograph is faster in
speed, has no moving parts, and thus is robust in structure.
Near-infrared spectroscopy and hyperspectral imaging 367
Another type of dispersive system employs electronically tunable filters, such as
acousto-optical tunable filters (AOTF). Using AOTF as the dispersive device, spectrometers can be constructed with no moving parts, having very high scanning speed, a
wide spectral working range, and random wavelength access (Eilert, 1995). Compared
to diffraction-grating spectrometers, the electronically tunable filter-based instruments
have a much higher cost and thus are not widely used.
4.2.2 Non-dispersive systems
There are three main groups of non-dispersive systems. The first group of spectrometers is based on the use of Fourier transform (FT) and the Michelson interferometer.
This type of instrument is mainly used in research laboratories. The working principle
of such a spectrometer enables the system to achieve excellent wavelength precision
and accuracy, a very high signal-to-noise-ratio (SNR), and a relatively fast scanning
speed. Since it utilizes a Michelson interferometer to create the conditions for optical
interference by splitting light into two beams and then recombining them after a path
difference has been introduced using a moving mirror, the system is very delicate.
Therefore, its performance is sensitive to mechanical vibrations and dust.
The second group of non-dispersive systems is based on a limited number of interference filters. These are the simplest and cheapest NIR instruments. Optical filters
are usually chosen according to the absorption wavelengths used for the most popular
applications – e.g. protein, moisture, and oil content in agricultural samples. Therefore,
interference-filter based instruments are only designed for a limited range of routine
The third group is the light-emitting diode (LED) based instruments. This type of
instrument employs an array of LEDs as the illumination sources that emit narrow bands
of NIR light. As the emitting wavelengths are predetermined, the instrument is usually
dedicated to a specific series of measurements. Both LED and filter-based instruments
satisfy the need for low-cost, specific applications, and portable instrumentation for
field analyses.
Generally, the selection of an appropriate instrumentation configuration depends
on the purpose of the application. More extensive reviews on instrumentation for
vibrational NIR spectroscopy can be found in, for example, Osborne et al. (1993) and
Coates (1998).
4.3 Near-infrared hyperspectral imaging
With the advent of electronically tunable filters and computers with immense computational power, it is now possible to acquire NIR images along with spectral data.
This technique, known as hyperspectral imaging, has shown the potential to provide
more information about the functional components of grain than is possible with NIRS
or optical imaging alone. It can be considered an extension of multispectral imaging,
where images are captured at a much smaller number of wavelengths by placing a wheel
with limited number of band-pass filters in front of a camera. Multispectral imaging
systems are constrained by the slow filter-switching speed and the rather large size of
the filter wheel. The latest generation of wavelength filters is based on electronically
368 Quality Evaluation of Wheat
InGaAs Camera
Halogen-tungsten lamp
Data storage and analysis
Figure 15.7 Near infrared (NIR) imaging system (photograph courtesy of Canadian Wheat Board Centre
of Grain Storage Research, Winnipeg, MB, Canada).
controlled liquid-crystal elements in a Lyot-type birefringent design. These liquidcrystal tunable filters (LCTF) select a transmitted wavelength range while blocking all
others, providing rapid selection of any wavelength in the visible to NIR range. Such
filters can be combined with charge-coupled device (CCD) cameras to create powerful
spectral imaging instruments (Figure 15.7). The strengths of LCTFs include compactness, large apertures and field-of-views, low wavefront distortion, flexible throughput
control, and low power requirements (Jha, 2000).
4.4 The application of NIR spectroscopy and hyperspectral
imaging systems
The development of a near-infrared (NIR) spectroscopy system for measuring the
moisture and protein content in wheat, the kernel vitreousness or hardness, fungal
contamination, scab or mould damage, and insect infestation has made the measurement of these quality factors objective, and the system has been adopted by the industry
(Delwiche, 1998, 2003; Delwiche and Hruschka, 2000; Wang et al., 2002). NIR spectroscopy has replaced the chemically intensive Kjeldahl method for protein content
measurement in many countries. For proper functioning of the NIR system, large
amounts of reference data from different growing regions should be used for calibration. Once properly calibrated, it is a rapid technique requiring small sample sizes. NIR
spectroscopy has the potential to be used for measuring the hardness and vitreousness
of kernels, for color classification, the identification of damaged kernels, the detection
of insect and mite infestation, and the detection of mycotoxins (Singh et al., 2006).
Also, the feasibility of using reflectance characteristics for quick identification of bulk
grain samples has been assessed (Mohan et al., 2004).
The NIR spectroscopic method detects infested grain kernels based on differences
in spectral reflectance (Dowell et al., 1999). The cuticular polysaccharides (chitin
content) of insects have a different spectral reflectance from that of water, protein,
starch, or other chemical constituents in the grain. This method was successfully used
to identify wheat kernels infested by the larval stages of R. dominica, S. oryzae, and
S. cerealella using wavelengths in the range of 1000–1350 nm and 1500–1680 nm.
Thermal imaging 369
It inspected 300 kernels in 3 minutes, and detected third and fourth instars of S. oryzae
with 95 percent accuracy (Dowell et al., 1998). The unique chemical composition of
the cuticle of different insect species was used to identify 11 different primary and
secondary grain feeders with more than 99 percent accuracy (Dowell et al., 1999).
Ridgway and Chambers (1996) determined that the NIR method detected infestation
in samples containing 270 and more insects/kg of grain. By analyzing single kernels,
the NIR method was able to detect infestation of S. oryzae in wheat only after the third
instar stage (Dowell et al., 1998). No difference was detected between the spectra of
kernels partially consumed by insects, and sound kernels (Dowell et al., 1998).
Near-infrared hyperspectral imaging systems based on LCTFs have gained
widespread popularity in medical imaging, but their applications in the field of agricultural products have so far been very few. Evans et al. (1998) used an LCTF-based
imaging system to evaluate the vigor of bean plants at different nitrogen-stress levels. Although the authors could successfully quantify these stresses in plants, their
imaging system suffered from a slow response of the LCTFs in attenuating desired
wavelengths. Similar concerns were shared by Archibald et al. (1998), who developed
a system to analyze wheat protein and determine color classification on a single-kernel
basis. With advances in the LCTF technology and the faster computational speed of
personal computers, the problem of the slow response of LCTFs has been overcome.
This is evident from a recent publication by Cogdill et al. (2004), who used a similar
hyperspectral imaging system and found it to have very fast wavelength tuning capability. They obtained accurate predictions for moisture concentrations but not for oil
content in maize, but conceded that the errors in predicting oil content were attributable
to the reference method rather than the spectrometer.
5 Thermal imaging
The thermal image is generated from the infrared radiation (700–1 nm) emitted from
an object at a given temperature. In other words, thermal imaging provides a surfacetemperature map of an object. The thermal imaging system (Figure 15.8) includes
an infrared thermal camera (such as the ThermaCAM TM SC500, of FLIR systems,
Burlington, Ontario, Canada, an un-cooled focal plane array type camera capable
of generating images of 320 × 240 pixels in the spectral range 7.5–13.0 µm) and a
computer system. The thermal resolution of such camera is quite high (approximately
0.07◦ C at 30◦ C). Close-up lenses (for example, of 50-µm focal length) are usually
attached to the original lens of the camera (FOV 24◦ × 18◦ ) to obtain magnified thermal
images of a kernel.
5.1 Application of thermal imaging
Thermal imaging has been demonstrated to detect insect-infested kernels and different
classes of wheat (Manickavasagan et al., 2006a, 2006b). In thermal imaging, the
emitted energy is represented as a two-dimensional image. This imaging technique is
a non-contact type, but it requires the creation of temperature differences in an object,
370 Quality Evaluation of Wheat
Figure 15.8 Thermal imaging system (photograph courtesy of Canadian Wheat Board Centre of Grain
Storage Research, Winnipeg, MB, Canada).
either by heating or cooling, to obtain internal information. By heating or cooling
kernels of wheat which were initially at a uniform temperature, it is possible to show
the differences between sound and infested kernels, or kernels of different classes.
At present, thermal imaging is at research stage; however, it has already shown the
potential to provide information associated with wheat quality degradation.
6 Potential practical applications of machine
vision technology
6.1 Automation of railcar unloading
To automate the handling of the contents of a railcar, it is necessary to collect a grain
sample and rapidly identify it using an imaging system. In such a situation, the grain
sample can be presented in bulk and imaged. The classification accuracies from the
bulk images were nearly 100 percent for five grain types. Also, the feasibility of using
reflectance characteristics for quick identification of bulk grain samples was assessed.
Based on these studies (Paliwal et al., 2001, 2003a, 2003b, 2005; Visen et al., 2001,
2002, 2004a, 2004b; Mohan et al., 2004), it was concluded that a system based on the
analysis of bulk images could be developed for automation of railcar unloading.
6.2 Optimization of grain cleaning
To automate a cleaner, it is desirable that it should have a decision support system to
adjust its parameters (such as vibration rate, grain flow rate, etc.) by calculating the
amount of impurity being removed from the sample. This can be done by calculating
the change in the ranges of morphological features of the particles before and after
References 371
the sample is passed through the cleaner. The ranges of morphological features change
significantly when a sample is passed through the cleaner, and thus can be used to
provide a feedback to the system.
6.3 Quality monitoring of export grains
The grain being loaded onto ships for the export market is usually blended from different
bins containing grain with different degrees of cleanliness to meet the specified tolerances for foreign material by the importing customer. Using high-resolution images
of kernels of five grain types (barley, CWAD wheat, CWRS wheat, oats, and rye),
and five broad categories of dockage constituents (broken wheat kernels, chaff, buckwheat, wheat-heads, and canola), analyses were performed for their classification.
Different feature models, viz. morphological, color, textural, and a combination of the
three, were tested for their classification performances using a neural network classifier. Kernels and dockage particles with well-defined characteristics (e.g. CWRS
wheat, buckwheat, and canola) showed near-perfect classification, whereas particles
with irregular and undefined features (e.g. chaff and wheat-heads) were classified with
accuracies of around 90 percent. The similarities in shape and size of some of the particles of chaff and wheat-heads to those of the kernels of barley and oats adversely
affected the classification accuracies of the latter. With calibration, algorithms can be
used to monitor and control the blending of grain.
6.4 Detection of low-level insect infestation
The Berlese funnel method, currently used by the Canadian Grain Commission to detect
infestations, extracted 67.2, 50.5, and 81.0 percent of first, second, and third instars
of C. ferrugineus larvae, respectively, in 6 hours. The same infested kernels were all
identified as being infested by the trained BPNN using the features extracted from the
soft X-ray images (Karunakaran et al., 2003a, 2003b, 2003c, 2004a, 2004b, 2004c,
2005d). Potential exists to identify uninfested and infested kernels (included kernels
infested by external and internal grain feeders) using soft X-rays.
This chapter summarizes results from studies carried out by several research trainees,
who were supervised by Dr Jayas. Their contributions are gratefully acknowledged.
Sections 4.1 and 4.2 are reproduced from the MSc. thesis of Mr Wenbo Wang, who
was supervised by Dr Paliwal, and we are thankful to him.
Anonymous (1987) Official Grain Grading Guide. Winnipeg, Canadian Grain Commission.
Archibald DD, Thai CN, Dowell FE (1998) Development of short-wavelength near-infrared
spectral imaging system for grain color classification. Proceedings of the SPIE, 3543,
372 Quality Evaluation of Wheat
Birth GS, Zachariah GL (1976) Spectrophotometry of agricultural products. In Quality
Detection in Foods (Gaffney JJ, ed.). St Joseph: ASAE, 6–11.
Campbell NA, Reece JB, Mitchell LG, Taylor MR (2002) Biology, Concepts and
Connections, 4th edn. San Francisco: Benjamin Cummings.
Canada Grain Act (1975) Canada grain regulations. Canada Gazette Part II, 109 (14).
Coates J (1998) Vibrational spectroscopy, instrumentation for infrared and Raman
spectroscopy. Applied Spectroscopy Reviews, 33 (4), 267–425.
Cogdill RP, Hurburgh Jr. CR, Rippke GR (2004) Single-kernel maize analysis by nearinfrared hyperspectral imaging. Transactions of the ASAE, 47 (1), 311–320.
Connolly C (2006) Machine vision developments. Sensor Review, 26 (4), 277–282.
Crowe TG, Luo X, Jayas DS, Bulley NR (1997) Color line-scan imaging of cereal grain
kernels. Applied Engineering in Agriculture, 13 (5), 689–694.
Delwiche SR (1998) Protein content of single kernels of wheat by near-infrared reflectance
spectroscopy. Journal of Cereal Science, 27, 241–254.
Delwiche SR (2003) Classification of scab- and other mold-damaged wheat kernels by
near-infrared reflectance spectroscopy. Transactions of the ASAE, 46 (3), 731–738.
Delwiche SR, Hruschka WR (2000) Protein content of bulk wheat from near-infrared
reflectance spectroscopy. Cereal Chemistry, 77 (1), 86–88.
Dowell FE, Throne JE, Baker JE (1998) Automated nondestructive detection of internal insect infestation of wheat kernels using near-infrared reflectance spectroscopy.
Journal of Economic Entomology, 91 (4), 899–904.
Dowell FE, Throne JE, Wang D, Baker JE (1999) Identifying stored-grain insects using
near-infrared spectroscopy. Journal of Economic Entomology, 92 (1), 165–169.
Eilert AJ, Danley WJ, Wang X (1995) Rapid identification of organic contaminants in
pretreated waste water using AOTF near-IR spectrometry. Advanced Instrumentation
Control, 50 (Pt. 2), 87–95.
Evans MD, Thai CN, Grant JC (1998) Development of a spectral imaging system based on
a liquid crystal tunable filter. Transactions of the ASAE, 41 (6), 1845–1852.
FAOSTAT (2006) FAOSTAT Classic – Crops Primary. Food and Agriculture Organization
of the United Nations (available on-line at http://faostat.fao.org/site/408/default.aspx,
accessed 11 November 2006).
Gonzalez RC, Woods RE (1992) Digital Image Processing. Reading: Addision-Wesley.
Hawk AL, Kaufmann HH, Watson CA (1970) Reflectance characteristics of various grains.
Cereal Science Today, 15, 381–384.
Jayas DS, Murray CE, Bulley NR (1999) An automated seed presentation device for use
in machine vision identification of grain. Canadian Agricultural Engineering, 41 (2),
Jayas DS, Paliwal J, Visen NS (2000) Multi-layer neural networks for image analysis of agricultural products. Journal of Agricultural Engineering Research, 77 (2),
Jayas DS, Mohan AL, Karunakaran C (2005) Unloading automation implemented in grain
industry. Resource, September, 6–7.
Jha AR (2000) Infrared Technology, Applications to Electro-Optics, Photonic Devices and
Sensors, 1st edn. Hoboken: Wiley-Interscience, Inc.
References 373
Karunakaran C, Visen NS, Paliwal J, Zhang G, Jayas DS, White NDG (2001) Machine
vision systems for agricultural products, CSAE Paper No. 01-305, Mansonville QC,
Karunakaran C, Jayas DS, White NDG (2003a) X-ray image analysis to detect infestations
caused by insects in grain. Cereal Chemistry, 80 (5), 553–557.
Karunakaran C, Jayas DS, White NDG (2003b) Soft X-ray image analysis to detect wheat
kernels damaged by Plodia interpunctella (Lepidoptera, Pyralidae). Sciences des
Aliments, 23 (5–6), 623–631.
Karunakaran C, Jayas DS, White NDG (2003c) Soft X-ray inspection of wheat kernels
infested by Sitophilus oryzae. Transactions of the ASAE, 46 (3), 739–745.
Karunakaran C, Jayas DS, White NDG (2004a) Soft X-rays, a potential insect detection
method in the grain handling facilities. In International Quality Grains Conference,
Indianapolis, IN, July 19–22.
Karunakaran C, Jayas DS, White NDG (2004b) Detection of infestations by Cryptolestes
ferrugineus inside wheat kernels using a soft X-ray method. Canadian Biosystems
Engineering, 46, 7.1–7.9.
Karunakaran C, Jayas DS, White NDG (2004c) Detection of internal wheat seed infestation
by Rhyzopertha dominica using X-ray imaging. Journal of Stored Products Research,
40, 507–516.
Karunakaran C, Jayas DS, White NDG (2004d) Identification of wheat kernels damaged
by the red flour beetle using X-ray images. Biosystems Engineering, 87 (3), 267–274.
Keefe PD (1992) A dedicated wheat grain image analyser. Plant Varieties and Seeds, 5 (1),
Keefe PD, Draper SR (1986) The measurements of new characters for cultivar identification
in wheat using machine vision. Seed Science and Technology, 14, 715–724.
Kubelka P, Munk F (1931) Ein Beitrag zur Optik der Farbanstriche. Zeitshrift für Technische
Physik, 12, 593–604.
Luo X, Jayas DS, Crowe TG, Bulley NR (1997) Evaluation of light sources for machine
vision. Canadian Agricultural Engineering, 39 (4), 309–315.
Luo XY, Jayas DS, Symons SJ (1999a) Comparison of statistical and neural network methods for classification of cereal grains using machine vision. Transactions of the ASAE,
42 (2), 413–419.
Luo XY, Jayas DS, Symons SJ (1999b) Identification of damaged kernels in wheat using
a color machine vision system. Journal of Cereal Science, 30 (1), 49–59.
Majumdar S, Jayas DS (1999a) Single-kernel mass determination for grain inspection using
machine vision. Applied Engineering in Agriculture, 15 (4), 357–362.
Majumdar S, Jayas DS (1999b) Classification of bulk samples of cereal grains using
machine vision. Journal of Agricultural Engineering Research, 73, 35–47.
Majumdar S, Jayas DS (2000a) Classification of cereal grains using machine vision, I.
Morphology models. Transactions of the ASAE, 43 (6), 1669–1675.
Majumdar S, Jayas DS (2000b) Classification of cereal grains using machine vision, II.
Color models. Transactions of the ASAE, 43 (6), 1677–1680.
Majumdar S, Jayas DS (2000c) Classification of cereal grains using machine vision, III.
Texture models. Transactions of the ASAE, 43 (6), 1681–1687.
374 Quality Evaluation of Wheat
Majumdar S, Jayas DS (2000d) Classification of cereal grains using machine vision,
IV. Morphology, color, and texture models. Transactions of the ASAE, 43 (6),
Majumdar S, Jayas DS, Symons SJ (1999) Textural features for grain identification.
Agricultural Engineering Journal, 8 (4), 213–222.
Majumdar S, Jayas DS, Hehn JL, Bulley NR (1996a) Classification of various grains using
optical properties. Canadian Agricultural Engineering, 38 (2), 139–144.
Majumdar S, Luo X, Jayas DS (1996b) Image processing and its applications in food
process control. In Computerized Control Systems in the Food Industry (Mittal GS,
ed.). New York: Marcel Dekker, Inc., pp. 207–234.
Manickavasagan A, Jayas DS, White NDG (2006a) Thermal imaging to detect infestation
by Cryptolestes ferrugineus inside wheat kernels. Journal of Stored Products Research
Manickavasagan A, Jayas DS, White NDG, Paliwal J (2006b) Wheat class identification
using thermal imaging. Transactions of the ASABE (submitted).
Melvin S, Karunakaran C, Jayas DS, White NDG (2003) Design and development of a
grain kernel singulation device. Canadian Biosystems Engineering, 45, 3.1–3.3.
Mohan AL, Jayas DS, White NDG, Karunakaran C (2004) Classification of bulk oil seeds,
speciality seeds and pulses using their reflectance characteristics. In Proceedings of
International Quality Grains Conference, Indianapolis, IN.
Murray I, Williams PC (1990) Chemical principles of near-infrared technology. In Nearinfrared Technology in the Agricultural and Food Industries (Williams PC, Norris KH,
eds). St Paul: American Association of Cereal Chemists, Inc., pp. 17–34.
Myers DG, Edsall KJ (1989) The application of image processing techniques to the
identification of Australian wheat varieties. Plant Varieties and Seeds, 2, 109–116.
Nair M, Jayas DS (1998) Dockage identification in wheat using machine vision. Canadian
Agricultural Engineering, 40 (4), 293–298.
Neuman MR, Sapirstein HD, Shwedyk E, Bushuk W (1987) Discrimination of wheat class
variety by digital image analysis of whole grain samples. Journal of Cereal Science,
6 (2), 125–132.
Neuman MR, Sapirstein HD, Shwedyk E, Bushuk W (1989a) Wheat grain color analysis
by digital image processing. I. Methodology. Journal of Cereal Science, 10, 175–182.
Neuman MR, Sapirstein HD, Shwedyk E, Bushuk W (1989b) Wheat grain color analysis by
digital image processing. II. Wheat class discrimination. Journal of Cereal Science,
10, 183–188.
Osborne BG, Fearn T, Hindle PH (1993) Practical NIR Spectroscopy, with Applications in
Food and Beverage Analysis, 2nd edn. New York: John Wiley & Sons.
Paliwal J, Shashidhar NS, Jayas DS (1999) Grain kernel identification using kernel
signature. Transactions of the ASAE, 42 (6), 1921–1924.
Paliwal J, Visen NS, Jayas DS (2001) Evaluation of neural network architectures for cereal
grain classification using morphological features. Journal of Agricultural Engineering
Research, 79 (4), 361–370.
Paliwal J, Visen NS, Jayas DS, White NDG (2003a) Cereal grain and dockage identification
using machine vision. Biosystems Engineering, 85 (1), 51–57.
References 375
Paliwal J, Visen NS, Jayas DS, White NDG (2003b) Comparison of a neural network and
a non-parametric classifier for grain kernel identification. Biosystems Engineering,
85 (4), 405–413.
Paliwal J, Borhan MS, Jayas DS (2004a) Classification of cereal grains using a flatbed
scanner. Canadian Biosystems Engineering, 46, 3.1–3.5.
Paliwal J, Jayas DS, Visen NS, White NDG (2004b) Feasibility of a machine-vision based
grain cleaner. Applied Engineering in Agriculture, 20 (2), 245–248.
Paliwal J, Visen NS, Jayas DS, White NDG (2005) Quantification of variations in machinevision-computed features of cereal grains. Canadian Biosystems Engineering, 47,
Ridgway C, Chambers J (1996) Detection of external and internal insect infestation in wheat
by near-infrared reflectance spectroscopy. Journal of Science in Food and Agriculture,
71 (2), 251–264.
Sapirstein HD, Bushuk W (1989) Quantitative determination of foreign material and vitreosity in wheat by digital image analysis. In ICC ’89 Symposium, Wheat End-Use
Properties (Salovaara H, ed.), Lahiti, Finland, pp. 453–474.
Shashidhar NS, Jayas DS, Crowe TG, Bulley NR (1997) Processing of digital images
of touching kernels by ellipse fitting. Canadian Agricultural Engineering, 39 (2),
Shatadal P, Jayas DS, Hehn JL, Bulley NR (1995a) Seed classification using machine vision.
Canadian Agricultural Engineering, 37 (3), 163–167.
Shatadal P, Jayas DS, Bulley NR (1995b) Digital image analysis for software separation and
classification of touching grains, I. Disconnect algorithm. Transactions of the ASAE,
38 (2), 635–643.
Shatadal P, Jayas DS, Bulley NR (1995c) Digital image analysis for software separation
and classification of touching grains, II. Classification. Transactions of the ASAE,
38 (2), 645–649.
Singh CB, Paliwal J, Jayas DS, White NDG (2006) Near-infrared spectroscopy, applications
in the grain industry. CSBE Paper No. 06-189, Canadian Society for Bioengineers,
Winnipeg, MB.
Swinehart DJ (1972) The Beer-Lambert law. Journal of Chemical Education, 39 (7),
Symons SJ, Fulcher RG (1988a) Relationship between oat kernel weight and milling yield.
Journal of Cereal Science, 7, 215–217.
Symons SJ, Fulcher RG (1988b) Determination of variation in oat kernel morphology by
digital image analysis. Journal of Cereal Science, 7, 219–228.
Symons SJ, Fulcher RG (1988c) Determination of wheat kernel morphological variation
by digital image analysis, I. Variation in Eastern Canadian milling quality wheats.
Journal of Cereal Science, 8, 210–218.
Symons SJ, Fulcher RG (1988d) Determination of wheat kernel morphological variation
by digital image analysis, II. Variation in cultivars of white winter wheats. Journal of
Cereal Science, 8, 219–229.
Visen NS (2002) Machine vision based grain handling system. PhD Thesis, Department of
Biosystems Engineering, University of Manitoba, Winnipeg, Manitoba, Canada.
376 Quality Evaluation of Wheat
Visen NS, Shashidhar NS, Paliwal J, Jayas DS (2001) Identification and segmentation of
occluding groups of grain kernels in a grain sample image. Journal of Agricultural
Engineering Research, 79 (2), 159–166.
Visen NS, Jayas DS, Paliwal J, White NDG (2002) Specialist neural networks for cereal
grain classification. Biosystems Engineering, 82 (2), 151–159.
Visen NS, Jayas DS, Paliwal J, White NDG (2004a) Comparison of two neural network architectures for classification of singulated cereal grains. Canadian Biosystems
Engineering, 46, 3.7–3.14.
Visen NS, Paliwal J, Jayas DS, White NDG (2004b) Image analysis of bulk grain samples
using neural networks. Canadian Biosystems Engineering, 46, 7.11–7.15.
Wang D, Dowell FE, Dempster R (2002) Determining vitreous subclass of hard red spring
wheat using visible/near-infrared spectroscopy. Cereal Chemistry, 79 (3), 418–422.
Wang W (2005) Design and evaluation of a visible-to-near-infrared spectrograph for grain
quality assessment. MSc Thesis, Department of Biosystems Engineering, University
of Manitoba, Winnipeg, Manitoba, Canada.
Wang W, Paliwal J (2006) Separation and identification of touching kernels and dockage
components in digital images. Canadian Biosystems Engineering, 48, 7.1–7.7.
Zayas I, Lai FS, Pomeranz Y (1986) Discrimination between wheat classes and varieties by
image analysis. Cereal Chemistry, 63, 52–56.
Zayas I, Pomeranz Y, Lai FS (1989) Discrimination of wheat and non-wheat components
in grain samples by image analysis. Cereal Chemistry, 66, 233–237.
Zhang G, Jayas DS, White NDG (2005) Separation of touching grain kernels in an image
by ellipse fitting algorithm. Biosystems Engineering, 92 (2), 135–142.
Quality Evaluation
of Rice
Yukiharu Ogawa
Faculty of Horticulture, Chiba University, Matsudo, Chiba,
271-8510 Japan
1 Introduction
Rice (Oryza sativa L.) is one of the major commercial cereal grains worldwide, along
with wheat and corn. In the order of 628 million tonnes of rice were produced throughout
the world in 2005, and the world trade in the commodity that same year was 29.9
million tonnes, as estimated by the FAO (2006). Over 90 percent of rice is produced
and consumed in Asia.
Since the mapping of the rice genome began, genetic studies, such as genome
research into rice, have progressed. Rice is therefore currently studied in many academic fields, including plant, breeding, crop, and food science. Although the aims of
rice studies vary, quality evaluation of the grain as a foodstuff is one of the main goals.
Computer vision technology, which is progressing all the time with the continuous
development in both hardware and software, can contribute to such quality evaluation
by assessing the quality of the rice grains objectively, consistently, and quantitatively.
In this chapter, various techniques and methods for the quality evaluation of rice
using computer vision technology are described. Rice research has various aspects,
as mentioned above, and the significance of the rice quality differs within each – for
example, the quality of rice as a foodstuff is different from that as a raw material. An
outline of rice quality is thus described in the next section. Rice as a raw material (“raw
rice”) and as a prepared foodstuff (“cooked rice”) is classified in the following sections
and described together with the different evaluation techniques.
2 Quality of rice
The word “quality” is extremely abstract. Consequently, before describing its evaluation, the quality of rice has to be defined. Parameters, which must be expressed by
actual and measurable objects or properties, are also required to evaluate the quality.
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
378 Quality Evaluation of Rice
Rice grows in a paddy field and is harvested as a plant seed material. The harvested
rice seed undergoes post-harvest processing, including drying, storing, and hulling.
The hulled rice is commonly known as brown rice. Usually, the brown rice grain is
milled to remove its bran layer, including the pericarp, testa, aleurone layer, etc. This
process is also called polishing. The milled rice appears as a white, semi-transparent
grain. Unlike other cereal grains, rice is usually purchased primarily as the milled
product by the consumer and is consumed as a steamed or boiled product – cooked
rice. Thus, the rice quality is related to which stage the rice is at.
Rice as a plant seed material, brown grain, and milled product can be regarded as the
raw material of foodstuffs. The parameters for the quality evaluation of rice as a raw
material are therefore concerned with biological and post-harvest handling properties.
These properties are influenced by characteristics under genetic control, environmental conditions, and processing conditions. Basically, it can be considered that physical
properties (i.e. measurable dimensions such as the grain size, shape, and color variance,
etc.), which are concerned with cultivar and growth environment, are primary parameters for the quality evaluation of raw rice (Hoshikawa, 1993a). Defects and fissures,
which are associated with post-harvest processing and market value, are also regarded
as parameters. Moreover, the water content and distribution within the raw grain also
influence its storage properties. The chemical contents and distribution in the grain
related to the morphological, histological, and structural properties are also considered
in assessing raw rice quality. The content of chemical compounds also defines the
nutritional quality. The aroma is related to the compounds, and is thus another quality
attribute of rice, although it cannot currently be represented as a visual parameter.
Steamed or boiled rice as a cooked product is a foodstuff, and therefore its quality
is based on its eating quality, which is related to the physical, chemical, and physicochemical properties of the cooked grain. Among these properties, the texture of cooked
rice products is one of the most important properties and it has been usually measured
using sensory analysis. Cooked rice products have a high moisture content, and their
starch granules are gelatinized by boiled water during cooking. In general, starch gelatinization is related to the physicochemical properties, which influence the cooked rice
texture. Consequently, the water distribution in a cooked grain during cooking is an
important parameter for the quality evaluation of cooked rice. The starch granules in
the individual grains gelatinize and the grain shape swells with such gelatinization.
As a result, the grain-scale macro-structure changes drastically during cooking, and
such structural changes to the interior and exterior of the grain are reflected in the
rice texture. Accordingly, the structural properties of the cooked grain, including its
surface structure (which is concerned with appearance), are also important parameters for quality evaluation. The rice-grain interior consists mostly of starch and starch
granules. The starch in a grain is enveloped in endosperm cells, which are composed
of cell-wall materials. Histological microstructures, such as cell formation and distribution, must therefore be related to the physical and physicochemical properties,
such as hardness and stickiness. The cell-scale micro-structure of the cooked grain is
an important parameter for quality evaluation. The thermal condition, aroma, taste,
etc. can also be regarded as quality evaluation parameters for the cooked rice grain,
although these have not so far been visualized.
Quality evaluation of raw rice 379
3 Quality evaluation of raw rice
3.1 Physical properties
Visual inspection of the grain by human eyes is a primary method of grain quality
inspection commercially. Automated inspection equipment and methods are important
and are in demand. Computer-aided machine vision systems can provide objective,
consistent, and quantitative measurements. They can also automatically and accurately
inspect visual qualities such as grain contour, size, color variance and distribution,
and damage. Image-processing techniques for computer-aid machine vision systems
have been developed, for example for determining the physical dimensions of milled
kernel (Goodman and Rao, 1984). Pattern-recognition techniques can also be used
as an aid in grain characterization, and can be an effective method for identifying
and classifying the grains (Lai et al., 1986). Sakai et al. (1996) demonstrated the
use of two-dimensional image analysis for the determination of the shape of brown
and polished rice grains of four varieties. The sample grains were polished by three
different polishing methods. The rice varieties were well separated by image analysis
using suitable dimension and shape factors, whereas the grain polished by different
methods could not be differentiated accurately. The water condensation on the grain
surface is caused by changes in the environmental conditions of temperature and relative
humidity during storage, and leads to a deterioration in quality. Atungulu et al. (2003)
investigated the relationship between the amount of condensed water, estimated by
thermodynamic simulation and experimental results, and the value obtained from color
indices such as RGB and/or HSI in the resulting grain images. They concluded that
the deviations from initial hue and intensity of the HSI indices were changed by the
condensation on grain surface, and were also related to the surrounding environment
of grains, such as temperature and relative humidity.
In general, an individual object should be placed under a camera for image processing, and clear images must be provided for the machine vision system. Boundary
extraction and geometrical feature measurements on physically touching objects are
therefore classic problems when a real-time machine vision inspection system is performed. The touching objects yield connected regions in the image after segmentation
from the background, thus making measurements of individual objects impossible
without further pre-processing. Considering this, Shatadal et al. (1995a) developed
an algorithm to segment connected grain-kernel image regions. Their algorithm
used the image transformed by the discipline of mathematical morphology, and succeeded in separating rice kernels in the image. The geometrical features were also
extracted from both software-separated and physically-separated kernels for pattern
classification (Shatadal et al., 1995b). The authors described that there was an important limitation of the algorithm, which led to failure when the connected kernels formed
a relatively long isthmus or bridge. Later, Wang and Chou (2004) developed a more
efficient method to segment touching rice kernels using the active contour model and
the inverse gradient vector flow field. The inverse gradient vector flow field was first
proposed to automatically generate a field center for every individual rice kernel in an
image. These centers were then employed as the reference for setting initial deformable
380 Quality Evaluation of Rice
contours that were required for building an active contour model. It was found that the
complete contours of touching objects identified by this approach could facilitate subsequent image processing to obtain the geometric, texture, and color characteristics of
An automatic inspection system with multiple kernel image inspection can speed
up the process. However, as mentioned above, to present many tiny grain kernels in
an oriented form for machine vision inspection is not an easy task, and to calculate
grading parameters of every kernel with many kernels touching each other randomly is
very difficult and time-consuming. Therefore, an efficient device is required to present
multiple grain kernels that are not in contact with one another, when an automatic grain
quality inspection system with high performance is developed. Wan (2002) developed
an automatic kernel-handling system, consisting of an automatic inspection machine
and an image-processing unit. His system could continually present matrix-positioned
grain kernels to charge-coupled device (CCD) cameras, singularize each kernel image
from the background, and discharge kernels to assigned containers. The inspection
machine had scattering and positioning devices, a photographing station, a parallel
discharging device, and a continuous conveyer belt with carrying holes for the grain
kernels. The image-processing unit and the inspection machine were designed to work
concurrently to provide high throughput of individual kernel images. Wan et al. (2002)
also investigated aspects associated with the performance of his automatic quality
inspection system for evaluating various rice appearance characteristics, such as sound,
cracked, chalky, immature, dead, broken, and damaged. Carter et al. (2006) proposed
both cluster and discriminant analyses for establishing the suitability of the measured
parameters for authentication of granular food using his developed digital imaging
system and fuzzy logic. Results demonstrated that it might be possible to distinguish
between different varieties of the same rice.
Milling of rough rice is usually conducted to produce white and polished edible
grain, owing to consumer preference. The important parameters to evaluate milled rice
quality are grain size and shape, whiteness, and cleanliness, which are correlated with
the transaction price of the rice. These factors are closely related to the process of
milling, in which rough rice is first subjected to dehusking or removal of hulls, and
then to the removal of the brownish outer bran layer. Finally, polishing is carried out to
remove the bran particles and to provide surface gloss to the edible white portion. The
degree of milling determines the extent of removal of the bran layer from the surface
of the milled kernels, and is thus related to the whiteness of rice. Yadav and Jindal
(2001) developed techniques that could be used for estimating the head rice yield, in
which the weight percentage of the milled kernels was represented by three-fourths of
their original length of brown rice relative to the rough rice weight and the degree
of milling, based on two-dimensional imaging of milled rice kernels. There quantity
of broken rice kernels allowed is specified when buying milled rice, and broken rice
kernels have normally only half the value of whole or head rice. The weight percentage
of whole kernels remaining after milling is one of the important physical characteristics
that determine the rice quality. The amount of broken rice kernels is determined mainly
by visual selection of these kernels from a large quantity of rice. The length and width
of rice kernels is generally measured using a single caliper. These analyses can be
Quality evaluation of raw rice 381
performed much faster and more accurately using machine vision systems. Dalen
(2004) reported that the size and size distribution of rice and the amount of broken rice
kernels could be determined by image analysis using flatbed scanning. He demonstrated
that his flatbed scanner with image analysis was a fast, easy, and low-cost method of
Rapid moisture adsorption by low-moisture rice grains may cause the grains to fissure, which becomes a cause of deterioration for the eating quality of rice when cooked.
The combination of physical properties such as stress and strain has a continuous influence on the rice grain. Such stress-fissured kernels break more readily than sound
kernels during harvesting, handling, and milling, and thereby reduce the quality and
the market value of the grain. Stress-fissure detection is still an important task in rice
quality evaluation. Lan et al. (2002) developed a machine vision system with a CCD
black/white camera, an image frame-grabber, a computer, and an image-processing
program to obtain images of fissured rice grains. The fissure pattern difference between
long and medium grain rice was recognized after analyzing the processed images of fissured grains. The detection procedure for fissures in a sample kernel of the medium rice
was carried out by image-processing methods, such as gamma correction, histogram
equalization, erosion, regional enhancement, and edge detection. For the long kernels,
gamma correction, high-pass filtering, contrast adjustment, and regional enhancement
were carried out because differences in the fissure pattern between long and medium
rice kernels were observed. Their computer vision system was able to reveal 94 percent
of all the fissure lines detected in medium grains by a human expert, and 100 percent
in long grains.
The stress distribution inside a rice kernel, which is the origin of kernel fissures,
is difficult to estimate by the imaging method. A finite-element analysis method can
simulate such stress distributions in a kernel by computer. Jia et al. (2002) mapped
and analyzed the distributions of radial, axial, tangential, and shear stresses in a kernel
during drying by the finite-element simulation combined with high-speed microscopy
imaging of the fissure appearance. As a result, they found that two distinct stress
zones existed inside a rice kernel during drying – a tensile zone near the surface, and a
compressive zone close to the center. It was also found that, as drying proceeded, radial,
tangential, and shear stresses gradually approached zero in magnitude and became
neutral in direction after 60 minutes of drying at 60◦ C and 17 percent relative humidity.
Only axial stress remained at a pronounced level, even after drying, which helped to
explain the fact that most fissures propagate perpendicular to the longitudinal axis of
the rice kernel.
3.2 Water content and distribution
Water distribution in a rice kernel is one of the important parameters for the quality
evaluation of rice grains. The changes in the water distribution of a rice seed during morphological development as a plant material are also important to elucidate the quality
of matured grain after harvest, because starch and other compounds are accumulated
by assimilating transport with water during development. In order to visualize such
morphological development of rice caryopsis by adding information on the moisture
382 Quality Evaluation of Rice
distribution pattern, nuclear magnetic resonance (NMR) micro-imaging was applied
as a non-destructive measurement technique (Horigane et al., 2001). In general, the
NMR imaging technique can be used for the non-destructive and non-invasive determination of moisture distribution and mobility in a grain. Thus, the moisture distribution
pattern can be discussed in relation to the morphological development. The moisture
distribution images of young tissue, the pericarp vascular bundle, and the endosperm
up to 25 days after anthesis were obtained, and the route for water supply to or drain
from the embryo was observed in their study. The three-dimensional structure of developing spikelets was represented as a maximum-intensity projection image, which is a
reconstructed image of a transparent view through an object using three-dimensional
data. By the observation of such images, the moisture content of the older caryopses
was found to be smaller than the younger during development because the resulting
signal intensities in the maximum intensity projection images were decreased. The
increments in width and thickness of the caryopsis and the junction between palea and
lemma were observed by cross-sectional NMR images. These findings supported the
fact that water flows from the pericarp vascular bundle into the nucellus.
The water content of rice seed that has been harvested as an agricultural product is
removed by drying. The drying process thus greatly affects the rice quality. Ishida et al.
(2004) visualized changes in the water distribution in a rough rice seed during drying
by the single point mapping imaging technique combined with magnetic resonance
imaging (MRI). They traced the decrease of water in rice seeds after harvesting at
various drying temperatures, and compared the decrease in image intensity, which
was proportional to the removable water content, with the grains dried by the ovendrying method. In experimental results, they showed that the water content in fully
ripened seeds was approximately 20 percent. It was a low moisture content compared
with usual seeds because physiological drying of the husk occurs before harvesting.
With a water content of less than 20 percent, MRI images could not be obtained by
the spin-echo method; however, it was possible to obtain images in a short time by
the single point mapping imaging method, and thus the process of water reduction
was traceable. Water in this concentration range was adsorbed water on the surface of
molecular structures in the grain, while water content of less than 7 percent was tightly
bound. In their results, the water presented mainly in the grain kernels but not in the
husk, and embryos contained rather large amounts of water. The signal of the water
in the images was reduced as the drying time elapsed and the drying temperature was
increased. It disappeared uniformly from all areas of the endosperm of the seeds.
The moisture content and its distribution in a rice grain were also observed by a
method based on electromagnetic imaging (Lim et al., 2003). Because the dielectric
constant of water is much greater than that of the dry material of grain, the dielectric
constant of grain is correlated with its water content. Such correlation forms the basis
for rapid determination of the moisture content of rice grain electrically. Even though
this provides relatively reliable moisture measurement, it generally lacks spatial resolution. The method based on electromagnetic imaging mapped the two-dimensional
moisture distribution in rice grains, and a quantitative image reconstruction algorithm
with simultaneous iterative reconstruction was employed to achieve rapid convergence
of a final acceptance solution.
Quality evaluation of raw rice 383
3.3 Compound contents and distribution
3.3.1 Microscopic imaging
The chemical-compound content and distribution in a grain, which affects the morphological and histological characteristics, is generally fixed in the growth stage and
has an influence on the rice quality. Illustrations and/or photographs were used to represent such physical structures of rice until computer vision was developed (Bechtel
and Pomeranz, 1978a). The internal structures of the grain were also determined precisely by hand (Hoshikawa, 1993b). Histological microscopy of the grain in which
those results were printed as photographs described the anatomical nature of rice seed
as a plant material. Electron micrographs, including scanning electron micrographs
(SEM), are usually used to observe the ultrastructure in micrometer-scale samples.
For example, Bechtel and Pomeranz (1977, 1978b, 1978c) captured the ultrastructure
of the mature ungerminated rice caryopsis by light- and electron-microscopy. Watson
and Dikeman (1977) also employed SEM for observations of the endosperm, aleurone,
germ, and hull of the grain, with the objective of obtaining a better understanding of
rice ultrastructure. Such anatomical studies contribute to the inspection of biochemical properties and structural characteristics, which influence the availability of rice’s
nutrients as a foodstuff. In other words, the nutritional quality of foodstuffs is directly
related to the nature of nutrient storage in the grains, which can be observed using
a microscope (Yiu, 1993). A microscope can visualize the structural details that are
required for analyzing histological characteristics, and can also obtain image data by
the use of digital capturing apparatus.
The anatomical and histological structures, which can be analyzed by computer
vision, are also related to the physical properties. For the measurement of such
microstructures, with the internal chemical-compound content and distribution of a biological material, a traditional sectioning technique using a microtome has been applied.
Although this technique is destructive, the distribution of various chemical compounds
and their roughly quantitative values in a section can be visualized and analyzed in
two dimensions by suitable staining and/or various imaging methods. There are lots of
studies for observing the histological components in a small segment of rice grain by
light microscopy with histochemical staining. However, whole-size sections of a rice
grain, which must be of high quality for the observation of microscopy, have not been
obtained by a standard sectioning method, because the rice grain has poor mechanical
properties for sectioning and low infiltration properties for an embedding matrix such
as paraffin. Moreover, the moisture content of rice grains is too low for the collection of frozen sections. Furukawa et al. (2003) demonstrated that the cross-sectional
images of rice kernels were stained and observed by light microscopy and confocal
laser scanning microscopy. Rice cross-sections of 200 µm were obtained using a laser
blade and microtome. They then applied immunofluorescence labeling with specific
antibodies as a histological staining technique, for visualization of the distribution of
proteins stored in endosperm tissues. As a result, localization of two types of protein
bodies in endosperm tissue was observed. It was also found that low-glutelin rice was
different from the other cultivars not only in the major storage protein composition,
but also in the distribution of storage proteins in endosperm tissue.
384 Quality Evaluation of Rice
A special sectioning method using cellulose tape was proposed by Palmgren (1954)
for the study of large, hard, and brittle specimens. The adhesive-tape method facilitated preparation and improved the quality of the resulting sections of the whole
body of a baby rat, which could then be stained for histological and histochemical
characteristics (Kawamoto and Shimizu, 1986). Ogawa et al. (2003a) employed the
adhesive-tape method, combined with a better preparation technique for preserving
microstructural details, to obtain whole rice-kernel sections. This method was a combination of tape-aided sectioning on a standard microtome and an autofluorescence
visualization technique by microscope in the ultraviolet (UV) range to observe the
histological properties of the whole size and the complete shape of the rice section.
The procedure of tape-aid sectioning for this method is as follows:
1. The sample is dehydrated in a graded ethanol series followed by xylene, transferred to melted paraffin, allowed to infiltrate, and embedded by hardening of
the paraffin.
2. Embedded rice kernels are sectioned by the usual microtome, at ambient temperature, equipped with disposable blades. Each kernel is sectioned until the desired
portion is exposed. A piece of adhesive tape is then firmly pressed to the face
of a specimen block. While holding the tape, the microtome is advanced to cut
a section stuck to the tape. Note that the adhesive tape is a special product made
of polyester coated with a solvent-type acrylic resin that serves as an adhesive.
3. The tape-section is affixed to a glass slide with the specimen side facing up, and
is deparaffinized in xylene. Afterwards, microscopic observable sections can
easily be obtained.
3.3.2 Virtual visualization
To observe the internal composition in three dimensions, Levinthal and Ware (1972)
developed a three-dimensional reconstruction technique using serial section images
and interactive formation. This technique was applied to measure the three-dimensional
physical structure of the central nervous system of a simple animal. The interactive
formation method for three-dimensional reconstruction is based on the outline of the
sectioned objects constructed, and thus could not be reconstructed precisely.
Ogawa et al. (2000, 2001) developed a modified three-dimensional reconstruction and visualization technique. This technique is a combination of tape-aided serial
sectioning, staining and digital imaging, and virtual rendering by computerized reconstruction. The concept of this technique is embodied in the schematic diagram presented
in Figure 16.1. By tape-aided microtome sectioning, it was found that a set of serial
sections of a rice grain could be prepared and preserved with their own set of relative
position data. Two positioning rods were also carefully embedded with their long axes
perpendicular to the bottom plane of the embedding mold and sectioned with the sample, as shown in Figure 16.1. After sectioning, a single set of serial sections was stained
by a suitable histochemical dye and was captured in by a charge-coupled device (CCD)
camera. As the stained areas represented areas containing a dye–target complex, the
distribution of each compound in the section was visualized in two dimensions. Since
all sections of the sample grain were stuck to adhesive tape and the positioning rods,
Quality evaluation of raw rice 385
Knife blade of microtome
Embedded material including
positioning rods
Stained section image
2D image
CCD camera
Reconstructed 3D model
(stacked 2D images)
Adhesive tape
-Staining -
Stained sections
Obtained sections
Reconstructed positioning rods
1. Serial sectioning
2. Digital imaging
3. Virtual rendering
Figure 16.1 Schematic diagram of the virtual three-dimensional reconstruction and visualization technique using the tape-aid serial
sections with position adjustment markers.
the relative position of each serial section could be adjusted by referencing the position
adjustment markers if the captured position of the section images differed from one
to another. All adjusted section images of a set of serial sections were stacked in the
memory of a personal computer to produce a three-dimensional plotting model, using
the volume-rendering method.
The distribution of various compounds in a rice kernel could be visualized in a virtual
three-dimensional model. Figure 16.2 shows images of a sample section and its stained
result. The thickness of the section was 10 µm. A rice section and sliced positioning
rods as position adjustment markers were stuck to adhesive tape, and therefore the
relative position between the serial sections could be adjusted as described above. A
double-staining method with a combination stain of coomassie brilliant blue (CBB)
solution and iodine solution was applied to the section for the visualization of protein
and starch distribution (Figure 16.2b); protein was stained blue by CBB, and starch
was stained purple or brown by iodine solution. The compound distributions on the
section, which was clearly differentiated by color, were visualized. Protein was mainly
distributed around the edges of the section, while starch was distributed in the inner area.
Figure 16.3 shows a three-dimensional plotting image reconstructed from a set of
double-stained serial sections in a personal computer. The distributions of the stained
protein and starch compounds in a rice grain were visualized in three dimensions.
The embryo of the grain pointed towards the top, and a 1/10 opacity ratio was
employed. Since the three-dimensional plotting model was reconstructed using the
volume-rendering method, the voxel data, which were produced by pixel data and the
thickness of the sections, could represent the position of compounds in the plotting
model. The size of one voxel of this model was 10 × 13 × 13 µm, because the pixel
size of the two-dimensional images was 13 × 13 µm and the section thickness was
10 µm. Therefore, each voxel represented not only the shape of the grain as drawn by a
polygon, but also positional information regarding compounds as voxel data, and this
could be virtually subtracted by data-processing techniques.
386 Quality Evaluation of Rice
Figure 16.2 Images of (a) a sample section and (b) its stained result. The circular areas (c) are the
position adjustment markers, and their thickness was 10 µm. These sections were stuck to an adhesive tape,
and thus the relative position of each serial section could be adjusted by referencing the position adjustment
markers if the captured position varied. Staining was performed by soaking in 0.05-N iodine solution for 30
seconds and washing with distilled water, then soaking in CBB solution for 30 seconds and washing with
removal solution. The magnification bar is 1 mm.
Figure 16.3 An image of a three-dimensional plotting model reconstructed from a serial of double-stained
sections in a personal computer. The two positioning rods (a) are also reconstructed from position
adjustment markers. The volume-rendering method is used for this reconstruction technique. The wrinkles of
the plotting images, which look like the contour lines of a contour map, are caused by section thickness. This
is a peculiarity of this visualization technique.
Figure 16.4 shows virtually divided images of the distribution of protein and starch
in a sample rice grain. These images are extracted from that in Figure 16.3, based on
the color differences in compound staining. Consequently, it can be visualized and
observed that the protein, represented by the dark areas in Figure 16.4, is located in
the surrounding parts of the grain and embryo. Starch is located in the interior portions. Because this three-dimensional visualization technique is based on histochemical
technology, it can visualize the distribution of various compounds in a rice grain.
Quality evaluation of raw rice 387
Figure 16.4 Virtually divided images of the distributions of protein and starch in a sample rice grain: (a)
protein is distributed at the outer parts of the grain and embryo; (b) starch is located in the interior portions.
Ogawa et al. (2002a) also developed another three-dimensional visualizing technique
for the observation of rice-seed structure in three dimensions. A three-dimensional
internal structure observation system (3D-ISOS, Toshiba Machine Co. Ltd, Numazu,
Japan) was applied to observe a rice caryopses structure during developing. This system
can slice a sample material sequentially and capture each cross-section using a color
CCD imaging device (DXC-930, Sony Co., Tokyo, Japan), incorporating uniform
lighting conditions. Because the captured images of the cross-section are sequentially
digitized, they can be virtually stacked in a personal computer using the volumerendering method. As a result, the three-dimensional structure of the sample material
can be visualized by displaying the stacked image set. To obtain samples of dyed rice
seed, a cut stem bearing a panicle, collected 30 days after flowering (before the fullyripe stage), was placed in a 0.1% rhodamine B solution in distilled water for 2 days
to imbibe the dye. Figure 16.5 shows images of the resulting three-dimensional model
of the rice seed produced by the virtually stacking of the serial image set of the crosssections. Figure 16.5a shows the image of the simple three-dimensional model, while
Figure 16.5b represents a three-dimensional form of the vascular bundle. This model
was extracted from Figure 16.5a by image processing to suppress the green and white
voxels. Using this technique, the three-dimensional structure of the vascular bundles
can be observed by color extraction based on natural pigmentation or artificial dyeing.
Ogawa et al. (2002b) also determined the lipid distribution of a brown rice kernel
in three dimensions by application of the tape-aided sectioning technique. Lipid is one
of the major constituents of rice grain, and its distribution is not uniform in the brown
rice kernel, as measured by chemical analysis for the graded milling flours (Kennedy
et al., 1974). It was also reported that the outer layer of rice kernels, which was in
the bran including the germ, had larger amounts of lipid than did the inner parts, i.e.
the core or inner endosperm. Stored rice, especially that stored for an extended period
after harvesting, does not have a pleasant odor when it is cooked. This odor is linked
388 Quality Evaluation of Rice
Figure 16.5 Images of (a) the simple three-dimensional model and (b) the three-dimensional form of
the vascular bundle of the rice seed structure. The vascular bundle (c), dyed red by rhodamine B, appears
as a cage-like structure (note that this is red in the actual model). Because the seed was immature, the hull,
including the palea (d) and the lemma (e), is represented as paler parts (green in the actual image). The
rachilla (f) and upper (g) and lower (h) glume are also shown. The rhodamine B solution is distributed
through the vascular bundles by diffusion. Usually the rice seed has eight vascular bundles, but only six are
shown in the resulting image because two pairs of vascular bundles at the interlocking edges of the palea
and lemma are merged by the diffusion of the dye.
to the enzyme reactions and/or lipid autoxidation (Yasumatsu and Moritaka, 1964).
Lipid autoxidation, which influences the rice quality, is immediately triggered by air
contact. The lipids that are located in the outer area of a kernel are considered to be
more oxidizable than those in the inner parts. The observation of lipid distribution in
three dimensions is thus a significant improvement to the research carried out to this
point. In general, histochemical techniques have been applied for the observation of
chemical distributions in a section. In order to obtain sections from a material, paraffin is commonly used as the embedding material. Consequently, paraffinization and
deparaffinization steps are required, including a xylene-soaking process. As not only
paraffin but also the lipid content is removed from the thin section by the xylene-soaking
process, the common paraffin-embedding method is not suitable for the observation
of the real lipid distribution in a grain section. Although the resin-embedding methods
using polymeric resin, or frozen-section methods for materials with high water content
have been used for lipid observation, a small piece of chopped specimen is needed to
obtain sections. Therefore, the lipid distribution in an area as large as a whole rice section is difficult to measure by the usual histochemical technique except by tape-aided
sectioning. By application of tape-aided sectioning, preparatory steps for sectioning
(such as sample dehydration, paraffinization, and deparaffinization procedures, which
would influence lipid content) can be safely omitted for the kernel and its sections.
Sample grains can be directly embedded in the liquid paraffin, but the liquid paraffin
cannot infiltrate into the grain because of moisture in the kernel. Other than the waxy
paraffin slices, which are around the rice section and also stuck to the adhesive tape to
repel the staining solution, only the grain section was stained. Figure 16.6 shows the
resulting images of a virtually divided model for the three-dimensional isolated lipid
Quality evaluation of raw rice 389
Figure 16.6 (a) An image of a virtually-divided three-dimensional visualizing model for the isolated lipid
distribution and (b) its schematic form. In the three-dimensional model, black-stained parts such as the seed
coat and the embryo are intentionally erased for better observation of the internal lipid distribution. Thus,
this model represents areas below the seed coat of the rice kernel three-dimensionally. Although this model
represents areas below the seed coat of the rice kernel and the sum of all the stained areas does not correlate
quantitatively to the sum of all lipid-containing tissues, the display of the lipid distribution must be
considered to have a qualitative character more than a quantitative value.
distribution, and its schematic form. The distinct lipid distribution at the divided plane
can be shown. It is clear that the lipid tends to distribute at the dorsal side more prominently than at the ventral side in the sample kernel. Juliano (1972) and Takeoka et al.
(1993) reported that lipid in the endosperm of rice existed most prominently in the cells
of the aleurone layer, and its content was very small in the starch-storing tissue, which
was located in the inner area of the rice kernel. By this visualization technique, the
differences in lipid distribution in rice kernels of various cultivars, growth conditions,
and post-harvest processing can be measured. Moreover, not only Sudan Black B but
also other dyes can be applied for this technique. For example, for the visualization of
differentiated lipid contents, which can be classified in fatty acids, neutral lipids, and
so on, it has the potential to shed light on many phenomena, such as the mechanism of
lipid autoxidation in a rice kernel.
3.3.3 Other imaging techniques
Atomic force microscopy (AFM) is a micro-imaging technique in which a sharp, probing tip is scanned over the surface of a sample. Interactions between the tip and the
sample are translated into a three-dimensional image with resolution ranging from
nanometers to micrometers. Using the AFM imaging technique, morphological features in the natural state and topographical information regarding biological samples,
such as biological membranes, cell surface, and the molecular structure of various
biological macromolecules, can be obtained. Dang and Copeland (2003) applied AFM
imaging on the surface of cut grains of several rice varieties chosen on the basis of
different amylase-amylopectin ratios and cooking properties. The angular starch structures (3–8 µm in size) were arranged in layers approximately 400 nm apart. The layers
390 Quality Evaluation of Rice
represented the growth rings of starch granule formation, and the cross-striations in
each layer corresponded to the blocklets of amorphous and crystalline regions within
the starch granule. Such blocklets had an average size of 100 nm, and were proposed
to comprise approximately 280 amylopectin side-chain clusters.
The photoluminescence imaging technique, which is based on the spectral characteristics of visible light emitted from organic and inorganic compounds under UV
irradiation, with video imaging and digital image processing, is suitable for quick
and non-destructive quality control in various types of processing. Visible light photoluminescence from polished rice and some other starches was evaluated using a
two-dimensional photoluminescence imaging technique in a quality control system for
foods (Katsumata et al., 2005). Their visible light photoluminescence had a broad peak
at a wavelength of 462 nm from starchy foods under illumination of ultraviolet light at
365 nm. Peak intensity of photoluminescence varies with the variety and the source of
rice. The brightness over the photoluminescence image of rice of a single breed, from
a single source, distributes according to a Gaussian distribution curve. The deviation
of fitting result of brightness from the Gaussian distribution curve, which is estimated
as χ2 value, and the correlation coefficient increased in rice specimens of various
species of blended rice. Most grains, including rice, are composed of amylopectin,
amylose, amino-acids, fatty-acids, and inorganic minerals, etc. Although the origin of
the visible light luminescence from starchy foods is unidentified, lots of compounds
emit visible light photoluminescence under UV irradiation. Thus, the relative contents
of amylopectin and amylose, the concentration of amino acids and inorganic minerals
such as Ca, Na, and K, may influence the photoluminescence intensity of rice. Because
the quality of rice was influenced by these compounds, Katsumata and colleagues concluded that the photoluminescence imaging technique was potentially useful for quality
evaluation of the rice. For example, the blended rice from different species could be
detected using a two-dimensional photoluminescence imaging technique.
4 Quality evaluation of cooked rice
4.1 Water distribution
Changes of the water distribution and the internal structure of a rice grain during
cooking are closely related to the gelatinization characteristics, which influence the
texture profiles of cooked rice. The gravimetric change in the rice grain during boiling
was analyzed using a shell and core model developed by Suzuki et al. (1997). The model
assumes that the gelatinization is much more rapid than the rate of water diffusion
in a grain. This means that a partly boiled grain has an ungelatinized core covered
by a gelatinized shell, which influences the eating quality of cooked rice. The core
and the shell should therefore be measurable by the moisture profile in a grain. The
geometrical changes in grains during cooking, which were followed by the kinetic study
using various models, can be measured by on-line image acquisition. Ramesh (2001)
inspected the swelling characteristics of whole rice grains under various temperatures
using a digital image-analysis system on a real-time basis, and evaluated the cooking
Quality evaluation of cooked rice 391
kinetics of the whole rice grains. In order to determine the hot water hydration kinetics,
he carried out the two-dimensional image analysis on basmati rice, which is a long-grain
type, and could not apply a sphere mathematical model for moisture movement in rice.
The projected area of rice grains was converted into swelling ratios for the comparison
of the swelling at different temperatures. The reaction rate constant and the activation
energy for the hot water hydration were obtained from the swelling data. The hydration
data were further analyzed to generalize a polynomial equation correlating swelling
ratios to the heating time and temperature. Ramesh also concluded that this helped the
design of cooking equipment by providing a progressive increase in volume towards
the discharge end to accommodate the swelling as cooking proceeds.
Takeuchi et al. (1997a, 1997b) developed a real-time measurement technique for
visualization of the moisture profiles in a rice grain during boiling by quick and onedimensional nuclear magnetic resonance (NMR) imaging with the adaptation of a
multi-echo technique. Changes in the moisture distribution in a rice grain during boiling
were evaluated by the proposed technique. The moisture content of the grain increased
during boiling, and thus influenced gelatinization of the starch in the starch-stored
cells. The moisture population map, which was represented by a virtual slice of the
partly boiled grain, showed an asymmetrical progress of the moisture uptake in the
grain. Therefore, they hypothesized that the water diffusion (moisture absorbance) in
a grain was possibly restricted by cell-wall components of the starch-stored cells and
the protein layers such as aleurone and subaleurone. They found, however, that cellwall materials had little effect on resisting moisture migration. Watanabe et al. (2001)
proposed a non-Fickian diffusion mathematical model for water migration in rice grains
during cooking based on the NMR-imaging technique. The migration of water is driven
by the gradient of water demand, which is defined as the difference between the ceiling
moisture content and the existing moisture content in the model. Their model was
demonstrated to have potential for describing the anomalous characteristic features of
water migration in a grain during cooking. The total limited water content within the
rice grain was calculated and employed as an indicator of both the concentration and
the distribution of water in the grain during cooking, observed by the NMR imaging
technique (Kasai et al., 2005).
4.2 Grain-scale macro-structure
Horigane et al. (1999) discovered the existence of hollows in a cooked rice grain, and
proposed a mechanism to explain their formation, using NMR micro-imaging of protons (1 H). Their NMR micro-imaging techniques have mainly been limited to test-tube
samples, allowing only a few grains to be analyzed at a time, although it is necessary to analyze multiple samples to ensure rapid, statistically sound conclusions. The
samples were observed and analyzed by using two-dimensional images of longitudinal
and transverse sections from three-dimensional NMR micro-images. Dark spots were
found in the transverse sections and were surrounded by a peripheral layer of high
proton-density. Such dark spots existed only within the grain, and caused no lacerations on the grain surface. The authors therefore hypothesized that a dark spot was
due to either a low proton-density substance, or gases that appeared as a hollow. The
392 Quality Evaluation of Rice
presence of gases in the hollow regions was only known to occur in cracks or fissures,
and they confirmed, by a photomicrograph of the longitudinal section, that the dark
spots were enclosed in hollow regions of the cooked grains. The hollows were related
to structural changes that must have occurred during cooking, because no hollows
were observed in uncooked grains except for cracks or fissures. They also found that
the hollows appeared in the measurement of time-series images of the center layer in
grains during cooking, which indicated changes in the internal structure and in water
distribution. Accordingly, they concluded that the hollows originated from cracks or
fissures of raw grain, and were caused by the sealing of such lacerations with gelatinized starch in the peripheral layer in combination with expansion of the grain during
cooking. The hollows in a grain make the endosperm tissue less homogenous, and
therefore influence the texture and structural properties of cooked rice. The hollows
were also detected by other imaging techniques. Suzuki et al. (1999) applied X-ray
imaging and a light transmittance photography techniques for the detection of hollows
in a grain. They concluded that light transmittance photography was an effective and
useful technique, although NMR imaging, the operation of which is very difficult,
could provide more precise images and three-dimensional and quantitative measures
of the depth and volume of hollows. Suzuki et al. (2002) also reported that hollow size,
which was easily measured by the light transmittance method, was different in each
Hollow volumes were measured by the NMR micro-imaging technique (Horigane
et al., 2000). The size, shape, and total volume of hollows for five cultivars with various
amylose content were measured. Factors that influence hollow shape and volume should
be related to either grain expansion or the gelatinization characteristics of the starchy
endosperm. Endosperm amylose content is one such factor. However, they found that
the volume increased during gelatinization, which was negatively related to the amylose
content. Cracks and fissures in the grain, which are also important for the formation of
hollows, occur prior to cooking in most cultivars due to soaking or changes in relative
humidity. It seems unlikely that the differences in hollow formation among cultivars
can be explained by the presence or absence of cracks and fissures. Thus, the volume
should be measured, for example, the changes in hollow ratio during cooking, and
calculated from the three-dimensional images constructed from serial slice images of
the samples. In the research work of Horigane et al. (2000), the volume of hollows
increased with grain volume and length during cooking below 100◦ C. Compared with
this, the volume subsequently decreased during prolonged boiling. It was also assumed
that there was a relationship between amylose content and hollow formation, based on
their hypothetical model. However, it was concluded that there was no correlation
between the final hollow volume and shape, and individual parameters such as flour
gelatinization and amylose content. NMR micro-imaging was performed on a 7.1 T
NMR spectrometer, and its settings resulted in a total acquisition time of 4.6 hours.
Therefore, this approach inevitably involves long acquisition time and is not suitable
for real-time observation of water transport in such a complex microstructure. Because
a long scanning time and low spatial resolution lead to erroneous results, a short scan
time and high spatial resolution are critical factors in the investigation of the cooking
behavior of rice kernels.
Quality evaluation of cooked rice 393
Mohoric et al. (2004) proposed optimized three-dimensional NMR imaging with
high spatial resolution based on the rapid acquisition relaxation enhanced (RARE)
imaging method for the monitoring of cooking of a single rice kernel in real time. They
aimed to achieve both high temporal and high spatial resolution in order to establish
relationships between moisture content profiles, rice kernel microstructures, and the
extent of gelatinization, and to develop the general pattern of moisture ingress. They
used the three-dimensional RARE imaging sequence to record images of resolution of
128 × 32 × 16 voxels with a volume of 117 × 156 × 313 µm3 . An image was scanned
in 64 seconds, and the images in time series spanned 30 minutes of the cooking process.
Results were obtained from such real-time observation at high resolution, and the water
uptake was determined by analysis of the magnetic resonance imaging (MRI). Results
were compared with previous studies, and the general pattern of moisture ingress –
i.e. the shape of moisture profiles and the actual facts – were generalized. Based on
these results, a sophisticated model of water uptake in a three-dimensional substrate
structure during different types of water diffusion, and the swelling of the substrate,
can be developed for the simulation and interpretation of three-dimensional ingress
patterns of moisture as observed by MRI imaging.
4.3 Cell-scale micro-structure
Histological compound distributions, related to the physical, chemical and physicochemical properties of cooked rice, are also important in evaluating cooked rice
qualities. In general, textural properties of cooked rice have great correlation with
the morphological structures of a single kernel, and observation of histological structures (such as the compound distributions in the cooked grain) thus contributes to
the quality inspection and evaluation of cooked rice. Microscopic techniques can be
applied to individual grain kernels to identify the histological characteristics of cooked
rice, and can also visualize structural details required for the evaluation of histological
characteristics. To allow the digestibility of protein of rice, Bradbury et al. (1980)
photographed the condition of histological components in a small segment of boiled
rice kernel using the electron microscope. However, the rice kernel has inhomogeneous structures (Takeoka et al., 1993), and therefore not only small segments but also
whole kernel sections are required for the evaluation of cooked rice grains. Although
the frozen-sectioning method can be proposed for the collection of whole sections of
a cooked kernel, it cannot produce quality sections because cooked rice kernels have
poor physical properties for sectioning. Ogawa et al. (2003b) thus applied a combination method of tape-aided sectioning on a standard microtome and an autofluorescence
visualization technique by microscope in the ultraviolet (UV) range to observe the
histological properties, such as the location of phenolic cell wall materials, which were
responsible for the autofluorescence produced using UV light (Fulcher, 1982). As a
result, cell distributions, cell formations, and disruptions can be visualized. They also
used scanning electron microscopy (SEM) as a complementary tool to fluorescence
microscopy. Figure 16.7 shows a microscopy and an autofluorescent sample image of
the same longitudinal section (20 µm thick) of a cooked rice kernel, which is stuck to
adhesive tape. These images prove that quality sections for the cooked rice kernel can
394 Quality Evaluation of Rice
Figure 16.7 (a) Sample microscopy and (b) autofluorescent images of the same longitudinal section of
cooked rice kernel stuck to a piece of adhesive tape. A simple image-processing algorithm for inverting the
negative image and contrast enhancement to the autofluorescent image is carried out for better observation
of visualized cell distributions. The section thickness is 20 µm; the magnification bar is 1 mm.
Figure 16.8 Magnifications of (a) the fringe and (b) the central area of the autofluorescent image in
Figure 16.7. The magnification bar is 100 µm.
be obtained using the tape-sectioning method. In the autofluorescent image (Figure
16.7b), it can be seen that cell walls are destroyed at the outer area of kernels. Cell
walls around the inner area are not destroyed in the non-void section (Figure 16.8).
As demonstrated by the cell distributions and cell-wall formations at the various areas
in the section, it was posited that cell walls tended to be damaged at the cells around
the border between the rice and water during cooking, because internal areas of the
non-void section had a clear distribution of cells similar to that of milled rice kernels.
When rice is cooked, some compounds (such as carbohydrate and lipid) are dissolved
into the cooking water, which gradually becomes concentrated and turns into a viscous liquid during boiling. This liquid becomes the membrane cover of the surface of
cooked grain kernels in the final cooking stage, and is thus related to the eating quality
Quality evaluation of cooked rice 395
(f )
Figure 16.9 Images of microscopy for the histological sections of the compressed rice sample and their
autofluorescent images focused in the void area. The compression ratios were 30% (a, d), 50% (b, e),
70% (c, f). These are perpendicular to the longitudinal direction. The magnification bar for the
microscopy images is 1 mm, and for the autoflurescence images is 100 µm.
of rice (Hoshikawa, 1993a). Cell disruptions allow the dissolution of such internal
compounds into the cooking water. Furthermore, cell disruptions are related to the
texture. Because it is considered that the balance between cell disruptions and the dissolution of compounds into cooking water depends on the cooking process, differences
in eating quality, which are caused by changes in cooking condition and recognized
by experience, must be influenced by the histological and structural properties of the
cooked grain kernels.
Ogawa’s visualization technique can reveal the relationship between the histological,
structural, and textural properties of cooked rice. Ogawa et al. (2006) studied the
structural changes occurring in the cooked rice grain, after compression to a specific
percentage, to show the relationship between texture and structure. The images are
shown in Fig. 16.9.
A compressed cooked rice grain was observed with the resistance force (Figure
16.10). The resistance force, which increased with the compression ratio, was linear
up to 40 percent compression and was non-linearly increased when the compression
ratio was above 50 percent (Figure 16.10). Microscopy samples were sectioned parallel
to and viewed perpendicular to the direction of compression. The sections were collected at approximately the mid-point of each kernel in order to visualize the effects of
compression. As the compression ratio increased, voided areas (empty or water-filled
cavities) in the samples decreased and disappeared with higher compression ratios. At
the initial phase of compression, voided areas offer little resistance to crushing. As
compression increased, the dense and starchy material of the kernel absorbed the pressure and the cells began to be crushed. The differences in structure at various points
of compression explain the linear behavior of the resistance force versus compression ratios up to 40 percent, and the non-linear behavior when the compression ratio
increases beyond this level. In the uncompressed cooked kernel, relatively intact cells
are found in the voided area. Surrounding the void area there are cells with disrupted
cell walls and, therefore, free starch granules. Starch granules in the disrupted cells
396 Quality Evaluation of Rice
Resistance force (N)
Figure 16.10
Compression ratio (%)
Averaged resistance force of cooked rice kernels against compression.
have already had access to water and become completely gelatinized with cooking,
so the voids are sealed by gelatinized starch. Structural details of the voided areas
show the way that voids and surrounding tissues change during compression. In the
uncompressed kernels, the void is relatively narrow and pointed towards the lateral
sides. Compression of the kernel at 30 percent causes the void to become wider in the
center and perhaps split more towards the periphery of the kernel, whereas individual
cells surrounding the void become only slightly distorted. Compressions of 50 percent
and 70 percent, decreased the void volume, and distorted shapes of cells somewhat,
affecting cell integrity. The effects of compression at 50 percent are less than those at
70 percent. Areas without voids are also compared structurally with those at the edge
of the kernel. The cells of a cooked and uncompressed kernel are radially oriented as in
a raw kernel, and the cells appear to be mostly intact. Compression at 70 percent causes
cells to be more rounded in shape due to the effects of crushing, with surprisingly few
areas where the cell walls have been torn. Cell walls are evidently capable of plastic
deformation not only upon cooking but also with compression. The degree to which
the kernel structure changed during the various compression tests, combined with the
linear and then non-linear behavior of the resistance force versus the compression ratio,
indicates that the voided areas and cell walls have an effect on texture.
SEM images were also used to visualize changes of the compound formations contained in the rice grain during cooking or processing. Sandhya and Bhattacharya (1995)
determined the relative rigidity/fragility of starch granules by SEM. They reported that
the low-amylose rice starch showed total granule disintegration after 60 minutes of
cooking at 95◦ C, but that high-amylose granules demonstrated only marginal disorganization in concentrated (12%) pastes. Waxy starch granules disintegrated even at
70◦ C. Apart from this, granules swelled without appreciable disruption and thus apparently more in low amylase starch than in high amylose starch in dilute (1%) pastes.
Their results also indicated that the low-amylose starch granules were weak and fragile,
and thus swelled and disintegrated easily. The high-amylose rice starch was relatively
References 397
strong and rigid, so it resisted swelling or disintegration. They therefore concluded that
the relative rigidity/fragility of starch granules is key to differences in rice quality.
5 Conclusions
The use of computer vision technology for raw and cooked rice quality evaluation, and
results from this technique, have been summarized in this chapter, though there may
be more related research such as near-infrared spectroscopic imaging and ultra-weak
photon-emission imaging. Because computer vision technology is still progressing
along with the development of hardware and software, lots of unidentified characteristics will be revealed. Such computerized techniques for rice quality evaluation should
be followed not only by “real imaging techniques” but also by “virtual visualization
techniques” – for example, hardness distribution with textural mechanics by parametric
Atungulu G, NishiyamaY, Koide S (2003) Evaluation of condensation phenomena on grains
by image analysis. Agricultural Engineering Journal, 12 (1&2), 65–78.
Bechtel DB, Pomeranz Y (1977) Ultrastructure of the mature ungerminated rice (Oriza
Sativa) caryopsis. The caryopsis coat and the aleurone cells. American Journal of
Botany, 64 (8), 966–973.
Bechtel DB, Pomeranz Y (1978a) Implications of the rice kernel structure in storage,
marketing, and processing: a review. Journal of Food Science, 43 (5), 1538–1542.
Bechtel DB, Pomeranz Y (1978b) Ultrastructure of the mature ungerminated rice (Oriza
Sativa) caryopsis. The germ. American Journal of Botany, 65 (1), 75–85.
Bechtel DB, Pomeranz Y (1978c) Ultrastructure of the mature ungerminated rice (Oriza
Sativa) caryopsis. The starchy endosperm. American Journal of Botany, 65 (6), 684–
Bradbury JH, Collins JG, Pyliotis NA (1980) Methods of separation of major histological
components of rice and characterization of their proteins by amino acid analysis.
Cereal Chemistry, 57 (2), 133–137.
Carter RM,YanY, Tomlins K (2006) Digital imaging based classification and authentication
of granular food products. Measurement Science and Technology, 17, 235–240.
Dalen GV (2004) Determination of the size distribution and percentage of broken kernels
of rice using flatbed scanning and image analysis. Food Research International, 37,
Dang JMC, Copeland L (2003) Imaging rice grains using atomic force microscopy. Journal
of Cereal Science, 37, 165–170.
Food and Agriculture Organization of the United Nations (FAO) (2006) I. Production. FAO
Rice Market Monitor, 9 (1), 3.
Fulcher RG (1982) Fluorescence microscopy of cereals. Food Microstructure, 1, 167–175.
398 Quality Evaluation of Rice
Furukawa S, Mizuma T, Kiyokawa Y, Masumura T, Tanaka K, Wakai Y (2003) Distribution
of storage proteins in low-glutelin rice seed determined using fluorescent antibody.
Journal of Bioscience and Bioengineering, 96 (5), 467–473.
Goodman DE, Rao RM (1984) A new, rapid, interactive image analysis method for determining physical dimensions of milled rice kernels. Journal of Food Science, 49,
Horigane AK, Toyoshima H, Hemmi H, Engelaar WMHG, Okubo A, Nagata T (1999)
Internal hollows in cooked rice grains (Oryza sativa cv. Koshihikari) observed by
NMR micro imaging. Journal of Food Science, 64, 1–5.
HoriganeAK, Engelaar WMHG,Toyoshima H, Ono H, Sasaki M, OkuboA, NagataT (2000)
Differences in hollow volumes in cooked rice grain with various amylose contents as
determined by NMR micro imaging. Journal of Food Science, 65, 408–412.
Horigane AK, Engelaart WMHG, Maruyama S, Yoshida M, Okubo A, Nagata T (2001)
Visualization of moisture distribution during development of rice caryopses (Oriza
sativa L.) by nuclear magnetic resonance microimaging. Journal of Cereal Science,
33, 105–114.
Hoshikawa K (1993a) Quality and shape of rice grains. In Science of the Rice Plant.
Vol.1. Morphology (Matsuo T, Hoshikawa K, eds). Tokyo: Food and Agriculture Policy
Research Center, pp. 377–412.
Hoshikawa K (1993b) Rice seed, germination and seedlings. In Science of the Rice Plant.
Vol. 1. Morphology (Matsuo T, Hoshikawa K, eds). Tokyo: Food and Agriculture Policy
Research Center, pp. 91–109.
Ishida N, Naito S, Kano H (2004) Loss of moisture from harvested rice seeds on MRI.
Magnetic Resonance Imaging, 22, 871–875.
Jia CC, Yang W, Siebenmorgen TJ, Bautista RC, Cnossen AG (2002) A study of rice
fissuring by finite-element simulation of internal stress combined with high-speed
microscopy imaging of fissure appearance. Transactions of the ASAE, 45 (3),
Juliano BO (1972) The rice caryopsis and its composition. In Rice, Chemistry and
technology (Houston DF, ed). St Paul: American Association of Cereal Chemists,
pp. 16–74.
Kasai M, Lewis A, Marica F, Ayabe S, Hatae K, Fyfe CA (2005) NMR imaging investigation
of rice cooking. Food Research International, 38, 403–410.
Katsumata T, Suzuki T, Aizawa H, Matashige E, Komuro S, Morikawa T (2005) Nondestructive evaluation of rice using two-dimensional imaging of photoluminescence. Review
of Scientific Instruments, 76 (7), 073702, 1–4.
Kawamoto T, Shimizu M (1986) A method for preparing whole body sections suitable
for autoradiographic, histological and histochemical studies. Stain Technology, 61,
Kennedy BM, Schelstraete M, Del Rosario AR (1974) Chemical, physical, and nutritional
properties of high-protein flours and residual kernel from the overmilling of uncoated
milled rice. I. Milling procedure and protein, fat, ash, amylose, and starch content.
Cereal Chemistry, 51, 435–448.
Lai FS, Zayas I, Pomeranz Y (1986) Application of pattern recognition techniques in the
analysis of cereal grains. Cereal Chemistry, 63 (2), 168–172.
References 399
Lan Y, Fang Q, Kocher MF, Hanna MA (2002) Detection of fissures in rice grains using
imaging enhancement. International Journal of Food Properties, 5 (1), 205–215.
Levinthal C, Ware R (1972) Three dimensional reconstruction from serial sections. Nature,
236, 207–210.
Lim MC, Lim KC, Abdullah MZ (2003) Rice moisture imaging using electromagnetic
measurement technique. Transactions of the Institution of Chemical Engineers, Part C,
81 (3), 159–169.
Mohoric A, Vergeldt F, Gerkema E, de Jager A, van Duynhoven J, van Dalen G, Van As H
(2004) Magnetic resonance imaging of single rice kernels during cooking. Journal of
Magnetic Resonance, 171, 157–162.
Ogawa Y, Sugiyama J, Kuensting H, Ohtani T, Hagiwara S, Kokubo K, Kudoh K, Higuchi T
(2000) Development of visualization technique for three-dimensional distribution of
protein and starch in a brown rice grain using sequential stained sections. Food Science
and Technology Research, 6 (3), 176–178.
Ogawa Y, Sugiyama J, Kuensting H, Ohtani T, Hagiwara S, Liu XQ, Kokubo M,
Yamamoto A, Kudoh K, Higuchi T (2001) Advanced technique for three-dimensional
visualization of compound distributions in a rice kernel. Journal of Agricultural and
Food Chemistry, 49 (2), 736–740.
Ogawa Y, Kuensting H, Sugiyama J, Ohtani T, Liu XQ, Kokubo M, Kudoh K, Higuchi T
(2002a) Structure of a rice grain represented by a new three-dimensional visualization
technique. Journal of Cereal Science, 36 (1), 1–7.
Ogawa Y, Kuensting H, Nakao H, Sugiyama J (2002b) Three-dimensional lipid distribution
of a brown rice kernel. Journal of Food Science, 67 (7), 2596–2599.
Ogawa Y, Orts WJ, Glenn GM, Wood DF (2003a) A simple method for studying whole
sections of rice grain. Biotechnic & Histochemistry, 78 (5), 237–242.
Ogawa Y, Glenn GM, Orts WJ, Wood DF (2003b) Histological structures of cooked rice
grain. Journal of Agricultural and Food Chemistry, 51 (24), 7019–7023.
Ogawa Y, Wood DF, Whitehand LC, Orts WJ, Glenn GM (2006) Compression deformation
and structural relationships of medium grain cooked rice. Cereal Chemistry, 83 (6),
PalmgrenA (1954)Tape for microsectioning of very large, hard or brittle specimens. Nature,
174, 46.
Ramesh MN (2001) An application of image analysis for the study of kinetics of hydration
of milled rice in hot water. International Journal of Food Properties, 4 (2), 271–284.
Sakai N, Yonekawa S, Matsuzaki A, Morishima H (1996) Two-dimensional image analysis of the shape of rice and its application to separating varieties. Journal of Food
Engineering, 27, 397–407.
Sandhya MR, Bhattacharya KR (1995) Microscopy of rice starch granules during cooking.
Starch/Starke, 46 (9), 334–337.
Shatadal P, Jayas DS, Bulley NR (1995a) Digital image analysis for software separation and
classification of touching grain: I. disconnect algorithm. Transactions of the ASAE,
38 (2), 635–643.
Shatadal P, Jayas DS, Bulley NR (1995b) Digital image analysis for software separation
and classification of touching grain: II. classification. Transactions of the ASAE, 38
(2), 645–649.
400 Quality Evaluation of Rice
Suzuki K, Aki M, Kubota K, Hosaka H (1997) Studies on the cooking rate equations of
rice. Journal of Food Science, 42 (6), 1545–1548.
Suzuki M, Horigane AK, Toyoshima H, Yan X, Okadome H, Nagata T (1999) Detection of
internal hollows in cooked rice using a light transmittance method. Journal of Food
Science, 64, 1027–1028.
Suzuki M, Kimura T,Yamagishi K, Shinmoto H (2002) Discrimination of cooked mochiminori and koshihikari rice grains by observation of internal hollows using light
transmittance photography. Food Science and Technology Research, 8 (1), 8–9.
Takeoka Y, Shimizu M, Wada T (1993) Morphology and development of reproductive
organs. In Science of the Rice Plant. Vol. 1. Morphology (Matsuo T, Hoshikawa K,
eds). Tokyo: Food and Agriculture Policy Research Center, pp. 339–376.
Takeuchi S, Fukuoka M, GomiY, Maeda M, Watanabe H (1997a)An application of magnetic
resonance imaging to the real time measurement of the change of moisture profile in
a rice grain during boiling. Journal of Food Engineering, 33, 181–192.
Takeuchi S, Maeda M, Gomi Y, Fukuoka M, Watanabe H (1997b) The change of moisture
distribution in a rice grain during boiling as observed by NMR imaging. Journal of
Food Engineering, 33, 281–297.
Wan YN (2002) Kernel handling performance of an automatic grain quality inspection
system. Transactions of the ASAE, 45 (2), 369–377.
Wan YN, Lin CM, Chiou JF (2002) Rice quality classification using an automatic grain
quality inspection system. Transactions of the ASAE, 45 (2), 379–387.
Wang YC, Chou JJ (2004) Automatic segmentation of touching rice kernels with an active
contour model. Transactions of the ASAE, 47 (5), 1803–1811.
Watanabe H, Fukuoka M, Tomiya A, Mihori T (2001) A new-Fickian diffusion model for
water migration in starchy food during cooking. Journal of Food Engineering, 49,
Watson CA, Dikeman E (1977) Structure of the rice grain shown by scanning electron
microscopy. Cereal Chemistry, 54 (1), 120–130.
Yadav BK, Jindal VK (2001) Monitoring milling quality of rice by image analysis.
Computers and Electronics in Agriculture, 33, 19–33.
Yasumatsu K, Moritaka S (1964) Fatty acid compositions of rice lipid and their changes
during storage. Agricultural and Biological Chemistry, 28 (5), 257–264.
Yiu SH (1993) Food microscopy and the nutritional quality of cereal foods. Food Structure,
12, 123–133.
Quality Evaluation of
Stephen J. Symons and Muhammad A. Shahin
Grain Research Laboratory, Winnipeg, Manitoba, Canada, R3C 3G8
1 Introduction
There has been a long-term consistent effort by scientists to move away from subjective evaluation of seed properties and towards objective inspection. Even prior to
computer-based quantification, simple seed parameters were being quantified using
numerical techniques – for example, oat kernels were placed on size grids and measurements were determined by estimating the proportion of squares covered by the kernel
(Baum and Thompson, 1976). The earliest attempts at quantifying quality parameters in cereals using automated or computer-based assessment were in the application
of imaging to assess gliadin electrophoregrams. Lookhart et al. (1983) used a computer application to compare the gliadin banding pattern from an “unknown” wheat
variety with band patterns obtained from known varieties. By doing so, they could
predict the membership of the unknown kernel to a known variety grouping. This was
an indirect approach to implementing computer analysis, for the acquisition of the
electrophoretic patterns was indirect via a photographic reversal negative that was subsequently scanned by a spectrodensitometer. Sapirstein and Bushuk (1985a) adapted the
imaging technique for electrophoretic analysis to use a digitizer to acquire information
from electrophoregrams for analysis. This technique, along with their modified band
relative mobility algorithms, improved the analysis of electrophoregrams for wheat
cultivar identification (Sapirstein and Bushuk, 1985b).
The application of machine vision systems for the measurement of seed characteristics started with relatively simple measurements. The concept of simple counting was
applied in the earliest imaging systems, some of which required the user manually
to trace the outline of their object of interest, while others incorporated the ability to
segment the object from the background (e.g. the IBAS system, Kontron Electronics, Eching, Germany). The application of automated imaging techniques to assess
attributes of the cereal grains has been reported for almost 30 years. Simple measurements of seed area, length, and either width or height were used to feed discriminant
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
402 Quality Evaluation of Corn/Maize
models to identify different UK wheat varieties (Travis and Draper, 1985; Keefe and
Draper, 1986), and different soft red US wheat varieties (Zayas et al., 1985).
1.1 Whole seed analysis for type
Using a Quantimet 720 image analyzer, Zayas et al. (1985) were able to directly digitize wheat seeds using a vidicon-type tube camera. Following daily corrections for
background effects and shading, seed morphology parameters were calculated. Using
a set of directly determined parameters, nine derived parameters were used to separate kernels of two wheat types, namely Arthur and Arkan, using a canonical analysis
technique. There was a high degree of identification of each wheat type. A Quantimet
10 image analysis system was used by Keefe and Draper (1986) to measure several
parameters from the side perspective of wheat kernels. Again, a camera, connected
to the computer using an analog-to-digital converter, was used to directly capture
images of the seeds (Draper and Travis, 1984; Travis and Draper, 1985). A total of 16
parameters were used: 10 were directly measured, while the other 6 were derived. An
analytical approach described by Almeida and Bisby (1984) was used for comparing
sets of variables. The method presented by Keefe and Draper (1986) was able to identify whether seed samples were derived from the same seed lot, or were of the same
Analysis of cereal grains required the exact manual positioning and location of the
kernel, since each measurement was orientation specific (Zayas et al., 1985; Keefe and
Draper, 1986; Symons and Fulcher, 1987). These systems also separated image capture,
image processing, and data analysis into stand-alone steps. In the next evolution of
the technology, image capture, image processing, and image and data storage were
integrated by linking two computer systems (Zayas et al., 1986). Keefe and Draper
(1988) reported a system for the automated location of the sample under the camera,
allowing multiple views. In their instrument, the camera moved in both X and Y
dimensions along a gantry that facilitated scanning a large number of samples.
While the initial application of imaging within cereals focused on the separation of
different sub-types or cultivars within a cereal type, such as wheat, corn or barley, this
was eventually expanded to the determination of membership of seeds from several
different seeds types. Lai et al. (1986) used discriminant analysis techniques to classify
and predict the membership of seven grain types, namely corn, soybean, grain sorghum,
white rice, brown rice, wheat, and barley. Figure 17.1 illustrates several of the grain
types that are commonly used in imaging studies. Corn, having unique shape and color
(translucency) characteristics in addition to being the largest seed type in the study, gave
100 percent accurate identification. Data models were also created that characterized
each of the seed types. However, caution must be exercised when dealing with the
characterization of biological products such as seeds. While seed characteristics are
highly influenced by the genetics of the parent, their appearance is also subject to
substantial influence from the growing environment. In a study of kernels of wheat
grown in eastern Canada, variations in kernel characteristics for a single variety grown
at multiple locations were as large as the differences found between different varieties
grown in one location (Symons, unpublished). This implies that large sample sets
Introduction 403
Figure 17.1 Examples of various commodities imaged: (a) corn; (b) soybean; (c) sorghum; (d) red spring
wheat; (e) malting barley. © Canadian Grain Commission.
exhibiting the total variability anticipated for a seed type or variety are required to
create robust data models.
1.2 Internal seed characteristics
The functional properties of cereal grains as they relate to processing or milling are
dependent upon more than just the external morphometry of the seed. Indeed, functional
properties are significantly influenced by the internal relationship of tissues during seed
formation and maturation. The internal characteristics of wheat seeds of European
wheats were found to be related to the milling properties (Evers and Withey, 1989),
although the seed preparation methodology was found to be less than optimal and
404 Quality Evaluation of Corn/Maize
would be difficult to adapt to a routine fast analysis. The internal tissue relationships
for Canadian wheats were related to variety (Symons and Fulcher, 1988a), suggesting
that measurement of these properties would be beneficial to the overall determination
of seed quality and perhaps variety. Certainly, the inclusion of these properties into a
model to classify varieties had a significant benefit.
1.3 Relating seed morphometry to quality
The rationale for the determination of cereal grain variety is that different varieties
exhibit differing quality characteristics. For oats, a distinct relationship between oat
milling yield and kernel weight was reported (Symons and Fulcher, 1988b). A strong
relationship between oat kernel area and kernel mass was established (Symons and
Fulcher, 1988c), allowing the potential application of imaging measurements to predict
oat milling yield. This solution, however, is not simple, since there is a wide diversity
in oat kernel weight within a sample due to kernel location in the oat plant head.
Furthermore, the growing location also has a significant effect on kernel characteristics
(Symons and Fulcher, 1988c). A model showing the successful prediction of oat milling
quality has yet to appear in the literature.
One of the perceptually easier problems to tackle in cereals is the determination of
kernel vitreousity. Sapirstein and Bushuk (1989) showed that durum wheat has a high
transmittance profile compared to hard red spring wheat, rye, barley and oats. They also
demonstrated that kernels of durum wheat with differing degrees of vitreousness had
substantially differing light transmittance profiles, and thus there was a potential for
using imaging methodology to predict durum wheat hard vitreous kernel (HVK) scores.
The proportion of HVK in a sample is an internationally recognized specification
determining the value of durum wheat, and is a quality factor that is currently visually
assessed, so remains a prime candidate for machine evaluation. Seed translucency
of durum wheat kernels (Figure 17.2) was able to predict HVK scores in a set of
five prepared samples in the HVK range of 20–100 percent, and was accurate for
the prediction of commercial cargo shipments of Canadian durum wheat (CWAD) to
within 5 percent (Symons et al., 2003). When measurements from reflective images
were combined with those from light transmittance images, a high degree of consistency
was found (Xie et al., 2004). In this work, the machine vision system gave a standard
deviation of 1 percent compared to trained inspectors with 3.5 percent. Xie et al.
(2004) also confirmed the observations of Symons et al. (2003) that mottled (piebald)
kernels and bleached (weathered) kernels were difficult to classify accurately, since
these characteristics are difficult to image and therefore to model numerically.
1.4 Assessing seed quality indirectly
Grain hardness is an important quality characteristic, as it directly affects the milling
characteristics and the level of starch damage that may occur during milling. Hardness is not easily determined from whole grain morphometic analysis, although, as
discussed in section 1.2, it is related to grain vitreousness, and HVK determination
may predict some milling properties of the grain. Zayas et al. (1994) approached the
Introduction 405
Figure 17.2 (a) Vitreous Canadian durum wheat; (b) non-vitreous Canadian durum wheat; (c) visual
appearance of the two quality types – transmitted light or backlight images of the two quality types showing
translucent vitreous kernels and opaque starchy kernels. © Canadian Grain Commission.
problem of determining grain hardness by characterizing the isolated starch granules
from samples of both US hard red winter (HRW) and soft red winter (SRW) wheats.
Using starch granule size and aspect ratio measured by imaging, full segregation of
the soft and hard wheats was demonstrated. This separation was confirmed using standard near-infrared analysis to determine sample hardness. There has been no clear
demonstration that the milling properties of common wheats can be determined from
whole grain morphometery, yet the characteristics of flour streams in the mill and flour
refinement (Symons and Dexter, 1991), which are related to both value and functional
properties, can be determined by imaging. The outer seed-coat layers have characteristic fluorescent properties for both the pericarp (Symons and Dexter, 1992, 1996) and
the aleurone (Symons and Dexter, 1993), which relate to traditional flour quality determinants (Kent Jones Color, L∗ or Ash content) used for describing flour refinement.
406 Quality Evaluation of Corn/Maize
Again, while there is no report of an imaging system that characterizes the milling
properties of durum wheat as related to semolina quality, imaging methods have been
developed that can characterize the semolina (Symons et al., 1996), particularly the
speckiness, which directly relates to milling quality and market value.
1.5 Adding color into the analysis
Color is a visual property of agricultural products, since it is correlated in many cases
to other physical, chemical, and sensorial indicators of product quality (Mendoza et al.,
2006). The analysis of images obtained using a color camera is equivalent to the analysis
of three monochrome images obtained through wide-band red (R), green (G), and blue
(B) filters. A difference in reflectance of all three-color planes was found between types
of western Canadian wheats (Neuman et al., 1989a). These color differences were used
to identify and classify the wheat classes that differed in color (Neuman et al., 1989b).
The Canadian grain grading system uses differences in wheat kernel size, shape, and
color to distinguish the wheat class (Figure 17.3), which in turn relates to functional
processing characteristics. For seeds, of which color formed the primary determinant of
quality, color measurements from images have a significant role in predicting quality.
Varieties of five western Canadian lentils were separated into color classes using colorimaging measurement. When combined with size determination, all five varieties were
determined with an accuracy of 99 percent (Shahin and Symons, 2005). Classification
was performed using a neural network model. Similarly, using a back-propagation type
neural network, five rice cultivars were classified and identified with a high degree
of accuracy when color features were incorporated into the classification model (Liu
et al., 2005).
1.6 The analysis is only as good as the sample
When there was a high degree of sample variability, such as in durum wheat rail-car
samples, there was less agreement between trained inspectors and a machine vision
system (Symons et al., 2003). This result has been experienced multiple times by
the current authors in their research program (unpublished). Seed samples containing
clearly defined groups remain relatively easily classified using measurements derived
from images. However, commercial samples, which may be somewhat heterogeneous
in composition, pose challenges in obtaining a high degree of classification accuracy.
The use of computer vision can give a measurement error of less than 0.1 mm for
samples of no less than 300 kernels. This could be used to characterize high-quality
commercial grain shipments (Sapirstein and Kohler, 1999). However, it was found
that as the overall sample quality declined, the sample size required to maintain this
accuracy of analysis increased, reflecting a lower uniformity in kernel morphometry.
1.7 Integration and automation of analysis
Early imaging instruments usually separated the image capture procedure from image
data analysis. More complex instruments (Symons and Dexter, 1991; Zayas et al., 1994)
Introduction 407
Prairie Spring (Red)
Red Spring
Prairie Spring (White)
Extra Strong
Red Spring
Soft White Spring
Red Winter
Figure 17.3 Canadian wheat classes differing in kernel size, shape, and color. © Canadian Grain
integrated both the image capture and image processing into a single instrumental process, although data analysis remained a distinctly separate process. This independence
of each step in the imaging process is acceptable in a research environment; however,
it is not acceptable if machine vision is to be deployed as an applied technology in the
industry for grain quality analysis. Research systems typically used hand-positioned
grain kernels to ensure consistency of imaging measurements. However, automation
dictates that this would not be possible and that alternative mechanisms need to be
sought. To this end, investigations of disconnect algorithms for the separation of touching wheat kernels was undertaken (Shatadal et al., 1995a). This work demonstrated
that in samples presented as a monolayer to an imaging system, better than 80 percent of touching instances could be separated. The effect of the disconnect algorithm
408 Quality Evaluation of Corn/Maize
applied to touching kernels was minimal for measured kernel features, although the
area of oats and the radius ratio of barley were adversely affected. Wheat features
remained unaffected by this technique (Shatadal et al., 1995b). Touching seeds of
Canadian pulse crops were singulated in multi-layer samples. The size distributions
of seeds as determined by image analysis matched closely with the size distributions
determined by sieving, the standard industrial sizing method (Shahin and Symons,
2005). These studies indicate that there is potential for automation of seed delivery
to imaging systems, since the resolution of touching seeds is possible using imaging
techniques. The integration of all steps, from image analysis to data analysis and information delivery, has only recently been reported for grain analysis. Shahin and Symons
(2001) report an integrated analysis system for Canadian lentils, while DuPont Acurum
(www.acurum.com) report an integrated instrumental system for cereal grains analysis.
2 Corn
The USA is a major – possibly the largest – corn grower and international exporter. Corn
(Zea mays L.) is grown as a food, feed, and industrial feedstock. The commercial value
of corn is based on the seed quality, which in turn determines the end use of the product.
End use of corn varies widely. Approximately 80 percent is consumed as animal feed for
meat, poultry, and milk production, while the remaining 20 percent is used in a variety
of industrial processes for production of oil, starches, high-fructose corn sweetener,
ethanol, cereals, and other food products (Hurburgh, 1989). On average, corn kernels
consists of 71 percent starch, 9 percent protein, and 4 percent oil on a dry weight basis;
however, genetic background and environment conditions cause significant variations
in constituent contents (Hurburgh, 1989).
2.1 Use of corn
Historically, a major portion of US corn has been grown for animal feed. However, the
fastest growing use of corn today is for food and industrial use. Most of the growth
in industrial use of corn has been in the area of wet milling, which currently accounts
for 75 percent of processed corn (Eckhoff, 1992). Corn wet-milling is an industrial
process that separates the corn kernel into its starch, protein, germ, and fiber fractions.
Growth in the wet-milling industry was especially rapid during the 1970s because of
breakthroughs in the production and subsequent use of high-fructose corn syrup (Leath
and Hill, 1987). Dry milling is the process that separates corn into endosperm, germ,
and fiber fractions. Dry milling has seen some limited growth, primarily because of
increased consumption of breakfast foods and other dry-milling products (Leath and
Hill, 1987).
2.2 Corn grading
For grading purposes, corn is classed as yellow, white, or mixed. Samples of yellow and
white corn containing less than 95 percent of one class are designated Mixed. According
Corn 409
to Canadian standards, primary grade determinants are minimum test weight, degree
of soundness (size uniformity), damaged kernels (heated, mold contamination, etc.),
cracked corn, and foreign material (OGGG). Corn is graded without reference to variety. The class forms part of the grade name – e.g. Corn, Sample CW Yellow Account
Corn quality factors are important for both wet and dry milling. For wet milling, it
is important to ensure that the corn kernels have not been dried at temperatures high
enough to cause protein denaturation or starch gelatinization. Stress crack percentages
are used as an indirect test for these conditions (Rausch et al., 1997). Stress cracks
below 20 percent will enable a high starch recovery from wet milled corn. The primary
factor needed by dry millers is a hard endosperm, which is used to produce large,
flaking grits (Paulsen et al., 1996). Many overseas dry millers prefer true densities in
the range of 1.25–1.28 g/cm3 . High-density kernels are usually more difficult to steep
adequately, resulting in lower starch recovery. The secondary factor needed by dry
millers is low stress cracks, preferably below 20 percent.
Artificial drying of corn can cause two types of damage. Rapid drying causes brittleness. This is the most prevalent damage, and is manifested in the form of stress
cracks leading to breakage. Stress cracks directly affect the ability of millers to salvage
intact endosperms, and generally reduce the number of large, premium grits produced
in dry milling. Stress cracks also contribute to the breakage in corn during its handling.
Scorching and discoloration of corn characterize damage caused by overheating. This
indirectly contributes to the brittleness of the dried grain. Heat damage caused by excessive drying temperatures not only results in physical damage to the kernel that affects
milling properties, but also causes undesirable chemical changes that make starch and
gluten separation difficult in wet milling.
Corn is usually harvested at moisture contents of between 18 and 25 percent. However, periodic early frosts or wet fall weather coupled with a producers’ desire for
timely harvest may necessitate harvest at a higher moisture level followed by hightemperature drying. According to a US Grain Council producer survey (Anonymous,
2001), more than 50 percent of on-farm corn-drying takes place at temperatures well
above the starch gelatinization temperature (>70◦ C). Artificial drying at high temperatures is known to induce stress cracks and reduce the germ quality, starch recovery,
starch quality, flaking grit yield, and storage life of corn. This can result in poor characteristics for wet milling, dry milling, handling, and storage (Freeman, 1973; Brooker
et al., 1974). Excessive stress cracking increases the amount of fines and broken corn
during handling, which in turn increases susceptibility to mold and insect damage during storage. In the dry-milling industry, high-temperature drying reduces grit yields
because of increased stress cracks, and reduces germ recovery and grit quality due
to poorer germ–endosperm separation (Paulsen and Hill, 1985). In the wet-milling
industry, high-temperature drying makes corn difficult to steep by altering the characteristics of the protein matrix in the endosperm and increasing the time for adequate
steeping. Inadequate steeping results in poor starch–gluten separation, reducing starch
yield and quality while increasing the yield and decreasing the purity of lower-valued
protein products (Watson and Hirata, 1962).
410 Quality Evaluation of Corn/Maize
3 Machine vision determination of corn quality
Quality parameters for corn kernels have been determined using machine vision in
both the densitometric and spatial domains. Machine vision sensing has been used to
develop methods for detecting and quantifying physical properties of corn in order
to develop the basis for on-line grain quality evaluation as a tool for assessing grain
quality and end-use of the grain. Machine vision systems have been developed for
assessing color, size, shape, breakage, stress crack, hardness, fungal contamination,
and seed viability in corn, as described below.
3.1 Color
The color of foods is an important quality factor that greatly influences consumer acceptance (Mendoza et al., 2006). Processors want clean, brightly colored corn for food
products. For grading purposes, trained inspectors visually observe the color of kernels
to determine the class of corn (white or yellow). However, the color of corn kernels can
vary considerably from white to yellow, orange, red, purple, and brown (Watson, 1987;
Figure 17.4). White food corn hybrids vary from a pale, dull white to a gray off-white
appearance, while yellow hybrids can range from a light yellow to a dark reddish yellow
color. Bright, clean yellow and white kernels are desired for food corn. Most of the
methods used in the field are subjective measurements. Objective color measurement
methods are important for processors and breeders developing new corn varieties.
In laboratories, corn color is typically measured using a colorimeter or spectrometer
that records the color values in the CIE Lab color space and its derivatives (CuevasRodrigues et al., 2004). Floyd et al. (1995) used a colorimeter to measure L, a, b
color components in white and yellow corn samples. They observed low correlations
(r ≤ 0.53) between the instrumental measurements and the color grade by a trained
inspector. Differences in endosperm hardness, kernel thickness, and pericarp gloss
between cultivars with the same visual color ratings contributed to the low correlation
values. Liao et al. (1992a) used machine vision to discriminate corn kernels based on
RGB color values. They reported that the values of the red (R) component of the corn
kernel images were higher than the values of the green (G) component. Later studies
found that the kernel color could be quickly determined after deriving HSI (Hue, Saturation, Intensity) from the RGB input image (Liao et al., 1994). The largest difference
between white and yellow maize varieties was found in the intensity component of the
image, while the blue component of the RGB image provided the greatest separation
between white and yellow corn kernels. In each case the standard deviation was low,
allowing for clear separation of the corn types (Liao et al., 1994).
The effectiveness of color-image analysis and classification techniques depends on
the constancy of the scene illumination, whereas scene illumination often changes
over time. Ng et al. (1998a) presented a calibration method to improve color-image
classification for changing illumination. They used a gray reference plate to capture
color changes due to changes in illumination. The extent of the color changes in each
of the RGB channels was calculated based on an equation derived from the spectral
reflectance model. These values formed a transformation matrix to transform the image
Machine vision determination of corn quality 411
Figure 17.4 (a) Multicolored varieties of corn. © This digital image was created by Sam Fentress, 25
September 2005. This image is dual-licensed under the GNU Free Documentation License, Version 1.2 or
later, and the Creative Commons Attribution Share-Alike license version 2.0. Attribution is required.
Please direct any questions to User talk:Asbestos. (Sam Fentress). (b) Exotic varieties of corn with
different kernel color. This public domain image is from Wikipedia, the free encyclopedia (http://en.
RGB values to compensate for the color changes. The color-corrected RGB values were
shown to be within four gray levels of the laboratory measurements for a 1-V change in
the lamp voltage. Liu and Paulsen (2000) successfully used machine vision to quantify
whiteness in corn samples with large color variations.
412 Quality Evaluation of Corn/Maize
3.2 Size and shape
Seed corn is marketed by kernel size, making kernel size distribution a very important
characteristic for the seed corn industry. The ability of a mechanical planter to meter
seeds at a consistent spacing improves with uniformly sized seed, which in turn affects
the yield. An ear of corn contains a large number of kernels, each with a slightly varying
physical size and shape reflecting its position on the ear – seeds on the tip of an ear
tend to be small and round, seeds in the middle of an ear tend to be flat, and seeds on
the bottom of an ear tend to be large and triangular. Machine vision systems have been
developed to determine kernel size and shape characteristics. Liao et al. (1992b) identified corn kernel profiles from morphological features, including curvatures along
the kernel perimeter, symmetry ratios along the major axis, aspect ratios, roundness
ratios, and pixel area. Ni et al. (1997a) developed a machine vision system to identify
different types of crown-end shapes of corn kernels. Corn kernels were classified as
convex or dent, based on their crown-end shape (Figure 17.5). Dent corn kernels were
further classified into smooth dent or non-smooth dent kernels. This system provided
an average accuracy of approximately 87 percent compared with human inspection.
Winter et al. (1997) measured morphological features of popping corn kernels using
image analysis. This information along with pixel value statistics was used to predict
the “popability” of popcorn using a neural network. Ni et al. (1998) used a mirror to
capture both top and side views of corn kernels, to determine kernel length, width, and
projected area for size classification. This size-grading vision system performed with
a high degree of accuracy (74–90 percent) when compared with mechanical sieving.
Steenhoek and Precetti (2000) developed an image-analysis system for the classification of corn kernels according to size categories. Kernels were classified into 16 size
categories based on the degree of roundness and flatness determined by morphological
Figure 17.5
Gray-scale and binary images showing different shapes of corn kernel crown.
Machine vision determination of corn quality 413
features measured from seed images. Neural network classification accuracies for
round and flat kernels were 96 percent and 80 percent, respectively. These reports
demonstrate that there is considerable potential for a machine vision system for corn
3.3 Breakage
Current practices of harvesting high-moisture corn introduce substantial mechanical
damage to kernels, which is further aggravated by subsequent handling and transportation operations. It is estimated that on-farm mechanical damage to corn kernels ranges
from 20 to 80 percent (Pierce et al., 1991). Such damage includes kernels that have
hairline cracks, as well as those that are broken, chipped, or crushed. Damaged corn is
more difficult to aerate, and has a shorter storage life than undamaged corn. Mechanical damage is frequently measured in laboratories through visual inspection, which
is subjective, tedious, and time-consuming. Large-scale measurement of corn damage
for the grain trade is not practical unless the process is fully automated.
Machine vision systems have been developed for measuring corn kernel breakage,
with promising results (Ding et al., 1990; Zayas et al., 1990). Liao et al. (1993)
developed a machine vision system to measure corn kernel breakage based on the
kernel shape profile. Diffused reflected light illuminated the single kernels for image
capture. A neural network classifier achieved high classification accuracy; 99 percent
for whole flat kernels, 96 percent for broken flat kernels, 91 percent for whole round
kernels, and 95 percent for broken round kernels. Liao et al. (1994) further improved
this system by including a Fourier profile of the kernel. The improved system had an
accuracy of 95 percent in identifying whole kernels as being whole, and 96 percent
accuracy for identifying broken kernels as being broken. Parameters such as projected
area, width, and height of the kernel were determined, in addition to Fourier coefficients
using an FFT (Fast Fourier Transform).
Ni et al. (1997b) designed and built a prototype machine vision system for automatically inspecting corn kernels. They used a strobe light to eliminate image blur due
to the motion of corn kernels. Kernel singulation and the synchronization of strobe
firing with the image acquisition were achieved by using optical sensors. The control circuitry was designed to enable synchronization of strobe firing with the vertical
blanking period of the camera. Corn kernels of random orientation were inspected for
whole versus broken percentages, and on-line tests had successful classification rates
of 91 percent and 94 percent for whole and broken kernels, respectively. Ng et al.
(1998b) developed machine vision algorithms for measuring the level of corn kernel
mechanical damage as a percentage of the kernel area. Before imaging, corn samples were dyed with a 0.1% Fast Green FCF dye solution to facilitate the detection
of damaged areas. Mechanical damage was determined by extracting from the kernel
images the damaged area stained by the green dye as a percentage of the projected
kernel area. The vision system demonstrated high accuracy and consistency. Standard
deviation for machine measurements was less than 5 percent of the mean value, which
is substantially smaller than for other damage-measurement methods. This method is,
however, limited in that it introduces a dye into the grain product.
414 Quality Evaluation of Corn/Maize
3.4 Stress cracks
Internal damage in corn appears in the form of stress cracks in the endosperm (Thomson and Foster, 1963). These cracks have traditionally been evaluated by candling
and visual assessment. Candling is time-consuming and inconsistent, due to fatigue
of the human eye. Gunasekaran et al. (1985) investigated the size characteristics of
stress cracks using electron microscopy. They observed that a typical stress crack
is about 53 µm in width and half the kernel in depth. Stress cracks originate at
the inner core of the floury endosperm, and propagate rapidly outwards along the
boundary of starch granules. Many cracks do not advance as far as the pericarp
layer. Reflected laser optical imaging failed to provide sufficient light reflectance
differences required for detecting stress cracks (Gunasekaran et al., 1986) whereas
backlighting images provided high contrast between the stress crack and the rest of the
kernel (Gunasekaran et al., 1987). Image-processing algorithms detected the cracks
in the form of lines or streaks with an accuracy of 90 percent. The general principles of backlighting for transmittance imaging are illustrated in Figure 17.6. Backlight
imaging reveals useful information about the internal structure of grain samples by
generally eliminating details from the surface and providing high contrast for edge
Reid et al. (1991) developed a computer vision system to automate the detection
of stress cracks in corn kernels. They used a combination of reflected (diffused) as
well as transmitted light for imaging single kernels. Edge detection followed by Hough
transform was used to detect stress cracks as line features. This system detected stress
cracks with an accuracy approaching 92 percent in comparison to human inspection
with candling. Han et al. (1996) used Fourier transform image features for the inspection of stress cracks. The proposed frequency domain classification method achieved
an average success ratio of 96 percent.
Figure 17.6
A generalized schematic of backlight imaging set-up. © Canadian Grain Commission.
Machine vision determination of corn quality 415
3.5 Heat damage
Milling processes are designed to separate kernels efficiently into their respective components. Corn that has been heated in the presence of moisture has difficulty during
the starch–gluten separation phase in wet milling. These problems result from protein
denaturation or starch gelatinization. Heat damage was visualized using tetrazolium
dye, which turns pink in living embryos but shows no color in dead tissues (Xie and
Paulsen, 2001). Dehydrogenase enzymes involved in respiration react with the tetrazolium, resulting in an insoluble, red formozan color in living cells. Non-living cells
retain their natural color. Machine vision images of kernels that were heat treated at
60◦ C for 3 and 9 hours respectively were compared to check samples that were not
heated. The unheated kernels had a tetrazolium reaction resulting in a bright red stain.
Kernels heated for 3 hours had a purplish color, indicating onset of damage; while the
kernels heated for 9 hours did not stain, indicating a totally dead germ (Paulsen and
Hill, 1985; Litchfield and Shove, 1989).
3.6 Mold and fungal contamination
Ng et al. (1998b) developed machine vision algorithms for measuring corn kernel
mold damage using color images of corn kernels illuminated with diffused reflectance
lighting. Mold damage was determined in terms of percentage of total projected kernel
area by isolating the moldy area on kernel images. A feed-forward neural network
was developed to classify mold and non-mold pixels, based on pixel RGB values. The
system measurements were highly accurate and consistent, with a standard deviation
of less than 5 percent of the mean value. Steenhoek et al. (2001) presented a method
for clustering pixel color information to segment features within corn kernel images.
Features for the blue-eye mold, germ damage, and starch were identified with a probabilistic neural network based on pixel RGB values. Accuracy of the network predictions
on a validation set approached 95 percent.
Aflatoxins are poisons produced by the fungus Aspergillus flavus after it infects agricultural commodities, such as corn. Aflatoxin-contaminated corn is dangerous when
consumed by animals or human beings, and therefore is an undesirable characteristic
for any corn that is going for feed or human consumption. The ability to detect A. flavus
and its toxic metabolite, aflatoxin, is important for health and safety reasons. The ability to detect and measure fungal growth and aflatoxin contamination of corn could
contribute significantly towards the separation of contaminated kernels from healthy
kernels. Dicrispino et al. (2005) have explored the use of hyperspectral imaging to
detect mycotoxin-producing fungi in grain products. Experiments were performed on
A. flavus cultures growing over an 8-day time period to see if the spectral image of
the fungus changed during growth. Results indicate that hyperspectral imaging technology can identify spectral differences associated with growth changes over time.
Further experiments may lead to this technology being used to rapidly and accurately
detect/measure Aspergillus flavus infection/aflatoxin contamination of corn without
destruction of healthy grain. This could provide a useful tool for both growers and
416 Quality Evaluation of Corn/Maize
buyers in the corn industry, that could enhance protection of food and feed as well as
increase profits.
The bright greenish-yellow (BGY) presumptive test is widely used by government
agencies as a quick test for monitoring corn aflatoxin to identify lots that should be
tested further. The test is based on the association of the BGY fluorescence in corn under
ultraviolet light (365 nm) with invasion by the molds that produce aflatoxin. Shotwell
and Hesseltine (1981) examined corn samples under ultraviolet light (365 nm) for the
bright greenish-yellow (BGY) fluorescence associated with aflatoxin-producing fungi.
They concluded that the BGY test could be carried out equally well by using the black
light viewer on whole-kernel corn or by inspecting a stream of coarsely ground corn
under ultraviolet light (365 nm). A count of 1 BGY particle per kg of corn appeared
to be an indication that the sample should be tested for aflatoxin by chemical means.
The higher the BGY count in a corn sample, the more likely it is to contain aflatoxin
in levels equal to or exceeding the tolerance limit of 20 ng/g.
Near-infrared spectra, X-ray images, color images, near-infrared images, and physical properties of single corn kernels were studied to determine whether combinations
of these measurements could distinguish fungal-infected kernels from non-infested
kernels (Pearson et al., 2006). Kernels used in this study were inoculated in the field
with eight different fungi: Acremonium zeae, Aspergillus flavus, Aspergillus niger,
Diplodia maydis, Fusarium graminearum, Fusarium verticillioides, Penicillium spp.,
and Trichoderma viride. Results indicate that kernels infected with Acremonium zeae
and Penicillium spp. were difficult to distinguish from non-infested kernels, while all
the other severely infected kernels could be distinguished with greater than 91 percent
accuracy. A neural network was also trained to identify infecting mold species with
good accuracy, based on the near-infrared spectrum. These results indicate that this
technology can potentially be used to separate fungal infected corn using a high-speed
sorter, and to automatically and rapidly identify the fungal species of infested corn
kernels. This will be of assistance to breeders developing fungal-resistant hybrids, as
well as mycologists studying fungal-infected corn.
Pearson and Wicklow (2006) used a high-speed single-kernel sorter to remove mycotoxins from corn. It was found that using spectral absorbance at 750 nm and 1200 nm
could distinguish kernels with aflatoxin contamination greater than 100 ppb from kernels with no detectable aflatoxin, with over 98 percent accuracy. When these two
spectral bands were applied to sorting corn at high speeds, reductions in aflatoxin averaged 82 percent for corn samples with an initial level of aflatoxin over 10 ppb. Most of
the aflatoxin is removed by rejecting approximately 5 percent of the grain. Fumonisin
is also removed along with aflatoxin during sorting. The sorter reduced fumonisin by
an average of 88 percent for all samples. This technology will help insure the safety of
the US food and feed supply.
3.7 Hardness or vitreousness
Hardness or vitreousness is an important grain quality factor for corn, affecting milling
characteristics. Vitreousness is typically a subjective evaluation, based on candling, to
identify the vitreous phenotypes. Kernels placed on a light box are visually scored and
Machine vision determination of corn quality 417
assigned to arbitrary, discontinuous classes according to the ratio of vitreous to floury
endosperm. Felker and Paulis (1993) proposed an image-analysis approach based on
a non-destructive method for quantification of corn kernel vitreousness. Corn kernels
were viewed on a light box using a monochrome video camera, and the transmitted
light video images were analyzed with commercially available software. For imaging,
kernels were surrounded by modeling clay to avoid light leaks around the kernels.
A high degree of correlation was observed between visual scores and average grayscale values of captured video images (r 2 = 0.85). Removing the image background
and correcting for kernel thickness improved the correlation (r 2 = 0.91).
Erasmus and Taylor (2004) reported a rapid, non-destructive image-analysis method
for determining endosperm vitreousness in corn kernels. For imaging, individual whole
kernels were placed on top of round illuminated areas smaller than the projected areas
of the kernels, to shine light through the kernels. A correction factor to allow constant
illumination of kernels was developed to adjust kernel size variations in relation to
constant light area. Significant correlations were found between corrected translucency
values and endosperm yields determined by hand dissection (r = 0.79). Corrections for
kernel thickness improved the correlation further (r = 0.81); however, the data spread
was rather wide (r 2 = 0.65).
3.8 Seed viability
Seed viability and vigor are important for the ongoing continuation of a variety. Producers would like to be assured that the corn seeds they plant will all emerge into new
plants. Xie and Paulsen (2001) developed a machine vision system to detect and quantify tetrazolium staining in sectioned corn kernels for corn viability classification. The
machine-vision based tetrazolium test was able to predict viability loss and therefore
detrimental effects of heat on corn to be used for wet milling. Corn harvested at 20
percent and 25 percent moisture was negatively affected by drying at 70◦ C. Corn harvested at 30 percent moisture was negatively affected by heat at all drying temperatures
above 25◦ C, and was much more severely affected as the drying temperature increased.
Cicero and Banzatto (2003) studied the effects of mechanical damage on corn seed
vigor using image analysis. Fifty seeds from three cultivars were visually selected to
form a sample of whole seeds with varying degrees of mechanical damage. The seeds
were X-rayed, photographed (ventral and dorsal sides), and submitted to a cold test. The
cold test was used to introduce stress and hence assess the ability (vigor) of the seeds
to withstand the stress. Photographs were repeated after the cold test. Images taken
before and after the cold test were examined simultaneously on a computer monitor to
determine the possible relationship between cause and effect. Results indicated that the
method under study permits association of mechanical damage with eventual losses
caused to corn seed vigor.
Mondo and Cicero (2005) studied the effect of the seed position on the ears on seed
quality, in terms of vigor and viability. Images obtained before and after germination
were visually examined on a computer screen simultaneously to make a complete diagnosis for each seed. The results indicated that the seeds in the proximal and intermediate
positions presented a similar quality and were superior to those of the distal position.
418 Quality Evaluation of Corn/Maize
It was also reported that spherical seeds with embryonic axes presented torsions, but
that neither altered nor reduced quality. However, alterations in the embryonic axes
(dark, undefined stains), presented in a larger quantity in the distal region of the ear,
were responsible for the loss of seed quality.
3.9 Other applications
Separation of shelled corn from residues is an important task in corn harvesting and
processing. Jia et al. (1991) investigated the use of machine vision for monitoring the
separation of shelled corn from residues. Image analysis results showed that spectral
reflectance differences in red and blue bands of the electromagnetic spectrum could be
used to separate corncobs from residues. Jia (1993) proposed an automatic inspection
system for grading seed maize using machine vision. Images of a number of samples
of maize were acquired as the maize cobs passed through the inspection system. The
samples represented the quality of inspected maize at different layers of unloading
maize from a truck. Machine vision algorithms were developed to measure the amount
of residues mixed with maize cobs, and the loss of kernels on cobs. Two parameters,
residue mixture ratio and kernel loss ratio, were introduced as indicators of quantitative
measurement of the amount of residues mixed with cobs, and kernels lost on the cobs.
Seed corn is harvested and delivered on the cob with some husk still attached to
avoid mechanical damage to the seeds. A husk deduction is manually estimated as the
husk/corn weight ratio, for payment purposes. Precetti and Krutz (1993) developed a
color machine vision system to perform real-time husk deduction measurements. They
reported that a linear relationship exists between the weight ratio of the husk deduction
and the surface ratio of the vision system. Variability of the machine vision system was
±1 percent compared to ±4 percent for the manual measurements.
3.10 Changing directions
A number of near-infrared reflectance (NIR) spectroscopy applications have been
reported in the literature for quality evaluation of corn in terms of moisture and amino
acids (Fontaine et al., 2002); protein, oil and starch (Kim and Williams, 1990; Orman
and Schumann, 1991); fungal infection (Pearson et al., 2006); and milling performance
(Wehling et al., 1996; Dijkhuizen et al., 1998). Hyperspectral imaging appears to be
a natural extension to take advantage of both the spectral and spatial information in
NIR and imaging, respectively. Yu et al. (2004) used Synchrotron Fourier Transform
infrared (FTIR) microspectroscopy to image the molecular chemical features of corn
to explore the spatial intensity and distribution of chemical functional groups in corn
tissues. Results of this study showed that FTIR images could help corn breeders in
selecting superior varieties of corn for targeted food and feed markets. Cogdill et al.
(2004) evaluated hyperspectral imaging as a tool to assess the quality of single maize
kernels. They developed calibrations to predict moisture and oil contents in single maize
kernels based on hyperspectral transmittance data in the range of 750 to 1090 nm. The
moisture calibration achieved good results, with a standard error of cross-validation
(SECV) of 1.2 percent and a relative performance determinant (RPD) of 2.74. The
References 419
oil calibration did not perform well (SECV = 1.38 percent, RPD = 1.45), and needs
improved methods of single seed reference analysis.
4 Conclusions
Corn (Zea Mays) has undergone extensive investigation with machine vision applications, and many characteristics are shown to have a high degree of detectability. Simple
quality characteristics, such as size and shape, have been shown to be easily measurable using imaging techniques, while others, such as breakage and cracks, may require
the additional use of dyes to reach a high degree of both detection and repeatability.
However, cracked kernels arise for many reasons, and different approaches are required
depending upon their origin. Imaging has the potential for mold detection and, with
the exciting advancements in hyperspectral imaging, for toxin detection.
In concurrence with imaging in cereals generally, the detection of corn quality factors
by imaging has tended to focus on reproduction of the subjective evaluations of quality
characteristics that have been used in the past to describe functionality. With the more
advanced imaging technologies emerging, it can only be predicted that research will
become focused on directly analyzing and describing the properties relating to the
process that corn is destined for, and properties that traditional quality evaluation
methods do not describe. With a growing concern for healthy food sources, there will
be a need to enhance the detection of toxins to ensure safe products.
Almeida MT, Bisby FA (1984) A simple method for establishing taxonomic characters from
measurement date. Taxon, 33, 405–409.
Anonymous 2001 (2000–2001) Value Enhanced Grain Quality Report. Washington, DC:
US Grain Council.
Baum BR, Thompson BK (1976) Classification of Canadian oat cultivars by quantifying
the size-shape of their “seeds”: a step towards automatic identification. Canadian
Journal of Botany, 54, 1472–1480.
Brooker DB, Bakker-Arkema FW, Hall CW (1974) Drying Cereal Grains. New York: AVI.
Cicero SM, Banzotto HL Jr (2003) Evaluation of image analysis in determining the relationship between mechanical damage and seed vigor in maize [in Portuguese]. Revista
brasileira de semente, 25 (1), 29–36.
Cogdill RP, Hurburgh CR Jr, Rippke GR (2004) Single-kernel maize analysis by nearinfrared hyperspectral imaging. Transactions of the ASAE, 47 (1), 311–320.
Cuevas-Rodrigues EO, Milan-Carillo J, Mora-Escobedo R, Cardenas-Valenzuela OG,
Reyes-Moreno C (2004) Quality protein maize (Zea Mays L.) temph flour through solid
state fermentation process. Lebensmittel Wissenschaft und Technologie, 37, 59–67.
Dicrispino K, Yao H, Hruska Z, Brabham K, Lewis D, Beach J, Brown RL, Cleveland TE
(2005) Hyperspectral imagery for observing spectral signature change in Aspergillus
420 Quality Evaluation of Corn/Maize
flavus. Proceedings of the SPIE, The International Society of Optical Engineering,
5996, 599606-1-10.
Dijkhuizen A, Dudley JW, Rocheford TR, Haken AE, Eckoff SR (1998) Near-infrared
reflectance correlated to 100-g wet-milling analysis in maize. Cereal Chemistry, 75
(2), 266–270.
Ding K, Morey RV, Wilcke WF, Hansen DJ (1990) Corn quality evaluation with computer
vision. ASAE Paper No. 90-3532, ASAE, St Joseph, MI, USA.
Draper SR, Travis AJ (1984) Preliminary observations with a computer based system for
analysis of the shape of seeds and vegetative structures. Journal of the National
Institute of Agricultural Botany, 16, 387–395.
Eckhoff SR (1992) Converting corn into food and industrial products. Illinois Research,
Erasmus C, Taylor RN (2004) Optimising the determination of maize endosperm vitreousness by a rapid non-destructive image analysis technique. Journal of the Science of
Food and Agriculture, 84, 920–930.
Evers AD, Withey RP (1989) Use of image analysis to predict milling extraction rates of
wheats. Food Microstructure, 8, 191–199.
Felker FC, Paulis JW (1993) Quantitative estimation of corn endosperm vitreosity by video
image analysis. Cereal Chemistry, 70 (6), 685–689.
Floyd CD, Rooney LW, Bockholt AJ (1995) Measuring desirable and undesirable color in
white and yellow food corn. Cereal Chemistry, 72 (5), 488–490.
Fontaine J, Schirmer B, Horr J, (2002) Near-infrared reflectance spectroscopy (NIRS)
enables the fast and accurate prediction of essential amino acid contents. 2. Results for
wheat, barley, corn, triticale, wheat bran/middlings, rice bran, and sorghum. Journal
of Agricultural and Food Chemistry, 50, 3902–3911.
Freeman JE (1973) Quality factors affecting value of corn for wet milling. Transactions of
the ASAE, 16, 671–682.
Gunasekaran S, Deshpande SS, Paulsen MR, Shove DG, (1985) Size characterization of
stress cracks in corn kernels. Transactions of the ASAE, 28 (5), 1668–1672.
Gunasekaran S, Paulsen MR, Shove DG (1986) A laser optical method for detecting corn
kernel defects. Transactions of the ASAE, 29 (1), 294–298.
Gunasekaran S, Cooper T, Berlage A, Krishnan P, (1987) Image processing for stress cracks
in corn kernels. Transactions of the ASAE, 30 (1), 266–271.
Han YJ, Feng Y, Weller CL, (1996) Frequency domain image analysis for detecting stress
cracks in corn kernels. Transactions of the ASAE, 12 (4), 487–492.
Hurburgh CR (1989) The value of quality to new and existing corn uses. Agricultural
Engineering Staff Papers Series FPR 89-2, ASAE-CSAE Meeting Presentation June
Jia J (1993) Seed maize quality inspection with machine vision. Proceedings of the SPIE,
The International Society of Optical Engineering, 1989, 288–295.
Jia J, Krutz GW, Precetti CJ (1991) Harvested corn cob quality evaluation using machine
vision. ASAE Paper No. 91-6537, ASAE, St Joseph, MI, USA.
Keefe PD, Draper SR (1986) The measurement of new characters for cultivar identification in wheat using machine vision. Seed Science and Technology, 14,
References 421
Keefe PD, Draper SR (1988) An automated machine vision system for the morphometry
of new cultivars and plant genebank accessions. Plant Varieties and Seeds, 1, 1–11.
Kim HO, Williams PC (1990) Determination of starch and energy in feed grains by nearinfrared reflectance spectroscopy. Journal of Agricultural and Food Chemistry, 38,
Lai FS, Zayas I, Pomeranz Y (1986) Application of pattern recognition techniques in the
analysis of cereal grains. Cereal Chemistry, 63 (2), 168–172.
Leath MN, Hill LD (1987). Economics of production, marketing, and utilization. In
Corn: Chemistry and Technology (Watson SA, Ramstad PE, eds). St Paul: American
Association of Cereal Chemists, pp. 201–252.
Liao K, Li Z, Reid JF, Paulsen MR, Ni B (1992a) Knowledge-based color discrimination
of corn kernels. ASAE Paper No. 92-3579, ASAE, St Joseph, MI, USA.
Liao K, Paulsen MR, Reid JF, Ni B, Bonificio E (1992b) Corn kernel shape identification
by machine vision using a neural network classifier. ASAE Paper No. 92-7017, ASAE,
St Joseph, MI, ASAE.
Liao K, Paulsen MR, Reid JF, Ni B, Bonificio E (1993) Corn kernel breakage classification
by machine vision using a neural network classifier. Transactions of the ASAE, 36 (6),
Liao K, Paulsen MR, Reid JF (1994) Real-time detection of color and surface defects of
maize kernels using machine vision. Journal of Agricultural Engineering Research,
59, 263–271.
Litchfield JB, Shove GC (1989) Dry milling of US hard-endosperm corn in Japan. ASAE
Paper No. 89-6015, ASAE, St Joseph, MI, USA.
Liu J, Paulsen MR (2000). Corn whiteness measurement and classification using machine
vision. Transactions of the ASAE, 43 (3), 757–763.
Liu C-C, Shaw J-T, Poong K-Y, Hong M-C, Shen M-L (2005) Classifying paddy rice
by morphological and color features using machine vision. Cereal Chemistry, 82,
Lookhart GL, Jones BL, Walker DE, Hall SB, Cooper DB (1983) Computer-assisted
method for identifying wheat cultivars from their gliadin electrophoregrams. Cereal
Chemistry, 60, 111–115.
Mendoza F, Dejmek P, Aguilera JM (2006) Calibrated color measurements of agricultural
foods using image analysis. Postharvest Biology and Technology, 41, 285–295.
Mondo VHV, Cicero SM (2005) Using image analysis to evaluate the quality of maize
seeds located in different positions on the ear [in Portuguese]. Revista brasileira de
semente, 27 (1), 9–18.
Neuman MR, Sapirstein HD, Shwedyk E, Bushuk W (1989a) Wheat grain color analysis
by digital image processing I. Methodology. Journal of Cereal Science, 10, 175–182.
Neuman MR, Sapirstein HD, Shwedyk E, Bushuk W (1989b) Wheat grain color analysis
by digital image processing II. Wheat class discrimination. Journal of Cereal Science,
10, 183–188.
Ni B, Paulsen MR, Reid JF (1997a) Corn kernel crown shape identification using image
analysis. Transactions of the ASAE, 40 (3), 833–838.
Ni B, Paulsen MR, Liar K, Reid JF (1997b) Design of an automated corn kernel inspection
system for machine vision. Transactions of the ASAE, 40 (2), 491–497.
422 Quality Evaluation of Corn/Maize
Ni B, Paulsen MR, Reid JF (1998) Size grading of corn kernels with machine vision.
Applied Engineering in Agriculture, 14 (5), 567–571.
Ng HF, Wilcke WF, Morey RV, Lang JP (1998a) Machine vision color calibration in
assessing corn kernel damage. Transactions of the ASAE, 41 (3), 727–732.
Ng HF, Wilcke WF, Morey RV, Lang JP (1998b) Machine vision evaluation of corn kernel
mechanical and mold damage. Transactions of the ASAE, 41 (2), 415–420.
Official Grain Grading Guide. Canadian Grain Commission, Winnipeg, Manitoba, Canada
(available on-line at www.grainscanada.gc.ca).
Orman BA, Schumann RA (1991) Comparison of near-infrared spectroscopy calibration
methods for the prediction of protein, oil, and starch in maize grain. Journal of
Agricultural and Food Chemistry, 39, 883–886.
Paulsen MR, Hill LD (1985) Corn quality factors affecting dry milling performance.
Journal of Agricultural Engineering Research, 31, 255.
Paulsen MR, Hofing SL, Hill LD, Eckhoff SR (1996) Corn quality characteristics for Japan
markets. Applied Engeering in Agriculture, 12 (6), 731–738.
Pearson TC, Wicklow TG (2006) Properties of corn kernels infected by fungi. Transactions
of the ASAE, 49 (4), 1235–1245.
Pearson TC, Dowell FE, Armstrong PR (2006) Objective grading and end-use property
assessment of single kernels and bulk grain samples. 2005 Annual Progress Reports
(NC-213): Management of Grain Quality and Security in World Markets, pp. 67–71
(available on-line at http://www.oardc.ohio-state.edu/nc213/PR05.pdf).
Pierce RO, Salter KL, Jones D (1991) On-farm broken corn levels. Applied Engineering in
Agriculture, 7 (6), 741–745.
Precetti CJ, Krutz GW (1993) A new seed corn husk deduction system using color machine
vision. ASAE Paper No. 93-1012, ASAE, St Joseph, MI, USA.
Rausch KD, Eckhoff SR, Paulsen MR (1997) Evaluation of the displacement value as a
method to detect reduced corn wet milling quality. Cereal Chemistry, 74 (3), 274–280.
Reid JF, Kim C, Paulsen M (1991) A computer vision sensor for automatic detection of
stress cracks in corn kernels. Transactions of the ASAE, 34 (5), 2236–2244.
Sapirstein HD, Bushuk W (1985a) Computer-aided analysis of gliadin electrophoregrams. I.
Improvement of precision of relative mobility determination by using a three reference
band standardization. Cereal Chemistry, 62, 373–377.
Sapirstein HD, Bushuk W (1985b) Computer-aided analysis of gliadin electrophoregrams.
III. Characterization of the heterogeneity in gliadin composition for a population of
98 common wheats. Cereal Chemistry, 62, 392–398.
Sapirstein HD, Bushuk W (1989) Quantitative determination of foreign material and vitreosity in wheat by digital image analysis. Proceedings ICC Symposium: Wheat End-Use
Properties: Wheat and Flour Characterization for Specific End-Uses, June 13–15,
Lahti Finland. Helsinki: University of Helsinki, Department of Food Chemistry and
Sapirstein HD, Kohler JM (1999) Effects of sampling and wheat grade on precision and
accuracy of kernel features determined by digital image analysis. Cereal Chemistry,
76, 110–115.
Shahin MA, Symons SJ (2001) A machine vision system for grading lentils. Canadian
Biosystems Engineering, 43, 7.7–7.14.
References 423
Shahin MA, Symons SJ (2005) Seed Sizing from images of non-singulated grain samples.
Canadian Biosystems Engineering, 47, 3.49–3.55.
Shatadal P, Jayas DS, Bulley NR (1995a) Digital image analysis for software separation and
classification of touching grains: I. Disconnect algorithm. Transactions of the ASAE,
38, 635–643.
Shatadal P, Jayas DS, Bulley NR (1995b) Digital image analysis for software separation
and classification of touching grains: II. Classification. Transactions of the ASAE, 38,
Shotwell OL, Hesseltine CW (1981) Use of bright greenish yellow fluorescence as a
presumptive test for aflatoxin in corn. Cereal Chemistry, 58 (2), 124–127.
Steenhoek LW, Precetti CJ (2000) Vision sizing of seed corn. ASAE Paper No. 00-3095,
ASAE, St Joseph, MI, USA.
Steenhoek LW, Misra MK, Batchelor WD, Davidson JL (2001) Probabilistic neural networks for segmentation of corn kernel images. Applied Engineering in Agriculture,
17 (2), 225–234.
Symons SJ, Dexter JE (1991) Computer analysis of fluorescence for the measurement of
flour refinement as determined by flour ash content, flour grade color, and tristimulus
color measurements. Cereal Chemistry, 68, 454–460.
Symons SJ, Dexter JE (1992) Estimation of milling efficiency: prediction of flour
refinement by the measurement of pericarp fluorescence. Cereal Chemistry, 69,
Symons SJ, Dexter JE (1993) Relationship of flour aleurone fluorescence to flour
refinement for some Canadian hard common wheat classes. Cereal Chemistry, 70,
Symons SJ, Dexter JE (1996) Aleurone and pericarp fluorescence as estimators of mill
stream refinement for various Canadian wheat classes. Journal of Cereal Science, 23,
Symons SJ, Fulcher RG (1987) The morphological characterization of seeds using digital
image analysis. Proceedings of the 37th Australian Cereal Chemistry Conference, pp.
Symons SJ, Fulcher RG (1988a) Determination of wheat kernel morphological variation
by digital image analysis: II. Variation in cultivars of soft white winter wheats. Journal
of Cereal Science, 8, 219–229.
Symons SJ, Fulcher RG (1988b) Relationship between oat kernel weight and milling yield.
Journal of Cereal Science, 7, 215–217.
Symons SJ, Fulcher RG (1988c) Determination of variation in oat kernel morphology by
digital image analysis. Journal of Cereal Science, 7, 219–228.
Symons SJ, Dexter JE, Matsuo RR, Marchylo BA (1996) Semolina speck counting using
an automated imaging system. Cereal Chemistry, 73, 561–566.
Symons SJ, Van Schepdael L, Dexter JE (2003) Measurement of hard vitreous kernels in
durum wheat by machine vision. Cereal Chemistry, 80, 511–517.
Thompson RA, Foster GH (1963) Stress cracks and breakage in artificially dried corn.
Marketing Research Rep. No. 631, October. Washington, DC: USDA, AMS, TFRD.
Travis AJ, Draper SR (1985) A computer-based system for the recognition of seed shape.
Seed Science Technology, 13, 813–820.
424 Quality Evaluation of Corn/Maize
Watson SA (1987) Structure and composition. In Corn Chemistry and Technology (Watson
SA, Ramstad, PE, eds). St Paul: American Association of Cereal Chemists, pp. 53–78.
Watson SA, Hirata Y (1962) Some wet-milling properties of artificially dried corn. Cereal
Chemistry, 39, 35–44.
Wehling RL, Jackson DS, Hamaker BR (1996) Prediction of corn dry-milling quality by
near-infrared spectroscopy. Cereal Chemistry, 73 (5), 543–546.
Winter P, Wood H, Young W, Sokhansanj S (1997) Neural networks and machine vision
team up to grade corn. Vision Systems Design, October, 28–33.
Xie W, Paulsen MR (2001) Machine vision detection of tetrazolium staining in corn.
Transactions of the ASAE, 44 (2), 421–428.
Xie F, Pearson T, Dowell FE, Zhang N (2004) Detecting vitreous wheat kernels using
reflectance and transmittance image analysis. Cereal Chemistry, 81, 594–597.
Yu P, McKinnon JJ, Christensen CR, Christensen DA (2004) Imaging molecular chemistry
of pioneer corn. Journal of Agricultural and Food Chemistry, 52, 7345–7352.
Zayas I., Pomeranz Y, Lai FS (1985) Discrimination between Arthur and Arkan wheats by
image analysis. Cereal Chemistry, 62, 478–480.
Zayas I., Lai FF, Pomeranz Y (1986) Discrimination between wheat classes and varieties
by image analysis. Cereal Chemistry, 63, 52–56.
Zayas I, Converse H, Steele JL (1990) Discrimination of whole from broken corn kernels
with image analysis. Transactions of the ASAE, 33 (5), 1642–1646.
Zayas I, Bechtel DB, Wilson JD, Dempster RE (1994) Distinguishing selected hard and
soft red wheats by image analysis of starch granules. Cereal Chemistry, 71, 82–86.
Quality Evaluation
of Pizzas
Cheng-Jin Du and Da-Wen Sun
Food Refrigeration and Computerised Food Technology, University
College Dublin, National University of Ireland, Dublin 2,
1 Introduction
With pizza being one of the more popular consumer foods, pizza markets in Europe,
America, and other continents have been boosted by the trend towards international
cuisine and convenience foods (Anonymous, 1994). As a result, pizza production has
been increasing at unprecedented momentum, and is expected to increase further in
the next decade in response to a growing world population. For example, the frozen
pizza market increased by almost 24 percent between 1999 and 2002, to a83 million,
according to figures from the Irish food board. Compared with the traditional homemade method of production, the modern method manufactures pizzas automatically
and production efficiency is thus greatly increased.
In today’s highly competitive market, quality is a key factor for the modern pizza
industry because the high quality of products is the basis for success. A challenging
problem faced by the manufacturers is how to keep producing consistent products
under variable conditions, especially with the inherent sensitivity of pizza-making.
Manual evaluation methods are tedious, laborious, costly, and time-consuming, and
are easily influenced by physiological factors, thus inducing subjective and inconsistent
evaluation results. For example, the method used by the Green Isle Foods Company, of
Naas in Ireland (a leading pizza-maker in Ireland that had a 58 percent market share in
frozen pizza in 1996), for pizza base evaluation is assessment by a human inspector,
who compares each base with a standard one Given the huge number of bases that move
along the production line at an appreciable speed, it is hard to believe that such a standard
can be maintained purely by visual inspection by a number of personnel over a period
of several hours. To satisfy the increased awareness, sophistication, and expectations of
consumers, and to achieve success in a growing and competitive market, it is necessary
to improve the methods for quality evaluation of pizza products. If quality evaluation
is achieved automatically using computer vision, the production speed and efficiency
Computer Vision Technology for Food Quality Evaluation
ISBN: 978-0-12-373642-0
Copyright © 2008, Elsevier Inc.
All rights reserved
428 Quality Evaluation of Pizzas
can be improved, as well as evaluation accuracy, with an accompanying reduction in
production costs (Sun and Brosnan, 2003a).
According to the pizza expert at Green Isle Foods, manufacturing of pizzas can
generally be broken down into three main steps – pizza base production, sauce spread,
and topping application. The basic recipe for the dough used in pizza bases consists
of flour, water, dry yeast, salt, oil, and sodium stearoyl-2-lactylate (Matz, 1989). First,
each ingredient is weighed and they are then mixed together. After the dough has been
allowed to rise, dough units are scaled and rounded before being flattened and rolled.
Finally, sauces are spread on the base and toppings are applied to form the final product.
In this chapter, the application of computer vision for pizza quality evaluation will be
discussed according to these manufacturing stages.
2 Pizza base production
In some literature the pizza base is also called the pizza crust. It comprises 55 percent
of the weight of pizza (Lehmann and Dubois, 1980). Although the crust might not
seem very exciting, it forms the basis upon which all the other parts come together
(Burg, 1998). Furthermore, pizza products are normally categorized according to the
production methods of the crust – for example, if the base is prepared by the leavening
method, the pizza can be classified as “yeast-leavened” or “chemically-leavened” (i.e.
with soda added). Therefore, it seems that the pizza base has attracted more attention
in the literature than have the pizza sauce and the topping.
There are two basic procedures for pizza base production: either the dough is divided,
rounded, and pressed into discs, or it is rolled out in a continuous sheet from which
circles are cut. The latter method produces uniform circles (Matz, 1989). In contrast,
the former method can give a better texture, but at the cost of the fixed size and perfectly
round shape, which it