Video Processing Whitepaper
Teranex Video Processing and HQV – Defined
by Jed Deame
History of the Technology
The roots of Teranex HQV processing go back to the early 1980s, when Lockheed Martin
developed it for military image and video processing. In the 15+ years of development by
Lockheed Martin, over $100 million was invested in the technology, and 13 patents were
issued.
Teranex was founded in 1998 to commercialize the Lockheed Martin technology. Since
then, six additional patents on hardware, software, and algorithms have been filed thanks
to the extensive work of Teranex's premier engineering team.
The company soon realized that the digital-media industry is particularly suited to the
benefits of the technology. Teranex's video-processing platforms are used by the leading
broadcasters around the world, including NBC, CBS, ABC, FOX, WB, and Turner
Networks, along with others in Japan, Australia, China, and South Korea. The Teranex
video processor is the preferred choice because it is able to process any type of material,
whether it be video, film, animation, noisy satellite feeds, or clean camera feeds, with the
best depth and clarity.
The post-production community also uses the Teranex box to ensure perfect conversion
between the many different formats they encounter as well as for film-to-video transfer
(telecine). In the telecine process, it is important to remove any dirt and scratches along
with reducing the film grain to provide a pleasing picture and aid in the DVD-encoding
process. The Teranex Image Restore system is a leader in the market and can
significantly clean up the content without adding blurring or smearing.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 1 of 12
In 2002, Silicon Optix and Teranex realized that semiconductor technology had advanced
enough that they could take the large Teranex video-processing box and condense into an
affordable single chip. In September 2004, the Realta HQV video processor, which
matches the performance of Teranex’s original 3RU video processor, was announced to
the world.
The Programmable Advantage
The Teranex architecture has won so much business with broadcasters and postproduction houses because it is fully programmable. With all the evolving formats and
standards in different countries around the world, the Teranex video processor can adapt.
Rather than hard-coding the video-processing algorithms into silicon competitors do, the
Realta HQV executes the algorithms entirely in software on one of the world's most
advanced array-processing engines. This is especially important when buying a format
converter, which is an investment that should last at least 5 – 10 years. It just makes sense
to have a video processor that can adapt to the confusing world of rapidly changing
standards. Teranex customers can finally purchase a low cost "future-proof" product that
doesn’t become obsolete after a few months. Also, as new formats are standardized and
new features are developed, software updates will be provided to the customer base.
Of course, one of the primary advantages of this programmability is the Teranex videoprocessing software, which has been refined through 100,000 hours of content
verification over the past six years by hundreds of the most demanding customers
worldwide – the “Golden Eyes” of Hollywood’s post-production and broadcast
companies.
Interlace to Progressive Conversion
The process of interlace to progressive conversion consists of a number of complex
image processing steps. It is important to understand the details of the processing steps in
order to appreciate the many places where quality may be lost in the conversion process.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 2 of 12
De-Interlacing
Why De-Interlace?
Most video sources, including standard definition DV, Betacam, Digi-Beta, and high
definition HDV, HDCam, and HD-D5 are predominantly interlaced images. Instead of
transmitting each video frame in its entirety (what is called progressive scan), most video
sources transmit only half of the image in each frame at any given time. This concept also
applies to recording video images: video cameras and film-transfer devices record only
half of the image in each frame at a time.
The words "interlaced" and "progressive" arise from the days of CRT or "picture-tube"
televisions, which form the image of each frame on the screen by scanning an electron
beam horizontally across the picture tube, starting at the top and working its way down to
the bottom. Each horizontal line "drawn" by the beam includes the part of the picture that
falls within the space occupied by that line. If the scanning is interlaced, the electron
beam starts by drawing every other line (all the odd-numbered lines) for each frame; this
set of lines is called the odd field. Then it resets back to the top of the screen and fills in
the missing information, drawing all the even-numbered lines, which are collectively
called the even field. Together, the odd and even fields form one complete frame of the
video image.
Translating the interlaced video signal from 480i and 1080i sources into progressive
format is required as the first step in DTV format conversion. Until the image in the
progressive domain, it is not possible to scale it to the desired output resolution without
creating unwanted image artifacts. This interlaced to progressive conversion is the
most important step in the format conversion process and determines the overall
quality of the output video signal.
Basic De-interlace Techniques:
If the objects in the video image are not moving, it is very easy to do the de-interlacing –
the two fields can be weaved together and combined to form a complete frame. However,
if there is motion in the image, this technique can generate significant artifacts. Since the
recording is performed in an interlaced manner, the two source fields that make up a
complete frame are not recorded at the same time. Each frame is recorded as an odd field
at one point in time, and then as an even field recorded 1/50th or 1/60th of a second later.
Hence if an object in the scene has moved in that fraction of a second, simply combining
fields causes the errors in the image called “combing” or “feathering” artifacts.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 3 of 12
An example of feathering or combing
Simplest Competitor Approach (Non-Motion Adaptive):
The simplest approach to avoid these artifacts is to only process a single field. This is
called a non-motion adaptive approach. In this method, when the two fields reach the
processor, only data from the first field is used and the data from the second field is not
taken into consideration.
The video-processing circuitry recreates or “interpolates” the missing lines by averaging
pixels from above and below. While there are no combing artifacts, image quality is
compromised because half of the detail and resolution have been discarded.
More-advanced techniques have been adopted by virtually all standard-definition video
processors, but this basic approach is still sometimes used for high-definition signals, due
to the increased computational and data-rate requirements of higher video resolution.
With video processors from some competitors, only 540 lines from a 1080i source are
used to create the image that makes it to the screen. This is true even for video processors
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 4 of 12
from companies that may have been considered providers of flagship performance in the
standard-definition era.
Advanced Competitor Approach (Frame-based Motion Adaptive):
More advanced de-interlacing techniques available from the competition include a framebased, motion-adaptive algorithm. By default, these video processors use the same
technique described above. However, by using a simple motion calculation, the video
processor can determine when no movement has occurred in the entire picture.
If nothing in the image is moving, the processor combines the two fields directly. With
this method, still images can have full vertical resolution, but as soon as there is any
motion, half of the data is discarded and the resolution drops to half. With this technique
static test patterns look sharp, but moving images tend to look soft.
Teranex HQV Approach (Pixel-Based Motion Adaptive):
HQV processing represents the most advanced de-interlacing technique available: a true
pixel-based motion-adaptive approach. With HQV processing, motion is identified at the
pixel level rather than the frame level. While it is mathematically impossible to avoid
discarding pixels in motion during de-interlacing, HQV processing is careful to discard
only the pixels that would cause combing artifacts. Everything else is displayed with full
resolution.
Only the pixels that would cause combing are removed.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 5 of 12
Pixel-based motion-adaptive de-interlacing avoids artifacts in moving objects and
preserves full resolution of non-moving portions of the screen even if neighboring pixels
are in motion.
Motion Compensation
Motion Compensated De-interlacing is often referred to as the “Holy Grail” of deinterlacing. In this technique, pixels in the first field are shifted to align with
corresponding pixels in the second field in order to preserve the full resolution of objects
in motion. This technique is extremely computationally complex, and requires
significant processing horsepower to do properly. Although it has the capability of
providing full resolution conversions of objects in motion, there are a number of practical
limitations to how well this can be achieved. For instance, as the motion between fields
increases, the search area increases exponentially, as does the probability that the objects
will change in shape. This creates a much larger possibility for artifacts due to false
matches in the search process.
It is important that motion compensation be performed on a per pixel basis and be
utilized in conjunction with a high quality per pixel motion adaptive de-interlace
framework in order to avoid the pitfalls and realize the potential benefits. Although
region based motion compensation techniques may be used to reduce the computational
complexity, the benefits are often indistinguishable from pure motion adaptive
approaches. Due to the high compute requirements and associated costs of properly
performing motion compensation on a per pixel basis, this technique is often relegated to
systems serving the high end of post production.
“Second Stage” Diagonal Interpolation
To recover some of the detail lost in the areas in motion in the pixel based motion
adaptive approach, HQV processing implements a multi-direction diagonal filter that
reconstructs some of the lost data at the edges of moving objects, filtering out any
“jaggies.” This operation is called “second-stage” diagonal interpolation because it’s
performed after the de-interlacing, which is the first stage of processing. Since diagonal
interpolation is independent of the de-interlacing process, competitors have used similar
algorithms with their frame-based de-interlacing approaches.
Diagonal lines resulting
from moving object
interpolation (green edges)
Highlighting the jaggy section of the line, the missing detail is re-created by
averaging along the diagonal lines.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
The new interpolated
frame (full resolution)
Page 6 of 12
Truth in Marketing
Teranex is not the only company to implement pixel-based motion-adaptive deinterlacing, and it is important to recognize that all such de-interlacing is not identical. In
order to implement a true per-pixel motion-adaptive de-interlacer, the video processor
must perform a four-field analysis. In addition to the two fields being analyzed in the
current frame, the two previous fields are required in order to determine which pixels are
in motion. Clearly, if a competing de-interlacer does not evaluate four fields, it simply
does not have the data necessary to perform true per-pixel motion-adaptive analysis.
Some competing products implement region-based analysis, in which motion is
determined by evaluating larger blocks of the image rather than complete frames or
individual pixels. Obviously, then, a claim of “four-field” analysis alone does not imply
per-pixel motion-adaptive de-interlacing.
HQV Processing continues to analyze at the per-pixel level using four-field analysis even
in high-definition.
Film Cadence and Video/Film Detection
Automatic Video/Film Detection
An important step in the interlace to progressive conversion process is to detect whether
the content is video based (i.e.: interlaced fields), film based (e.g.: 3:2, 2:2, or other
assorted cadences), or mixed video and film in the same frame. The processing
requirements are very different depending on the content type and it is imperative that
this detection logic be accurate in order to ensure that the proper conversion algorithms
are used. It is also important that the video/film detection be fast and automatic. If it
takes too long for the logic to make a decision, significant conversion artifacts may result.
If it is not automatic, the operator will need to constantly switch the processing mode to
reflect the content type being processed. In some post production applications where the
content type is known and fixed, this may be desirable, but for broadcast or display
applications when the content type changes often, automatic video film detection is key.
If the Video/Film detector determines that the content is video based, the techniques
described above are utilized. If the content is film based, an entirely different processing
method must be used.
3:2 Film Mode Detection
Motion picture films are recorded at 24 frames per second. When the film is converted to
video for DVD or television broadcast, those 24 frames must be converted into 60
interlaced fields. The process is as follows. Consider four frames of film: A, B, C, and D.
The first step is to convert these four frames into eight fields. This transforms 24 frames
per second (fps) into 48 interlaced fields per second. Then, to account for the faster rate
of the NTSC standard (roughly 30 frames per second or 60 interlaced fields per second),
it is necessary to repeat certain fields. This is done by repeating a field every other frame.
That is, both fields of frame A are recorded (A-odd, A-even), but three fields of frame B
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 7 of 12
are recorded (B-odd, B-even, B-odd). The cycle repeats with frames C and D. This is
called a 2:3 (or 3:2) cadence because two fields of one frame are output followed by three
fields of the next frame.
When this sequence is played back on a progressive-scan video display, it is possible to
implement the same de-interlacing techniques described earlier (non-motion adaptive vs.
motion adaptive, etc.). However, it is possible to perfectly reconstruct the original frames
without losing any data. Unlike interlaced video, in which the two fields were recorded a
fraction of a second apart, these fields were recorded at the same time in the same film
frame and later separated into fields.
So, to display a video signal that originated as 24fps film, all a video processor needs to
do is analyze the fields and determine that there is a regularly alternating pattern of two
fields followed by three fields, etc. This recognition and reconstruction is called 2:3 (or
sometimes 3:2) pull-down detection, and it is found in all but the most basic deinterlacers. Unfortunately, nothing is quite that simple.
Mixed Video and Film
Sometimes, further editing and post-processing is done on film that has been converted to
video. This includes titles, transitions, and other effects. As a result, simply
reconstructing full frames using 2:3 frame matching results in combing artifacts because
parts of the image are best processed using a standard de-interlacing approach, while
other parts will look better by detecting the right cadence and reconstructing the original
frames.
Like the various approaches to standard de-interlacing, there are many approaches to
dealing with mixed video and film. If the processor interprets the material as film,
feathering artifacts will appear around the video portion; if the processor interprets the
material as video, the film portion will be displayed at half of its resolution. Some
processors determine whether there is more film or more video content and choose the
approach with the greatest benefit. Since this usually means film, the result is feathering
artifacts. Other processors are designed with the idea that these artifacts should never be
seen and use the video de-interlacing techniques in all cases, at the expense of half the
video resolution.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 8 of 12
HQV Processing, on the other hand, uses per-pixel calculations for all of its processing.
This means it is possible for the HQV processor to implement cadence-detection
strategies for the pixels that represent film content while implementing pixel-based
motion-adaptive de-interlacing for the video content that has been superimposed.
Other Cadences
The HQV Processing advantage of true per-pixel de-interlacing becomes even more
evident when dealing with other cadences. Although 24fps film and its associated 2:3
video cadence is the most common format, it isn’t the only cadence used today.
Sometimes, TV stations accelerate their film-based movies and TV shows by dropping
every twelfth field to make room for more commercials. This speedup is usually too
small to be noticed by the average viewer, but these “vari-speed broadcasts” end up
having unusual cadences such as 3:2:3:2:2. If a de-interlacer is unable to detect this
sequence, as with most of the competition, half the resolution is lost.
The variety of cadences does not end there. Professional DVCPro camcorders are
increasingly used in television and film production. In order to maximize the recording
time, these camcorders use a 2:2:2:4 cadence or a 2:3:3:2 cadence to store the progressive
source signal as 480i on the tape.
Animation is often rendered at 12fps. Two pull-down cadences can be used to convert
this to the 30 fps broadcast standard. Doubling every frame, and then applying 2:3 pulldown to the resultant fields will generate a 5:5 cadence. Applying 3:2 pull-down to the
frames (rather than the fields) will generate a 4:6 (or 6:4) cadence. The Japanese
‘Anime’ format is often rendered at 8 fps. To convert this to 30 fps, each frame of
animation is repeated three times, and then 2:3 conversion is performed resulting in an
effective cadence of 7:8 (or 8:7).
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 9 of 12
Most competitors’ processors count the incoming fields and try to match them against
known sequences such as 2:3 or 2:2 in order to select the right decoding. This works in
the simplest cases, but there may still be a short delay before the processor is able to
“lock on” and determine the right cadence. In addition, when the video processor
encounters an unusual sequence such as animation or DVCPro, it will resort to discarding
half the data until it can lock to a known sequence.
With HQV processing, there is never any confusion about cadence. Instead of trying to
match the incoming video against known patterns, HQV processing simply identifies
complete frames as they come in. HQV processing is able to identify all known cadences,
no matter how uncommon, and it can also detect cadences that have not yet been
invented.
No matter what type of video you’re watching or where it comes from, HQV processing
will always provide the best reconstruction of the image.
Noise Reduction
Random noise is an inherent problem with all recorded images. Noise can enter the
system in many places along the path from acquisition to consumption. Not only does
noise get introduced during post-production, compression, and transmission, but it is also
present at the source in the form of film grain or imaging-sensor noise. Noise-reduction
algorithms can minimize the noise in a picture.
The simplest approach to noise reduction is to use a spatial filter that removes highfrequency data. In this approach, only a single frame is evaluated at any given time, and
structures in the image that are one or two pixels in size are nearly eliminated. This does
remove the noise, but it also degrades the image quality because there is no way to
differentiate between noise and detail. This approach can also cause an artificial
appearance in which people look like their skin is made of plastic. This represents the
most widely used noise-reduction approach.
A temporal filter takes advantage of the fact that noise is a random element of the image
that changes over time. Instead of simply evaluating individual frames, a temporal noise
filter evaluates several frames at once. By identifying the differences between two frames
and then removing that data from the final image, visible noise can be reduced very
effectively. If there are no objects in motion, this is a virtually perfect noise-reduction
technique that preserves most of the detail. This approach is used by many high-end
competitors.
However, a problem arises if there are moving objects in the image, which also cause
differences from one frame to the next; of course, these differences should be retained. If
moving objects are not distinguished from noise, they will be filtered and a ghosting or
smearing effect is seen.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 10 of 12
HQV processing uses a per-pixel motion-adaptive and noise-adaptive temporal filter to
avoid the artificial appearance and artifacts associated with conventional noise filters. To
preserve maximum detail, moving pixels do not undergo unnecessary noise processing.
In static areas, the strength of noise reduction is determined on a per-pixel basis,
depending on the level of noise in the surrounding pixels as well as in previous frames,
allowing the filter to adapt to the amount of noise in the image at any given time. The end
result is a natural-looking picture with minimal noise and grain and maximum
preservation of fine details.
Detail Enhancement
Detail enhancement, also called sharpening, is an important component in format
conversion, both standard definition and high definition. Unfortunately, due to the
historically poor implementations of sharpening algorithms, this process has received a
reputation as something to avoid.
All digital video goes through a low-pass anti-aliasing filter to prevent false color and
moiré effects that can occur during the digitization process. The filter improves overall
image quality, but it necessarily blurs some of the detail. The data-compression stage can
also remove some detail. Fortunately, much of the lost detail can be mathematically
recovered.
Because the human visual system perceives sharpness in terms of apparent contrast,
exaggerating the differences between light and dark can produce what appears to be a
sharper image. Unfortunately, due to rudimentary implementations of sharpening in the
past, this process has been associated with artifacts known as “ringing” or “halos” in
which objects are surrounded by a bright white edge. The resulting image appears harsh
and does not reflect what was originally captured. The halos can sometimes be more
distracting than the softness from the uncorrected image. For that reason, it is often
recommended that users turn down the sharpening on video devices.
HQV Detail Enhancement technology is different. By using a more conservative
algorithm and selectively identifying the area of blur before processing, HQV Detail
Enhancement avoids halo or ringing artifacts at even the highest setting. Of course, it is
also possible to disable HQV Detail Enhancement if the source has already applied
sharpening. A key benefit of HQV Detail Enhancement is that, when used in conjunction
with our 1024-tap scaler, standard-definition TV can be delivered at near high-definition
quality.
1024-tap Scaling
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 11 of 12
Converting standard-definition video to high-definition video involves resizing an image
to contain as much as six times the number of pixels it had originally. How this is done
determines the quality of the resized image.
The most basic video processors perform their scaling calculations by analyzing no more
than four pixels in the source image to create one pixel in the final image. This represents
what is called a 4-tap scaler. (Without getting too technical, the number of "taps"
determines the number of pixels that are analyzed.) With all other things being equal, a
larger number of taps will result in better scaling quality. The average scaler uses no
more than 16 taps. However, even this level of scaling can still produce blurry images.
HQV processing uses an image scaler with an unprecedented 1024 taps. This level of
quality reflects the fact that HQV processing has its roots in Teranex algorithms, which
were developed for defense and military image analysis. For every pixel, the HQV
processor evaluates the surrounding 1024 pixels in order to provide the best image
quality when scaling the image up from standard definition. Again, when this advanced
upsampling technology is combined with HQV Detail Enhancement, standard-definition
television sources are transformed to near-HD quality.
10-bit 4:4:4 Internal Data Paths
Not only does HQV processing implement some of the most advanced algorithms for
video processing, but the internal data paths support 10-bits-per-channel with full 4:4:4
color sampling. (The term "4:4:4" refers to the fact that the color information can be input
at full horizontal resolution, and 10-bit data paths provide 1024 steps of brightness and
color.) The result is the ability to render over 1 billion colors. In comparison,
conventional video processors that only have 8-bit data paths result in the ability to
render only 16 million colors. Simply put, by maintaining more bits in the data, HQV
products can preserve all the fine detail and dynamic range found in the original source.
Summary
As you can now understand, HQV processing represents an enormous leap in video
processing, with true flagship performance in de-interlacing, noise reduction, and scaling
with both standard-definition and high-definition signals. Silicon Optix and Teranex
designed HQV processing as a no-compromise solution.
Most competing video processors only have enough computational horsepower to handle
the simple cases of de-interlacing and 3:2 detection. Since they are not programmable, it
is not possible for them to handle all of the “corner cases” common in today’s video
content. With HQV processing, you can be assured you are getting the highest possible
conversion quality, preserving the full resolution of the source.
Teranex Video Processing Whitepaper
Rev 12t June 25, 2006
Page 12 of 12
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising