Xerox Scan to PC Desktop User guide

Xerox Scan to PC Desktop User guide
USER’S GUIDE
LEGAL NOTICES
Copyright © 2008 Nuance Communications, Inc. All rights reserved. No part of this
publication may be transmitted, transcribed, reproduced, stored in any retrieval
system or translated into any language or computer language in any form or by any
means, mechanical, electronic, magnetic, optical, chemical, manual, or otherwise,
without prior written consent from Nuance Communications, Inc., 1 Wayside Road,
Burlington, Massachusetts 01803-4609. Printed in the United States of America and
in Ireland.
The software described in this book is furnished under license and may be used or
copied only in accordance with the terms of such license.
IMPORTANT NOTICE
Nuance Communications, Inc. provides this publication "As Is" without warranty of
any kind, either express or implied, including but not limited to the implied
warranties of merchantability or fitness for a particular purpose. Some states or
jurisdictions do not allow disclaimer of express or implied warranties in certain
transactions; therefore, this statement may not apply to you. Nuance reserves the
right to revise this publication and to make changes from time to time in the content
hereof without obligation of Nuance to notify any person of such revision or changes.
TRADEMARKS AND CREDITS
Nuance, ScanSoft, OmniPage, PaperPort, True Page, Direct OCR, Logical Form Recognition,
RealSpeak are registered trademarks or trademarks of Nuance Communications, Inc.,
in the United States of America and/or other countries. All other company names or
product names referenced herein may be the trademarks of their respective holders.
THIRD PARTY LICENSES/NOTICES
Please see acknowledgements/notices at the end of this guide.
Nuance Communications, Inc.
1 Wayside Road
Burlington, MA 01803-4609
U.S.A.
Nuance Communications International BVBA
International Headquarters
Guldensporenpark 32
Building D
9820 Merelbeke
Belgium
Part Number: 50-281A-10220
C
O N T E N T S
WELCOME
5
New features in OmniPage 16
INSTALLATION
AND SETUP
System requirements
Installing OmniPage
Setting up your scanner with OmniPage
How to start the program
Registering your software
Activating OmniPage
Uninstalling the software
USING OMNIPAGE
DOCUMENTS
Processing methods
Defining the source of page images
Describing the layout of the document
Preprocessing Images
Zones and backgrounds
PROOFING
9
9
10
11
14
15
15
15
17
OmniPage Documents
The OmniPage Desktop and Views
Basic Processing Steps
How to use OmniPage with PaperPort
PROCESSING
7
AND EDITING
The editor display and views
Proofreading OCR results
Verifying text
The Character Map
OmniPage 16 User’s Guide
17
18
23
24
25
25
29
32
34
39
47
47
48
49
50
3
User dictionaries
Languages
Training
Text and image editing
On-the-fly editing
Marking and redacting
Reading text aloud
Creating and editing forms
SAVING
AND EXPORTING
Saving and Exporting
Saving original images
Saving recognition results
Sending pages by mail
Other export targets
WORKFLOWS
Workflow Assistant
Batch Manager
Creating new jobs
Watched folders
Watched mailboxes
Barcode processing
File-it Assistant
TECHNICAL
INFORMATION
Troubleshooting
INDEX
4
Contents
51
52
52
54
56
57
58
60
63
63
64
65
70
70
71
74
76
77
81
82
83
85
87
87
93
Welcome
Welcome to this OmniPage® 16 text recognition program, and thank
you for choosing our software! The following documentation has
been provided to help you get started and give you an overview of
the program.
This User’s Guide
This guide introduces you to using OmniPage 16. It includes
installation and setup instructions, a description of the program’s
commands and working areas, task-oriented instructions, ways to
customize and control processing, and technical information.
Descriptions are based on the Windows VistaTM operating system.
This guide is written with the assumption that you know how to
work in the Microsoft Windows environment. Please refer to your
Windows documentation if you have questions about how to use
dialog boxes, menu commands, scroll bars, drag and drop
functionality, shortcut menus, and so on.
We also assume you are familiar with your scanner and its
supporting software, and that the scanner is installed and working
correctly before it is setup with OmniPage 16. Please refer to the
scanner’s own documentation as necessary.
How-to-Guides
The How-to-Guides display on first program launch. They are a
series of mini-guides that help you get started easily by providing
concise overviews of key program areas, such as getting input,
image improvement, zoning, recognition, editing, proofreading, new
features, and the like.
Welcome
5
Online Help
OmniPage online Help contains information on features, settings,
and procedures. It also has a comprehensive glossary, with its own
alphabetical index and a table of contents. The online Help is
provided as HTML help, and has been designed for quick and easy
information retrieval. Online Help is available after you install
OmniPage.
Comprehensive context-sensitive help aims to provide just
enough assistance to let you keep working without delay.
It is available from dialog boxes. Press F1 in any dialog box
to access it, or click the help button if the dialog box has one.
Readme File
The Readme file contains last-minute information about the
software. Please read it before using OmniPage. To open this HTML
file, choose Readme in the OmniPage Installer or afterwards in the
Help menu.
Scanning and other information
The Nuance® web site at www.nuance.com provides timely
information on the program. The Scanner Guide
(http://www.nuance.com/scannerguide/) contains up-dated
information about supported scanners and related issues; Nuance
tests the 25 most widely used scanner models. Access Nuance’s web
site from the OmniPage 16 Installer or afterwards from the Help
menu.
Tech Notes
The web site at www.nuance.com contains Tech Notes on
commonly reported issues using OmniPage 16. Web pages may also
offer assistance on the installation process and troubleshooting.
6
Welcome
New features in OmniPage 16
Here are some main areas of innovation compared to OmniPage 15.
If you are upgrading, you may not need to consult this guide very
much.
•
Three screen views: Choose from Classic (as in OmniPage
15), Flexible and Quick Convert View (all main controls on
a single panel). See Chapter 2.
•
Multiple documents. In Classic or Flexible view you can
have two or more documents open at one time, for easy
cross-document editing.
•
Digital camera processing: perform OCR on digital
camera images with special algorithms. See Chapter 3.
•
2007 programs: OmniPage 16 supports the latest Word
and Excel inside Office 2007 (DOCX and XLSX), and also
provides links for SharePoint 2007 and Outlook 2007.
•
PDF Enhancements: these include support for PDF
version 1.6, faster processing speed, higher accuracy,
improved output quality, and the MRC high compression
technology for certain PDF flavors.
•
Legal documents: OmniPage 16 offers high-quality
handling and recognition of legal documents.
•
Customizable shortcut menus in Windows Explorer:
send image files or PDFs directly to major Windows
programs, process them with your own workflows, or use
the Convert Now Wizard for easy conversion control.
•
General improvements: these include faster processing,
better quality output page layout (font matching, table
detection, etc.); and a new, intuitive Workflow Assistant.
New features in OmniPage 16
7
New features unique to OmniPage Professional 16
•
Extracting data from filled forms: A new workflow step
allows data to be extracted from sets of forms and exported
to databases, based on a PDF form template. The forms can
be active PDF forms, static forms in a range of image
formats or scanned paper forms.
•
Marking and redacting: Text can be highlighted,
struckout or redacted (made unreadable) in the Text
Editor. Redacting is useful for legal documents or for those
with confidential content.
•
File-it Assistant: A more efficient aid for creating and
using barcode cover page workflows. These allow for
automatic processing and storage of documents driven by
the push of just one scanner button.
A more complete list of features, and the differences between
various OmniPage versions appears in online Help.
This icon is used throughout the guide to denote features
that are available only in OmniPage Professional 16.
OmniPage 16 is supplied in Enterprise versions for network use. It is
also supplied in Special Editions for selected scanner manufacturers
and other resellers. The feature set in these editions may vary, in line
with each vendor's requirements.
8
Welcome
Installation and setup
This chapter provides information on installing and starting
OmniPage.
System requirements
The minimum requirements to install and run OmniPage 16 are:
•
A computer with an Intel® Pentium® III processor or
equivalent. Intel Core Duo, Intel Core 2 Duo or AMD X2
Dual Core 3600+ recommended.
•
Windows 2000 (from Service Pack 4), Windows XP 32bit (from Service Pack 2), Windows XP 64-bit, and
Windows Vista 32-bit or 64-bit.
•
Microsoft Internet Explorer 5.5.
•
256MB of memory (RAM), 1GB recommended.
•
150MB of free hard disk space for application and sample
files plus 70MB working space during installation.
Additionally:
•
175MB for all RealSpeak® modules (80MB for
RealSpeak® Solo American English language
module, additional 9-11MB per RealSpeak Solo
other language modules)
•
20MB for ScanSoft PDF Create! *
5MB for Microsoft Installer (MSI) if not present (it is
included in most Windows operating systems).
•
•
1024x768 pixel color monitor with 16-bit color or greater
video card.
•
A sound card and speaker for reading text aloud.
•
A CD-ROM drive for installation.
Installation and setup
9
•
A Windows compatible pointing device.
•
4 megapixel digital camera or higher for digital camera
text capture
•
A compatible scanner with its own scanner driver
software, if you plan to scan documents. See the Scanner
Guide at Nuance’s web site (www.nuance.com) for a list
of supported scanners.
•
Web access is needed for product registration, Scanner
Wizard database updating and obtaining live updates for
the program.
To save DOCX and XPSX files (for Microsoft Office 2007
Word and Excel) or to load and save XPS files (XML Paper
Specification), you should have or install Microsoft .NET
Framework 3.0. The link to the Microsoft download page
can be found in the Release Notes, or in the application
About box. Alternatively, click the OmniPage .Net
Framework balloon tooltip.
* Supplied with OmniPage Professional 16 only.
•
Installing OmniPage
OmniPage 16’s installation program takes you through installation
with instructions on every screen.
Before installing OmniPage:
10
•
Close all other applications, especially anti-virus
programs.
•
Log into your computer with administrator privileges if
you are installing on Windows 2000, XP or Vista.
•
If you own a previous version of OmniPage, or if you are
upgrading from demonstration software or an OmniPage
Special Edition, the installer asks your consent to uninstall
that product.
Chapter 1
To install OmniPage:
1. Insert the OmniPage CD-ROM in the CD-ROM drive. The
installation program should start automatically. If it does not
start, locate your CD-ROM drive in Windows Explorer and
double-click the Autorun.exe program at the top-level of the
CD-ROM.
2. Choose a language to use during installation. Accept the EndUser License Agreement and enter the serial number shown on
the CD envelope.
3. Choose a complete or a custom installation. A complete
installation installs all RealSpeakTM Text-to-Speech language
modules (currently 9). Custom installation lets you exclude or
add modules. To exclude a module, click its down arrow and
select ‘This feature will not be available’.
4. Follow the instructions on each screen to install the software.
All files needed for scanning are copied automatically during
installation.
Setting up your scanner with OmniPage
All files needed for scanner setup and support are copied
automatically during the program’s installation, but no scanner
setup occurs at installation time. Before using OmniPage 16 for
scanning, your scanner should be installed with its own scanner
driver software and tested for correct functionality. Scanner driver
software is not included with OmniPage.
Scanner setup is done through the Scanner Setup Wizard.
You can start this yourself, as described below. Otherwise,
it appears when you first attempt to perform scanning.
Proceed as follows:
•
Choose Start > All Programs > ScanSoft OmniPage 16 >
Scanner Setup Wizard
Setting up your scanner with OmniPage
11
or click the Setup button in the Scanner panel of the
Options dialog box.
or choose Scan in the Get Page drop-down list in the
OmniPage Toolbox and click the Get Page button.
12
•
The Scanner Setup Wizard starts. If you have a web
connection, the first panel invites you to update the
scanner database supplied with the wizard. Choose Yes or
No and click on Next.
•
Choose ‘Select and test scanner or digital camera’, then
click Next. If you have a single installed scanner, it
appears, along with any scanners previously set up with
OmniPage. If the required scanner is not listed, click Add
Scanner... .
•
You see a list of all detected scanner drivers in the
checkmarked categories. This can include network
devices. Select one and click OK. To install a second
device, you must run the Scanner Wizard again.
•
The wizard reports whether the chosen scanner model
already has settings in the scanner database. If it does, you
do not need to test it. If it does not, you should test it.
Click on Next.
•
If you chose not to test, click Finish. If you chose testing,
click Next to have the scanner connection tested. If the
connection is in order, you see a menu of further tests.
Choose which testing steps you want to run. The Basic
test scan is recommended.
•
By default OmniPage uses its own scanning interface,
located in the Scanner panel of the Options dialog box. If
you want to use your scanner’s own interface instead,
choose Advanced settings... and select this. Click Hint
editor... and choose Edit hints... only if you are experienced
in configuring scanners or have been advised by Technical
Support to do so.
Chapter 1
•
Click Next to start the tests. For the Basic scan test, insert
a test page into your scanner. The wizard will scan using
your scanner manufacturer’s software. Click on Next. Your
scanner’s native user-interface will appear.
•
Click on Scan to begin the sample scan.
•
If necessary, click on Missing Image… or Improper
Orientation... and make the appropriate selections.
•
Once the image appears correctly in the window, click on
Next.
•
Move through the remaining requested tests, following the
instructions on the screen.
•
When all the requested tests have been completed
successfully, the Scanner Wizard reports and invites you
to click on Finish.
You have successfully configured your scanner to work
with OmniPage 16!
To change the scanner settings at a later time, or to setup or remove
a scanner, reopen the Scanner Setup Wizard from the Windows
Start menu or from the Scanner panel of the Options dialog box.
•
To test and repair an improperly functioning scanner, open the
wizard and select ‘Test the current scanner or digital camera’ in the
second panel, then work through the procedure described above,
maybe using advice received from Technical Support.
To specify a different default scanner, open the wizard to reach the
list of setup scanners. Move the highlight to the desired scanner and
be sure to close the wizard with Finish.
To get updated settings for your current scanner, open the wizard,
request a fresh database download in the first screen, then choose
‘Use current settings with current device’, click Next and then
Finish.
Setting up your scanner with OmniPage
13
How to start the program
To start OmniPage 16 do one of the following:
•
Click Start in the Windows taskbar and choose All
Programs > ScanSoft OmniPage 16 > OmniPage
[Professional] 16.
•Double-click the OmniPage icon in the
program’s installation folder or on the Windows
desktop if placed there.
•Double-click an OmniPage Document (OPD)
icon or file name; the clicked document is loaded
into the program. See “OmniPage Documents” in
the next chapter.
Right click one or more image file icons or file names for a
shortcut menu. Select Open With... OmniPage application.
The images are loaded into the program.
On opening, OmniPage’s title screen is displayed and then a view
selection panel. OmniPage has three basic view types. For details,
see The OmniPage Desktop and Views in the next chapter. It
provides an introduction to the program’s main working areas.
•
There are several ways of running the program with a limited
interface:
•
Use the Batch Manager program. Click Start in the
Windows taskbar and choose All Programs > ScanSoft
OmniPage 16 > OmniPage Batch Manager. See the
Workflows chapter.
•
Click Acquire Text from the File menu of an application
registered with the Direct OCR™ facility. See “How to set
up Direct OCR” in the Processing Documents chapter.
•
Right-click on one or more image file icons or file names
for a shortcut menu. Select OmniPage 16 and choose a
target format, or the Convert Now Wizard or a workflow
from its sub-menu. The files will be processed according to
the workflow instructions. See the Workflows chapter.
14
Chapter 1
•
•
Click the OmniPage Agent icon on the taskbar. Choose a
workflow to start the program and run the workflow.
Use OmniPage 16 with Nuance’s PaperPort® document
management product, to add OCR services. See “How to
use OmniPage with PaperPort” in the Using OmniPage
chapter.
Registering your software
Nuance’s online registration runs at the end of installation. Please
ensure web access is available. We provide an easy electronic form
that can be completed in less than five minutes. When the form is
filled, click Submit. If you did not register the software during
installation, you will be periodically invited to register later. You
can go to www.nuance.com to register online. Click on Support and
from the main support screen choose Register in the left-hand
column. For a statement on the use of your registration data, please
see Nuance’s Privacy Policy.
Activating OmniPage
You will be invited to activate the product at the end of installation.
Please ensure that web access is available. Provided your serial
number is found at its storage location and has been correctly
entered, no user interaction is required and no personal information
is transmitted. If you do not activate the product at installation
time, you will be invited to do this each time you invoke the
program. OmniPage 16 can be launched only five times without
activation. We recommend Automatic Activation.
Uninstalling the software
Sometimes uninstalling and then reinstalling OmniPage will solve a
problem. The OmniPage Uninstall program will not remove files
Registering your software
15
containing recognition results or any of the following user-created
files:
Zone templates (*.zon)
Image enhancement templates (*.ipp)
Training files (*.otn)
User dictionaries (*.ud)
OmniPage Documents (*.opd)
Job files (*.opj)
Workflow files (*.xwf)
To uninstall from Windows 2000, XP or Vista you must be logged
into your computer with administrator privileges.
To uninstall or reinstall OmniPage:
•
Close OmniPage.
•
Click Start in the Windows taskbar and choose the
Control Panel and then Uninstall a program (in earlier
Windows versions: Add/Remove Programs).
•
Select OmniPage and click Uninstall (in earlier Windows
versions: Remove).
•
Click Yes in the dialog box that appears to confirm
removal.
•
Select Yes to restart your computer immediately, or No if
you plan to restart later.
Follow instructions until the process is finished.
When you uninstall OmniPage, the link to your scanner is also
uninstalled. You must setup your scanner again with OmniPage if
you reinstall the program. All RealSpeak modules that were
installed with the program will also be uninstalled.
ScanSoft PDF Create! 4 needs to be uninstalled separately.
•
With OmniPage 16 Professional, PaperPort must be installed and
uninstalled separately.
16
Chapter 1
Using OmniPage
OmniPage 16 uses optical character recognition (OCR)
technology to transform text from scanned pages or image files into
editable text for use in your favorite computer applications.
In addition to text recognition, OmniPage can retain the following
elements and attributes of a document through the OCR process.
Graphics (photos, logos)
Form elements (checkboxes, radio buttons, text fields)
Text formatting (character and paragraph)
Page formatting (column structures, table formats, headings,
placing of graphics).
Documents in OmniPage
A document in OmniPage consists of one image for each document
page. After you perform OCR, the document will also contain
recognized text, displayed in the Text Editor, possibly along with
graphics, tables and form elements.
OmniPage Documents
An OmniPage Document (.opd) contains the original page
images (optionally pre-processed) with any zones placed
on them. After recognition, the OPD also contains the
recognition results.
An OmniPage Document can contain an embedded user dictionary,
training file, zone template file, or an image enhancement template
file. This can increase file size considerably but makes the OPD
Using OmniPage
17
more portable. To embed a file, open the relevant dialog box from
the Tools menu, select the desired file and click Embed. Use the
Extract button to get a local copy of an embedded file inside an OPD
you have received.
When you open an OmniPage Document, its settings are applied,
replacing those existing in the program.
The OmniPage Desktop and Views
OmniPage comes with three different views to suit your task the
best.
•
Classic View - This view has a similar look and feel to
previous versions of OmniPage.
•
Flexible View - This view is a new alternate layout of the
OmniPage function panels stacked in a tabbed view to give
each panel more space.
•
Quick Convert View - This view is designed for quick and
easy document conversion without having to learn a lot.
The most important conversion options are clearly visible
on one screen.
Use the Windows menu to switch between views and to save your
own custom view. For a custom view, arrange the panels and
toolbars as you wish, then choose Window > Custom Views >
Manage. Click Add and name your view. Your screen layouts will be
displayed in the Custom Views submenu with a checkmark beside
the active one.
Classic View
In Classic View, the OmniPage Desktop has four main working
areas, separated by splitters: the Document Manager, the Page
18
Chapter 2
Image, Thumbnails and the Text Editor. The Page Image has an
Image toolbar and the Text Editor has a Formatting toolbar.
Standard
Toolbar
OmniPage
Toolbox
Formatting toolbar
Thumbnails
Image
toolbar
Document
Manager
Page Image
Text Editor
OmniPage toolbox: This Toolbox lets you drive the processing.
Thumbnails panel: This displays page thumbnails.
Document Manager: This provides an overview of your document
with a table. Each row represents one page. Columns present
statistical or status information for each page, and (where
appropriate) document totals.
Page Image: This displays the image of the current page, together
with its zones. When a page is displayed, the Image toolbar is
available.
Text Editor: This displays the recognition results from the current
page.
The OmniPage Desktop and Views
19
Flexible View
Use this view to set up the OmniPage workspace so that it fits your
task optimally. Suggested scenarios:
Maximizing workspace (single screen)
Load a document. Open the panels you want to
use. Grab them by their captions one by one, and
drag them so that they dock behind the active
one as tabs. You can also dock online Help to
avoid handling two separate windows.
Working with recognition results (single screen)
Load a document and have it recognized. Close
all panels except the Document Manager and the
Text Editor. Maximize both horizontally, scale
down the Document Manager and dock it to the
top or bottom. You can now step through the
pages double-clicking them one by one in the Document Manager,
inspecting recognition results in the Text Editor. The number of
suspect words and reject characters in the Document Manager will
help you identify problematic pages.
Handling large documents (dual-screen)
Load the document you want to work on. Move
its Thumbnail View to your second monitor and
maximize it for a large scale overview of your
document and far more space for thumbnail
operations.
20
Chapter 2
Verifying (dual-screen)
Place the Page Image on one screen and the Text
Editor on the other. This gives you more space for
editing and proofing.
The Page Image is always available for verifying
recognition and for performing on-the-fly zoning
and editing.
The scenarios presented above are only examples to give you
an idea of what you can do in Flexible View.
Quick Convert View
Use the Quick Convert View for fast recognition and saving. You
can switch to Quick View only when you have no opened document
and it can handle only one document at a time.
Quick
Convert
toolbar
Processing
buttons
Settings:
source document
output text format, formatting level
output folder and file name
saving options
page range
Page Image
The OmniPage Desktop and Views
21
The Toolbars
The program has eleven main toolbars. Use the View menu to show,
hide or customize them. Status bar texts at the bottom edge of the
OmniPage program window explain the purpose of all tools.
Standard toolbar: Performs basic functions.
Image toolbar: Performs image, zoning and table operations. Three
of its tool groups can now be handled separately (mini-toolbars):
•
Zones toolbar: Offers zoning tools.
•
Rotate toolbar: Provides rotating tools.
•
Table toolbar: Inserts, moves and removes row and column
dividers.
Formatting toolbar: Formats recognized text in the Text Editor.
Verifier toolbar: Controls the location and appearance of the
verifier.
Reorder toolbar: Modifies the order of elements in recognized
pages.
Mark Text toolbar: Performs text marking and redacting.
Form Drawing toolbar: Creates new form elements.
Form Arrangement toolbar: Arranges and aligns form elements.
All toolbars can be moved and customized in each view to your
particular needs, including use of a secondary monitor.
The Form toolbars and the Mark Text toolbar (for details
see Chapter 4) appear only in OmniPage Professional 16.
Program Panels
OmniPage has six panels that can be handled (docked, floated,
resized) separately: Thumbnails, Page Image, Text Editor,
Document Manager, Workflow Status, and Online Help.
22
Chapter 2
To float a panel anywhere on the screen, keep CTRL pushed while
dragging. To dock it, drag the panel over the OmniPage main
window, hold down the left mouse button and start pressing space
to see all possible docking positions. To select a given position,
release the mouse button.
Basic Processing Steps
There are three ways of handling documents: with automatic,
manual or workflow processing. The basic steps for all processing
methods are broadly the same:
1. Bring a set of images into OmniPage. You can scan a
paper document with or without an Automatic
Document Feeder (ADF) or load one or more image files.
2. Perform OCR to generate editable text. After OCR, you
can check and correct errors in the document using the
OCR Proofreader and edit the document in the Text
Editor.
3. Export the document to the desired location. You can
save your document to a specified file name and type,
place it on the Clipboard, send it as a mail attachment or publish it.
You can save the same document repeatedly to different
destinations, different file types, with different settings and levels of
formatting.
Using OmniPage, you can choose from the following processing
methods: Automatic, Manual, Combined, or Workflow. You can
start recognition from other applications, using Direct OCR and can
also schedule processing to run at a later time.
Processing methods are detailed in the next chapter and in Online
Help.
Basic Processing Steps
23
Settings
The Options dialog box is the central location for OmniPage
settings. Access it from the Standard toolbar or the Tools
menu. Context-sensitive help provides information on each setting.
How to use OmniPage with PaperPort
The PaperPort® program is a paper management
software product from Nuance. It lets you link pages
with suitable applications. Pages can contain pictures,
text or both. If PaperPort exists on a computer with
OmniPage, its OCR services become available and
amplify the power of PaperPort. You can choose an
OCR program by right-clicking on a text application’s
PaperPort link, selecting Preferences and then
selecting OmniPage 16 as the OCR package. OCR settings can be
specified, as with Direct OCR.
PaperPort provides the easiest way to turn paper into organized
digital documents that everybody in an office can quickly find and
use. PaperPort works with scanners, multifunction printers, and
networked digital copiers to turn paper documents into digital
documents. It then helps you to manage them along with all other
electronic documents in one convenient and easy-to-use filing
system.
PaperPort’s large, clear item thumbnails allow you to visually
organize, retrieve and use your scanned documents, including
Word files, spreadsheets, PDF files and even digital photos.
PaperPort’s Scanner Enhancement Technology tools ensure that
scanned documents will look great while the annotation tools let
you add notes and highlights to any scanned image.
PaperPort is included in the OmniPage Professional
package. For application information, refer to PaperPort’s
own documentation.
24
Chapter 2
Processing documents
This tutorial chapter describes different ways you can process
a document and also provides information on key parts of
this processing.
Processing methods
Using OmniPage, you can choose from the following processing
methods:
Automatic
A fast and easy way to process documents is to let
OmniPage do it automatically for you. Select
settings in the Options dialog box and in the
OmniPage Toolbox drop-down lists and then click Start. It will take
each page through the whole process from beginning to end, when
possible running in parallel. It will typically auto-zone the pages.
Manual
Manual processing gives you more precise control
over the way your pages are handled. You can
process the document page-by-page with different
settings for each page. The program also stops
between each step: acquiring images, performing
recognition, exporting. This lets you, for instance, draw zones
manually or change recognition language(s). You start each step by
clicking the three buttons on the OmniPage Toolbox.
1. Use button one to get a set of images.
Processing documents
25
2. Manually zone pages where you want to process only part of
the page or if you want to give precise zoning instructions. Use
ignore backgrounds or zones to exclude areas from processing.
Use process backgrounds or zones to specify areas to be autozoned.
3. Use button two to have the pages recognized.
4. Do proofing and editing as desired.
5. Use button three to save your results.
The default for manual processing is to have all entered pages
automatically selected. This way you can have all new pages
recognized by a single mouse click. You can remove this default in
the Process panel of the Options dialog box.
Combined
You can process a document automatically and view results in the
Text Editor. If most pages are in order, but a few have not turned
out as expected, you can switch to manual processing to adjust
settings and re-recognize just those problem pages. Alternatively,
you can acquire images with manual processing, draw zones on
some or all of them, and then send all pages to automatic processing
by pressing the Start button and choosing to process existing pages.
Workflow
A workflow consists of a series of steps and their
settings. Typically it will include a recognition step,
but it does not have to. It does not have to conform
to the 1-2-3 pattern of traditional processing. Workflows are listed
in the Workflow drop-down list – sample workflows plus any you
create. Workflows allow you to handle recurring tasks more
efficiently, because all the steps and their settings are pre-defined.
You can choose to place the OmniPage Agent icon on your taskbar.
26
Chapter 3
Its shortcut menu lists your workflows. Click a workflow to launch
OmniPage and have it run.
Let the Workflow Assistant guide you in creating new workflows.
It provides a choice of steps and the settings they need. Click Next
after each step to add another one. You can use the Assistant just to
get more guidance when doing automatic processing. See
“Workflow Assistant” in Chapter 6.
At a later time
You can schedule OCR jobs or other processing jobs in
OmniPage Batch Manager to be performed automatically at a
later time, when you may not even be present at your
computer. This is done through the Batch Manager. It does not
matter if your computer is turned off after the job is set up, so long as
it is running at job start time. If you are scanning pages, your scanner
must be functioning at job start time, with the pages loaded in the
ADF.
When you choose New Job, first the Job Wizard, and then the
Workflow Assistant appears - the latter with a slightly modified set
of choices and settings. In the first panel of the Job Wizard, you
define your job type and name your job; next you are to specify a
starting time, a recurring job or watched folder instructions.
A job incorporates a workflow with timing instructions added. See
“Batch Manager” in Chapter 6.
Processing from other applications
You can use the Direct OCR™ feature to call on the recognition
services of OmniPage while you work in the following applications:
Microsoft Office 2000 or higher, Corel WordPerfect 12 or X3. First
you must check the Enable Direct OCR check box under
Tools > Options > General. Then, two items in its Add-Ins
Processing methods
27
(File Menu in applications apart from MS Office 2007) open the
door to OCR facilities.
How to set up Direct OCR
Start the application you want connected to OmniPage. Start
OmniPage, open the Options dialog box at the General panel and
select Enable Direct OCR.
In the target application, go to Add-Ins (or the File menu in
applications other than Office 2007) > OmniPage > Acquire Text
Settings > Direct OCR, and specify OCR, Scanner, Output Format
and Direct OCR settings. Select process options for proofing and
zoning. These function for future Direct OCR work until you
change them again; they are not applied when OmniPage is used on
its own.
How to use Direct OCR
1. Open your application and work in a document. To acquire
recognition results from scanned pages, place them correctly in
the scanner.
2. Use the target application’s Add-Ins (or File) Menu item
Acquire Text Settings... to review your recognition settings, if
necessary.
3. Use the Add-Ins (File) Menu item Acquire Text to acquire
images from scanner or file.
4. If you selected Draw zones automatically in the Direct OCR panel
of the Options dialog box, under Acquire Text Settings...,
recognition proceeds immediately.
5. If Draw zones automatically is not selected, each page image will
be presented to you, allowing you to draw zones manually.
Click the Perform OCR button to continue with recognition.
28
Chapter 3
6. If proofing was specified, this follows recognition. Then the
recognized text is placed at the cursor position in your
application, with the formatting level specified by Acquire
Text Settings... .
Defining the source of page images
There are three possible image sources: from image files, from a
digital camera and from a scanner. There are two main types of
scanners: flatbed or sheetfed. A scanner may have a built-in or
added Automatic Document Feeder (ADF), which makes it easier to
scan multi-page documents. The images from scanned documents
can be input directly into OmniPage or may be saved with the
scanner’s own software to an image file, which OmniPage can later
open.
Input from image files
You can create image files from your own scanner, or receive them
by e-mail or as fax files. OmniPage 16 can open a wide range of
image file types. Select Load Files in the Get Pages drop-down list.
Files are specified in the Load Files dialog box. This appears when
you start automatic processing. In manual processing, click the Get
Page button or use the Process menu. The lower part of the dialog
box provides advanced settings, and can be shown or hidden.
The minimum width or height for an image file is 16 by 16 pixels; the
maximum is 8400 pixels (71cm or 28 inches at the resolution 201 to
600 dpi). See online Help for pixel limits.
In OmniPage Professional 16, files can also be imported
from FTP locations, Microsoft SharePoint, SharePoint
2003, 2007, or ODMA sources.
Defining the source of page images
29
Input from digital camera
You can bring digital camera photos of documents for
recognition into OmniPage. First, make sure that your
device driver is installed properly. Then connect the camera and
download images. Click Load Digital Camera Files in the Get Page
drop-down list. If you use this, 3D Deskew, resolution enhancement
and straightening text lines are automatically performed on images.
You can also do manual 3D deskewing, see the section “Image
Enhancement tools” later in this Chapter.
To acquire digital camera photos containing text from Direct OCR
or PaperPort, mark the Load as digital camera image checkbox. The
above mentioned automatic enhancements will apply.
For tips and advice on working with digital camera images see the
How-to-Guides.
Input from scanner
You must have a functioning, supported scanner correctly installed
with OmniPage 16. You have a choice of scanning modes. In making
your choice, there are two main considerations:
•
Which type of output do you want in your export
document?
•
Which mode will yield best OCR accuracy?
Scan black and white
Select this to scan in black-and-white. Black-and-white
images can be scanned and handled quicker than others
and occupy less disk space.
Scan grayscale
Select this to use grayscale scanning. For best OCR
accuracy, use this for pages with varying or low contrast
30
Chapter 3
(not much difference between light and dark) and with text on
colored or shaded backgrounds.
Scan color
Select this to scan in color. This will function only with
color scanners. Choose this if you want colored graphics,
texts or backgrounds in the output document. For OCR
accuracy, it offers no more benefit than grayscale
scanning, but will require much more time, memory
resources and disk space.
Brightness and contrast
Good brightness and contrast settings play an important role in
OCR accuracy. Set these in the Scanner panel of the Options dialog
box or in your scanner’s interface. After loading an image, check its
appearance. If characters are thick and touching, lighten the
brightness. If characters are thin and broken, darken it. Then rescan
the page.
If your scanning results are still not satisfactory, open the scanned
image in the Image Enhancement window to edit it using a range of
different tools.
Scanning with an ADF
The best way to scan multi-page documents is with an Automatic
Document Feeder (ADF). Simply load pages in the correct order
into the ADF. You can scan double-sided documents with an ADF.
A duplex scanner will manage this automatically.
Scanning without an ADF
Using OmniPage’s scanner interface, you can scan multi-page
documents efficiently from a flatbed scanner, even without an ADF.
Select Automatically scan pages in the Scanner panel of the Options
Defining the source of page images
31
dialog box, and define a pause value in seconds. Then the scanner
will make scanning passes automatically, pausing between each
scan by the defined number of seconds, giving you time to place the
next page.
Document-to-document conversion
In OmniPage Professional 16 you can open not only
image files, but also documents created in wordprocessing and similar applications. Supported file
types include .doc, .xls, .ppt, .rtf, .wpd and others. Click the Load
Files button in the OmniPage Toolbox or select the Load Files
command under Get Page, in the File menu. In the Load Files
dialog box, choose Documents.
When you are finished, you can choose from a wide variety of
document file types for saving.
Describing the layout of the document
Before starting recognition you are requested to describe the layout
of the incoming pages to assist the auto-zoning process. When you
do automatic processing, auto-zoning always runs unless you
specify a template that does not contain a process zone or
background. When you do manual processing, auto-zoning
sometimes runs. See online Help: When does auto-zoning run? Here are
your input description choices:
Automatic
Choose this to let the program make all auto-zoning
decisions. It decides whether text is in columns or not,
whether an item is a graphic or text to be recognized and
whether to place tables or not.
32
Chapter 3
Single column, no table
Choose this setting if your pages contain only one column of
text and no table. Business letters or pages from a book are
normally like this.
Multiple columns, no table
Choose this if some of your pages contain text in columns
and you want this decolumnized or kept in separate
columns, similar to the original layout.
Single column with table
Choose this if your page contains only one column of text
and a table.
Spreadsheet
Choose this if your whole page consists of a table which you
want to export to a spreadsheet program, or have treated as
single table.
Form
Choose this if your whole page consists of a form and you
want form elements auto-recognized. After recognition, you
can modify form element properties, create new ones, or edit
form layout. This option is available in OmniPage
Professional 16 only.
Legal pleading
Choose this to recognize legal documents. Legal headers are
detected and removed. Choose to have pleading numbers
retained or dropped.
Custom
Choose this for maximum control over auto-zoning. You can
prevent or encourage the detection of columns, graphics and
tables. Make your settings in the OCR panel of the Options
dialog box.
Describing the layout of the document
33
Template
Choose a zone template file if you wish to have its
background value, zones and properties applied to all
acquired pages from now on. The template zones are also
applied to the current page, replacing any existing zones.
If auto-zoning yielded unexpected recognition results, use manual
processing to rezone individual pages and re-recognize them.
Preprocessing Images
To improve OCR results, you can enhance your images before
zoning and recognition using the Image Enhancement tools. To
open the Image Enhancement window, click the SET - Enhance
Image button in the Image Toolbar, or click Tools and choose SET Enhance Image. You can also build Image Enhancement steps into
your workflows by choosing the Enhance Images step.
The input for Image Enhancement is the Primary image.
We must distinguish three types of image:
Original image: The image created by your scanner or contained in
a file before it enters the program.
Primary image: The state of the original image after it has been
loaded into OmniPage, possibly modified by automatic or manual
pre-processing operations.
OCR image: A black-and-white image derived from the primary
image, optimized for good OCR results.
Some tools affect the Primary image, others the OCR image. Be sure
you know which image you are editing.
Good brightness and contrast settings play an important role in
OCR accuracy. Set these in the Scanner panel of the Options dialog
box or in your scanner’s interface. The diagram illustrates an
optimum brightness setting. After loading an image, check its
34
Chapter 3
appearance. If characters are thick and touching, lighten the
brightness. If characters are thin and broken, darken it. Use the
OCR Brightness tool to optimize the image.
Unsuitable
Tolerable
Good
Best
Good
Tolerable
Unsuitable
Image Enhancement Tools
The Image Enhancement tools can also be used to edit images to
save and use them as image files. Note that some these tools work
on the Primary image, others on the one used for OCR (OCR
image). Click the Primary/OCR Image button in the Image
Enhancement window, to see the current state of either image.
The Image Enhancement window has two panels. The left panel
shows the starting image. Your changes are shown in the right
preview panel. When you click Accept, the right image is moved to
Preprocessing Images
35
the left panel to become the new starting image for further
enhancement.
The following tools are accessible on the toolbar:
Pointer (F5) - the Pointer is a neutral tool carrying out
different operations under different circumstances (for
example, to pick a color for the Fill operation, or to catch the
deskew line.)
Zoom (F6) - click the tool then use the left mouse button to
zoom in on your image or the right mouse button to zoom out.
You can also use the mouse wheel for zooming in and out even in the inactive view. In the active view the "+" and "-"
buttons serve the same purpose.
Select Area (F7) - click and draw your selection on the image
to use a tool only on the selected area. (Image Enhancement
Tools, by default, work on the whole page.) Selection has
three modes (in the View menu): Normal, Additive, and
Subtractive.
Primary/OCR Image - click this tool to switch between the
primary and the OCR image in the active view. Primary
images can be of any image mode, while an OCR image is its
black-and-white version, generated purely for OCR purposes.
Synchronize Views - click this tool to zoom and scroll the
inactive view to the same zoom value and scroll position as
the active view. To make the inactive view dynamically follow
the focus of the active one, click View then choose the Keep
Synchronized command.
Brightness and Contrast - click this tool to adjust the
brightness and contrast of your primary image or a selected
part of it. Use the sliders in the tool area to achieve the desired
effect.
36
Chapter 3
Hue / Saturation / Lightness - click this tool then use the
sliders to modify the hue, saturation and lightness of your
primary image.
Crop - if you decide to use only a given part of your image,
click the Crop tool then select the area to keep and the rest of
the image will be removed.
Rotate - click this tool to rotate (by 90, 180 or 270 degrees)
and/or flip your image, or its selected area.
Despeckle - click this tool to remove stray dots from your
image. Despeckle works on the OCR image at 4 levels. You
can also use this tool not to remove noise from the page but to
strengthen letter outlines: to do this mark the checkbox
Inverse despeckling.
OCR Brightness - use this tool the set Brightness and
Contrast of your OCR image. See the diagram showing
optimum brightness under Preprocessing Images above.
Dropout color - click this tool and pick a color. Sections of
the scanned image in this color will be set transparent. The
tool has its effect on the OCR image.
Resolution - use this tool to decrease the resolution of your
primary image in percentages. Note that you cannot adjust a
resolution higher than that of the original one.
Deskew - sometimes pages are scanned crookedly. To
straighten the lines of text manually, use the Deskew tool.
(Auto-deskew is also available in the Process panel of
Options.)
3D Deskew - use this tool to remove perspective distortion
from digital camera images. This is particularly useful when
you want to check the results of automatic 3D Deskew or you
prefer to do 3D deskew manually after a Load Files step.
Preprocessing Images
37
Fill - use this tool to apply uniform coloring to selected areas.
3D Deskew works by snapping the distorted image to a grid. All
you need to do is to manually straighten this grid, and image
coordinates will follow - see illustration below (before - after 3D
Deskew).
Using Image Enhancement History
To commit or undo your image edits (one by one or all the steps),
use the History panel in the Image Enhancement window. Once you
have modified the original image, its preview displays the changes,
but they are not done until you click the Apply button next to the
History list. Modifications not added to the History by clicking the
Add button will not be applied.
Any time you want to see what output a certain step resulted in,
double click it in the History list.
To discard changes you have performed with a given tool, but before
applying it, select the step in the list, then click the Reset button.
To restore the image as it was before you started the current
enhancement session, click the Discard all changes button.
Saving and applying templates
If you have a number of similar images to enhance, you can build up
a list of enhancement steps to apply to all of them.
38
Chapter 3
To create and store an image enhancement template, first bring an
image file into the Image Enhancement window, then carry out your
preprocessing steps and add them to the History clicking the Apply
button. When you are done, choose Save Enhancement Template
from the File menu. Browse to your preferred destination and save
the template file (with the extension .ipp).
To carry out the set of modifications saved in the template file on
another image, simply open the new image in the Image
Enhancement window and choose Load Enhancement Template
from the File menu.
Image Enhancement in Workflows
To incorporate image enhancement in a workflow
choose its icon in the Workflow Assistant. The
following options are available:
Display images for manual enhancement - during the execution of a
workflow, each loaded image will be displayed for manual editing.
Apply enhancement template - an already saved enhancement
template will be applied automatically to the image while being
processed by the workflow.
Apply enhancement template and display - the workflow will apply
the selected image enhancement template, and will also display the
image so that you can make further edits to it.
Zones and backgrounds
Zones define areas on the page to be processed or ignored. Zones are
rectangular or irregular, with vertical and horizontal sides. Page
images in a document have a background value: process or ignore
(the latter is more typical). Background values can be changed with
the tools shown. Zones can be drawn on page backgrounds with
the tools shown under Zone Types and Properties (see later).
Zones and backgrounds
39
Process areas (in process zones or backgrounds) are auto-zoned
when they are sent to recognition.
Ignore areas (in ignore zones or backgrounds) are dropped from
processing. No text is recognized and no image is transferred.
Automatic zoning
Automatic zoning allows the program to detect blocks of text,
headings, pictures and other elements on a page and draw zones to
enclose them.
You can Auto-zone a whole page or a part of it. Automatically
drawn zones and template zones have solid borders. Manually
drawn or modified zones have dotted borders.
Auto-zone a page background
Acquire a page. It appears with a process background. Draw a
zone. The background changes to ignore. Draw text, table or
graphic zones to enclose areas you want manually zoned. Click the
Process background tool (shown) to set a process background.
Draw ignore zones over parts of the page you do not need. After
recognition the page will return with an ignore background and
new zones round all elements found on the background.
Zone types and properties
Each zone has a zone type. Zones containing text can also have a
zone contents setting: alphanumeric or numeric. The zone type and
zone contents together constitute the zone properties. Right-click
in a zone for a shortcut menu allowing you to change the zone’s
properties. Select multiple zones with Shift+clicks to change their
properties in one move.
The Image toolbar provides six zone drawing tools, one for each
type.
40
Chapter 3
Process zone
Use this to draw a process zone, to define a page area where
auto-zoning will run. After recognition, this zone will be
replaced by one or more zones with automatically
determined zone types.
Ignore zone
Use this to draw an ignore zone, to define a page area you do
not want transferred to the Text Editor.
Text zone
Use this to draw a text zone. Draw it over a single block of
text. Zone contents will be treated as flowing text, without
columns being found.
Table zone
Use this to have the zone contents treated as a table. Table
grids can be automatically detected, or placed manually.
Graphic zone
Use this to enclose a picture, diagram, drawing, signature or
anything you want transferred to the Text Editor as an
embedded image, and not as recognized text.
Form zone
Use this to enclose an area of your document containing
form elements such as a checkbox, radio button, text field or
anything you want transferred to the Text Editor as a form
element. Afterwards, in True Page view, you can edit form
layout, and modify the properties of form elements. Form
zones are available in OmniPage Professional 16 only.
Zones and backgrounds
41
Working with zones
The Image toolbar provides zone editing tools.
Grouped tools can be undocked/floated an redocked as a separate mini toolbar for
convenience. One is always selected. When you
no longer want the service of a tool, click a different tool. Some tools
on this toolbar are grouped. If docked as a single tool, only the last
selected tool from the group is visible. To select a visible tool,
click it.
To draw a single zone select the zone drawing tool of the desired
type, then click and drag the cursor.
To resize a zone, select it by clicking in it, move the cursor to a side
or corner, catch a handle and move it to the desired location. It
cannot overlap another zone.
To make an irregular zone by addition draw a partially overlapping
zone of the same type.
To join two zones of the same type draw an overlapping zone of the
same type (drawn zones on the left, resulting zone on the right).
To make an irregular zone by subtraction draw an overlapping zone
of the same type as the background.
To split a zone draw a splitting zone of the same type as the
background.
A full set of zoning diagrams appear in the Online Help.
42
Chapter 3
When you draw a new zone that partly overlaps an existing zone of
a different type, it does not really overlap it; the new zone replaces
the overlapped part of the existing zone.
The following zone types are prohibited:
Speed zoning lets you do manual zoning quickly. Activate the zone
selection cursor, then move the cursor over the page image. Shaded
areas will appear showing the auto-detected zones. Double-click to
transform a shaded area into a zone.
Table grids in the image
After automatic processing you may see table zones placed
on a page. They are denoted with a table zone icon in the top
left corner of the zone. To change a rectangular zone to or
from a table zone, use its shortcut menu. You can also draw
table type zones, but they must remain rectangular.
You draw or move table dividers to determine where
gridlines will appear when the table is placed in the Text
Editor. You can draw or resize a table zone (provided it stays
rectangular) to discard unneeded columns or rows from the outer
edges of a table.
Using the table tools you can insert row and column dividers; move
and remove dividers. Click the Place/Remove all dividers tool to
have dividers in a table auto-detected and placed.
You can specify line formatting for table borders and grids from a
shortcut menu. You will have greater choice for editing borders and
shading in the Text Editor after recognition.
Zones and backgrounds
43
Using zone templates
A template contains a page background value and a set of zones and
their properties, stored in a file. A zone template file can be loaded
to have template zones used during recognition. Load a template file
in the Layout Description drop-down list or from the Tools menu.
You can browse to network locations to load templates created by
others.
When you load a template, its background and zones are placed:
•
on the current page, replacing any zones already there
•
on all further acquired pages
•
on pre-existing pages sent to (re-)recognition without any
zones.
With manual processing the template zones in the first two cases
can be viewed and modified before recognition.
With automatic processing the template zones can be viewed and
modified only after recognition.
With workflow processing, use the zone images step. This
combines two steps: load templates and manual zoning. To use a
zone template, click the Add button in the appropriate panel of the
Workflow Assistant, and select the zone template file to use. Then
make your choice between displaying images for manual zoning;
applying the zone template; or applying it and display the images.
Templates accept ignore and process zones and backgrounds. They
can therefore be useful to define which parts of the pages to process
with auto-zoning, and which parts to ignore. Process zones or
process background areas from a template may be replaced during
recognition by a set of smaller zones; specific zone types will be
assigned to these zones.
44
Chapter 3
How to save a zone template
Select a background value and prepare zones on a page. Check their
locations and properties. Click Zone Template... in the Tools menu.
In the dialog box, select [zones on page] and click Save, then assign a
name and optionally a different path. Choose a network location to
share the template file. Click OK. The new zone template remains
loaded.
How to modify a zone template
Load the template and acquire a suitable image with manual
processing. The template zones appear. Modify the zones and/or
properties as desired. Open the Zone Template Files dialog box.
The current template is selected. Click Save and then Close.
How to unload a template
Select a non-template setting in the Layout Description drop-down
list. The template zones are not removed from the current or
existing pages, but template zones will no longer be used for future
processing. You can also open the Zone Template Files dialog box,
select [none] and click the Set As Current button. In this case, the
layout description setting returns to Automatic.
How to replace one template with another
Select a different template in the Layout Description drop-down
list, or open the Zone Template Files dialog box, select the desired
template and click the Set As Current button. Zones from the new
template are applied to the current page, replacing any existing
zones. They are also applied as explained above.
How to remove a template file
Open the Zone Template Files dialog box. Select a template and
click the Remove button. Zones already placed by this template are
not removed. Template files can be deleted only from the operating
system.
Zones and backgrounds
45
How to include a template file in an OPD
Open a document, then click Tools and choose Zone Template.
Select the one you want to include and click Embed. Then save the
document to the OPD format. This means the template will travel
with the OPD if it is sent to a new location. When the OPD file is
opened later, the included zone template will be shown in the Zone
Template Files dialog box as [embedded] and can be saved to a new
named template file at the new location by using the Extract
button.
46
Chapter 3
Proofing and editing
Recognition results are placed in the Text Editor. These can be
recognized texts, tables, forms and embedded graphics. This
WYSIWYG (What You See Is What You Get) editor is detailed in
this chapter.
The editor display and views
The Text Editor displays recognized texts and can mark words that
were suspected during recognition with red, wavy underlines. They
are displayed with red characters in the OCR Proofreader.
A word may be suspect because it was not found in any active
dictionary: standard, user or professional. It may also be suspect as a
result of the OCR process, even if it is found in the dictionary. If the
uncertainty stems from certain characters in the word, these are
shown with a yellow highlight, both in the Editor and the OCR
Proofreader.
Choose to have non-dictionary words marked or not in the Proofing
panel of the Options dialog box. All markers can be shown or
hidden as selected in the Text Editor panel of the Options dialog
box. You can also show or hide non-printing characters and header/
footer indicators. The Text Editor panel also lets you define a unit of
measurement for the program and a word wrap setting for use in all
Text Editor views except Plain Text view.
OmniPage 16 can display pages with three levels of formatting. You
can switch freely between them with the three buttons at the
bottom left of the Text Editor or from the View menu.
Proofing and editing
47
Plain Text view
This displays plain decolumnized left-aligned text in a single
font and font size, with the same line breaks as in the original
document.
Formatted Text view
This displays decolumnized text with font and paragraph
styling.
True Page view
True Page® view tries to conserve as much of the formatting of
the original document as possible. Character and paragraph
styling is retained. Reading order can be displayed by arrows.
Proofreading OCR results
After a page is recognized, the recognition results appear in the
Text Editor. Proofreading starts automatically if that was requested
in the Proofing panel of the Options dialog box. You can start
proofing manually any time. Work as follows:
1. Click the Proofread OCR tool in the Standard toolbar, or
choose Proofread OCR... in the Tools menu.
2. Proofing starts from the current page, but skips text already
proofed. If a suspected error is detected, the OCR Proofreader
dialog box colors the suspect word in its context, adds a yellow
highlight to any suspect characters and provides a picture of
how the word originally looked in the image. The explanation
says ‘Suspect word’ or ‘Non-dictionary word’.
3. If the recognized word is correct, click Ignore or Ignore All to
move to the next suspect word. Click Add to add it to the
current user dictionary and move to the next suspect word.
4. If the recognized word is not correct, modify the word in the
Edit panel or select a dictionary suggestion. Click Change or
48
Chapter 4
Change All to implement the change and move to the next
suspect word. Click Add to add the changed word to the
current user dictionary and move to the next suspect word.
5. Color markers are removed from words in the Text Editor as
they are proofread. You can switch to the Text Editor during
proofing to make corrections there. Use the Resume button to
restart proofing. Click Page Ready to skip to the next page and
Document Ready or Close to stop proofreading before the end
of the document is reached.
6. A page is marked with the proofed icon
on its thumbnail
and in the Document Manager if proofing ran to the end of the
page. Choose Recheck Current Page... from the Tools menu to
re-proof a page.
Verifying text
After performing OCR, you can compare any part of the recognized
text against the corresponding part of the original image, to verify
that the text was recognized correctly.
The verifier tool is in the Formatting toolbar. The verifier can
also be controlled from the Tools menu. Hover the cursor
over a verifier display to obtain the verifier toolbar. Use it as
follows:
How much context for
dynamic verifier?
• one word
• three words (current + neighbors)
• whole image line
zoom in/out
Verifying text
49
To turn the Verifier on, click the Verifier tool or press F9. To turn it
off, click the Verifier tool again, press F9 again, or press Esc.
A full list of verifier keyboard shortcuts is available in the Online
Help.
The Character Map
The Character Map is a dockable tool giving you aid in
proofing. It is used for essentially two purposes:
•
to insert characters during proofing and editing that are
not or not easily accessible from your keyboard. In this
respect, it is very similar to the system Character Map.
•
to show all characters validated by the current recognition
languages.
To access the Character Map, click its button in the Formatting
Toolbar, or choose Character Map from the View menu and click
Show.
Under the Character Map menu item, you can also choose to display
recent characters only, or different character sets.
You can access the Character Map in other ways, such as:
50
•
Click Tools > Options and choose the OCR tab. Click the
Additional Characters button to select characters to be
included in proofing. Similarly, you can modify the Reject
Character by using the Character Map.
•
Select Train Character under the Tools menu. Click the
(...) button beside the Correct field.
•
Select Train Character from the shortcut menu of a
suspect or non-dictionary word in the Text Editor.
Chapter 4
User dictionaries
The program has built-in dictionaries for many languages. These
assist during recognition and may offer suggestions during proofing.
They can be supplemented by user dictionaries. You can save any
number of user dictionaries, but only one can be loaded at a time. A
dictionary called Custom is the default user dictionary for
Microsoft Word.
Starting a user dictionary
Click Add in the OCR Proofreader dialog box with no user dictionary
loaded or open the User Dictionary Files dialog box from the Tools
menu and click New.
Loading or unloading a user dictionary
Do this from the OCR panel of the Options dialog box or from the
User Dictionary Files dialog box.
Editing or removing a user dictionary
Add words by loading a user dictionary and then clicking Add in the
OCR Proofreader dialog box. You can add and delete words by
clicking Edit in the User Dictionary Files dialog box. You can also
import words from OmniPage user dictionaries (*.ud). While editing
a user dictionary, you can import a word list from a plain text file to
add words to the dictionary quickly. Each word must be on a
separate line with no punctuation at the start or end of the word. The
Remove button lets you remove the selected user dictionary from the
list.
To embed a user dictionary in an OmniPage Document, load your
input file, choose Tools > User Dictionary; select the user dictionary
you want to use, click Embed, and name it. Then save to the file type
OmniPage Document.
User dictionaries
51
Languages
The program can read over 110 languages with three alphabets:
Latin, Greek and Cyrillic. See the list in the OCR panel of the
Options dialog box. It shows which languages have dictionary
support. A listing is also provided on the Nuance web site.
In addition to user dictionaries, specialized dictionaries are
available for certain professions (currently medical, legal and
financial) for some languages. See the list and make selections in the
OCR panel of the Options dialog box.
Training
Training is the process of changing the OCR solutions assigned to
character shapes in the image. It is useful for uniformly degraded
documents or when an unusual typeface is used throughout a
document. OmniPage 16 offers two types of training: manual
training and automatic training (IntelliTrain). Data coming from
both types of training are combined and available for saving to a
training file.
When you leave a page on which training data was generated, you
will be asked how to apply it to other existing pages in the
document.
Manual training
To do manual training, place the insertion point in front of the
character you want to train, or select a group of characters (up to
one word) and choose Train Character... from the Tools menu or the
shortcut menu. You will see an enlarged view of the character(s) to
be trained, along with the current OCR solution. Change this to the
desired solution and click OK. The program takes this training and
examines the rest of the page. If it finds candidate words to change,
52
Chapter 4
the Check Training dialog box lists these. Incorrect words should
be re-trained before the list is approved.
IntelliTrain
IntelliTrain is an automated form of training. It takes input from the
corrections you make during proofing. When you make a change, it
remembers the character shape involved, and your proofing change.
It searches other similar character shapes in the document,
especially in suspect words. It assesses whether to apply the user
correction or not.
You can turn IntelliTrain on or off in the OCR panel of the Options
dialog box.
IntelliTrain remembers the training data it collects, and adds it to
any manual training you have done. This training can be saved to a
training file for future use with similar documents.
For examples of IntelliTrain, see the Online Help.
Training files
Whenever you close a document or switch to another one when
unsaved training data exists, a dialog box appears allowing you to
save it. To save a training file into an OPD, load it from Tools >
Training File, click Embed, and save to the file type OmniPage
Document.
Saving training to file, loading, editing and unloading training files
are all done in the Training Files dialog box.
Unsaved training can be edited in the Edit Training dialog box, an
asterisk is displayed in the title bar in place of a training file name.
Save it in the Training Files dialog box.
Training
53
A training file can be also edited; its name appears in the title bar. If
it has unsaved training added to it, an asterisk appears after its
name. Both the unsaved and the modified training are saved when
you close the dialog box.
The Edit Training dialog box displays frames containing a character
shape and an OCR solution assigned to that shape. Click a frame to
select it. Then you can delete it with the Delete key, or change the
assignation. Use arrow keys to move to the next or previous frame.
You are
editing your
unsaved
training.
This frame has
been deleted.
To undelete it,
select it again
and press the
Delete key.
This frame is
selected.
Top part: image shape.
Bottom part: OCR
Double-click frame or
press Enter to change its
OCR solution.
Text and image editing
OmniPage has a WYSIWYG Text Editor, providing many editing
facilities. These work very similarly to those in leading word
processors.
Editing character attributes
In all views except Plain Text view, you can change the font type, size
and attributes (bold, italic, underlined) for selected text.
Editing paragraph attributes
In all views except Plain Text view, you can change the alignment of
selected paragraphs and apply bulleting to paragraphs.
54
Chapter 4
Paragraph styles
Paragraph styles are auto-detected during recognition. A list of styles
is built up and presented in a selection box on the left of the
Formatting toolbar. Use this to assign a style to selected paragraphs.
Graphics
You can edit the contents of a selected graphic if you have an image
editor in your computer. Click Edit Picture With in the Format
menu. Here you can choose to use the image editor associated with
BMP files in your Windows system, and load the graphic.
Alternatively, you can use the Choose Program... item to select
another program. This will replace the Default Image Editor item.
Edit the graphic, then close the editor to have it re-embedded in the
Text Editor. Do not change the graphic’s size, resolution or type,
because this will prevent the re-embedding. You can also edit images
before recognition using the Image Enhancement tools.
Tables
Tables are displayed in the Text Editor in grids. Move the cursor into
a table area. It changes appearance, allowing you to move gridlines.
You can also use the Text Editor’s rulers to modify a table. Modify the
placement of text in table cells with the alignment buttons in the
Formatting toolbar and the tab controls in the ruler.
Hyperlinks
Web page and e-mail addresses can be detected and placed as links in
recognized text. Choose Hyperlink... in the Format menu to edit an
existing link or create a new one.
Editing in True Page
Page elements are contained in text boxes, table boxes and picture
boxes. These usually correspond to text, table and graphic zones in
the image. Click inside an element to see the box border; they have
the same coloring as the corresponding zones. The online Help topic
True Page provides details on the operations summarized here.
Text and image editing
55
Frames have gray borders and enclose one or more boxes. They are
placed when a visible border is detected in an image. Format frame
and table borders and shading with a shortcut menu or by choosing
Table... in the Format menu. Text box shading can be specified from
its shortcut menu.
Multicolumn areas have orange borders and enclose one or more
boxes. They are auto-detected and show which text will be treated
as flowing columns when exported with the Flowing Page
formatting level.
Reading order can be displayed and changed. Click the Show
reading order tool in the Formatting toolbar to have the order
shown by arrows. Click again to remove the arrows.
Click the Change reading order tool for a set of reordering
buttons in place of the Formatting toolbar. A changed order is
applied in Plain Text and Formatted Text views. It modifies
the way the cursor moves through a page when it is exported
as True Page.
On-the-fly editing
This allows you to modify a recognized page through re-zoning,
without having to re-process the whole page. When on-the-fly
editing is enabled, zone changes (deleting, drawing, resizing,
changing type) immediately make changes in the recognized page.
Conversely, when you modify elements in the Text Editor’s True
Page view, this changes the zones on that page.
Two linked tools on the Image toolbar control on-the-fly zoning.
One of these tools is always active whenever no recognition is in
progress.
Click this to activate on-the-fly editing. The red signal shows
there are no stored zoning changes.
56
Chapter 4
Click this to turn on-the-fly editing off. Your zoning changes
are stored; the on-the-fly tool displays a green signal to show
there are stored changes. To activate these changes, do one of
the following:
Click the on-the-fly tool with a green signal. The zoning
changes will cause changes in the Text Editor.
Click the Perform OCR button to have the whole page
(re)recognized, including your zone changes.
For details on how changes are handled in on-the-fly zoning and their
effects in the Text Editor views, see On-the-fly processing in online Help.
Marking and redacting
The Mark Text toolbar gives you tools to
mark (highlight or strike-out); and to
redact text. Use the View menu to have
this toolbar displayed. You can float or
dock this tool group. Each tool has its
equivalent menu item in the Format menu
or the Text Editor shortcut menu.
Redacting is blacking out confidential information. It is unreadable
and unsearchable. To mark and redact text manually, click the
Mark for Redacting tool and use its cursor to select all the text parts
you want to redact. They appear with a gray highlight. When you
are ready, click the Redact Document tool. Choose to do redaction
in a copy (safer) or the original document. If you choose to redact a
copy, both the copy and the original remain open in OmniPage,
ready to be saved.
WARNING: If you redact the original document, you cannot
retrieve the information you have blacked out.
Marking and redacting
57
To find and redact text by searching, select Find and Mark Text
from the Edit menu to display the Find, Replace and Mark Text
dialog box. Search for text to be marked for redaction. Step through
all occurrences and decide for each case whether to redact
immediately or mark for redaction. In the latter case, perform the
redaction by choosing Close and Redact Document in the Mark
Text dialog box or later click the Redact Document button.
You can apply highlighting and striking out either by selection or
searching.
Reading text aloud
The ScanSoft RealSpeak® speech facility is provided for the visually
impaired, but it can also be useful to anyone during text checking
and verification. The speaking is controlled by movements of the
insertion point in the Text Editor which can be mouse or keyboard
driven.
To hear text:
Use these keys:
One character at a time, forward or
back
Right or left arrow. Letter, number or
punctuation names are spoken.
Current word
Ctrl + Numpad 1
One word to the right
Ctrl + right arrow
One word to the left
Ctrl + left arrow
A single line
Place the insertion point in the line
Next line
Down arrow
Previous line
Up arrow
Current sentence
Ctrl + Numpad 2
From insertion point to end of sentence
Ctrl + Numpad 6
58
Chapter 4
From start of sentence to insertion
point
Ctrl + Numpad 4
Current page
Ctrl + Numpad 3
From top of current page to insertion point
Ctrl + Home
From insertion point to end of current page
Ctrl + End
Previous, next or any page
Ctrl + PgUp, PgDown or navigation buttons
Typed characters
Each typed character is pronounced separately.
The Text-to-Speech facility is enabled or disabled with the Tools
menu item Speech Mode or with the F10 key. A second menu item
Speech Settings... allows you to select a voice (for example, male or
female for a given language), a reading speed and the volume. You
must ensure the language selection is appropriate for the text you
want to hear.
You also have the following keyboard controls:
To do this:
Use this:
Pause/Resume
Ctrl + Numpad 5
Set speed higher
Ctrl + Numpad +
Set speed lower
Ctrl + Numpad –
Restore speed
Ctrl + Numpad *
All speech systems will be installed with OmniPage 16 if you choose
a complete installation. If you perform a custom installation, you
can choose the languages you need.
Reading text aloud
59
Creating and editing forms
You can bring paper or electronic forms (distributed mainly
as PDF in an office environment) into OmniPage
Professional 16, recognize them and edit their content,
layout or both - in True Page view. Draw form zones over the
relevant areas of your image before recognition, or choose Form as
recognition layout, then use the two toolbars: Form Drawing and
Form Arrangement to make modifications and produce a fillable
form and save it in the following formats: PDF, RTF, or XSN
(Microsoft Office InfoPath 2003 format). Static forms can be saved
to HTML. OmniPage Professional 16 uses the Logical Form
RecognitionTM technology to process forms.
Please note that OmniPage supports form creation and editing,
however the tools available here are not designed to fill in forms.
The Form Drawing Toolbar
This is a dockable toolbar, displayed in the Text Editor that allows
you to create a range of form elements using the following tools:
Selection: Click this tool to be able to select, move, or resize
elements in your form.
Text: Use the text tool to add fixed text descriptions on your
form such as titles, labels and headers.
Line: The Line tool is mainly used in layout design: click it and
draw lines to separate distinct sections in your form.
Rectangle: Click this tool to create rectangles in your form for
design purposes.
Graphic: Use this tool to select areas of your form that are to be
treated as graphics.
Fill text: Click this tool to create fillable text fields. These are
fields where you want people to enter text.
60
Chapter 4
Comb: Use this tool to create a text field consisting of boxes.
This is typically used for information such as ZIP codes.
Checkbox: Click this tool and draw Checkboxes - typically for
Yes/No questions and marking one or more choices.
Circle text: Its function is similar to the Checkbox element
(above): the Circle text tool creates elements that get encircled
when selected.
Table: This tool creates tables in your form.
You can also create form elements by right-clicking an existing form
element in your recognized form, and choose the Insert Form Object
menu item.
The Form Arrangement Toolbar
The tools on this toolbar can be used to line up form elements or to
set which one is on top of the others when they overlap. This latter
function is useful for example if you want to create a background
graphic design for your form.
To set the order of overlapping elements, use the
“Bring to Front” and “Send to Back” buttons.
To align the right/left, top/bottom edges or the centers of the
selected form elements
horizontally - use the horizontal alignment tools
vertically - use the vertical arrangement tools.
The commands of the Form Arrangement toolbar are also accessible
from the shortcut menu of any form element.
Editing Form object properties
To edit a form object directly select it then right-click the given
element to display its shortcut menu. You can edit the appearance
Creating and editing forms
61
or the properties of any form element here. Use the following
commands:
Form Object Appearance - use the tabs Borders, Shading and
Shadow to design the look of your form elements in a similar way as
you would do in a text-editing application.
Form Object Properties - this command gives you access to the
element properties such as size, position, name. Note that
properties dynamically vary depending on what type of element you
select.
Extracting Form Data
Form data extraction is a new workflow step. Data is
extracted from elements such as fillable fields, check boxes,
and option buttons.
To create a workflow that contains form data extraction:
• Define the input and its settings. Input types include:
image PDF, PDF form, image files and forms scanned from
paper.
• Choose Extract Form Data in place of recognition, and
specify its settings. Set an active PDF form as template. It
can be single or multi-page, filled or unfilled. The program
determines the location and type of the form fields based
on this form template.
• Finish the workflow with a saving step.
OmniPage will extract data from the form, using the specified
template. Export is to a comma-separated value text file (.csv) ready
to be loaded into a spreadsheet.
Once you select Form Data Extraction in a workflow, only saving
steps will follow.
62
Chapter 4
Saving and exporting
Once you have acquired at least one image for a document,
you can export the image(s) to file. Once you have recognized at
least one page, you can export recognition results – a single page,
selected pages or the whole document – to a target application by
saving to file, copying to Clipboard or sending to a mailing
application. Saving as an OmniPage Document is always possible.
OmniPage provides comprehensive support for Office 2007
applications and formats.
A document remains in OmniPage after export. This allows you to
save, copy or send its pages repeatedly, for example with different
formatting levels, using different file types, names or locations. You
can also add or re-recognize pages or modify the recognized text.
With automatic processing and in Batch Manager jobs, you specify
where to save first before processing starts.
A workflow may contain one or more saving steps, even to different
targets (for instance, to file and to mail). A Batch Manager job must
contain at least one saving step. See Chapter 6, “Workflows”.
Saving and Exporting
If you want to work with your document again in OmniPage in a
later session, save it as an OmniPage Document. This is a special
output file type. It saves the original images together with the
recognition results, settings and training.
Exporting is done through button 3 on the OmniPage Toolbox. It
lists available export targets. Some appear only if access to the
target is detected on your computer. Select the desired target then
Saving and exporting
63
click the Export Results button to begin export. You can also
perform exporting through the Process menu.
Saving original images
You can save original images to disk in a wide variety of file types
with or without image enhancement (using the Image
Enhancement Tools).
1. Choose Save to File in the Export Results drop-down list. In
the dialog box that appears, select Image under Save as.
2. Choose a folder location and a file type. Type in a file name.
3. Select to save the selected zone image(s) only, the current page
image, selected page images or all images in the document. For
multiple zones or multiple pages, you can have all images in a
single multi-page image file, providing you set TIFF, MAX,
DCX, JB2 or Image-only PDF as file type. Otherwise each image
is placed in a separate file. OmniPage adds numerical suffixes to
the file name you provide, to generate unique file names.
4. Click Converter Options... if you want to specify a saving mode
(black-and-white, grayscale, color or ‘As is’), a maximum
resolution and other settings. For TIFF files, you specify the
compression method here.
5. Click OK to save the image(s) as specified. Zones and
recognized text are not saved with the file.
64
Chapter 5
Saving recognition results
You can save recognized pages to disk in a wide variety of file types.
1. Choose Export Results... in the File menu, or click the Export
Results button in the OmniPage Toolbox with Save to File
selected in the drop-down list.
2. The Save to File dialog box appears. Select Text under Save as.
3. Select a folder location and a file type for your document. Select
a page range, file options, naming options and a formatting level
for the document. See “Selecting a formatting level” on this
page.
4. Type in a file name. Click Converter Options... if you want to
specify precise settings for the export. See “Selecting converter
options” later in this chapter.
5. Click OK. The document is saved to disk as specified. If View
Result is selected, the exported file will appear in its target
application; that is the one associated with the selected file
type in your Windows system or in the advanced saving
options for your selected file type converter.
Selecting a formatting level
The formatting level for export is defined at export time, in the
saving dialog box (Save to File, Copy to Clipboard, Send in Mail or
other dialog box). Three of the levels correspond to the format
views of the same name in the Text Editor. However, the level to be
applied for saving is independent of the formatting view displayed
in the Text Editor. When exporting to file or mail, first specify a file
type. This determines which formatting levels are available.
Saving recognition results
65
The formatting levels are:
Plain Text
This exports plain decolumnized left-aligned text in a
single font and font size. When exporting to Text or
Unicode file types, graphics and tables are not
supported. You can export plain text to nearly all file
types and target applications; in these cases graphics,
tables and bullets can be retained.
Formatted Text
This exports decolumnized text with font and
paragraph styling, along with graphics and tables.
This is available for nearly all file types.
Flowing Page
This keeps the original layout of the pages, including
columns. This is done wherever possible with column
and indent settings, not with text boxes or frames.
Text will then flow from one column to the other,
which does not happen when text boxes are used.
True Page
This keeps the original layout of the pages, including
columns. This is done with text, picture and table
boxes and frames. This is offered only for target
applications capable of handling these. True Page
formatting is the only choice for XML export and for
all PDF export, except to the file type ‘PDF Edited’.
Spreadsheet
This exports recognition results in tabular form,
suitable for use in spreadsheet applications. This
places each document page onto a separate
worksheet.
66
Chapter 5
When exporting to Microsoft Excel, 'Spreadsheet' is good for saving
whole-page tables. Prefer 'Formatted Text' if your document
contains smaller tables: each table will be placed on a separate
worksheet with non-table parts placed in an index worksheet with
hyperlinks to each relevant worksheet
Selecting converter options
Click the Converter Options... button in a saving dialog box to have
precise control over the export. This brings up a dialog box with the
name of the converter associated with the current file type. It
presents a series of options tailored to this file type. First, confirm or
change the formatting level, because this influences which other
options are presented. Select options as desired. Online Help details
how to do this.
Using multiple converters
Multiple converters allow you to export to two or more file types in
one export step. Choose Multiple in the saving dialog box:
To make your own multiple converter, open the Save Preferences
dialog box from the Tools menu. Choose the heading Multiple
converters. Select a converter and click Create from... . This will
make a copy of the selected converter that you can freely modify
without overwriting the original one.
The new converter appears in the list. Select it and click Options...
to specify its settings. You receive a list of all text converters,
followed by all image converters. Checkmark the desired ones.
Optionally specify sub-folder paths for each file type.
You can save pages with different formatting levels or file options to
the different file types, as defined in their simple converters. A few
saving operations cannot be done with multiple converters. These
are:
Saving recognition results
67
Saving OmniPage Documents
Use a workflow with two saving steps, or perform two separate
saves.
Saving to two targets
For instance, you cannot use a multiple converter to save a
document to file and also send it in mail. Use a workflow with two
saving steps, or perform two separate saves.
Saving different page ranges
You cannot save different page ranges to different file types, because
only one set of selected pages can exist at saving time. For the same
reason, a single workflow cannot be used either. Perform two
separate saves or use two workflows.
Saving to PDF
You have five choices when saving to Portable Document Format
(PDF) files. The first four are presented as Text converters, the last
one is listed among the Image converters.
PDF (Normal):
Pages are exported as they appeared in the Text Editor in True Page
view. The PDF file can be viewed and searched in a PDF viewer and
edited in a PDF editor.
PDF Edited:
Use this if you have made significant editing changes in the
recognition results. You have three formatting level choices,
including True Page. The PDF file can be viewed, searched and
edited.
PDF Searchable Image (formerly PDF Image on Text):
The PDF file is viewable only and cannot be modified in a PDF
editor. The original images are exported, but there is a linked text
file behind each image, so the text can be searched. A found word is
highlighted in the image.
68
Chapter 5
PDF with image substitutes:
As for PDF (Normal), but words containing reject and suspect
characters have image overlays, so these uncertain words display as
they were in the original document. The PDF file can be viewed,
searched and edited.
PDF Image (formerly PDF, image only):
The original images are exported. The PDF file is viewable only and
cannot be modified in a PDF editor and text cannot be searched.
Besides the above flavors, you can use other parameters in defining
your PDF output:
PDF 1.6
Save to PDF version 1.6 for enhanced security, markup and
attachment embedding functionality.
PDF-A
Choose to create a PDF-A compliant file to make sure that your PDF
displays exactly identically, regardless of the computer
environment.
Tagged PDF
Create a tagged PDF file to preserve its structure. This will ensure
logical reading order, correct table structure and more.
PDF MRC
Use this high compression technology for good quality and smaller
file size. Available for color and grayscale PDF Images or PDF
Searchable Images.
Converting from PDF
To extract text content from a PDF file, load it into OmniPage,
recognize it, and save the results to a text format.
A variety of outputs is also available from a PDF file shortcut menu:
Word, Excel, RTF, WordPerfect or text. For more options, use the
Convert Now Wizard.
Saving recognition results
69
Sending pages by mail
You can send page images or recognized pages as one or more files
attached to a mail message if you have installed a MAPI-compliant
mail application, such as Microsoft Outlook. To send pages by
e-mail:
• With automatic processing, select Send in Mail as the
setting in the Export Results drop-down list on the
OmniPage Toolbox. The Export Options dialog box
appears as soon as the last available page in the document
is recognized or proofed.
• With manual processing, select Send in Mail as the setting
in the Export Results drop-down list and then click its
button. The dialog box appears immediately.
• Workflows and jobs accept a Send in Mail export step.
Other export targets
Turn recognized text into an audio wave file for later listening,
using ScanSoft RealSpeak. A multiple converter is useful for this,
allowing you to save the document to file and generate the wave file
in one saving step. You must specify the reading language in the
converter options for the wave file type.
In OmniPage Professional 16 you can export files to other
targets. You can save files to a central server (an FTP site) or
to Microsoft SharePoint 2003 and 2007. Exporting choices
are made in the Export Options dialog box. When you click OK you
are directed to FTP or SharePoint log-in and invited to specify the
required path.
If an ODMA-compliant Document Management System (DMS) is
detected in your computing environment, it will be offered. If you
have access to more than one DMS, the system default will apply.
The ODMA server must be pre-configured to accept the file types to
be exported from OmniPage Professional, as defined by their
extensions.
See the Online help for more information on these targets.
70
Chapter 5
Workflows
A workflow contains a series of processing steps and their settings.
It can be saved for repeated use whenever you have a task needing
the same processing. Workflows usually begin with a scanning or
loading step, but they can also start from the document currently
open in OmniPage. After that, they do not have to conform to the
traditional 1-2-3 processing pattern. Usually a workflow will
include a recognition step, but this is not compulsory. For instance,
page images can be saved to image files in a different file type or to
an OmniPage Document. With or without OCR, any number of
saving steps are possible, even to different targets, each with their
own export settings.
Workflows are designed for efficient whole-document processing.
They can also handle recognizing or saving single or selected pages
from a document.
Some workflows run without user interaction. Workflows needing
interaction are those with a manual image enhancement step, a
manual zoning step, a proofing/editing step, the ones when runtime prompting is requested for input or output file names and
paths, or scanning workflows prompting for more pages.
Batch Manager jobs are closely related to workflows. Jobs are
created in the Job Wizard which uses the Workflow Assistant in
the creation process. Jobs run workflows according to the job
parameters and it is more typical of them to run unattended.
Sample workflows
Sample workflows are provided with OmniPage 16 to offer you
typical work processes. They are available in the Workflow drop-
Workflows
71
down list. Choose one then click the Workflow Assistant button to
see its steps and settings.
Running workflows
Here is how to run a sample workflow or one you have created:
1. If your workflow takes input from scanner, place your
document in its ADF or its first page on the scanner bed.
2. Select the desired workflow from the
Workflow drop-down list.
3. Press the Start button. The OmniPage Toolbox displays the
steps in the workflow and acts as a progress monitor. To stop
the workflow before it completes, press the Stop button.
4. If run-time input selection is specified, the Load Files dialog
box awaits your choice of files.
5. If you requested a step requiring interaction (image
enhancement, manual zoning, or proofing) the program
presents pages for attention.
6. When a page is enhanced, zoned or
proofed, click the Page Ready button in the
Toolbox or appropriate dialog box to move to
the next page.
7. When the last page is enhanced, zoned or
proofed, or when you no longer want to do
zoning or proofing, press the appropriate
Document Ready button on the Toolbox. Any pages without
zones will be auto-zoned.
8. The After Completion menu under Process / Workflows gives
you three options to end a workflow. You can choose to close
the document, close OmniPage, or shut down your computer.
72
Chapter 6
These settings are typically applied if the workflow runs
unattended - if your workflow is so, remember to include a
saving step.
You can also run workflows from an OmniPage Agent icon on the
Windows taskbar. Right-click it for a shortcut menu listing your
workflows. Select one to run it. OmniPage will be launched if
necessary. If it is running with a document loaded, the Start
Workflow dialog box displays where you can choose what to
process from the current document: only the Workflow-defined
pages, all pages, selected pages, or the current page.
If you do not see the OmniPage Agent icon, enable it in the General
panel of the Options dialog box or choose Start > All Programs >
ScanSoft OmniPage 16 > OmniPage Agent.
You can launch some workflows from your desktop, or from
Windows Explorer. Right click on an image file icon or file name for
a shortcut menu. Multiple file selection is possible. Choose
OmniPage 16 and a workflow name from the sub-menu. This submenu also provides quick access to six target formats using default
settings: Word, Excel, PDF, RTF, TXT and WordPerfect. To
customize which workflows you would like to see here, click the
Add and Remove Workflows menu item. Only workflows with runtime prompting for input files are listed here.
Pressing Stop while a workflow is running pauses it. Click Start to
resume processing. If you pause a workflow, maybe do some manual
processing, and then save the document as an OmniPage Document,
when you later open that OmniPage Document, the interrupted
workflow will resume.
73
Workflow Assistant
This allows you to create and modify workflows. The Job Wizard
also uses this to create or modify workflows that jobs execute - see
the next section. The Assistant offers one or more steps, each with a
drop-down list. This left panel of the Workflow Assistant dialog
box lets you build your workflow.
.
This shows the
steps you have
chosen.
This shows the possible
steps at any given
workflow position.
Use this to add a new step
to your workflow.
Specify settings for
current step here.
Click the Close button to delete a workflow step.
All subsequent, dependent steps will also be removed.
To change a step, click this arrow and select
from the ones in the list.
74
Chapter 6
At any moment in the process, the Assistant dropdown menu offers
all steps that are logically possible at that point.
In OmniPage 16 Professional, additional steps are available: Extract
Form Data and Mark Text.
Creating workflows
Select New Workflow... in the Workflow drop-down list, or
from the Process menu. Or click the Workflow Assistant
button in the Standard toolbar when no workflow is selected.
The opening Assistant panel offers two starting points:
Choose Fresh Start to begin with no steps in the workflow diagram
on the right. Accept or change the default workflow name. Then
click Next and choose your first step. Choose an image loading step
that can take input from file, scanner or digital camera files. Specify
settings on the right. Then move on to build your workflow: it can
include a variety of different steps. When done, click Finish.
Choose Existing Workflows to see a list of existing workflows.
These are the sample workflows plus any you have created. Select
one as source. Its steps will appear in the workflow diagram on the
right. Enter a name for your new workflow. Click Next to proceed;
modify its steps and settings as described in the next section. The
changed settings apply to the new workflow only and are not
written back to the workflow used as the source. Any changed
settings enter the new workflow, but do not affect the settings in
the program. Finally, select Finish to complete your new workflow.
Workflow Assistant
75
Modifying workflows
Select the workflow you want to modify in the Workflow
drop-down list and click the Workflow Assistant button in
the standard toolbar. Or choose Workflows... in the Tools
menu, select the desired workflow and click Modify... . The first
panel of the Workflow Assistant appears with the workflow
loaded. Click the icon in the workflow diagram that represents the
step you want to modify. Click the downward pointing arrow
under the icon to replace this step with another one. Continue
modifying steps and/or settings as desired. Remember that deleting
or modifying a step may result in later, dependent steps being
removed. Click Next to replace removed steps or to add new ones.
Click Finish to confirm the changes to your workflow.
Batch Manager
The Batch Manager is a separate but integrated program
to let you create jobs to be processed immediately, or at
some time in the future. By choosing steps carefully, you
can set up jobs that can run unattended. A job executes a
workflow according to the job settings. Jobs are created in
the Job Wizard.
•
•
•
•
•
76
In OmniPage Professional 16 you have the following
additional Batch Manager capabilities:
Setting job timing and recurrence
Folder watching for incoming image files
E-mail inbox watching for incoming attachments
(Outlook and Lotus Notes)
E-mail notification of job completion to specified
recipients
Driving workflows with barcodes.
Chapter 6
Creating new jobs
Open the Batch Manager from the Process Menu or from your
system, by choosing Start > All Programs > ScanSoft OmniPage 16 >
OmniPage Batch Manager or from the OmniPage Agent on the
taskbar.
Creating a job is basically timing a workflow. To do this, start
the Batch Manager (as described above) and click the Create
Job icon or choose Create Job from the File menu.
The Job Wizard starts. First you need to define your job type. You
can create five different types, instances of two basic categories:
Normal and Watch type.
Normal and Watch type jobs may have a recurrence pattern. The
latter are tailored to monitor a specified folder or e-mail inbox for
incoming images to be processed in OmniPage. A specific type
within this category is Barcode cover page jobs, where barcode
cover pages are used to identify which workflow to carry out.
Normal job: Set starting time and specify or create the
Workflow to be run. If you select ‘Do not start now’ use the
Activate button in the Batch Manager to start it.
Job types available in OmniPage Professional 16 only:
Barcode cover page job: This is a special type of folder
watching job (see below). It monitors a folder for incoming
barcode pages, then processes subsequently incoming
images with the workflow identified by the barcode. For
details, see Barcode processing later in this chapter.
Folder watching job: Select this job type and browse to the
folder(s) to be watched for incoming image files.
Outlook mailbox watching job: This job watches an
Outlook e-mail inbox for incoming image attachments of
a specified type.
Creating new jobs
77
Lotus Notes mailbox watching job: Same as above, but a
Lotus Notes inbox is watched.
Name your job and click Next.
The next panel shows Start and Stop Options. Specify Start and
End Time here, recurrency pattern (for recurrent jobs) and set if the
input files are to be deleted when the job is completed. If you wish,
you can set e-mail notification as well. (OmniPage Professional
only)
From the next panel onwards, you can construct your job (except
for barcode cover page jobs) as you normally do with Workflows.
Set your starting point (Fresh Start or Existing Workflows) and
proceed as described above.
The Options dialog box in the Batch Manager is in the Tools menu.
Its General panel has an option Enable OmniPage Agent on system tray at
system startup. By default it is on. It must remain selected for jobs to
run at their scheduled time. The option is provided so it is possible
to prevent all jobs from running without having to disable them
individually. Its state also governs the running of barcode cover
page jobs.
The General panel lets you limit the number of pages allowed in an
output document, even if the file option Create one file for all pages is
selected. When the limit is reached, a new file is started,
distinguished by a numerical suffix.
Click Finish to confirm job creation.
Modifying jobs
Jobs with an inactive status can be modified. Select the job
in the left panel of the Batch Manager and choose Modify
from the Edit menu or click the Modify Job button. First,
modify timing instructions as desired. Then the Workflow
Assistant appears with the workflow steps and settings loaded.
78
Chapter 6
Make the desired changes as already described for workflows. See
“Modifying workflows” above.
Managing and running jobs
This is done with the Batch Manager. It presents two panels. The
left panel lists each job, its next run, status and history. The status
will be:
Waiting:
Scheduled but job start time is in the future.
Running:
Processing is currently underway.
Watching:
Watching is in progress but there is no processing.
Inactive:
Created with timing instruction: Do not start now; or
any deactivated jobs.
Expired:
Scheduled job but start time is in the past.
Collecting:
Watching in progress but the job is waiting for all
incoming files to arrive.
Paused:
User has paused the job and not yet resumed it.
Closing:
Watch type job is saving its result.
Starting:
The status right before Running. Displays when a
job is just being started or when more jobs are about
to run than the number of jobs Batch Manager can
simultaneously run.
Click on a job and a step-by-step analysis of all pages in the job
appears in the right panel. It shows where input was taken from,
the page status and where output was directed to. Click on a plus
icon to see more information about the page. Click on a minus icon
to hide details. For jobs with the error or warning status, the listing
shows which pages failed or what problems occurred.
Creating new jobs
79
Activate Job in the File menu serves to activate any inactive
job immediately.
Deactivate Job in the File menu deactivates any active job. If
the job is running, this will stop it before deactivating.
Choose this to close a Watch type job immediately to save its
result.
Stop Job in the File menu stops a job with status Starting,
Running, or Paused.
Pause Job is available for jobs with status Running or
Starting. To modify such a job’s timing instructions you must
stop it.
Resume Job lets the job continue from its state when it was
paused.
Delete Job in the Edit menu serves to delete the currently
selected job. Only Inactive jobs can be deleted.
Rename Job serves to modify the name of any job.
Use the Edit menu to send a copy of a job’s status report to
Clipboard.
Use Save OPD As... in the File menu to save any intermediate result
of a paused job to an OPD file.
To remove data files storing data of any previously run job, click
Edit, then choose Clear Occurrence. Clear All Occurrences removes
all data for all job occurrences. These two options are useful to free
disk space, but cleared occurrences cannot be viewed any more, so
use these with caution.
The Workflow viewer
The Workflow viewer is integrated into the Batch Manager to the
right of the list of your jobs. Use it to get comprehensive and
detailed information about the processing of each occurrence of the
job. The viewer shows the process in a step-by-step fashion -
80
Chapter 6
following the steps of the workflow. It displays input and output at
each stage. Job results are marked by icons. Drop-down lists give
you information about processing steps.
Watched folders
In OmniPage Professional 16, you can specify watched
folders and e-mail inboxes (Outlook and Lotus Notes) as job
input. These allow processing to be started automatically
whenever image files are placed in pre-defined folders or
arrive into inboxes as e-mail attachments. This is useful to
have sets of files with predictable content arriving from
remote locations processed automatically on arrival, even if no-one
is in attendance. Typically these are reports or form-like documents
that are delivered repeatedly or at recurring intervals, for example
each week or month.
To use this facility, prepare a set of folders or e-mail folders to be
watched. You should not use these folders for other purposes, not
even for barcode cover page jobs. When setting up such a job,
choose Folder watching job, name it and click Next. In the dialog
box that appears, browse to the folders.
Add a watched
folder to the list
using this Browse for
Folder dialog box.
Specify an image
file type.
Watched folders
81
Add the desired folders and file types (one type or all types). Click
the checkbox in front of your selected folder to include its
subfolders as well. To enable a number of file types, add the Folder
repeatedly, once for each type. Add a checkmark to watch
subfolders of the selected folder as well.
When you reach the next panel of the Job Wizard, you set the
timing instructions: a starting time and an end time for the
watching to occur. You can specify recurrences, for instance to have
the folder(s) watched only during your lunch hour (Start 12.15, End
13.05) every Monday, Wednesday and Friday, or overnight in the
last three days of each month, when you keep your computer
running to collect and process monthly reports arriving from afar.
When files enter a watched folder, the program waits
approximately for the interval specified in Batch Manager Options
for more files to arrive in order to process them together. When files
cease to arrive, processing starts.
To finish the watching early, choose Deactivate Job. Then you can
modify the job freely.
Watched mailboxes
In OmniPage Professional 16, you can specify watched
mailboxes as job input. These allow processing to be
started automatically whenever image files of specified
file types are placed in pre-defined e-mail folders. This
is useful to have sets of files with predictable content
arriving processed automatically on arrival, even if noone is in attendance.
The program supports watching Microsoft Outlook and Lotus
Notes mailboxes.
82
Chapter 6
Barcode processing
In OmniPage Professional 16, you can run workflows (sets
of steps and their settings) using barcode cover pages that
define which workflow should run. A barcode cover page
identifies a workflow (with workflow identifier, workflow
name and workflow steps) and contains information on
workflow creation (name of the creator, date of creation,
etc.). Note that barcode processing cannot be recurrent.
There are two ways of doing barcode processing:
Scanner input: Workflow processing is driven by placing the cover
page on top of a document to be scanned and pushing the scanner's
Start button.
Image file input: Job processing is driven by copying the barcode
cover page image into a watched folder that will receive the
document images to be processed.
For scanner input you have to
1. Create a workflow that contains the processing steps you need
with Scan Images as first step.
2. Print a barcode page that identifies the workflow.
3. Start barcode processing from scanner.
To scan with a barcode page:
1. Place the barcode cover page on the top of the document in the
ADF.
2. Press the Start button on the scanner.
3. Select “Barcode cover page workflow” as Scanner button
default action on the Scanner tab of Options. You can also set it
Barcode processing
83
to Prompt for workflow. In this case Prompt for workflow is
selected in the Scanner panel, a dialog box appears with the
available choices: Scanning, Barcode cover page workflow, and
all scanning workflows.
All available pages will be processed by the specified workflow, or
until a new barcode page is encountered. The result will be saved as
specified by the workflow.
For image input you must create a barcode cover page job.
A barcode cover page job uses a special kind of watched folder.
Always use a separate folder for barcode processing. The starting
time for the workflow is defined by the moment the barcode cover
page enters a watched folder.
For a barcode cover page job processing you need to
1. Create a workflow that contains the processing steps you need.
Select Load Files as input with “Select files for loading each
time this workflow is started” selected.
2. Save a barcode cover page that identifies the workflow...
3. Define timing instructions for barcode folder watching in the
Batch Manager by creating a barcode cover page job.
To process with a barcode cover page job:
1. Make sure that the job is running at the required time.
2. The folder is being monitored and the workflow will be started
as soon as a barcode cover page is placed in the specified
watched folder.
3. The workflow will process image files arriving in the folder
after the cover page.
4. The workflow will be completed at the specified end time of
the job, or each time a new barcode cover page is detected.
You can copy the barcode cover page image and the image files into
the watched barcode folder yourself, or direct others to do this. You
84
Chapter 6
can also place just a barcode cover page image file in the watched
folder, then have a network scanner make and send image files
there.
File-it Assistant
The File-it Assistant lets you create scanning workflows for
repeated document conversion tasks. The Assistant is for scanning
jobs that require no user interaction during the processing. In a
typical scenario operators at a scanning station prepare documents,
applying the appropriate cover page to each, without needing to
know anything about the later processing or destination of the
documents, because all that is pre-determined. Associate a button
on your scanner with OmniPage and print a barcode cover page to
identify your workflow. As a result, you can scan, convert and save
without interaction beyond pressing the scanner button.
Create the workflow:
1. Select File-it Assistant from the Tools menu.
2. Name your workflow, choose an output file type, location and
file name.
3. Review and optionally change the workflow settings.
4. Print the barcode cover page.
5. Associate OmniPage with a scanner button (must be done only
once) in the Control Panel.
Use the workflow:
1. Place the printed barcode cover page on top of a document in
your scanner.
File-it Assistant
85
2. Push the OmniPage-associated scanner button. The document
will be converted using workflow settings and sent to the
location you defined.
It is possible to use barcode cover pages stored as image files to
drive jobs from watched folders. Such jobs permit interactive steps
like manual zoning and proofing that are not available via the Fileit-Assistant.
86
Chapter 6
Technical information
This chapter provides troubleshooting and other technical
information about using OmniPage 16. Please also read the online
Readme file and other help topics, or visit the Nuance web pages.
Troubleshooting
Although OmniPage is designed to be easy to use, problems
sometimes occur. Many of the error messages contain selfexplanatory descriptions of what to do – check connections, close
other applications to free up memory, and so on.
Please see your Windows documentation or OmniPage online Help
for information on optimizing your system and application
performance.
On supported file formats, see the Online Help.
Solutions to try first
Try these solutions if you experience problems starting or using
OmniPage:
•
Make sure that your system meets all the listed
requirements. See the Installation and setup chapter.
•
Make sure that your scanner is plugged in and that all
cable connections are secure.
•
Visit the support section of Nuance’s web site at
www.nuance.com. It contains Tech Notes on commonly
reported issues using OmniPage. Our web pages may also
offer assistance on the installation process and
troubleshooting.
Technical information
87
•
•
•
•
Use the software that came with your scanner to verify
that the scanner works properly before using it with
OmniPage.
Make sure you have the correct drivers for your scanner,
printer, and video card. Visit Nuance’s web page through
the Help menu and consult its scanner section for more
information.
Defragment your hard disk. See Windows online Help for
more information.
Uninstall and reinstall OmniPage, as described in the
section, “Uninstalling the software” in the Installation and
setup chapter.
Testing OmniPage
Restarting Windows 2000, XP or Vista in its safe mode allows you
to test OmniPage on a simplified system. This is recommended
when you cannot resolve crashing problems or if OmniPage has
stopped running altogether. See Windows online Help for more
information.
To test OmniPage in safe mode:
1. Restart your computer in safe mode by pressing F8
immediately after you see the ‘Starting Windows’ message.
2. Launch OmniPage and try performing OCR on an image. Use a
known image file, for instance one of the supplied sample image
files.
• If OmniPage does not launch or run properly in safe mode,
then there may be a problem with the installation.
Uninstall and reinstall OmniPage, and then run it in
Windows safe mode.
88
Chapter 7
• If OmniPage runs in safe mode, then a device driver on
your system may be interfering with OmniPage operation.
Troubleshoot the problem by restarting Windows in
Step-by-Step Confirmation mode. See Windows online
Help for more information.
Text does not get recognized properly
Try these solutions if any part of the original document is not
converted to text properly during OCR:
•
Look at the original page image and ensure that all text
areas are enclosed by text zones. If an area is not enclosed
by a zone, it is generally ignored during OCR. See the
section on creating and modifying zones, in the
“Processing documents” Chapter.
•
Make sure text zones are identified correctly. Reidentify
zone types and contents, if necessary, and perform OCR on
the document again. See “Zone types and properties” in the
“Processing documents” Chapter.
•
Be sure you do not have an unsuitable template loaded by
mistake. If zone borders cut through text, recognition is
impaired.
•
Adjust the brightness and contrast sliders in the Scanner
panel of the Options dialog box. You may need to
experiment with different settings combinations to get the
desired results.
•
Use the Image Enhancement Tools to optimize your image
for OCR.
•
Check the resolution of the original image. Hover the
cursor over a page thumbnail for a popup display. If the
resolution is significantly above or below 300 dpi,
recognition is likely to suffer.
Troubleshooting
89
•
•
•
•
•
Make sure the correct document languages are selected in
the OCR panel of the Options dialog box. Only languages
included in the document should be selected.
Turn IntelliTrain on and make some proofing corrections.
This is most likely to help with stylized fonts or uniformly
degraded documents. If IntelliTrain was running, try
turning it off – on some types of degraded documents it
may not be able to help.
Do some manual training, or edit existing training to
remove unsuccessful training.
If you use True Page as the Text Editor view or for export,
recognized text is put into text boxes or frames. Some text
may be hidden if a text box is too small. To view the text,
place the cursor in the text box and use the arrow keys on
your keyboard to scroll to the top, bottom, left, or right of
the box.
Check the glass, mirrors, and lenses on your scanner for
dust, smudges or scratches. Clean if necessary.
Problems with fax recognition
Try these solutions to improve OCR accuracy on fax images:
•
Ask senders to use clean, original documents if possible.
•
Ask senders to select Fine or Best mode when they send
you a fax. This produces a resolution of 200 x 200 dpi.
•
Ask senders to transmit files directly to your computer via
fax modem if you both have one. You can save fax images as
image files and then load them into OmniPage. See “Input
from image files” in the Processing documents Chapter.
90
Chapter 7
System or performance problems during OCR
Try these solutions if a crash occurs during OCR or if processing
takes a very long time:
•
Check image quality. Consult your scanner documentation
on ways to improve the quality of scanned images.
•
Break complex page images (lots of text and graphics or
elaborate formatting) into smaller jobs. Draw zones
manually or modify automatically created zones and
perform OCR on one page area at a time. See “Working
with zones” in the Processing documents Chapter.
•
Restart Windows, 2000, XP or Vista in safe mode and test
OmniPage by performing OCR on the included sample
image files.
If you are performing multiple tasks at once, such as recognizing
and printing, OCR may take longer.
Supported file types
Supported image file formats for loading are TIFF, PCX, DCX, BMP,
JPEG, JB2, JP2, GIF, PNG, XIFF, MAX, PDF, XPS.
Supported file types for saving recognition results as text are:
HTML 3.2, 4.0
Microsoft Excel 97, 2000, XP, 2003, 2007
Microsoft PowerPoint 97
Microsoft Publisher 98
Microsoft Word 97, 2000, XP, 2003 (WordML), 2007
OmniPage Documents
PDF (Normal), Edited, with image on text, with image
substitutes
RTF Word 6.0/95, RTF Word 97, RTF Word 2000, RTF 2000
ExactWord
WordPad
Troubleshooting
91
WordPerfect 12, X3
Text, Text with line breaks, Text - Formatted, Text - Comma
Separated
Unicode Text, Unicode Text with line breaks, Unicode Text Formatted, Unicode Text - Comma Separated
Wave Audio Converter (to save recognized text being read aloud)
In OmniPage Professional 16 there is also support for: eBook,
Microsoft InfoPath (for forms), Microsoft Reader, and XML.
92
Chapter 7
Index
Symbols
Auto-zoning
69
Numerics
3D Deskew
B
Backgrounds for zoning
37
A
Accuracy
improvement 30, 52,
89
influence of brightness
31
influence of training 52
scanning influence 30
Acquire Text menu items
28
Activating OmniPage 15
Adding
to zones 42
training to training files
53
words to user dictionary
48
ADF 29, 31
Advanced saving options
67
Advice on problems 87
Alphanumeric zones 40
Attachments to mail 70
Auto-detect layout 32
Automatic Document
Feeder (ADF) 29,
31
32, 41
Automatic training 53
Auto-sending by mail 70
39
Basic processing steps
18
Batch Manager 76
Black-and-white
images 64
scanning 30
Blacking out confidential
words 57
Bold text 54
Boxes 55
Boxes for recognized text
90
Brightness 31, 89
Brightness / Contrast (E)
36
Bring to Front tool (F)
C
61
Color
images 64
markers 49
scanning 31
Comb tool (F) 61
Comparing recognized
words with originals
49
Composition of workflows
71
Contrast 31, 89
Convert Now Wizard 69
Converters multiple 67
Converting from PDF 69
Converting image files 73
Copying pages to Clipboard 70
Creating
training data 53
workflows 75
Crop (E) 37
Custom Layout 33
Custom views 18
Customizing export converters 67
Changing
part of a page 57
reading order 56
Changing views 18
D
Character attributes 54
Deleting
Character Map 50
jobs 80
Characters, suspect 47
training files 53
Checkbox tool (F) 61
user dictionaries 51
Checking OCR results 49
Describing document layCircle text tool (F) 61
out 32
Classic View 18
Deskew (E) 37
Clipboard 70
OmniPage 16 User’s Guide
93
Deskewing digital camera
37
Desktop 18
Desktop launching of
workflows 73
Despeckle (E) 37
Dictionaries 48
Digital camera input 30,
37
Direct OCR 27
Disabling job running 78
Disk space 9
Document Layout, Form
33
Document Manager
19
18,
Document Ready button
72
Document-to-document
conversion 32, 38
Documents
copying to Clipboard 70
double-sided 32
exporting 63
in OmniPage 17
layout description 32
saving 63
with varied layout 32
Double-sided documents
31
Drawing zones in Direct
OCR 29
Dropout color (E) 37
Dropping graphics from
export 65
Dual screens 20
Duplex scanners 31
94
Index
Dynamic verifier
49
E
Extracting items from
OPDs 17
Editing
character attributes 54
form objects 61
graphics 55
in True Page 55
on-the-fly 57
paragraph attributes 54
PDF output 69
recognized text 54
tables 43, 55
training files 54
user dictionaries 51
E-mail notification 76
Embedding items in OPDs
F
Embedding templates in
OPD files 44
Enabling OmniPage taskbar icon 73
Error messages for jobs
suspect words 48
Finishing
proofing in a workflow
17
79, 80
Excel 2007 (XLSX) 91
Export converters 67
Export Results button 65
Exporting
graphics 65
in Flowing Page 66
in True Page 66
repeated 63
to Clipboard 70
to file 65
to mail 70
to PDF 69
Extracting form data 62
Fax recognition 90
Features, new 7
File-it Assistant 85
Files
as export target 64
as image source 29
retained on uninstall 15
separation options 65
types for export 65
Fill (E) 38
Fill text tool (F) 60
Finding
non-dictionary words
48
72
workflows 75
zoning in a workflow 72
Flexible View 18, 20
Flowing Page 66
Form Arrangement Toolbar 61
Form data, extracting 62
Form Drawing Toolbar
60
Form zone 41
Formatted Text view 48,
66
Formatting levels
65
48,
Formatting toolbar
18
Frames
55, 66, 90
G
Graphic tool (F) 60
Graphic zone 41
Graphics
editing 55
in export 65
Grayscale
images 64
scanning 31
Grouping elements 55
H
Header/footer indicators
47
Hearing texts read aloud
59
Hiding / showing markers
47
Image Panel 18
Image toolbar 18
Images
backgrounds 39
black-and-white 64
color 64
editing 55
grayscale 64
quality 31
resolution 64, 89
saving 64
substitutes in PDF 69
Improving accuracy 30,
53, 89
Increasing memory 89
Input
from image file 29
from PDF files 29
Input from digital camera
30
Highlighting text 57
Horizontal Alignment tools Installing
OmniPage 10
(F) 61
scanners 11
Hue / Saturation (E) 37
IntelliTrain 53, 90
Hyperlinks 55
Italic text 54
I
Ignore backgrounds 39
Ignore zones 41
Image Enhancement
history 38
in workflows 39
tools 35
Image files
conversion 73
input 29
reading order 29
samples 88
J
Jobs
disabling 78
error messages
80
79,
managing 79, 80
modifying 78
page limit 78
recurring 82
running 79, 80
status 79, 80
timing instructions
Joining zones 42
82
L
Languages 52, 90
Launch
target application 65
workflows from desktop
73
Layout description 32
Layout retention 48
Layout, auto-detect 32
Legal dictionaries 48
Legal documents 33
Line tool (F) 60
Links to web pages 55
Loading
Image Enhancement
templates 38
image files 29
training files 53
user dictionaries 51
zone templates 34,
44
M
Mail 70
Mailbox watching 82
Managing jobs 79, 80
Manual training 52
Manual zoning 39
Marked words in Editor
47
Markers 47, 49
Marking text 57
Medical dictionaries 48
Memory requirements 9,
89
OmniPage 16 User’s Guide
95
Microsoft Outlook 70
Microsoft Word, opening
PDF files in 69
Minimum system requirements 9
Modifying
image quality 34
jobs 78
tables 43, 55
zone templates 45
zones 42
MRC compression 69
Multicolumn areas 55
Multi-page image files 64
Multiple column pages
33
Multiple converters
67
N
New features 7
Non-dictionary words 47
Non-printing characters
47
Numeric zones
40
O
OCR
Batch Manager 76
checking OCR results
49
OmniPage
activating 15
documents in 17
earlier versions 10
installing 10
new features 8
reinstalling 15
starting 11
testing 88
uninstalling 15
OmniPage Desktop 18
OmniPage Documents
17
saving as 63
OmniPage Professional 8
OmniPage Toolbox 19
OmniPage Workflow Starter 73
On-the-fly editing 57
OPD files
embedding items 17
extracting items 17
template embedding 44
Opening image files 29
Optimizing brightness 31
Options dialog box 24
Options for proofing 48
Options for saving 67
Order of page elements
56
Direct OCR 27
poor performance in 91 Original image saving 64
proofreading results 48 Overview
of processing steps 18
settings for Direct OCR
27
OCR Brightness (E)
OCR image 34
96
37
Index
P
Page limit for jobs 78
Page Ready button 72
Pages
copying to Clipboard 70
multi-page image files
64
navigation 19
sending as mail 70
PaperPort 16, 24
Paragraph
editing attributes 55
styles 55, 65
Pausing workflows 73
PDF Edited 68
PDF file input 29
PDF Flavors 68
PDF, converting from/to
69
Pending pages 57
Performance problems
during OCR 91
Plain Text view 66
Pleading numbers 33
Pointer (E) 36
PowerPoint 2007 (PPTX)
91
Preprocessing images
34
Primary image 34
Primary/OCR Image (E)
36
Problems with faxes 90
Process backgrounds 39
Process zones 41
Processing
basic steps of 18
from other applications
27
manual
27
step-by-step 27
steps, overview 18
with workflows 72
Professional dictionaries
Repeated exporting 63
Replacing zone templates
45
Professional version 8
Proofing
in a workflow 72
options 48
Properties of zones 40
Purpose of training 52
Purpose of workflows 71
Resolution 64, 89
Resolution (E) 37
Retaining paragraph
styles 65
Re-training 52
Rotate (E) 37
Running
Batch Manager jobs
workflows 72
Q
S
48
Quality of images 31
Quick Convert View 18,
21
R
78
Safe mode 88
Sample image files 88
Saving
and launching 65
as OmniPage Document
Reading order 56
63
Reading text aloud 58
documents 63
RealSpeak 58
options 67
Recognition
original images 64
accuracy 31, 52, 89
recognition results 65
languages 52, 90
text 65
problems with faxes 90
to file 64
saving results 65
to multiple file types 67
speeding up 90
training files 53
Rectangle tool (F) 60
user dictionaries 51
Recurring jobs 82
zone templates 45
Redacting text 57
Saving and applying ImRegistering
age Enhancement
applications for Direct
templates 38
OCR 28
Scanners 90
Reinstalling OmniPage
drivers 12
15
duplex 31
Removing zone temsetting up 11
plates 45
Scanning 30, 31
input from 31
pictures 31
Wizard 11
Scheduled processing 76
Searchable PDF 68
Searching PDF output 69
Select Area (E) 36
Selection tool (F) 60
Send to Back tool (F) 61
Sending pages by mail
70
Setting up a scanner 11
Setting up Direct OCR 28
Settings
Acquire Text 28
for Direct OCR 28
Options dialog box 24
zone types 43
Simplified UI 21
Single-column pages with
tables 33
Slow recognition 91
Smart folders 81, 82
Solutions for poor performance 87
Spreadsheet pages 33
Standard toolbar 19
Starting a user dictionary
51
Starting Batch Manager
76
Starting the program 11
Status of jobs 79, 80
Step-by-step processing
18
Stopping workflows
OmniPage 16 User’s Guide
73
97
Storing zoning changes
57
Striking out text 57
Suggestions in proofing
48
Suspect words 47
Synchronize Views (E)
36
System or performance
problems during OCR
91
System requirements
9
T
Table tool (F) 61
Tables
editing 55
editing dividers 43
in single column pages
33
Timing of jobs 82
Toolbar docking / floating
49
Training 52
automatic 53
IntelliTrain 53
manual 52
training files 54
Troubleshooting 87
True Page editing 55
True Page export 66
True Page view 48
TWAIN scanner drivers
12
Types of zones
U
40
Underlined text 54
Ungrouping elements 55
Uninstalling the software
W
Watched folders 81, 82
Watched mailboxes 82
Web page links 55
Wizard for scanner setup
11
Word 2007 (DOCX) 91
Workflow Assistant 27,
74
Workflows
composition 71
creating 75
finishing 75
for form data extraction
62
pausing and stopping
73
running 72
taskbar icon 73
user interaction 72
Working with zones 42
in Text Editor 55
15
removing dividers 43
Unloading
rows in 43
training files 53
X
zones 41, 43
user dictionaries 51
XPS 91
Taskbar workflow icon 73 zone templates 45
Z
Technical information 87 URLs 55
Template zones 34,
User dictionaries 48, 51 Zones
adding to 42
44, 89
Using Direct OCR 28
alphanumeric 40
Template, form 62
V
changing types 41
Templates in OPDs 44
Verifying text 49
deleting templates 44
Testing OmniPage 88
Vertical Arrangement
graphic 41
Text Editor 18, 47
Tools (F) 61
ignore 41
Text saving 65
Views 18
in Direct OCR 29
Text tool (F) 60
Formatted text 47
irregular 42
Text-to-Speech facility
Plain Text 47
joining 42
59
True Page 48
manual 39, 89, 91
Thumbnails 18, 19
modifying templates 44
98
Index
numeric 40
process 41
properties 40
replacing templates 44
saving templates 44
table 41, 43
templates 34, 44,
89
types 40, 89
unloading templates 45
working with 42
(E)=Image Enhancement
Zoning in a workflow 72
Tool
(F)=Form Drawing or
Zoning on-the-fly 57
Zoom (E) 36
Arrangement Tool
Zooming displays 19,
(Professional only)
49
OmniPage 16 User’s Guide
99
THIRD PARTY LICENSES/NOTICES
The Independent JPEG Group's software, copyright © 1991-1995, Thomas G. Lane.
This software is based, in part, on the work of the Independent JPEG Group,
Colosseum Builders, Inc., the FreeType Team, and Catharon Productions, Inc.
Zlib copyright © 1995-1998 Jean-loup Gailly and Mark Adler.
This product was developed using Kakadu software.
The word verification, spelling and hyphenation portions of this product are based in
part on Proximity Linguistic Technology.
The Proximity Hyphenation System © Copyright 1988. All Rights Reserved.
Proximity Technology Inc.
The Proximity/Merriam-Webster American English Linguibases. © Copyright 1982,
1983, 1987, 1988 Merriam-Webster Inc. © Copyright 1982, 1983, 1987, 1988 Proximity
Technology Inc. Words are checked against the 116,000, 80,821, 92,641, 106713,
118,533, 91928, 103,792, 130,690, and 140,713 word Proximity/Merriam-Webster
Linguibases. The Proximity/Collins British English Linguibases. © Copyright 1985
William Collins Sons & Co. Ltd. Legal and Medical Supplements © Copyright 1982
Merriam-Webster Inc. © Copyright 1982, 1985 Proximity Technology, Inc. Words
are checked against the 80,307, 90,406, 105,785, and 115,784 word Proximity/Collins
Linguibases. The Proximity/Collins French, German, Italian, Portuguese (Brazilian),
Portuguese (Continental), Spanish Linguibases. © Copyright 1984, 1985, 1986, 1988
William Collins Sons & Co. Ltd. © Copyright 1984, 1985, 1986, 1988 Proximity
Technology, Inc. Words are checked against the 136,771, 150,893, 178,839, 207,119,
212,565, and 194,393 word Proximity/Collins Linguibases. The Proximity/Van Dale
Dutch Lingubase. © Copyright 1987 Van Dale Lexicografie bv. © Copyright 1987
Proximity Technology, Inc. Words are checked against the 119,614 word Proximity/
Van Dale Linguibase. The Proximity/Munksgaard Danish Linguibase. © Copyright
1988 Munksgaard International Publishers Ltd. © Copyright 1988 Proximity
Technology Inc. Words are checked against the 113,000 word Proximity/
Munksgaard Linguibase. The Proximity/IDE Norwegian and Swedish Linguibases.
© Copyright 1988 IDE a.s. © Copyright 1988 Proximity Technology Inc. Words are
checked against the 126,123 and 150,000 word Proximity/IDE Linguibases. Esperanto
dictionary based on compilation by Toon Witkam and Stefan MacGill.
Part of this software is derived from the RSA Data Security, Inc. MD5 Message-Digest
Algorithm. AES encryption/decryption copyright © 2001, Dr Brian Gladman,
Worcester, UK.
© Nuance Communications, Inc., 2008. All rights reserved. Subject to change without
prior notice.
100
Index
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement