HP Autonomy OCR for WorkSite - data sheet
Data sheet
HP Autonomy OCR
for WorkSite
Find all hidden information across
the enterprise.
In today’s enterprise environment, often the most critical
information is the 10% of information that has not been indexed.
There are thousands of valuable documents such as contracts,
signed agreements, court documents, and other scanned content
that are not full text searchable because they were created by
processes that do not include character recognition capabilities.
Image-only scanned documents can be generated by:
• Ad-hoc desktop scanners that do not have the ability to generate
text for indexing
• 3rd parties who provide non-OCRed scanned content as email
• Desktop OCR processes that lack enterprise throughput, failover,
and error handling
• Internet research downloads or imports
These image-only documents create a huge challenge, as this
information is only visible via navigation and metadata searches
in WorkSite; image-only documents are not returned in full text
search result sets because their content has not been indexed by
IDOL. Not only does this represent a large, unquantifiable risk, it is
also renders many critical documents – a rich source of business
information - underutilized.
Powered by HP Autonomy IDOL, the Optical Character Recognition
(OCR) module extracts the content from these documents into
the IDOL index collection allowing search and thus enables your
organization to fully leverage the benefits of this once lost content.
Low cost of ownership
The plug-in nature of the module allows organizations to leverage
existing IDOL infrastructure, eliminating the need for workstations
or software on the desktop. Zero desktop footprint also means less
maintenance for IT overall.
Powerful server-side processing
Installed as a back end service, the OCR module performs two
important functions:
• Back file OCR
The OCR module identifies image-only WorkSite documents,
generates OCR text and indexes the content
• OCR all incoming documents
As part of the indexing process, the OCR module continuously
extracts text from new and revised WorkSite documents
No more hidden documents
Extracting the content from the image files, including email
attachments, enables smart business decisions by providing the
correct users with search access to this critical content.
Important enterprise knowledge captured in these documents can
be found, regardless of how they are searched within WorkSite.
Fast and powerful seamless automation
Since the HP Autonomy OCR module is an IDOL plugin leveraging
the existing IDOL indexing process, no middleware processes are
involved, resulting in fast OCR processing. Documents can be made
available for searching in minutes.
Image files added to, and any files already present in WorkSite,
automatically become searchable as part of the normal IDOL
indexing process, without any additional input or work from the end
user or IT staff.
Key benefits
Boosts efficiency by finding hidden content including email attachments
Seamless server side integration with WorkSite; no desktop footprint
Powered by IDOL with support for over 1000 file types and over 120 languages
Shares existing IDOL infrastructure
About HP Autonomy
HP Autonomy is a global leader in software that processes human
information, or unstructured data, including social media, email,
video, audio, text and web pages, etc. Autonomy’s powerful
management and analytic tools for structured information together
with its ability to extract meaning in real time from all forms of
information, regardless of format, is a powerful tool for companies
seeking to get the most out of their data. Autonomy’s product
portfolio helps power companies through enterprise search analytics,
business process management and OEM operations. Autonomy also
offers information governance solutions in areas such as eDiscovery,
content management and compliance, as well as marketing solutions
that help companies grow revenue, such as web content management,
online marketing optimization and rich media management.
Please visit autonomy.com to find out more.
Accuracy across formats and languages
IDOL’s ability to understand content in over 120 languages gives the
HP Autonomy OCR module the power to extract information from
practically any document regardless of its origin or language with an
un-paralleled level of accuracy.
OCR is performed on all graphic files and documents (PDF, TIFF, JPEG
or GIF) regardless of size - a document can be one page or a collated
set. Also, the OCR process is performed in place so document
integrity is always maintained.
Get connected
Current HP driver, support, and security alerts
delivered directly to your desktop
Copyright © 2013 HP Autonomy. All rights reserved. Other trademarks
are registered trademarks and the properties of their respective owners.
20130204_RL_DS_HP_ Autonomy_OCR_WorkSite
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF