Fulltext SDK vs. Data Capture SDK

Into

This is a feature and usage scenario comparison between FineReader Engine and FlexiCapture Engine.

  • FineReader Engine is a traditional OCR toolkit, designed for converting images and image PDFs into plain text, Office formats, HTML, searchable PDFs or XML. There are different options how the layout of the original document should be analyzed and restored.
  • FlexiCapture Engine is based on the same core ABBYY OCR technology - but the purpose of this SDK is document separation, classification and data extraction. FlexiLayout technology uses document layout information and also textual information to locate the relevant areas to be able to extract the data that is required for a business process.

Here a brief overview how the document analysis makes a difference:

Detailed Feature Comparison

Usage Scenarios

FineReader Engine FlexiCapture Engine
Application Areas Intelligent Document Conversion into Text,
Microsoft Office formats, HTML,
XML and PDF; full text recognition
Intelligent Data Capture, data extraction of specific fields of interest
Document Types Any kind of documents
e.g. books, magazines, letters,
manuals, business documents etc.
Structured Documents
e.g. fixed forms, multiple-choice tests, questionnaires;
Semi-structured Documents
e.g. invoices, shipping documents, passports;
Unstructured documents
e.g. letters, contracts
Further processing - Access to full textual information (search) Document conversion for further editing
- Document conversion for business
process streamlining (searchable PDFs instead of paper)
- Document Archiving
- Extracted data is used in further business processes and applications
- Documents routing and workflow integration
- Document Archiving

Core Recognition

FineReader Engine FlexiCapture Engine
Image Pre-
Processing (Noise removal, deskew, despeckle, etc.)
Yes Yes
OCR in 190
Languages
Yes Yes
Chinese, Japanese, Korean OCRAdd on Not available in FCE V8.0
Available as Add-On in FCE V9.0
ICR in 114 LanguagesYes – manual definition of the recognition area Yes – recognition areas are defined via Fixed Form Templates
and/or found by FlexiLayouts
Barcode Recognition1D Barcodes included
2D: PDF 417 Barcode included
1D and 2D Support included

Document Processing Technologies

FineReader Engine FlexiCapture Engine
Document Separation- No built in document separation,
- Document splitting through own logic and code possible
- “Simple” automatic separation, based on blank pages and barcodes
- “Advanced” separation, based on multi page document definitions: Fixed Form Templates and FlexiLayouts
Document Classification - No built in document classification
- Own logic required, based on OCR results
Documents can automatically be classified via Fixed Form Templates and FlexiLayouts
Document and Layout Analysis Focus: Detection of all text on the document pages, layout retention
Automatic analysis of the page/document layout; Location and identification of text blocks, text columns, tables, images or barcodes. \\Full access
Focus: “Own” document analysis, based on FlexiLayout technology
Key elements for orientation are detected (specific text strings, lines, white space, barcodes), then the relevant data is located and extracted. Identification based on regular
Access to required data - Own logic, based on OCR results
- No built in data access logic
FlexiLayouts internally deal with different hypotheses and uncertainties, for example when the search elements are not unique or OCR errors occur

Verification

FineReader Engine FlexiCapture Engine
Verification - Full text verification
- Layout optimization
- Data verification
- Group and context verification
Visual Components V8.0: No, only data via API
V9.0: Yes
V10: Yes Details...
V8.0: No, only data via API
V9.0: Yes, Details...

Export

FineReader Engine FlexiCapture Engine
Export Formats to backend systemsDocument Export for Digital Editing
- Text, RTF, HTML,
- DOC(X), XLS(X), PPT(X),
- XML with character, layout & formatting info
Data Export to back end applications
- Text, CSV, XLS, DBF
- XML with data, field types, field position,errors, character recognition quality information
- ODBC-Export via own application
Export Formats for archiving- BMP, TIFF, JPEG, JPEG2000, PNG
- Searchable PDF, PDF/A, MRC PDFs
- BMP, TIFF, JPEG, JPEG2000, PNG
- Searchable PDF, PDF/A, MRC PDFs

Development

FineReader Engine FlexiCapture Engine
SDK Integration and Project Setup Set of DLLs that are used to integrate FineReader Engine functionality in custom solutions or applications2 tier development:
a) Document description and data extraction logic development in FlexiCapture Standalone Pro (= Fixed Form Templates and FlexiLayouts)

b) FlexiCapture Engine DLLs, for integration and document processing in custom solutions or applications
API Extension FineReader Engine API only
(no FCE extension possible)
It is possible to license FineReader Engine API within FlexiCapture Engine
The new adjusted licensing scheme of ABBYY Europe contains the FineReader Engine API in Version 10 :-) 1)

Summary

FineReader Engine FlexiCapture Engine
Summary Access to the layout and text recognition results
Sophisticated document export to a variety of formats.
Technology for separation and classification
of documents and extraction of relevant information
Data export and easy to use document export.
1) Licensing scheme can be different in other regions, please contact your local ABBYY Office/Distributor for details