Skip to main content
Skip table of contents

IDP Description

Overview

Retarus Intelligent Document Processing (IDP) is an advanced document processing platform designed to streamline the conversion of unstructured business documents (such as PDF, PNG, JPEG or TIFF) into structured XML format.

The IDP platform leverages cutting-edge technologies, including OCR (Optical Character Recognition), trained AI models, and a combination of ML (Machine Learning) and DL (Deep Learning) to extract relevant data from incoming business documents such as invoices and orders, thus making them available for further automated processing.

IDP offers a robust and efficient solution, whether you are dealing with a high volume of unstructured incoming documents or need to integrate seamlessly with your ERP (Enterprise Resource Planning) system.

System Architecture

image-20240522-075339.png

Target Audience

The IDP solution caters to organizations across various industries that handle a significant volume of unstructured invoices, orders, and order confirmations, automating data entry and reducing manual effort in document processing.

Benefits

  • Time Savings: Minimize manual data entry and accelerate document processing.

  • Accuracy: Reduce errors associated with manual input.

  • Scalability: Handle large volumes of documents effortlessly.

  • Cost Efficiency: Decrease reliance on human resources for data extraction.

  • Easy integration with customer's processes: Validate and enhance processing output with customer's master data.

Supported incoming document formats

  • PDF

  • PNG

  • JPEG

  • TIFF

Supported document types

  • Invoice

  • Order

  • Order confirmation

  • Delivery Note

Key Features

Document processing

  • Standardized Import. Customer uploads unstructured business documents to the IDP platform via standard available channels (mail, fax, API call) or via manual upload. Each incoming document is uniquely identified on the platform and associated with a Customer. Various security mechanisms protect against misuse and ensure the confidentiality of data, preventing unauthorized third parties from accessing customer information.

  • Automated Data Extraction (AI). IDP employs OCR technology to extract text, numerical and date information from uploaded images and PDFs. Well-trained, document type-specific AI models enhance data capture by inferring the meaning of the extracted data, even from complex or imperfect layouts. Each automatically captured document element is assigned a confidence level, expressed as a percentage, indicating the certainty of the captured data accuracy.

  • Document Validation.  Extracted data is run against a set of flexibly configured business rules to ensure data quality, e.g., checking the context of the surrounding data fields in the document to verify if the single field value makes sense, or if the value of the field itself is possible at all. Validators can be customized for each processing flow “preset” to determine whether a document requires human review for verification.

  • Manual Document Validation (Human-in-the-Loop). For cases involving ambiguous data, data with capture confidence below the established threshold, or poorly legible documents, the Retarus IDP Portal application facilitates manual correction and validation of incoming documents – the so-called „Human-in-the-loop“ step. Human reviewers use an intuitive user interface to quickly verify and adjust extracted information, ensuring high accuracy, or to reject erroneous documents.

  • Master Data Enrichment. Customers can import their master data via API call or SFTP to enrich and verify extracted information. By referencing existing master data records, we enhance the context and accuracy of the processed data. Master data can also be used as a tool in manual document validation.

  • Document Parking. Document Parking automatically sets aside documents that don't meet processing prerequisites, e.g., there is no up-to-date master data available at the document ingestion. A scheduled job checks these prerequisites hourly and releases the document for validation once they are met. A document is allowed to spend just a certain time being parked, after which it is automatically released and put back into processing.

  • Enhanced Keyword detection. This feature empowers customers to automate document handling based on keyword detection. Customer can prepare a custom list of keywords, associating each keyword with a pre-defined action. The list is imported into the system as a special type of master data using the same upload mechanism. When keyword functionality is enabled, the IDP platform automatically scans ingested documents for these keywords and triggers the corresponding action upon detection. In cases where multiple keywords are detected within a single document, the system prioritizes actions based on the order of keywords in the list, with higher-ranking keywords taking precedence.
    Detected keywords are displayed in the Human-in-the-Loop interface, showing instance counts and locations within the document. Also, each instance of the detected keyword is included in the XML output result.

  • Document Review. The External Review feature enhances document quality through seamless collaboration. Users can flag documents needing external review and add detailed notes about identified issues. Reviewers access these documents in a separate overview and provide direct feedback to the customer, creating a structured and efficient review process.

  • Structured XML Output. The processed data is transformed into a structured XML format, making it compatible with various systems. This ensures seamless integration with the customer's existing software, such as ERP. Processed documents can be downloaded via API. Rejected documents are stored separately.

  • Archiving. This background task automatically archives downloaded documents after a set period, optimizing system performance and storage capacity. All information about processing (meta-data) is permanently available, while the documents with associated files eventually become unavailable for viewing and downloading.

Dashboard and Reporting

  • Overview and Monitoring. Retarus IDP Portal application offers insight into the system workload for each configured processing flow “preset”. Customers can review both processed and rejected documents via the Retarus IDP Portal overview, as well as track each document and its status across the platform.

  • Reporting. There is a set of predefined reports available on the IDP dashboard, e.g., report on the number of automatically vs. manually processed documents per period, report on ingested documents per preset or overall, report on rejected documents, average duration of the processing, reporting per user, etc.

Administration and Configuration

  • Processing Configuration. Presets define the sequence of actions performed on an input document. They encompass rules, mappings, and transformations designed to tailor IDP to Customer's specific document types and business needs. Customers can easily configure more than one processing preset for each document type by using IDP intuitive administrative interface or by using exposed API. Preset configuration defines supported/mandatory fields, minimum confidence level per field or overall, data completion based on business rules and master data, as well as validation rules.
    Presets can be published, updated, and deactivated.

  • User Administration. Customers can manage their users, user roles, permissions, and access levels within the Retarus IDP product. Alternatively, IDP does support Single Sign-On integration.

  • Multitenancy and Data Security. The IDP system is designed as a secure multitenant system ensuring data security.

  • Multilingual Support for IDP Portal. The IDP application is available in several languages: English (EN), German (DE).

  • Flexible integration options allow seamless integration into your existing processes. Having all the main functionalities exposed through secure APIs, the IDP solution also offers pragmatic integration alternatives based on proven industry standards (e.g., EDI).

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.