Service Overview

Document Understanding is a serverless, multitenant service, that you can use to detect and classify text, tables, and other key data from document files that you upload.

The service is accessible by using the Console, REST APIs, SDK, and CLI. You can process individual files or batches of documents by using the ProcessorJob API endpoint.

The following pretrained models are supported:

  • Optical Character Recognition (OCR): Detects and recognizes text in a document.
  • Text extraction: Provides the word level and line level text, and the bounding box coordinates of where the text is found.
  • Key-value extraction: Extracts a predefined list of key-value pair information from receipts, invoices, passports, and driver IDs.
  • Table extraction: Extracts content in tabular format, maintaining the row and column relationships of cells.
  • Document classification: Classifies documents into different types based on visual appearance, high-level features, and extracted keywords. Some example document types are invoice, receipt, and resume.
  • Optical Character Recognition (OCR) PDF: Generates a searchable PDF file in the OCI Object Storage service.