Introduction

Gini provides a system to extract payment information from documents like invoices, reminders, and remittance slips. This API is part of Gini Pay product. The system can handle photos made with Android- or iOS-based devices as well as PDF files uploaded from a web application.

File types

The API supports PDF, GIF, JPEG, PNG, and TIFF files.

Document classification

The information extraction starts when the document is sent to the extraction system. There it first gets verified and then classified as being native (a PDF file) or scanned (a camera or a PDF image).

There is a difference between native and scanned PDF files. Native PDF files are created using Microsoft Word, Excel, Illustrator, or other software that generates PDF files from source code. Scanned PDF files are created by scanning devices from the actual paper documents.

The native PDF files already contain this information in the document source code and are processed accordingly. However, the scanned files do not have the source code and therefore do not directly provide the information that can be easily read and understood by the system. Therefore, the extraction system has to apply Optical Character Recognition (OCR) and various computer vision techniques to obtain the document contents.

Data extraction

Once the layout and the textual contents become available for the uploaded document, the system starts extracting document semantic information such as the document sender (name, address) and meta information such as the document type (invoice, contract).

The system might not extract the information correctly. This most likely happens due to OCR errors caused by the poor quality of the scanned document, incomplete textual data, or quite specific document design format. In such cases, it is still possible to correct the extractions by manually selecting the correct amount to pay on the document and submitting it back to the API. The extraction system will receive the feedback and help us to improve its self-learning algorithms over time.

If you have any questions about Gini Pay API and the functionality it provides, don't hesitate to get in touch with us via api@gini.net.