Document Extractions
An extraction contains an entity
that describes a general semantic type of the extraction, such as iban, bic, or amount. The entity
also determines the format of the value
containing text information. There might be an optional box
element describing the position of the extraction value on the document. We refer to it as the bounding box. In most cases, the extractions without a bounding box are considered to be meta information such as doctype
.
Name | Type | Description |
---|---|---|
|
| A key (primary identification) of an entity type, for example, |
|
| A normalized textual representation of the Text/Information provided by the extraction value, for example, iban without spaces between the digits. |
| bounding-box | Optional: Bounding box containing the position of the extraction value on the document. |
//document extraction
{
"entity": "amount",
"value": "20.00:EUR",
"box": { ... }
}
Specific extractions
A specific extraction assigns a semantic property to the extraction. It also has an additional candidates
field:
Name | Type | Description |
---|---|---|
|
| Optional: A reference to extraction candidates. See Extraction Entities for possible values. |
//specific extractions
{
"amountToPay": {
"entity": "amount",
"value": "20.00:EUR",
"box": { ... },
"candidates": "amounts"
}
}
Available specific extractions
Name | Description | Entity | Candidates |
---|---|---|---|
| The amount which yet to be paid. | amounts | |
| The bic of a payment recipient. | bics | |
| The document type of a given document. | n/a | |
| The IBAN of a document sender. | ibans | |
| The payment purpose text | n/a | |
| The payment recipient, beneficiary of a money transfer activity | paymentRecipients | |
| The payment reference. | n/a |
We differentiate between paymentPurpose and paymentReferences on remittance slips only (Verwendungszweck vs. Zahlungsreferenz).
On invoices, you only find the paymentPurpose field.