Document AI uses Enterprise Knowledge Graph
to normalize and
enrich entity extraction results (for supported fields). For example, the addresses 123 Main St Apt 1
and 123 Main street # 1
could be normalized to the same
standardized address.
For each supported field, Document AI also returns a normalizedValue
in addition to the raw extracted field, normalizing the literal text.
This contains the data in a standardized format to reduce post-processing.
Most data belongs to one of the following categories:
- Money
- Date
- Timestamp
- Address
- Boolean
- Integer
- Float
Sample response
The enriched values can be found in the entities.normalizedValue
field as shown in the following truncated sample:
{
"entities"
:
[
{
"textAnchor"
:
{
"textSegments"
:
[
...
],
"content"
:
"Google Singapore"
},
"type"
:
"employer_name"
,
"mentionText"
:
"Google Singapore"
,
"confidence"
:
0.69933707
,
"pageAnchor"
:
{
"pageRefs"
:
[
{
"boundingPoly"
:
{
"normalizedVertices"
:
[
...
]
}
}
]
},
"id"
:
"9"
,
"normalizedValue"
:
{
"text"
:
"Google Asia Pacific, Singapore"
}
}
]
}
In the sample, the original employer_name
"Google Singapore" has been
normalized to "Google Asia Pacific, Singapore".
In the Google Cloud console, the enriched and normalized fields are annotated with G . For example:

Supported processors
Here are the processors and fields that support entity enrichment.
Bank Statement Parser
Category | Pretrained |
---|---|
Solution type | Lending |
Functions | OCR, Entity Extraction |
Release stage | General availability |
Access status | Public |
Full processor details | Detailed entry |
-
bank_address
-
bank_name
W2 Parser
Category | Pretrained |
---|---|
Solution type | Lending |
Functions | OCR, Entity Extraction |
Release stage | General availability |
Access status | Public |
Full processor details | Detailed entry |
-
EmployerNameAndAddress
-
EIN
Pay Slip Parser
Category | Pretrained |
---|---|
Solution type | Lending |
Functions | OCR, Entity Extraction |
Release stage | General availability |
Access status | Public |
Full processor details | Detailed entry |
-
employer_address
-
employer_name
Expense Parser
Category | Pretrained |
---|---|
Solution type | Procurement |
Functions | OCR, Entity Extraction |
Release stage | General availability |
Access status | Public |
Full processor details | Detailed entry |
-
supplier_address
-
supplier_name
-
supplier_phone
Invoice Parser
Category | Pretrained |
---|---|
Solution type | Procurement |
Functions | OCR, Entity Extraction |
Release stage | General availability |
Access status | Public |
Full processor details | Detailed entry |
-
supplier_address
-
supplier_name
-
supplier_phone