The schema defines the output of the processed document by a processor.
| JSON representation |
|---|
{ "displayName" : string , "description" : string , "entityTypes" : [ { object ( |
| Fields | |
|---|---|
displayName
|
Display name to show users. |
description
|
Description of the schema. |
entityTypes[]
|
Entity types of the schema. |
metadata
|
Metadata of the schema. |
EntityType
EntityType is the wrapper of a label of the corresponding model with detailed attributes and limitations for entity-based processors. Multiple types can also compose a dependency tree to represent nested types.
| JSON representation |
|---|
{ "displayName" : string , "name" : string , "description" : string , "baseTypes" : [ string ] , "properties" : [ { object ( |
displayName
string
User defined name for the type.
name
string
Name of the type. It must be unique within the schema file and cannot be a "Common Type". The following naming conventions are used:
- Use
snake_casing. - Name matching is case-sensitive.
- Maximum 64 characters.
- Must start with a letter.
- Allowed characters: ASCII letters
[a-z0-9_-]. (For backward compatibility, internal infrastructure and tooling can handle any ASCII character.) - The
/is sometimes used to denote a property of a type. For exampleline_item/amount. This convention is deprecated, but will still be honored for backward compatibility.
description
string
The description of the entity type. Could be used to provide more information about the entity type for model calls.
baseTypes[]
string
The entity type that this type is derived from. For now, one and only one should be set.
properties[]
object (
Property
)
Description the nested structure, or composition of an entity.
entityTypeMetadata
object (
EntityTypeMetadata
)
Metadata for the entity type.
Union field value_source
.
value_source
can be only one of the following:
enumValues
object (
EnumValues
)
If specified, lists all the possible values for this entity. This should not be more than a handful of values. If the number of values is >10 or could change frequently, use the EntityType.value_ontology
field and specify a list of all possible values in a value ontology file.
EnumValues
Defines the a list of enum values.
| JSON representation |
|---|
{ "values" : [ string ] } |
| Fields | |
|---|---|
values[]
|
The individual values that this enum values type can include. |
Property
Defines properties that can be part of the entity type.
| JSON representation |
|---|
{ "name" : string , "description" : string , "displayName" : string , "valueType" : string , "occurrenceType" : enum ( |
| Fields | |
|---|---|
name
|
The name of the property. Follows the same guidelines as the EntityType name. |
description
|
The description of the property. Could be used to provide more information about the property for model calls. |
displayName
|
User defined name for the property. |
valueType
|
A reference to the value type of the property. This type is subject to the same conventions as the |
occurrenceType
|
Occurrence type limits the number of instances an entity type appears in the document. |
method
|
Specifies how the entity's value is obtained. |
propertyMetadata
|
Any additional metadata about the property can be added here. |
PropertyMetadata
Metadata about a property.
| JSON representation |
|---|
{
"inactive"
:
boolean
,
"fieldExtractionMetadata"
:
{
object (
|
| Fields | |
|---|---|
inactive
|
Whether the property should be considered as "inactive". |
fieldExtractionMetadata
|
Field extraction metadata on the property. |
FieldExtractionMetadata
Metadata for how this field value is extracted.
| JSON representation |
|---|
{
"summaryOptions"
:
{
object (
|
| Fields | |
|---|---|
summaryOptions
|
Summary options config. |
SummaryOptions
Metadata for document summarization.
| JSON representation |
|---|
{ "length" : enum ( |
| Fields | |
|---|---|
length
|
How long the summary should be. |
format
|
The format the summary should be in. |
EntityTypeMetadata
Metadata about an entity type.
| JSON representation |
|---|
{ "inactive" : boolean } |
| Fields | |
|---|---|
inactive
|
Whether the entity type should be considered inactive. |
Metadata
Metadata for global schema behavior.
| JSON representation |
|---|
{ "documentSplitter" : boolean , "documentAllowMultipleLabels" : boolean , "prefixedNamingOnProperties" : boolean , "skipNamingValidation" : boolean } |
| Fields | |
|---|---|
documentSplitter
|
If true, a |
documentAllowMultipleLabels
|
If true, on a given page, there can be multiple |
prefixedNamingOnProperties
|
If set, all the nested entities must be prefixed with the parents. |
skipNamingValidation
|
If set, this will skip the naming format validation in the schema. So the string values in |

