REST Resource: documents

Resource: Document
- JSON representation
Type
BoilerplateHandling
Methods

Resource: Document

Represents the input to API methods.

JSON representation

JSON representation
{ "type" : enum ( `Type` ) , "language" : string , "referenceWebUri" : string , "boilerplateHandling" : enum ( `BoilerplateHandling` ) , // Union field `source` can be only one of the following: "content" : string , "gcsContentUri" : string // End of list of possible types for union field `source` . }

 { 
 "type" 
 : 
 enum (  Type 
 
) 
 , 
 "language" 
 : 
 string 
 , 
 "referenceWebUri" 
 : 
 string 
 , 
 "boilerplateHandling" 
 : 
 enum (  BoilerplateHandling 
 
) 
 , 
 // Union field source 
can be only one of the following: 
 "content" 
 : 
 string 
 , 
 "gcsContentUri" 
 : 
 string 
 // End of list of possible types for union field source 
. 
 }

Fields

type

enum ( Type )

Required. If the type is not set or is TYPE_UNSPECIFIED , returns an INVALID_ARGUMENT error.

language

string

The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted.
Language Support lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an INVALID_ARGUMENT error is returned.

reference  
Web  
Uri

string

The web URI where the document comes from. This URI is not used for fetching the content, but as a hint for analyzing the document.

boilerplate  
Handling

enum ( BoilerplateHandling )

Indicates how detected boilerplate(e.g. advertisements, copyright declarations, banners) should be handled for this document. If not specified, boilerplate will be treated the same as content.

Union field source . The source of the document: a string containing the content or a Google Cloud Storage URI. source can be only one of the following:

content

string

The content of the input in string format. Cloud audit logging exempt since it is based on user data.

gcs  
Content  
Uri

string

The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucketName/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris . NOTE: Cloud Storage object versioning is not supported.

Type

The document types enum.

Enums
`TYPE_UNSPECIFIED`	The content type is not specified.
`PLAIN_TEXT`	Plain text
`HTML`	HTML

BoilerplateHandling

Ways of handling boilerplate detected in the document

Enums
`BOILERPLATE_HANDLING_UNSPECIFIED`	The boilerplate handling is not specified.
`SKIP_BOILERPLATE`	Do not analyze detected boilerplate. Reference web URI is required for detecting boilerplate.
`KEEP_BOILERPLATE`	Treat boilerplate the same as content.

Methods
`analyze Entities`	Finds named entities (currently proper names and common nouns) in the text along with entity types, salience, mentions for each entity, and other properties.
`analyze Entity Sentiment`	Finds entities, similar to `AnalyzeEntities` in the text and analyzes sentiment associated with each entity and its mentions.
`analyze Sentiment`	Analyzes the sentiment of the provided text.
`analyze Syntax`	Analyzes the syntax of the text and provides sentence boundaries and tokenization along with part of speech tags, dependency trees, and other properties.
`annotate Text`	A convenience method that provides all syntax, sentiment, entity, and classification features in one call.
`classify Text`	Classifies a document into categories.
`moderate Text`	Moderates a document for harmful and sensitive categories.

REST Resource: documents

Resource: Document

Type

BoilerplateHandling

Methods

`analyze Entities`

`analyze Entity Sentiment`

`analyze Sentiment`

`analyze Syntax`

`annotate Text`

`classify Text`

`moderate Text`