The Vision API can detect any Vision API feature from PDF and TIFF files stored in Cloud Storage.
Feature detection from PDF and TIFF must be requested using the files:asyncBatchAnnotate
function, which performs an offline (asynchronous)
request and provides its status using the operations
resources.
Output from a PDF/TIFF request is written to a JSON file created in the specified Cloud Storage bucket.
Limitations
The Vision API accepts PDF/TIFF files up to 2000 pages. Larger files will return an error.
Authentication
API keys are not supported for files:asyncBatchAnnotate
requests. See Using a service account
for
instructions on authenticating with a service account.
The account used for authentication must have access to the Cloud
Storage bucket that you specify for the output ( roles/editor
or roles/storage.objectCreator
or above).
You canuse an API key to query the status of the operation; see Using an API key for instructions.
Feature detection requests
Currently PDF/TIFF document detection is only available for files stored in Cloud Storage buckets. Response JSON files are similarly saved to a Cloud Storage bucket.
Command-line
To perform PDF/TIFF document text detection, make a POST request and provide the appropriate request body:
curl -X POST \ -H "Authorization: Bearer $( gcloud auth application-default print-access-token ) " \ -H "Content-Type: application/json; charset=utf-8" \ https://vision.googleapis.com/v1/files:asyncBatchAnnotate -d "{ 'requests':[ { 'inputConfig': { 'gcsSource': { 'uri': 'gs:// your-source-bucket-name / folder / multi-page-file.pdf ' }, 'mimeType': 'application/pdf' }, 'features': [ { 'type': 'DOCUMENT_TEXT_DETECTION' } ], 'outputConfig': { 'gcsDestination': { 'uri': 'gs:// your-bucket-name / folder /' }, 'batchSize': 1 } } ] }"
Where:
-
inputConfig
- replaces theimage
field used in other Vision API requests. It contains two child fields:-
gcsSource.uri
- the Cloud Storage URI of the PDF or TIFF file (accessible to the user or service account making the request) -
mimeType
- one of the accepted file types:application/pdf
orimage/tiff
.
-
-
outputConfig
- specifies output details. It contains two child field:-
gcsDestination.uri
- a valid Cloud Storage URI. The bucket must be writeable by the user or service account making the request. The filename will beoutput-x-to-y
, wherex
andy
represent the PDF/TIFF page numbers included in that output file. If the file exists, its contents will be overwritten. -
batchSize
- specifies how many pages of output should be included in each output JSON file.
-
Response:
A successful asyncBatchAnnotate
request returns a response with a single name
field:
{ "name" : "projects/usable-auth-library/operations/ 1efec2285bd442df " }
This name represents a long-running operation with an associated ID
(for example, 1efec2285bd442df
), which can be queried using the v1.operations
API.
To retrieve your Vision annotation response, send a GET request to the v1.operations
endpoint, passing the operation ID in the URL.
curl -X GET -H "Authorization: Bearer $( gcloud auth application-default print-access-token ) " \ -H "Content-Type: application/json" \ https://vision.googleapis.com/v1/operations/ 1efec2285bd442df
If the operation is in progress:
{ "name" : "operations/ 1efec2285bd442df " , "metadata" : { "@type" : "type.googleapis.com/google.cloud.vision.v1.OperationMetadata" , "state" : "RUNNING" , "createTime" : "2019-05-15T21:10:08.401917049Z" , "updateTime" : "2019-05-15T21:10:33.700763554Z" } }
Once the operation has completed, the state
shows as DONE
and your
results are written to the Cloud Storage file you specified:
{ "name" : "operations/ 1efec2285bd442df " , "metadata" : { "@type" : "type.googleapis.com/google.cloud.vision.v1.OperationMetadata" , "state" : "DONE" , "createTime" : "2019-05-15T20:56:30.622473785Z" , "updateTime" : "2019-05-15T20:56:41.666379749Z" }, "done" : true , "response" : { "@type" : "type.googleapis.com/google.cloud.vision.v1.AsyncBatchAnnotateFilesResponse" , "responses" : [ { "outputConfig" : { "gcsDestination" : { "uri" : "gs:// your-bucket-name / folder /" }, "batchSize" : 1 } } ] } }
The JSON in your output file is similar to that of an image's document text detection request
, with the addition of a context
field showing the location of the PDF or TIFF that was specified and
the number of pages in the file:
output-1-to-1.json
Go
Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Go API reference documentation .
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Java
Before trying this sample, follow the Java setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Java API reference documentation .
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Python
Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
gcloud
The gcloud
command you use depend on the file type.
-
To perform PDFtext detection, use the
gcloud ml vision detect-text-pdf
command as shown in the following example:gcloud ml vision detect-text-pdf gs:// my_bucket / input_file gs:// my_bucket / out_put_prefix
-
To perform TIFFtext detection, use the
gcloud ml vision detect-text-tiff
command as shown in the following example:gcloud ml vision detect-text-tiff gs:// my_bucket / input_file gs:// my_bucket / out_put_prefix
Additional languages
C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.
PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.
Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.