Batch file annotation offline

The Vision API can detect any Vision API feature from PDF and TIFF files stored in Cloud Storage.

Feature detection from PDF and TIFF must be requested using the files:asyncBatchAnnotate function, which performs an offline (asynchronous) request and provides its status using the operations resources.

Output from a PDF/TIFF request is written to a JSON file created in the specified Cloud Storage bucket.

Limitations

The Vision API accepts PDF/TIFF files up to 2000 pages. Larger files will return an error.

Authentication

API keys are not supported for files:asyncBatchAnnotate requests. See Using a service account for instructions on authenticating with a service account.

The account used for authentication must have access to the Cloud Storage bucket that you specify for the output ( roles/editor or roles/storage.objectCreator or above).

You canuse an API key to query the status of the operation; see Using an API key for instructions.

Feature detection requests

Currently PDF/TIFF document detection is only available for files stored in Cloud Storage buckets. Response JSON files are similarly saved to a Cloud Storage bucket.

Command-line

To perform PDF/TIFF document text detection, make a POST request and provide the appropriate request body:

curl  
-X  
POST  
 \ 
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
application-default  
print-access-token ) 
 " 
  
 \ 
-H  
 "Content-Type: application/json; charset=utf-8" 
  
 \ 
https://vision.googleapis.com/v1/files:asyncBatchAnnotate  
-d  
 "{ 
 'requests':[ 
 { 
 'inputConfig': { 
 'gcsSource': { 
 'uri': 'gs:// your-source-bucket-name 
/ folder 
/ multi-page-file.pdf 
' 
 }, 
 'mimeType': 'application/pdf' 
 }, 
 'features': [ 
 { 
 'type': 'DOCUMENT_TEXT_DETECTION' 
 } 
 ], 
 'outputConfig': { 
 'gcsDestination': { 
 'uri': 'gs:// your-bucket-name 
/ folder 
/' 
 }, 
 'batchSize': 1 
 } 
 } 
 ] 
 }"

Where:

inputConfig - replaces the image field used in other Vision API requests. It contains two child fields:
- gcsSource.uri - the Cloud Storage URI of the PDF or TIFF file (accessible to the user or service account making the request)
- mimeType - one of the accepted file types: application/pdf or image/tiff .
outputConfig - specifies output details. It contains two child field:
- gcsDestination.uri - a valid Cloud Storage URI. The bucket must be writeable by the user or service account making the request. The filename will be output-x-to-y , where x and y represent the PDF/TIFF page numbers included in that output file. If the file exists, its contents will be overwritten.
- batchSize - specifies how many pages of output should be included in each output JSON file.

Response:

A successful asyncBatchAnnotate request returns a response with a single name field:

 { 
  
 "name" 
 : 
  
 "projects/usable-auth-library/operations/ 1efec2285bd442df 
" 
 }

This name represents a long-running operation with an associated ID (for example, 1efec2285bd442df ), which can be queried using the v1.operations API.

To retrieve your Vision annotation response, send a GET request to the v1.operations endpoint, passing the operation ID in the URL.

curl  
-X  
GET  
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
application-default  
print-access-token ) 
 " 
  
 \ 
-H  
 "Content-Type: application/json" 
  
 \ 
https://vision.googleapis.com/v1/operations/ 1efec2285bd442df

If the operation is in progress:

 { 
  
 "name" 
 : 
  
 "operations/ 1efec2285bd442df 
" 
 , 
  
 "metadata" 
 : 
  
 { 
  
 "@type" 
 : 
  
 "type.googleapis.com/google.cloud.vision.v1.OperationMetadata" 
 , 
  
 "state" 
 : 
  
  "RUNNING" 
 
 , 
  
 "createTime" 
 : 
  
 "2019-05-15T21:10:08.401917049Z" 
 , 
  
 "updateTime" 
 : 
  
 "2019-05-15T21:10:33.700763554Z" 
  
 } 
 }

Once the operation has completed, the state shows as DONE and your results are written to the Cloud Storage file you specified:

 { 
  
 "name" 
 : 
  
 "operations/ 1efec2285bd442df 
" 
 , 
  
 "metadata" 
 : 
  
 { 
  
 "@type" 
 : 
  
 "type.googleapis.com/google.cloud.vision.v1.OperationMetadata" 
 , 
  
 "state" 
 : 
  
  "DONE" 
 
 , 
  
 "createTime" 
 : 
  
 "2019-05-15T20:56:30.622473785Z" 
 , 
  
 "updateTime" 
 : 
  
 "2019-05-15T20:56:41.666379749Z" 
  
 }, 
  
 "done" 
 : 
  
 true 
 , 
  
 "response" 
 : 
  
 { 
  
 "@type" 
 : 
  
 "type.googleapis.com/google.cloud.vision.v1.AsyncBatchAnnotateFilesResponse" 
 , 
  
 "responses" 
 : 
  
 [ 
  
 { 
  
 "outputConfig" 
 : 
  
 { 
  
 "gcsDestination" 
 : 
  
 { 
  
 "uri" 
 : 
  
 "gs:// your-bucket-name 
/ folder 
/" 
  
 }, 
  
 "batchSize" 
 : 
  
 1 
  
 } 
  
 } 
  
 ] 
  
 } 
 }

The JSON in your output file is similar to that of an image's document text detection request , with the addition of a context field showing the location of the PDF or TIFF that was specified and the number of pages in the file:

output-1-to-1.json

Full response

 { 
  
 "inputConfig" 
 : 
  
 { 
  
 "gcsSource" 
 : 
  
 { 
  
 "uri" 
 : 
  
 "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf" 
  
 }, 
  
 "mimeType" 
 : 
  
 "application/pdf" 
  
 }, 
  
 "responses" 
 : 
  
 [ 
  
 { 
  
 "fullTextAnnotation" 
 : 
  
 { 
  
 "pages" 
 : 
  
 [ 
  
 { 
  
 "property" 
 : 
  
 { 
  
 "detectedLanguages" 
 : 
  
 [ 
  
 { 
  
 "languageCode" 
 : 
  
 "en" 
 , 
  
 "confidence" 
 : 
  
 0.94 
  
 } 
  
 ] 
  
 }, 
  
 "width" 
 : 
  
 612 
 , 
  
 "height" 
 : 
  
 792 
 , 
  
 "blocks" 
 : 
  
 [ 
  
 { 
  
 "boundingBox" 
 : 
  
 { 
  
 "normalizedVertices" 
 : 
  
 [ 
  
 { 
  
 "x" 
 : 
  
 0.12908497 
 , 
  
 "y" 
 : 
  
 0.10479798 
  
 }, 
  
 ... 
  
 { 
  
 "x" 
 : 
  
 0.12908497 
 , 
  
 "y" 
 : 
  
 0.1199495 
  
 } 
  
 ] 
  
 }, 
  
 "paragraphs" 
 : 
  
 [ 
  
 { 
  
 ... 
  
 }, 
  
 "words" 
 : 
  
 [ 
  
 { 
  
 ... 
  
 }, 
  
 "symbols" 
 : 
  
 [ 
  
 { 
  
 ... 
  
 "text" 
 : 
  
 "C" 
 , 
  
 "confidence" 
 : 
  
 0.99 
  
 }, 
  
 { 
  
 "property" 
 : 
  
 { 
  
 "detectedLanguages" 
 : 
  
 [ 
  
 { 
  
 "languageCode" 
 : 
  
 "en" 
  
 } 
  
 ] 
  
 }, 
  
 "text" 
 : 
  
 "O" 
 , 
  
 "confidence" 
 : 
  
 0.99 
  
 }, 
  
 ... 
  
 } 
  
 ] 
  
 } 
  
 ], 
  
 "text" 
 : 
  
 "CONTENTS\n.\n1-1\nII-1\nIII-1\nList of Statistical Tables... 
 \nHow to Use This Census Report ..\nTable Finding Guide .\nUser 
 Notes .......\nStatistical Tables.........\nAppendixes 
 \nA Geographic Terms and Concepts .........\nB Definitions of 
 Subject Characteristics.\nData Collection and Processing Procedures... 
 \nQuestionnaire. ........\nE Maps .................\nF Operational 
 Overview and accuracy of the Data.......\nG Residence Rule and 
 Residence Situations for the \n2010 Census of the United States... 
 \nH Acknowledgments .....\nE\n*Appendix may be found in the separate 
 volume, CPH-1-A, Summary Population and\nHousing Characteristics, 
 Selected Appendixes, on the Internet at 
 <www.census.gov\n/prod/cen2010/cph-1-a.pdf>.\nContents\n" 
  
 }, 
   
 "context" 
 : 
  
 { 
  
 "uri" 
 : 
  
 "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf" 
 , 
  
 "pageNumber" 
 : 
  
 1 
  
 } 
 
  
 } 
  
 ] 
 }

Go

Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Go API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // detectAsyncDocumentURI performs Optical Character Recognition (OCR) on a 
 // PDF file stored in GCS. 
 func 
  
 detectAsyncDocumentURI 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 gcsSourceURI 
 , 
  
 gcsDestinationURI 
  
 string 
 ) 
  
 error 
  
 { 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageAnnotatorClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 request 
  
 := 
  
& visionpb 
 . 
 AsyncBatchAnnotateFilesRequest 
 { 
  
 Requests 
 : 
  
 [] 
 * 
 visionpb 
 . 
 AsyncAnnotateFileRequest 
 { 
  
 { 
  
 Features 
 : 
  
 [] 
 * 
 visionpb 
 . 
 Feature 
 { 
  
 { 
  
 Type 
 : 
  
 visionpb 
 . 
 Feature_DOCUMENT_TEXT_DETECTION 
 , 
  
 }, 
  
 }, 
  
 InputConfig 
 : 
  
& visionpb 
 . 
 InputConfig 
 { 
  
 GcsSource 
 : 
  
& visionpb 
 . 
 GcsSource 
 { 
 Uri 
 : 
  
 gcsSourceURI 
 }, 
  
 // Supported MimeTypes are: "application/pdf" and "image/tiff". 
  
 MimeType 
 : 
  
 "application/pdf" 
 , 
  
 }, 
  
 OutputConfig 
 : 
  
& visionpb 
 . 
 OutputConfig 
 { 
  
 GcsDestination 
 : 
  
& visionpb 
 . 
 GcsDestination 
 { 
 Uri 
 : 
  
 gcsDestinationURI 
 }, 
  
 // How many pages should be grouped into each json output file. 
  
 BatchSize 
 : 
  
 2 
 , 
  
 }, 
  
 }, 
  
 }, 
  
 } 
  
 operation 
 , 
  
 err 
  
 := 
  
 client 
 . 
 AsyncBatchAnnotateFiles 
 ( 
 ctx 
 , 
  
 request 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Waiting for the operation to finish." 
 ) 
  
 resp 
 , 
  
 err 
  
 := 
  
 operation 
 . 
 Wait 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "%v" 
 , 
  
 resp 
 ) 
  
 return 
  
 nil 
 }

Java

Before trying this sample, follow the Java setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Java API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  /** 
 * Performs document text OCR with PDF/TIFF as source files on Google Cloud Storage. 
 * 
 * @param gcsSourcePath The path to the remote file on Google Cloud Storage to detect document 
 *     text on. 
 * @param gcsDestinationPath The path to the remote file on Google Cloud Storage to store the 
 *     results on. 
 * @throws Exception on errors while closing the client. 
 */ 
 public 
  
 static 
  
 void 
  
 detectDocumentsGcs 
 ( 
 String 
  
 gcsSourcePath 
 , 
  
 String 
  
 gcsDestinationPath 
 ) 
  
 throws 
  
 Exception 
  
 { 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
 ImageAnnotatorClient 
  
 client 
  
 = 
  
 ImageAnnotatorClient 
 . 
 create 
 ()) 
  
 { 
  
 List<AsyncAnnotateFileRequest> 
  
 requests 
  
 = 
  
 new 
  
 ArrayList 
<> (); 
  
 // Set the GCS source path for the remote file. 
  
 GcsSource 
  
 gcsSource 
  
 = 
  
 GcsSource 
 . 
 newBuilder 
 (). 
 setUri 
 ( 
 gcsSourcePath 
 ). 
 build 
 (); 
  
 // Create the configuration with the specified MIME (Multipurpose Internet Mail Extensions) 
  
 // types 
  
 InputConfig 
  
 inputConfig 
  
 = 
  
 InputConfig 
 . 
 newBuilder 
 () 
  
 . 
 setMimeType 
 ( 
  
 "application/pdf" 
 ) 
  
 // Supported MimeTypes: "application/pdf", "image/tiff" 
  
 . 
 setGcsSource 
 ( 
 gcsSource 
 ) 
  
 . 
 build 
 (); 
  
 // Set the GCS destination path for where to save the results. 
  
 GcsDestination 
  
 gcsDestination 
  
 = 
  
 GcsDestination 
 . 
 newBuilder 
 (). 
 setUri 
 ( 
 gcsDestinationPath 
 ). 
 build 
 (); 
  
 // Create the configuration for the System.output with the batch size. 
  
 // The batch size sets how many pages should be grouped into each json System.output file. 
  
 OutputConfig 
  
 outputConfig 
  
 = 
  
 OutputConfig 
 . 
 newBuilder 
 (). 
 setBatchSize 
 ( 
 2 
 ). 
 setGcsDestination 
 ( 
 gcsDestination 
 ). 
 build 
 (); 
  
 // Select the Feature required by the vision API 
  
 Feature 
  
 feature 
  
 = 
  
 Feature 
 . 
 newBuilder 
 (). 
 setType 
 ( 
 Feature 
 . 
 Type 
 . 
 DOCUMENT_TEXT_DETECTION 
 ). 
 build 
 (); 
  
 // Build the OCR request 
  
 AsyncAnnotateFileRequest 
  
 request 
  
 = 
  
 AsyncAnnotateFileRequest 
 . 
 newBuilder 
 () 
  
 . 
 addFeatures 
 ( 
 feature 
 ) 
  
 . 
 setInputConfig 
 ( 
 inputConfig 
 ) 
  
 . 
 setOutputConfig 
 ( 
 outputConfig 
 ) 
  
 . 
 build 
 (); 
  
 requests 
 . 
 add 
 ( 
 request 
 ); 
  
 // Perform the OCR request 
  
 OperationFuture<AsyncBatchAnnotateFilesResponse 
 , 
  
 OperationMetadata 
>  
 response 
  
 = 
  
 client 
 . 
 asyncBatchAnnotateFilesAsync 
 ( 
 requests 
 ); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Waiting for the operation to finish." 
 ); 
  
 // Wait for the request to finish. (The result is not used, since the API saves the result to 
  
 // the specified location on GCS.) 
  
 List<AsyncAnnotateFileResponse> 
  
 result 
  
 = 
  
 response 
 . 
 get 
 ( 
 180 
 , 
  
 TimeUnit 
 . 
 SECONDS 
 ). 
 getResponsesList 
 (); 
  
 // Once the request has completed and the System.output has been 
  
 // written to GCS, we can list all the System.output files. 
  
 Storage 
  
 storage 
  
 = 
  
 StorageOptions 
 . 
 getDefaultInstance 
 (). 
 getService 
 (); 
  
 // Get the destination location from the gcsDestinationPath 
  
 Pattern 
  
 pattern 
  
 = 
  
 Pattern 
 . 
 compile 
 ( 
 "gs://([^/]+)/(.+)" 
 ); 
  
 Matcher 
  
 matcher 
  
 = 
  
 pattern 
 . 
 matcher 
 ( 
 gcsDestinationPath 
 ); 
  
 if 
  
 ( 
 matcher 
 . 
 find 
 ()) 
  
 { 
  
 String 
  
 bucketName 
  
 = 
  
 matcher 
 . 
 group 
 ( 
 1 
 ); 
  
 String 
  
 prefix 
  
 = 
  
 matcher 
 . 
 group 
 ( 
 2 
 ); 
  
 // Get the list of objects with the given prefix from the GCS bucket 
  
 Bucket 
  
 bucket 
  
 = 
  
 storage 
 . 
 get 
 ( 
 bucketName 
 ); 
  
 com 
 . 
 google 
 . 
 api 
 . 
 gax 
 . 
 paging 
 . 
 Page<Blob> 
  
 pageList 
  
 = 
  
 bucket 
 . 
 list 
 ( 
 BlobListOption 
 . 
 prefix 
 ( 
 prefix 
 )); 
  
 Blob 
  
 firstOutputFile 
  
 = 
  
 null 
 ; 
  
 // List objects with the given prefix. 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Output files:" 
 ); 
  
 for 
  
 ( 
 Blob 
  
 blob 
  
 : 
  
 pageList 
 . 
 iterateAll 
 ()) 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 blob 
 . 
 getName 
 ()); 
  
 // Process the first System.output file from GCS. 
  
 // Since we specified batch size = 2, the first response contains 
  
 // the first two pages of the input file. 
  
 if 
  
 ( 
 firstOutputFile 
  
 == 
  
 null 
 ) 
  
 { 
  
 firstOutputFile 
  
 = 
  
 blob 
 ; 
  
 } 
  
 } 
  
 // Get the contents of the file and convert the JSON contents to an AnnotateFileResponse 
  
 // object. If the Blob is small read all its content in one request 
  
 // (Note: the file is a .json file) 
  
 // Storage guide: https://cloud.google.com/storage/docs/downloading-objects 
  
 String 
  
 jsonContents 
  
 = 
  
 new 
  
 String 
 ( 
 firstOutputFile 
 . 
 getContent 
 ()); 
  
 Builder 
  
 builder 
  
 = 
  
 AnnotateFileResponse 
 . 
 newBuilder 
 (); 
  
 JsonFormat 
 . 
 parser 
 (). 
 merge 
 ( 
 jsonContents 
 , 
  
 builder 
 ); 
  
 // Build the AnnotateFileResponse object 
  
 AnnotateFileResponse 
  
 annotateFileResponse 
  
 = 
  
 builder 
 . 
 build 
 (); 
  
 // Parse through the object to get the actual response for the first page of the input file. 
  
 AnnotateImageResponse 
  
 annotateImageResponse 
  
 = 
  
 annotateFileResponse 
 . 
 getResponses 
 ( 
 0 
 ); 
  
 // Here we print the full text from the first page. 
  
 // The response contains more information: 
  
 // annotation/pages/blocks/paragraphs/words/symbols 
  
 // including confidence score and bounding boxes 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "%nText: %s%n" 
 , 
  
 annotateImageResponse 
 . 
 getFullTextAnnotation 
 (). 
 getText 
 ()); 
  
 } 
  
 else 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "No MATCH" 
 ); 
  
 } 
  
 } 
 }

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // Imports the Google Cloud client libraries 
 const 
  
 vision 
  
 = 
  
 require 
 ( 
 ' @google-cloud/vision 
' 
 ). 
 v1 
 ; 
 // Creates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 vision 
 . 
  ImageAnnotatorClient 
 
 (); 
 /** 
 * TODO(developer): Uncomment the following lines before running the sample. 
 */ 
 // Bucket where the file resides 
 // const bucketName = 'my-bucket'; 
 // Path to PDF file within bucket 
 // const fileName = 'path/to/document.pdf'; 
 // The folder to store the results 
 // const outputPrefix = 'results' 
 const 
  
 gcsSourceUri 
  
 = 
  
 `gs:// 
 ${ 
 bucketName 
 } 
 / 
 ${ 
 fileName 
 } 
 ` 
 ; 
 const 
  
 gcsDestinationUri 
  
 = 
  
 `gs:// 
 ${ 
 bucketName 
 } 
 / 
 ${ 
 outputPrefix 
 } 
 /` 
 ; 
 const 
  
 inputConfig 
  
 = 
  
 { 
  
 // Supported mime_types are: 'application/pdf' and 'image/tiff' 
  
 mimeType 
 : 
  
 'application/pdf' 
 , 
  
 gcsSource 
 : 
  
 { 
  
 uri 
 : 
  
 gcsSourceUri 
 , 
  
 }, 
 }; 
 const 
  
 outputConfig 
  
 = 
  
 { 
  
 gcsDestination 
 : 
  
 { 
  
 uri 
 : 
  
 gcsDestinationUri 
 , 
  
 }, 
 }; 
 const 
  
 features 
  
 = 
  
 [{ 
 type 
 : 
  
 'DOCUMENT_TEXT_DETECTION' 
 }]; 
 const 
  
 request 
  
 = 
  
 { 
  
 requests 
 : 
  
 [ 
  
 { 
  
 inputConfig 
 : 
  
 inputConfig 
 , 
  
 features 
 : 
  
 features 
 , 
  
 outputConfig 
 : 
  
 outputConfig 
 , 
  
 }, 
  
 ], 
 }; 
 const 
  
 [ 
 operation 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 asyncBatchAnnotateFiles 
 ( 
 request 
 ); 
 const 
  
 [ 
 filesResponse 
 ] 
  
 = 
  
 await 
  
 operation 
 . 
 promise 
 (); 
 const 
  
 destinationUri 
  
 = 
  
 filesResponse 
 . 
 responses 
 [ 
 0 
 ]. 
 outputConfig 
 . 
 gcsDestination 
 . 
 uri 
 ; 
 console 
 . 
 log 
 ( 
 'Json saved to: ' 
  
 + 
  
 destinationUri 
 );

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  def 
  
 async_detect_document 
 ( 
 gcs_source_uri 
 , 
 gcs_destination_uri 
 ): 
  
 """OCR with PDF/TIFF as source files on GCS""" 
 import 
  
 json 
 import 
  
 re 
 from 
  
 google.cloud 
  
 import 
 vision 
 from 
  
 google.cloud 
  
 import 
  storage 
 
 # Supported mime_types are: 'application/pdf' and 'image/tiff' 
 mime_type 
 = 
 "application/pdf" 
 # How many pages should be grouped into each json output file. 
 batch_size 
 = 
 2 
 client 
 = 
 vision 
 . 
  ImageAnnotatorClient 
 
 () 
 feature 
 = 
 vision 
 . 
  Feature 
 
 ( 
 type_ 
 = 
 vision 
 . 
  Feature 
 
 . 
 Type 
 . 
 DOCUMENT_TEXT_DETECTION 
 ) 
 gcs_source 
 = 
 vision 
 . 
  GcsSource 
 
 ( 
 uri 
 = 
 gcs_source_uri 
 ) 
 input_config 
 = 
 vision 
 . 
  InputConfig 
 
 ( 
 gcs_source 
 = 
 gcs_source 
 , 
 mime_type 
 = 
 mime_type 
 ) 
 gcs_destination 
 = 
 vision 
 . 
  GcsDestination 
 
 ( 
 uri 
 = 
 gcs_destination_uri 
 ) 
 output_config 
 = 
 vision 
 . 
  OutputConfig 
 
 ( 
 gcs_destination 
 = 
 gcs_destination 
 , 
 batch_size 
 = 
 batch_size 
 ) 
 async_request 
 = 
 vision 
 . 
  AsyncAnnotateFileRequest 
 
 ( 
 features 
 = 
 [ 
 feature 
 ], 
 input_config 
 = 
 input_config 
 , 
 output_config 
 = 
 output_config 
 ) 
 operation 
 = 
 client 
 . 
  async_batch_annotate_files 
 
 ( 
 requests 
 = 
 [ 
 async_request 
 ]) 
 print 
 ( 
 "Waiting for the operation to finish." 
 ) 
 operation 
 . 
 result 
 ( 
 timeout 
 = 
 420 
 ) 
 # Once the request has completed and the output has been 
 # written to GCS, we can list all the output files. 
 storage_client 
 = 
  storage 
 
 . 
  Client 
 
 () 
 match 
 = 
 re 
 . 
 match 
 ( 
 r 
 "gs://([^/]+)/(.+)" 
 , 
 gcs_destination_uri 
 ) 
 bucket_name 
 = 
 match 
 . 
  group 
 
 ( 
 1 
 ) 
 prefix 
 = 
 match 
 . 
  group 
 
 ( 
 2 
 ) 
 bucket 
 = 
 storage_client 
 . 
  get_bucket 
 
 ( 
 bucket_name 
 ) 
 # List objects with the given prefix, filtering out folders. 
 blob_list 
 = 
 [ 
 blob 
 for 
 blob 
 in 
 list 
 ( 
 bucket 
 . 
 list_blobs 
 ( 
 prefix 
 = 
 prefix 
 )) 
 if 
 not 
 blob 
 . 
 name 
 . 
 endswith 
 ( 
 "/" 
 ) 
 ] 
 print 
 ( 
 "Output files:" 
 ) 
 for 
 blob 
 in 
 blob_list 
 : 
 print 
 ( 
 blob 
 . 
 name 
 ) 
 # Process the first output file from GCS. 
 # Since we specified batch_size=2, the first response contains 
 # the first two pages of the input file. 
 output 
 = 
 blob_list 
 [ 
 0 
 ] 
 json_string 
 = 
 output 
 . 
  download_as_bytes 
 
 () 
 . 
 decode 
 ( 
 "utf-8" 
 ) 
 response 
 = 
 json 
 . 
 loads 
 ( 
 json_string 
 ) 
 # The actual response for the first page of the input file. 
 first_page_response 
 = 
 response 
 [ 
 "responses" 
 ][ 
 0 
 ] 
 annotation 
 = 
 first_page_response 
 [ 
 "fullTextAnnotation" 
 ] 
 # Here we print the full text from the first page. 
 # The response contains more information: 
 # annotation/pages/blocks/paragraphs/words/symbols 
 # including confidence scores and bounding boxes 
 print 
 ( 
 "Full text: 
 \n 
 " 
 ) 
 print 
 ( 
 annotation 
 [ 
 "text" 
 ])

gcloud

The gcloud command you use depend on the file type.

To perform PDFtext detection, use the gcloud ml vision detect-text-pdf command as shown in the following example:

gcloud ml vision detect-text-pdf gs:// my_bucket 
/ input_file 
gs:// my_bucket 
/ out_put_prefix

To perform TIFFtext detection, use the gcloud ml vision detect-text-tiff command as shown in the following example:

gcloud ml vision detect-text-tiff gs:// my_bucket 
/ input_file 
gs:// my_bucket 
/ out_put_prefix

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.

Batch file annotation offline Stay organized with collections Save and categorize content based on your preferences.

Limitations

Authentication

Feature detection requests

Command-line

Go

Java

Node.js

Python

gcloud

Additional languages

Batch file annotation offline