The Vision API can run offline (asynchronous) detection services and
annotation of a large batch of image files using any Vision feature type
. For example, you
can specify one or more Vision API features (such as TEXT_DETECTION
, LABEL_DETECTION
,
and LANDMARK_DETECTION
) for a single batch of images.
Output from an offline batch request is written to a JSON file created in the specified Cloud Storage bucket.
- online (synchronous)
requests - An online annotation request (
images:annotate
orfiles:annotate
) immediately returns inline annotations to the user. Online annotation requests limit the number of files you can annotate in a single request. With animages:annotate
request you can only specify a small number of images (<=16) to be annotated ; with afiles:annotate
request you can only specify a single file and specify a small number of pages (<=5) in that file to be annotated. - offline (asynchronous)
requests - An offline annotation request
(
images:asyncBatchAnnotate
orfiles:asyncBatchAnnotate
) starts a long-running operation (LRO) and does not immediately return a response to the caller. When the LRO completes, annotations are stored as files in a Cloud Storage bucket you specify. Aimages:asyncBatchAnnotate
request allows you to specify up to 2000 images per request ; afiles:asyncBatchAnnotate
request allows you to specify larger batches of files and can specify more pages (<=2000) per file for annotation at a single time than you are able to with online requests.
Limitations
The Vision API accepts up to 2,000 image files. A larger batch of image files will return an error.
Currently supported feature types
Feature type | |
---|---|
CROP_HINTS
|
Determine suggested vertices for a crop region on an image. |
DOCUMENT_TEXT_DETECTION
|
Perform OCR on dense text images, such as documents (PDF/TIFF), and images with
handwriting. TEXT_DETECTION
can be used for sparse text images.
Takes precedence when both DOCUMENT_TEXT_DETECTION
and TEXT_DETECTION
are present. |
FACE_DETECTION
|
Detect faces within the image. |
IMAGE_PROPERTIES
|
Compute a set of image properties, such as the image's dominant colors. |
LABEL_DETECTION
|
Add labels based on image content. |
LANDMARK_DETECTION
|
Detect geographic landmarks within the image. |
LOGO_DETECTION
|
Detect company logos within the image. |
OBJECT_LOCALIZATION
|
Detect and extract multiple objects in an image. |
SAFE_SEARCH_DETECTION
|
Run SafeSearch to detect potentially unsafe or undesirable content. |
TEXT_DETECTION
|
Perform Optical Character Recognition (OCR) on text within the image.
Text detection is optimized for areas of sparse text within a larger image.
If the image is a document (PDF/TIFF), has dense text, or contains handwriting,
use DOCUMENT_TEXT_DETECTION
instead. |
WEB_DETECTION
|
Detect topical entities such as news, events, or celebrities within the image, and find similar images on the web using the power of Google Image Search. |
Sample code
Use the following code samples to run offline annotation services on a batch of image files in Cloud Storage.
Java
Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Java reference documentation .
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Python
Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Response
A successful request returns response JSON files in the Cloud Storage
bucket you indicated in the code sample. The number of responses per JSON file
is dictated by batch_size
in the code sample.
The returned response is similar to regular Vision API feature responses, depending on which features you request for an image.
The following responses
show LABEL_DETECTION
and TEXT_DETECTION
annotations for image1.png
, IMAGE_PROPERTIES
annotations for image2.jpg
, and OBJECT_LOCALIZATION
annotations for image3.jpg
.
The response also contain a context
field showing
the file's URI.
offline_batch_output/output-1-to-2.json
{ "responses" : [ { " labelAnnotations " : [ { "mid" : "/m/07s6nbt" , "description" : "Text" , "score" : 0.93413997 , "topicality" : 0.93413997 }, { "mid" : "/m/0dwx7" , "description" : "Logo" , "score" : 0.8733531 , "topicality" : 0.8733531 }, ... { "mid" : "/m/03bxgrp" , "description" : "Company" , "score" : 0.5682425 , "topicality" : 0.5682425 } ], " textAnnotations " : [ { "locale" : "en" , "description" : "Google\n" , "boundingPoly" : { "vertices" : [ { "x" : 72 , "y" : 40 }, { "x" : 613 , "y" : 40 }, { "x" : 613 , "y" : 233 }, { "x" : 72 , "y" : 233 } ] } }, ... ], "blockType" : "TEXT" } ] } ], "text" : "Google\n" }, "context" : { "uri" : " gs://cloud-samples-data/vision/document_understanding/image1.png " } }, { " imagePropertiesAnnotation " : { "dominantColors" : { "colors" : [ { "color" : { "red" : 229 , "green" : 230 , "blue" : 238 }, "score" : 0.2744754 , "pixelFraction" : 0.075339235 }, ... { "color" : { "red" : 86 , "green" : 87 , "blue" : 95 }, "score" : 0.025770646 , "pixelFraction" : 0.13109145 } ] } }, "cropHintsAnnotation" : { "cropHints" : [ { "boundingPoly" : { "vertices" : [ {}, { "x" : 1599 }, { "x" : 1599 , "y" : 1199 }, { "y" : 1199 } ] }, "confidence" : 0.79999995 , "importanceFraction" : 1 } ] }, "context" : { "uri" : " gs://cloud-samples-data/vision/document_understanding/image2.jpg " } } ] }
offline_batch_output/output-3-to-3.json
{ "responses" : [ { "context" : { "uri" : " gs://cloud-samples-data/vision/document_understanding/image3.jpg " }, " localizedObjectAnnotations " : [ { "mid" : "/m/0bt9lr" , "name" : "Dog" , "score" : 0.9669734 , "boundingPoly" : { "normalizedVertices" : [ { "x" : 0.6035543 , "y" : 0.1357359 }, { "x" : 0.98546547 , "y" : 0.1357359 }, { "x" : 0.98546547 , "y" : 0.98426414 }, { "x" : 0.6035543 , "y" : 0.98426414 } ] } }, ... { "mid" : "/m/0jbk" , "name" : "Animal" , "score" : 0.58003056 , "boundingPoly" : { "normalizedVertices" : [ { "x" : 0.014534635 , "y" : 0.1357359 }, { "x" : 0.37197515 , "y" : 0.1357359 }, { "x" : 0.37197515 , "y" : 0.98426414 }, { "x" : 0.014534635 , "y" : 0.98426414 } ] } } ] } ] }