Detect multiple objects

The Vision API can detect and extract multiple objects in an image with Object Localization.

Object localization identifies multiple objects in an image and provides a LocalizedObjectAnnotation for each object in the image. Each LocalizedObjectAnnotation identifies information about the object, the position of the object, and rectangular bounds for the region of the image that contains the object.

Object localization identifies both significant and less-prominent objects in an image.

Object information is returned in English only. The Cloud Translation can translate English labels into various other languages .

image with bounding boxes — *Image credit:* Bogdan Dada on Unsplash ( *annotations added* ).

For example, the API returns the following information and bounding location data for the objects in the preceding image:

Name	mid	Score	Bounds
Bicycle wheel	/m/01bqk0	0.89648587	(0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065)
Bicycle	/m/0199g	0.886761	(0.312, 0.6616471), (0.638353, 0.6616471), (0.638353, 0.9705882), (0.312, 0.9705882)
Bicycle wheel	/m/01bqk0	0.6345275	(0.5125398, 0.760708), (0.6256646, 0.760708), (0.6256646, 0.94601655), (0.5125398, 0.94601655)
Picture frame	/m/06z37_	0.6207608	(0.79177403, 0.16160682), (0.97047985, 0.16160682), (0.97047985, 0.31348917), (0.79177403, 0.31348917)
Tire	/m/0h9mv	0.55886006	(0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065)
Door	/m/02dgv	0.5160098	(0.77569866, 0.37104446), (0.9412425, 0.37104446), (0.9412425, 0.81507325), (0.77569866, 0.81507325)

midcontains a machine-generated identifier (MID) corresponding to a label's Google Knowledge Graph entry. For information on inspecting midvalues, see the Google Knowledge Graph Search API documentation.

Try it for yourself

If you're new to Google Cloud, create an account to evaluate how Cloud Vision API performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Try Cloud Vision API free

Object Localization requests

Set up your Google Cloud project and authentication

If you have not created a Google Cloud project, do so now. Expand this section for instructions.

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project .

Enable the Vision API.

Enable the API

Install the Google Cloud CLI.

If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .

To initialize the gcloud CLI, run the following command:

gcloud  
init

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project .

Enable the Vision API.

Enable the API

Install the Google Cloud CLI.

If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .

To initialize the gcloud CLI, run the following command:

gcloud  
init

Detect objects in a local image

You can use the Vision API to perform feature detection on a local image file.

For REST requests, send the contents of the image file as a base64 encoded string in the body of your request.

For gcloud and client library requests, specify the path to a local image in your request.

Note: This feature returns results with

 normalizedVertices

[0,1] and not real pixel values (

 vertices

REST

Before using any of the request data, make the following replacements:

BASE64_ENCODED_IMAGE : The base64 representation (ASCII string) of your binary image data. This string should look similar to the following string:
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
Visit the base64 encode topic for more information.
RESULTS_INT : (Optional) An integer value of results to return. If you omit the "maxResults" field and its value, the API returns the default value of 10 results. This field does not apply to the following feature types: TEXT_DETECTION , DOCUMENT_TEXT_DETECTION , or CROP_HINTS .
PROJECT_ID : Your Google Cloud project ID.

HTTP method and URL:

POST https://vision.googleapis.com/v1/images:annotate

Request JSON body:

{
  "requests": [
    {
      "image": {
        "content": " BASE64_ENCODED_IMAGE 
"
      },
      "features": [
        {
          "maxResults": RESULTS_INT 
,
          "type": "OBJECT_LOCALIZATION"
        },
      ]
    }
  ]
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell , which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list .

Save the request body in a file named request.json , and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID 
" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/images:annotate"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list .

Save the request body in a file named request.json , and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = " PROJECT_ID 
" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

Response:

Response

{
  "responses": [
    {
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.89648587,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/0199g",
          "name": "Bicycle",
          "score": 0.886761,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.312,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.9705882
              },
              {
                "x": 0.312,
                "y": 0.9705882
              }
            ]
          }
        },
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.6345275,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.5125398,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.94601655
              },
              {
                "x": 0.5125398,
                "y": 0.94601655
              }
            ]
          }
        },
        {
          "mid": "/m/06z37_",
          "name": "Picture frame",
          "score": 0.6207608,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.79177403,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.31348917
              },
              {
                "x": 0.79177403,
                "y": 0.31348917
              }
            ]
          }
        },
        {
          "mid": "/m/0h9mv",
          "name": "Tire",
          "score": 0.55886006,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/02dgv",
          "name": "Door",
          "score": 0.5160098,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.77569866,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.81507325
              },
              {
                "x": 0.77569866,
                "y": 0.81507325
              }
            ]
          }
        }
      ]
    }
  ]
}

Go

Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Go API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // localizeObjects gets objects and bounding boxes from the Vision API for an image at the given file path. 
 func 
  
 localizeObjects 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 file 
  
 string 
 ) 
  
 error 
  
 { 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageAnnotatorClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 f 
 , 
  
 err 
  
 := 
  
 os 
 . 
 Open 
 ( 
 file 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 defer 
  
 f 
 . 
 Close 
 () 
  
 image 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageFromReader 
 ( 
 f 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 annotations 
 , 
  
 err 
  
 := 
  
 client 
 . 
 LocalizeObjects 
 ( 
 ctx 
 , 
  
 image 
 , 
  
 nil 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 if 
  
 len 
 ( 
 annotations 
 ) 
  
 == 
  
 0 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "No objects found." 
 ) 
  
 return 
  
 nil 
  
 } 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "Objects:" 
 ) 
  
 for 
  
 _ 
 , 
  
 annotation 
  
 := 
  
 range 
  
 annotations 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 annotation 
 . 
 Name 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 annotation 
 . 
 Score 
 ) 
  
 for 
  
 _ 
 , 
  
 v 
  
 := 
  
 range 
  
 annotation 
 . 
 BoundingPoly 
 . 
 NormalizedVertices 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "(%f,%f)\n" 
 , 
  
 v 
 . 
 X 
 , 
  
 v 
 . 
 Y 
 ) 
  
 } 
  
 } 
  
 return 
  
 nil 
 }

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Java reference documentation .

  /** 
 * Detects localized objects in the specified local image. 
 * 
 * @param filePath The path to the file to perform localized object detection on. 
 * @throws Exception on errors while closing the client. 
 * @throws IOException on Input/Output errors. 
 */ 
 public 
  
 static 
  
 void 
  
 detectLocalizedObjects 
 ( 
 String 
  
 filePath 
 ) 
  
 throws 
  
 IOException 
  
 { 
  
 List<AnnotateImageRequest> 
  
 requests 
  
 = 
  
 new 
  
 ArrayList 
<> (); 
  
 ByteString 
  
 imgBytes 
  
 = 
  
 ByteString 
 . 
 readFrom 
 ( 
 new 
  
 FileInputStream 
 ( 
 filePath 
 )); 
  
 Image 
  
 img 
  
 = 
  
 Image 
 . 
 newBuilder 
 (). 
 setContent 
 ( 
 imgBytes 
 ). 
 build 
 (); 
  
 AnnotateImageRequest 
  
 request 
  
 = 
  
 AnnotateImageRequest 
 . 
 newBuilder 
 () 
  
 . 
 addFeatures 
 ( 
 Feature 
 . 
 newBuilder 
 (). 
 setType 
 ( 
 Type 
 . 
 OBJECT_LOCALIZATION 
 )) 
  
 . 
 setImage 
 ( 
 img 
 ) 
  
 . 
 build 
 (); 
  
 requests 
 . 
 add 
 ( 
 request 
 ); 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
 ImageAnnotatorClient 
  
 client 
  
 = 
  
 ImageAnnotatorClient 
 . 
 create 
 ()) 
  
 { 
  
 // Perform the request 
  
 BatchAnnotateImagesResponse 
  
 response 
  
 = 
  
 client 
 . 
 batchAnnotateImages 
 ( 
 requests 
 ); 
  
 List<AnnotateImageResponse> 
  
 responses 
  
 = 
  
 response 
 . 
 getResponsesList 
 (); 
  
 // Display the results 
  
 for 
  
 ( 
 AnnotateImageResponse 
  
 res 
  
 : 
  
 responses 
 ) 
  
 { 
  
 for 
  
 ( 
 LocalizedObjectAnnotation 
  
 entity 
  
 : 
  
 res 
 . 
 getLocalizedObjectAnnotationsList 
 ()) 
  
 { 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Object name: %s%n" 
 , 
  
 entity 
 . 
 getName 
 ()); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Confidence: %s%n" 
 , 
  
 entity 
 . 
 getScore 
 ()); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Normalized Vertices:%n" 
 ); 
  
 entity 
  
 . 
 getBoundingPoly 
 () 
  
 . 
 getNormalizedVerticesList 
 () 
  
 . 
 forEach 
 ( 
 vertex 
  
 - 
>  
 System 
 . 
 out 
 . 
 format 
 ( 
 "- (%s, %s)%n" 
 , 
  
 vertex 
 . 
 getX 
 (), 
  
 vertex 
 . 
 getY 
 ())); 
  
 } 
  
 } 
  
 } 
 }

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // Imports the Google Cloud client libraries 
 const 
  
 vision 
  
 = 
  
 require 
 ( 
 '@google-cloud/vision' 
 ); 
 const 
  
 fs 
  
 = 
  
 require 
 ( 
 'fs' 
 ); 
 // Creates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 vision 
 . 
 ImageAnnotatorClient 
 (); 
 /** 
 * TODO(developer): Uncomment the following line before running the sample. 
 */ 
 // const fileName = `/path/to/localImage.png`; 
 const 
  
 request 
  
 = 
  
 { 
  
 image 
 : 
  
 { 
 content 
 : 
  
 fs 
 . 
 readFileSync 
 ( 
 fileName 
 )}, 
 }; 
 const 
  
 [ 
 result 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 objectLocalization 
 ( 
 request 
 ); 
 const 
  
 objects 
  
 = 
  
 result 
 . 
 localizedObjectAnnotations 
 ; 
 objects 
 . 
 forEach 
 ( 
 object 
  
 = 
>  
 { 
  
 console 
 . 
 log 
 ( 
 `Name: 
 ${ 
 object 
 . 
 name 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 `Confidence: 
 ${ 
 object 
 . 
 score 
 } 
 ` 
 ); 
  
 const 
  
 vertices 
  
 = 
  
 object 
 . 
 boundingPoly 
 . 
 normalizedVertices 
 ; 
  
 vertices 
 . 
 forEach 
 ( 
 v 
  
 = 
>  
 console 
 . 
 log 
 ( 
 `x: 
 ${ 
 v 
 . 
 x 
 } 
 , y: 
 ${ 
 v 
 . 
 y 
 } 
 ` 
 )); 
 });

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  def 
  
 localize_objects 
 ( 
 path 
 ): 
  
 """Localize objects in the local image. 
 Args: 
 path: The path to the local file. 
 """ 
 from 
  
 google.cloud 
  
 import 
 vision 
 client 
 = 
 vision 
 . 
 ImageAnnotatorClient 
 () 
 with 
 open 
 ( 
 path 
 , 
 "rb" 
 ) 
 as 
 image_file 
 : 
 content 
 = 
 image_file 
 . 
 read 
 () 
 image 
 = 
 vision 
 . 
 Image 
 ( 
 content 
 = 
 content 
 ) 
 objects 
 = 
 client 
 . 
 object_localization 
 ( 
 image 
 = 
 image 
 ) 
 . 
 localized_object_annotations 
 print 
 ( 
 f 
 "Number of objects found: 
 { 
 len 
 ( 
 objects 
 ) 
 } 
 " 
 ) 
 for 
 object_ 
 in 
 objects 
 : 
 print 
 ( 
 f 
 " 
 \n 
 { 
 object_ 
 . 
 name 
 } 
 (confidence: 
 { 
 object_ 
 . 
 score 
 } 
 )" 
 ) 
 print 
 ( 
 "Normalized bounding polygon vertices: " 
 ) 
 for 
 vertex 
 in 
 object_ 
 . 
 bounding_poly 
 . 
 normalized_vertices 
 : 
 print 
 ( 
 f 
 " - ( 
 { 
 vertex 
 . 
 x 
 } 
 , 
 { 
 vertex 
 . 
 y 
 } 
 )" 
 )

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.

Detect objects in a remote image

You can use the Vision API to perform feature detection on a remote image file that is located in Cloud Storage or on the Web. To send a remote file request, specify the file's Web URL or Cloud Storage URI in the request body.

Note: This feature returns results with

 normalizedVertices

[0,1] and not real pixel values (

 vertices

REST

Before using any of the request data, make the following replacements:

CLOUD_STORAGE_IMAGE_URI : the path to a valid image file in a Cloud Storage bucket. You must at least have read privileges to the file. Example:
- ```
https://cloud.google.com/vision/docs/images/bicycle_example.png
```
RESULTS_INT : (Optional) An integer value of results to return. If you omit the "maxResults" field and its value, the API returns the default value of 10 results. This field does not apply to the following feature types: TEXT_DETECTION , DOCUMENT_TEXT_DETECTION , or CROP_HINTS .
PROJECT_ID : Your Google Cloud project ID.

HTTP method and URL:

POST https://vision.googleapis.com/v1/images:annotate

Request JSON body:

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": " CLOUD_STORAGE_IMAGE_URI 
"
        }
      },
      "features": [
        {
          "maxResults": RESULTS_INT 
,
          "type": "OBJECT_LOCALIZATION"
        },
      ]
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json , and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID 
" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/images:annotate"

PowerShell

Save the request body in a file named request.json , and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = " PROJECT_ID 
" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

Response:

Response

{
  "responses": [
    {
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.89648587,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/0199g",
          "name": "Bicycle",
          "score": 0.886761,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.312,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.9705882
              },
              {
                "x": 0.312,
                "y": 0.9705882
              }
            ]
          }
        },
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.6345275,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.5125398,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.94601655
              },
              {
                "x": 0.5125398,
                "y": 0.94601655
              }
            ]
          }
        },
        {
          "mid": "/m/06z37_",
          "name": "Picture frame",
          "score": 0.6207608,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.79177403,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.31348917
              },
              {
                "x": 0.79177403,
                "y": 0.31348917
              }
            ]
          }
        },
        {
          "mid": "/m/0h9mv",
          "name": "Tire",
          "score": 0.55886006,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/02dgv",
          "name": "Door",
          "score": 0.5160098,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.77569866,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.81507325
              },
              {
                "x": 0.77569866,
                "y": 0.81507325
              }
            ]
          }
        }
      ]
    }
  ]
}

Go

Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Go API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // localizeObjects gets objects and bounding boxes from the Vision API for an image at the given file path. 
 func 
  
 localizeObjectsURI 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 file 
  
 string 
 ) 
  
 error 
  
 { 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageAnnotatorClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 image 
  
 := 
  
 vision 
 . 
 NewImageFromURI 
 ( 
 file 
 ) 
  
 annotations 
 , 
  
 err 
  
 := 
  
 client 
 . 
 LocalizeObjects 
 ( 
 ctx 
 , 
  
 image 
 , 
  
 nil 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 if 
  
 len 
 ( 
 annotations 
 ) 
  
 == 
  
 0 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "No objects found." 
 ) 
  
 return 
  
 nil 
  
 } 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "Objects:" 
 ) 
  
 for 
  
 _ 
 , 
  
 annotation 
  
 := 
  
 range 
  
 annotations 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 annotation 
 . 
 Name 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 annotation 
 . 
 Score 
 ) 
  
 for 
  
 _ 
 , 
  
 v 
  
 := 
  
 range 
  
 annotation 
 . 
 BoundingPoly 
 . 
 NormalizedVertices 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "(%f,%f)\n" 
 , 
  
 v 
 . 
 X 
 , 
  
 v 
 . 
 Y 
 ) 
  
 } 
  
 } 
  
 return 
  
 nil 
 }

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Java reference documentation .

  /** 
 * Detects localized objects in a remote image on Google Cloud Storage. 
 * 
 * @param gcsPath The path to the remote file on Google Cloud Storage to detect localized objects 
 *     on. 
 * @throws Exception on errors while closing the client. 
 * @throws IOException on Input/Output errors. 
 */ 
 public 
  
 static 
  
 void 
  
 detectLocalizedObjectsGcs 
 ( 
 String 
  
 gcsPath 
 ) 
  
 throws 
  
 IOException 
  
 { 
  
 List<AnnotateImageRequest> 
  
 requests 
  
 = 
  
 new 
  
 ArrayList 
<> (); 
  
 ImageSource 
  
 imgSource 
  
 = 
  
 ImageSource 
 . 
 newBuilder 
 (). 
 setGcsImageUri 
 ( 
 gcsPath 
 ). 
 build 
 (); 
  
 Image 
  
 img 
  
 = 
  
 Image 
 . 
 newBuilder 
 (). 
 setSource 
 ( 
 imgSource 
 ). 
 build 
 (); 
  
 AnnotateImageRequest 
  
 request 
  
 = 
  
 AnnotateImageRequest 
 . 
 newBuilder 
 () 
  
 . 
 addFeatures 
 ( 
 Feature 
 . 
 newBuilder 
 (). 
 setType 
 ( 
 Type 
 . 
 OBJECT_LOCALIZATION 
 )) 
  
 . 
 setImage 
 ( 
 img 
 ) 
  
 . 
 build 
 (); 
  
 requests 
 . 
 add 
 ( 
 request 
 ); 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
 ImageAnnotatorClient 
  
 client 
  
 = 
  
 ImageAnnotatorClient 
 . 
 create 
 ()) 
  
 { 
  
 // Perform the request 
  
 BatchAnnotateImagesResponse 
  
 response 
  
 = 
  
 client 
 . 
 batchAnnotateImages 
 ( 
 requests 
 ); 
  
 List<AnnotateImageResponse> 
  
 responses 
  
 = 
  
 response 
 . 
 getResponsesList 
 (); 
  
 client 
 . 
 close 
 (); 
  
 // Display the results 
  
 for 
  
 ( 
 AnnotateImageResponse 
  
 res 
  
 : 
  
 responses 
 ) 
  
 { 
  
 for 
  
 ( 
 LocalizedObjectAnnotation 
  
 entity 
  
 : 
  
 res 
 . 
 getLocalizedObjectAnnotationsList 
 ()) 
  
 { 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Object name: %s%n" 
 , 
  
 entity 
 . 
 getName 
 ()); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Confidence: %s%n" 
 , 
  
 entity 
 . 
 getScore 
 ()); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Normalized Vertices:%n" 
 ); 
  
 entity 
  
 . 
 getBoundingPoly 
 () 
  
 . 
 getNormalizedVerticesList 
 () 
  
 . 
 forEach 
 ( 
 vertex 
  
 - 
>  
 System 
 . 
 out 
 . 
 format 
 ( 
 "- (%s, %s)%n" 
 , 
  
 vertex 
 . 
 getX 
 (), 
  
 vertex 
 . 
 getY 
 ())); 
  
 } 
  
 } 
  
 } 
 }

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // Imports the Google Cloud client libraries 
 const 
  
 vision 
  
 = 
  
 require 
 ( 
 '@google-cloud/vision' 
 ); 
 // Creates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 vision 
 . 
 ImageAnnotatorClient 
 (); 
 /** 
 * TODO(developer): Uncomment the following line before running the sample. 
 */ 
 // const gcsUri = `gs://bucket/bucketImage.png`; 
 const 
  
 [ 
 result 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 objectLocalization 
 ( 
 gcsUri 
 ); 
 const 
  
 objects 
  
 = 
  
 result 
 . 
 localizedObjectAnnotations 
 ; 
 objects 
 . 
 forEach 
 ( 
 object 
  
 = 
>  
 { 
  
 console 
 . 
 log 
 ( 
 `Name: 
 ${ 
 object 
 . 
 name 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 `Confidence: 
 ${ 
 object 
 . 
 score 
 } 
 ` 
 ); 
  
 const 
  
 veritices 
  
 = 
  
 object 
 . 
 boundingPoly 
 . 
 normalizedVertices 
 ; 
  
 veritices 
 . 
 forEach 
 ( 
 v 
  
 = 
>  
 console 
 . 
 log 
 ( 
 `x: 
 ${ 
 v 
 . 
 x 
 } 
 , y: 
 ${ 
 v 
 . 
 y 
 } 
 ` 
 )); 
 });

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  def 
  
 localize_objects_uri 
 ( 
 uri 
 ): 
  
 """Localize objects in the image on Google Cloud Storage 
 Args: 
 uri: The path to the file in Google Cloud Storage (gs://...) 
 """ 
 from 
  
 google.cloud 
  
 import 
 vision 
 client 
 = 
 vision 
 . 
 ImageAnnotatorClient 
 () 
 image 
 = 
 vision 
 . 
 Image 
 () 
 image 
 . 
 source 
 . 
 image_uri 
 = 
 uri 
 objects 
 = 
 client 
 . 
 object_localization 
 ( 
 image 
 = 
 image 
 ) 
 . 
 localized_object_annotations 
 print 
 ( 
 f 
 "Number of objects found: 
 { 
 len 
 ( 
 objects 
 ) 
 } 
 " 
 ) 
 for 
 object_ 
 in 
 objects 
 : 
 print 
 ( 
 f 
 " 
 \n 
 { 
 object_ 
 . 
 name 
 } 
 (confidence: 
 { 
 object_ 
 . 
 score 
 } 
 )" 
 ) 
 print 
 ( 
 "Normalized bounding polygon vertices: " 
 ) 
 for 
 vertex 
 in 
 object_ 
 . 
 bounding_poly 
 . 
 normalized_vertices 
 : 
 print 
 ( 
 f 
 " - ( 
 { 
 vertex 
 . 
 x 
 } 
 , 
 { 
 vertex 
 . 
 y 
 } 
 )" 
 )

gcloud

To detect labels in an image, use the gcloud ml vision detect-objects command as shown in the following example:

gcloud ml vision detect-objects https://cloud.google.com/vision/docs/images/bicycle_example.png

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.

Try it

Try object detection and localization with the following tool. You can use the image specified already ( https://cloud.google.com/vision/docs/images/bicycle_example.png ) or specify your own image in its place. Send the request by selecting Execute.

image without bounding boxes — *Image credit:* Bogdan Dada on Unsplash .

Request body:

{
  "requests": [
    {
      "features": [
        {
          "maxResults": 10,
          "type": "OBJECT_LOCALIZATION"
        }
      ],
      "image": {
        "source": {
          "imageUri": "https://cloud.google.com/vision/docs/images/bicycle_example.png"
        }
      }
    }
  ]
}

Detect multiple objects Stay organized with collections Save and categorize content based on your preferences.

Try it for yourself

Object Localization requests

Set up your Google Cloud project and authentication

Detect objects in a local image

REST

curl

PowerShell

Response

Go

Java

Node.js

Python

Additional languages

Detect objects in a remote image

REST

curl

PowerShell

Response

Go

Java

Node.js

Python

gcloud

Additional languages

Try it

Detect multiple objects