Detect handwriting in images

Handwriting detection with Optical Character Recognition (OCR)

The Vision API can detect and extract text from images:

  • DOCUMENT_TEXT_DETECTION extracts text from an image (or file ); the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information.

    Screenshot simulating how an OCR system might identify and extract text, highlighting headings, paragraphs, and icons.

    One specific use of DOCUMENT_TEXT_DETECTION is to detect handwriting in an image.

    Lined paper with Google Cloud Platform written in cursive.

Try it for yourself

If you're new to Google Cloud, create an account to evaluate how Cloud Vision API performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Try Cloud Vision API free

Document text detection requests

Set up your Google Cloud project and authentication

Detect document text in a local image

You can use the Vision API to perform feature detection on a local image file.

For REST requests, send the contents of the image file as a base64 encoded string in the body of your request.

For gcloud and client library requests, specify the path to a local image in your request.

REST

Before using any of the request data, make the following replacements:

  • BASE64_ENCODED_IMAGE : The base64 representation (ASCII string) of your binary image data. This string should look similar to the following string:
    • /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
    Visit the base64 encode topic for more information.
  • PROJECT_ID : Your Google Cloud project ID.

HTTP method and URL:

POST https://vision.googleapis.com/v1/images:annotate

Request JSON body:

{
  "requests": [
    {
      "image": {
        "content": " BASE64_ENCODED_IMAGE 
"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ]
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json , and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID " \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/images:annotate"

PowerShell

Save the request body in a file named request.json , and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = " PROJECT_ID " }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

Go

Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Go API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // detectDocumentText gets the full document text from the Vision API for an image at the given file path. 
 func 
  
 detectDocumentText 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 file 
  
 string 
 ) 
  
 error 
  
 { 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageAnnotatorClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 f 
 , 
  
 err 
  
 := 
  
 os 
 . 
 Open 
 ( 
 file 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 defer 
  
 f 
 . 
 Close 
 () 
  
 image 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageFromReader 
 ( 
 f 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 annotation 
 , 
  
 err 
  
 := 
  
 client 
 . 
 DetectDocumentText 
 ( 
 ctx 
 , 
  
 image 
 , 
  
 nil 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 if 
  
 annotation 
  
 == 
  
 nil 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "No text found." 
 ) 
  
 } 
  
 else 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "Document Text:" 
 ) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "%q\n" 
 , 
  
 annotation 
 . 
 Text 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "Pages:" 
 ) 
  
 for 
  
 _ 
 , 
  
 page 
  
 := 
  
 range 
  
 annotation 
 . 
 Pages 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\tConfidence: %f, Width: %d, Height: %d\n" 
 , 
  
 page 
 . 
 Confidence 
 , 
  
 page 
 . 
 Width 
 , 
  
 page 
 . 
 Height 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "\tBlocks:" 
 ) 
  
 for 
  
 _ 
 , 
  
 block 
  
 := 
  
 range 
  
 page 
 . 
 Blocks 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\t\tConfidence: %f, Block type: %v\n" 
 , 
  
 block 
 . 
 Confidence 
 , 
  
 block 
 . 
 BlockType 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "\t\tParagraphs:" 
 ) 
  
 for 
  
 _ 
 , 
  
 paragraph 
  
 := 
  
 range 
  
 block 
 . 
 Paragraphs 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\t\t\tConfidence: %f" 
 , 
  
 paragraph 
 . 
 Confidence 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "\t\t\tWords:" 
 ) 
  
 for 
  
 _ 
 , 
  
 word 
  
 := 
  
 range 
  
 paragraph 
 . 
 Words 
  
 { 
  
 symbols 
  
 := 
  
 make 
 ([] 
 string 
 , 
  
 len 
 ( 
 word 
 . 
 Symbols 
 )) 
  
 for 
  
 i 
 , 
  
 s 
  
 := 
  
 range 
  
 word 
 . 
 Symbols 
  
 { 
  
 symbols 
 [ 
 i 
 ] 
  
 = 
  
 s 
 . 
 Text 
  
 } 
  
 wordText 
  
 := 
  
 strings 
 . 
 Join 
 ( 
 symbols 
 , 
  
 "" 
 ) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\t\t\t\tConfidence: %f, Symbols: %s\n" 
 , 
  
 word 
 . 
 Confidence 
 , 
  
 wordText 
 ) 
  
 } 
  
 } 
  
 } 
  
 } 
  
 } 
  
 return 
  
 nil 
 } 
 

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Java reference documentation .

  public 
  
 static 
  
 void 
  
 detectDocumentText 
 ( 
 String 
  
 filePath 
 ) 
  
 throws 
  
 IOException 
  
 { 
  
 List<AnnotateImageRequest> 
  
 requests 
  
 = 
  
 new 
  
 ArrayList 
<> (); 
  
 ByteString 
  
 imgBytes 
  
 = 
  
 ByteString 
 . 
 readFrom 
 ( 
 new 
  
 FileInputStream 
 ( 
 filePath 
 )); 
  
 Image 
  
 img 
  
 = 
  
 Image 
 . 
 newBuilder 
 (). 
 setContent 
 ( 
 imgBytes 
 ). 
 build 
 (); 
  
 Feature 
  
 feat 
  
 = 
  
 Feature 
 . 
 newBuilder 
 (). 
 setType 
 ( 
 Type 
 . 
 DOCUMENT_TEXT_DETECTION 
 ). 
 build 
 (); 
  
 AnnotateImageRequest 
  
 request 
  
 = 
  
 AnnotateImageRequest 
 . 
 newBuilder 
 (). 
 addFeatures 
 ( 
 feat 
 ). 
 setImage 
 ( 
 img 
 ). 
 build 
 (); 
  
 requests 
 . 
 add 
 ( 
 request 
 ); 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
 ImageAnnotatorClient 
  
 client 
  
 = 
  
 ImageAnnotatorClient 
 . 
 create 
 ()) 
  
 { 
  
 BatchAnnotateImagesResponse 
  
 response 
  
 = 
  
 client 
 . 
 batchAnnotateImages 
 ( 
 requests 
 ); 
  
 List<AnnotateImageResponse> 
  
 responses 
  
 = 
  
 response 
 . 
 getResponsesList 
 (); 
  
 client 
 . 
 close 
 (); 
  
 for 
  
 ( 
 AnnotateImageResponse 
  
 res 
  
 : 
  
 responses 
 ) 
  
 { 
  
 if 
  
 ( 
 res 
 . 
 hasError 
 ()) 
  
 { 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Error: %s%n" 
 , 
  
 res 
 . 
 getError 
 (). 
 getMessage 
 ()); 
  
 return 
 ; 
  
 } 
  
 // For full list of available annotations, see http://g.co/cloud/vision/docs 
  
 TextAnnotation 
  
 annotation 
  
 = 
  
 res 
 . 
 getFullTextAnnotation 
 (); 
  
 for 
  
 ( 
 Page 
  
 page 
  
 : 
  
 annotation 
 . 
 getPagesList 
 ()) 
  
 { 
  
 String 
  
 pageText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Block 
  
 block 
  
 : 
  
 page 
 . 
 getBlocksList 
 ()) 
  
 { 
  
 String 
  
 blockText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Paragraph 
  
 para 
  
 : 
  
 block 
 . 
 getParagraphsList 
 ()) 
  
 { 
  
 String 
  
 paraText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Word 
  
 word 
  
 : 
  
 para 
 . 
 getWordsList 
 ()) 
  
 { 
  
 String 
  
 wordText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Symbol 
  
 symbol 
  
 : 
  
 word 
 . 
 getSymbolsList 
 ()) 
  
 { 
  
 wordText 
  
 = 
  
 wordText 
  
 + 
  
 symbol 
 . 
 getText 
 (); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
  
 "Symbol text: %s (confidence: %f)%n" 
 , 
  
 symbol 
 . 
 getText 
 (), 
  
 symbol 
 . 
 getConfidence 
 ()); 
  
 } 
  
 System 
 . 
 out 
 . 
 format 
 ( 
  
 "Word text: %s (confidence: %f)%n%n" 
 , 
  
 wordText 
 , 
  
 word 
 . 
 getConfidence 
 ()); 
  
 paraText 
  
 = 
  
 String 
 . 
 format 
 ( 
 "%s %s" 
 , 
  
 paraText 
 , 
  
 wordText 
 ); 
  
 } 
  
 // Output Example using Paragraph: 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "%nParagraph: %n" 
  
 + 
  
 paraText 
 ); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Paragraph Confidence: %f%n" 
 , 
  
 para 
 . 
 getConfidence 
 ()); 
  
 blockText 
  
 = 
  
 blockText 
  
 + 
  
 paraText 
 ; 
  
 } 
  
 pageText 
  
 = 
  
 pageText 
  
 + 
  
 blockText 
 ; 
  
 } 
  
 } 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "%nComplete annotation:" 
 ); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 annotation 
 . 
 getText 
 ()); 
  
 } 
  
 } 
 } 
 

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // Imports the Google Cloud client library 
 const 
  
 vision 
  
 = 
  
 require 
 ( 
 '@google-cloud/vision' 
 ); 
 // Creates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 vision 
 . 
 ImageAnnotatorClient 
 (); 
 /** 
 * TODO(developer): Uncomment the following line before running the sample. 
 */ 
 // const fileName = 'Local image file, e.g. /path/to/image.png'; 
 // Read a local image as a text document 
 const 
  
 [ 
 result 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 documentTextDetection 
 ( 
 fileName 
 ); 
 const 
  
 fullTextAnnotation 
  
 = 
  
 result 
 . 
 fullTextAnnotation 
 ; 
 console 
 . 
 log 
 ( 
 `Full text: 
 ${ 
 fullTextAnnotation 
 . 
 text 
 } 
 ` 
 ); 
 fullTextAnnotation 
 . 
 pages 
 . 
 forEach 
 ( 
 page 
  
 = 
>  
 { 
  
 page 
 . 
 blocks 
 . 
 forEach 
 ( 
 block 
  
 = 
>  
 { 
  
 console 
 . 
 log 
 ( 
 `Block confidence: 
 ${ 
 block 
 . 
 confidence 
 } 
 ` 
 ); 
  
 block 
 . 
 paragraphs 
 . 
 forEach 
 ( 
 paragraph 
  
 = 
>  
 { 
  
 console 
 . 
 log 
 ( 
 `Paragraph confidence: 
 ${ 
 paragraph 
 . 
 confidence 
 } 
 ` 
 ); 
  
 paragraph 
 . 
 words 
 . 
 forEach 
 ( 
 word 
  
 = 
>  
 { 
  
 const 
  
 wordText 
  
 = 
  
 word 
 . 
 symbols 
 . 
 map 
 ( 
 s 
  
 = 
>  
 s 
 . 
 text 
 ). 
 join 
 ( 
 '' 
 ); 
  
 console 
 . 
 log 
 ( 
 `Word text: 
 ${ 
 wordText 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 `Word confidence: 
 ${ 
 word 
 . 
 confidence 
 } 
 ` 
 ); 
  
 word 
 . 
 symbols 
 . 
 forEach 
 ( 
 symbol 
  
 = 
>  
 { 
  
 console 
 . 
 log 
 ( 
 `Symbol text: 
 ${ 
 symbol 
 . 
 text 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 `Symbol confidence: 
 ${ 
 symbol 
 . 
 confidence 
 } 
 ` 
 ); 
  
 }); 
  
 }); 
  
 }); 
  
 }); 
 }); 
 

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  def 
  
 detect_document 
 ( 
 path 
 ): 
  
 """Detects document features in an image.""" 
 from 
  
 google.cloud 
  
 import 
 vision 
 client 
 = 
 vision 
 . 
 ImageAnnotatorClient 
 () 
 with 
 open 
 ( 
 path 
 , 
 "rb" 
 ) 
 as 
 image_file 
 : 
 content 
 = 
 image_file 
 . 
 read 
 () 
 image 
 = 
 vision 
 . 
 Image 
 ( 
 content 
 = 
 content 
 ) 
 response 
 = 
 client 
 . 
 document_text_detection 
 ( 
 image 
 = 
 image 
 ) 
 for 
 page 
 in 
 response 
 . 
 full_text_annotation 
 . 
 pages 
 : 
 for 
 block 
 in 
 page 
 . 
 blocks 
 : 
 print 
 ( 
 f 
 " 
 \n 
 Block confidence: 
 { 
 block 
 . 
 confidence 
 } 
 \n 
 " 
 ) 
 for 
 paragraph 
 in 
 block 
 . 
 paragraphs 
 : 
 print 
 ( 
 "Paragraph confidence: 
 {} 
 " 
 . 
 format 
 ( 
 paragraph 
 . 
 confidence 
 )) 
 for 
 word 
 in 
 paragraph 
 . 
 words 
 : 
 word_text 
 = 
 "" 
 . 
 join 
 ([ 
 symbol 
 . 
 text 
 for 
 symbol 
 in 
 word 
 . 
 symbols 
 ]) 
 print 
 ( 
 "Word text: 
 {} 
 (confidence: 
 {} 
 )" 
 . 
 format 
 ( 
 word_text 
 , 
 word 
 . 
 confidence 
 ) 
 ) 
 for 
 symbol 
 in 
 word 
 . 
 symbols 
 : 
 print 
 ( 
 " 
 \t 
 Symbol: 
 {} 
 (confidence: 
 {} 
 )" 
 . 
 format 
 ( 
 symbol 
 . 
 text 
 , 
 symbol 
 . 
 confidence 
 ) 
 ) 
 if 
 response 
 . 
 error 
 . 
 message 
 : 
 raise 
 Exception 
 ( 
 " 
 {} 
 \n 
 For more info on error messages, check: " 
 "https://cloud.google.com/apis/design/errors" 
 . 
 format 
 ( 
 response 
 . 
 error 
 . 
 message 
 ) 
 ) 
 

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.

Detect document text in a remote image

You can use the Vision API to perform feature detection on a remote image file that is located in Cloud Storage or on the Web. To send a remote file request, specify the file's Web URL or Cloud Storage URI in the request body.

REST

Before using any of the request data, make the following replacements:

  • CLOUD_STORAGE_IMAGE_URI : the path to a valid image file in a Cloud Storage bucket. You must at least have read privileges to the file. Example:
    • gs://cloud-samples-data/vision/handwriting_image.png
  • PROJECT_ID : Your Google Cloud project ID.

HTTP method and URL:

POST https://vision.googleapis.com/v1/images:annotate

Request JSON body:

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": " CLOUD_STORAGE_IMAGE_URI 
"
        }
       },
       "features": [
         {
           "type": "DOCUMENT_TEXT_DETECTION"
         }
       ]
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json , and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID " \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/images:annotate"

PowerShell

Save the request body in a file named request.json , and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = " PROJECT_ID " }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

Go

Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Go API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // detectDocumentText gets the full document text from the Vision API for an image at the given file path. 
 func 
  
 detectDocumentTextURI 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 file 
  
 string 
 ) 
  
 error 
  
 { 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageAnnotatorClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 image 
  
 := 
  
 vision 
 . 
 NewImageFromURI 
 ( 
 file 
 ) 
  
 annotation 
 , 
  
 err 
  
 := 
  
 client 
 . 
 DetectDocumentText 
 ( 
 ctx 
 , 
  
 image 
 , 
  
 nil 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 if 
  
 annotation 
  
 == 
  
 nil 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "No text found." 
 ) 
  
 } 
  
 else 
  
 { 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "Document Text:" 
 ) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "%q\n" 
 , 
  
 annotation 
 . 
 Text 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "Pages:" 
 ) 
  
 for 
  
 _ 
 , 
  
 page 
  
 := 
  
 range 
  
 annotation 
 . 
 Pages 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\tConfidence: %f, Width: %d, Height: %d\n" 
 , 
  
 page 
 . 
 Confidence 
 , 
  
 page 
 . 
 Width 
 , 
  
 page 
 . 
 Height 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "\tBlocks:" 
 ) 
  
 for 
  
 _ 
 , 
  
 block 
  
 := 
  
 range 
  
 page 
 . 
 Blocks 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\t\tConfidence: %f, Block type: %v\n" 
 , 
  
 block 
 . 
 Confidence 
 , 
  
 block 
 . 
 BlockType 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "\t\tParagraphs:" 
 ) 
  
 for 
  
 _ 
 , 
  
 paragraph 
  
 := 
  
 range 
  
 block 
 . 
 Paragraphs 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\t\t\tConfidence: %f" 
 , 
  
 paragraph 
 . 
 Confidence 
 ) 
  
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
  
 "\t\t\tWords:" 
 ) 
  
 for 
  
 _ 
 , 
  
 word 
  
 := 
  
 range 
  
 paragraph 
 . 
 Words 
  
 { 
  
 symbols 
  
 := 
  
 make 
 ([] 
 string 
 , 
  
 len 
 ( 
 word 
 . 
 Symbols 
 )) 
  
 for 
  
 i 
 , 
  
 s 
  
 := 
  
 range 
  
 word 
 . 
 Symbols 
  
 { 
  
 symbols 
 [ 
 i 
 ] 
  
 = 
  
 s 
 . 
 Text 
  
 } 
  
 wordText 
  
 := 
  
 strings 
 . 
 Join 
 ( 
 symbols 
 , 
  
 "" 
 ) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\t\t\t\tConfidence: %f, Symbols: %s\n" 
 , 
  
 word 
 . 
 Confidence 
 , 
  
 wordText 
 ) 
  
 } 
  
 } 
  
 } 
  
 } 
  
 } 
  
 return 
  
 nil 
 } 
 

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Java reference documentation .

  public 
  
 static 
  
 void 
  
 detectDocumentTextGcs 
 ( 
 String 
  
 gcsPath 
 ) 
  
 throws 
  
 IOException 
  
 { 
  
 List<AnnotateImageRequest> 
  
 requests 
  
 = 
  
 new 
  
 ArrayList 
<> (); 
  
 ImageSource 
  
 imgSource 
  
 = 
  
 ImageSource 
 . 
 newBuilder 
 (). 
 setGcsImageUri 
 ( 
 gcsPath 
 ). 
 build 
 (); 
  
 Image 
  
 img 
  
 = 
  
 Image 
 . 
 newBuilder 
 (). 
 setSource 
 ( 
 imgSource 
 ). 
 build 
 (); 
  
 Feature 
  
 feat 
  
 = 
  
 Feature 
 . 
 newBuilder 
 (). 
 setType 
 ( 
 Type 
 . 
 DOCUMENT_TEXT_DETECTION 
 ). 
 build 
 (); 
  
 AnnotateImageRequest 
  
 request 
  
 = 
  
 AnnotateImageRequest 
 . 
 newBuilder 
 (). 
 addFeatures 
 ( 
 feat 
 ). 
 setImage 
 ( 
 img 
 ). 
 build 
 (); 
  
 requests 
 . 
 add 
 ( 
 request 
 ); 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
 ImageAnnotatorClient 
  
 client 
  
 = 
  
 ImageAnnotatorClient 
 . 
 create 
 ()) 
  
 { 
  
 BatchAnnotateImagesResponse 
  
 response 
  
 = 
  
 client 
 . 
 batchAnnotateImages 
 ( 
 requests 
 ); 
  
 List<AnnotateImageResponse> 
  
 responses 
  
 = 
  
 response 
 . 
 getResponsesList 
 (); 
  
 client 
 . 
 close 
 (); 
  
 for 
  
 ( 
 AnnotateImageResponse 
  
 res 
  
 : 
  
 responses 
 ) 
  
 { 
  
 if 
  
 ( 
 res 
 . 
 hasError 
 ()) 
  
 { 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Error: %s%n" 
 , 
  
 res 
 . 
 getError 
 (). 
 getMessage 
 ()); 
  
 return 
 ; 
  
 } 
  
 // For full list of available annotations, see http://g.co/cloud/vision/docs 
  
 TextAnnotation 
  
 annotation 
  
 = 
  
 res 
 . 
 getFullTextAnnotation 
 (); 
  
 for 
  
 ( 
 Page 
  
 page 
  
 : 
  
 annotation 
 . 
 getPagesList 
 ()) 
  
 { 
  
 String 
  
 pageText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Block 
  
 block 
  
 : 
  
 page 
 . 
 getBlocksList 
 ()) 
  
 { 
  
 String 
  
 blockText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Paragraph 
  
 para 
  
 : 
  
 block 
 . 
 getParagraphsList 
 ()) 
  
 { 
  
 String 
  
 paraText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Word 
  
 word 
  
 : 
  
 para 
 . 
 getWordsList 
 ()) 
  
 { 
  
 String 
  
 wordText 
  
 = 
  
 "" 
 ; 
  
 for 
  
 ( 
 Symbol 
  
 symbol 
  
 : 
  
 word 
 . 
 getSymbolsList 
 ()) 
  
 { 
  
 wordText 
  
 = 
  
 wordText 
  
 + 
  
 symbol 
 . 
 getText 
 (); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
  
 "Symbol text: %s (confidence: %f)%n" 
 , 
  
 symbol 
 . 
 getText 
 (), 
  
 symbol 
 . 
 getConfidence 
 ()); 
  
 } 
  
 System 
 . 
 out 
 . 
 format 
 ( 
  
 "Word text: %s (confidence: %f)%n%n" 
 , 
  
 wordText 
 , 
  
 word 
 . 
 getConfidence 
 ()); 
  
 paraText 
  
 = 
  
 String 
 . 
 format 
 ( 
 "%s %s" 
 , 
  
 paraText 
 , 
  
 wordText 
 ); 
  
 } 
  
 // Output Example using Paragraph: 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "%nParagraph: %n" 
  
 + 
  
 paraText 
 ); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Paragraph Confidence: %f%n" 
 , 
  
 para 
 . 
 getConfidence 
 ()); 
  
 blockText 
  
 = 
  
 blockText 
  
 + 
  
 paraText 
 ; 
  
 } 
  
 pageText 
  
 = 
  
 pageText 
  
 + 
  
 blockText 
 ; 
  
 } 
  
 } 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "%nComplete annotation:" 
 ); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 annotation 
 . 
 getText 
 ()); 
  
 } 
  
 } 
 } 
 

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // Imports the Google Cloud client libraries 
 const 
  
 vision 
  
 = 
  
 require 
 ( 
 '@google-cloud/vision' 
 ); 
 // Creates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 vision 
 . 
 ImageAnnotatorClient 
 (); 
 /** 
 * TODO(developer): Uncomment the following lines before running the sample. 
 */ 
 // const bucketName = 'Bucket where the file resides, e.g. my-bucket'; 
 // const fileName = 'Path to file within bucket, e.g. path/to/image.png'; 
 // Read a remote image as a text document 
 const 
  
 [ 
 result 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 documentTextDetection 
 ( 
  
 `gs:// 
 ${ 
 bucketName 
 } 
 / 
 ${ 
 fileName 
 } 
 ` 
 ); 
 const 
  
 fullTextAnnotation 
  
 = 
  
 result 
 . 
 fullTextAnnotation 
 ; 
 console 
 . 
 log 
 ( 
 fullTextAnnotation 
 . 
 text 
 ); 
 

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  def 
  
 detect_document_uri 
 ( 
 uri 
 ): 
  
 """Detects document features in the file located in Google Cloud 
 Storage.""" 
 from 
  
 google.cloud 
  
 import 
 vision 
 client 
 = 
 vision 
 . 
 ImageAnnotatorClient 
 () 
 image 
 = 
 vision 
 . 
 Image 
 () 
 image 
 . 
 source 
 . 
 image_uri 
 = 
 uri 
 response 
 = 
 client 
 . 
 document_text_detection 
 ( 
 image 
 = 
 image 
 ) 
 for 
 page 
 in 
 response 
 . 
 full_text_annotation 
 . 
 pages 
 : 
 for 
 block 
 in 
 page 
 . 
 blocks 
 : 
 print 
 ( 
 f 
 " 
 \n 
 Block confidence: 
 { 
 block 
 . 
 confidence 
 } 
 \n 
 " 
 ) 
 for 
 paragraph 
 in 
 block 
 . 
 paragraphs 
 : 
 print 
 ( 
 "Paragraph confidence: 
 {} 
 " 
 . 
 format 
 ( 
 paragraph 
 . 
 confidence 
 )) 
 for 
 word 
 in 
 paragraph 
 . 
 words 
 : 
 word_text 
 = 
 "" 
 . 
 join 
 ([ 
 symbol 
 . 
 text 
 for 
 symbol 
 in 
 word 
 . 
 symbols 
 ]) 
 print 
 ( 
 "Word text: 
 {} 
 (confidence: 
 {} 
 )" 
 . 
 format 
 ( 
 word_text 
 , 
 word 
 . 
 confidence 
 ) 
 ) 
 for 
 symbol 
 in 
 word 
 . 
 symbols 
 : 
 print 
 ( 
 " 
 \t 
 Symbol: 
 {} 
 (confidence: 
 {} 
 )" 
 . 
 format 
 ( 
 symbol 
 . 
 text 
 , 
 symbol 
 . 
 confidence 
 ) 
 ) 
 if 
 response 
 . 
 error 
 . 
 message 
 : 
 raise 
 Exception 
 ( 
 " 
 {} 
 \n 
 For more info on error messages, check: " 
 "https://cloud.google.com/apis/design/errors" 
 . 
 format 
 ( 
 response 
 . 
 error 
 . 
 message 
 ) 
 ) 
 

gcloud

To perform handwriting detection, use the gcloud ml vision detect-document command as shown in the following example:

gcloud ml vision detect-document gs://cloud-samples-data/vision/handwriting_image.png 

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.

Specify the language (optional)

Both types of OCR requests support one or more languageHints that specify the language of any text in the image. However, an empty value usually yields the best results, because omitting a value enables automatic language detection. For languages based on the Latin alphabet, setting languageHints is not needed. In rare cases, when the language of the text in the image is known, setting a hint helps get better results (although it can be a significant hindrance if the hint is wrong). Text detection returns an error if one or more of the specified languages is not one of the supported languages .

If you choose to provide a language hint, modify the body of your request ( request.json file) to provide the string of one of the supported languages in the imageContext.languageHints field as shown in the following sample:

 { 
  
 "requests" 
:  
 [ 
  
 { 
  
 "image" 
:  
 { 
  
 "source" 
:  
 { 
  
 "imageUri" 
:  
 " IMAGE_URL 
" 
  
 } 
  
 } 
,  
 "features" 
:  
 [ 
  
 { 
  
 "type" 
:  
 "DOCUMENT_TEXT_DETECTION" 
  
 } 
  
 ] 
,   
 "imageContext" 
 
:  
 { 
   
 "languageHints" 
 
:  
  [ 
 "en-t-i0-handwrit" 
 ] 
 
  
 } 
  
 } 
  
 ] 
 } 

Multi-regional support

You can now specify continent-level data storage and OCR processing. The following regions are currently supported:

  • us : USA country only
  • eu : The European Union

Locations

Cloud Vision offers you some control over where the resources for your project are stored and processed. In particular, you can configure Cloud Vision to store and process your data only in the European Union.

By default Cloud Vision stores and processes resources in a Global location, which means that Cloud Vision doesn't guarantee that your resources will remain within a particular location or region. If you choose the European Union location, Google will store your data and process it only in the European Union. You and your users can access the data from any location.

Setting the location using the API

The Vision API supports a global API endpoint ( vision.googleapis.com ) and also two region-based endpoints: a European Union endpoint ( eu-vision.googleapis.com ) and United States endpoint ( us-vision.googleapis.com ). Use these endpoints for region-specific processing. For example, to store and process your data in the European Union only, use the URI eu-vision.googleapis.com in place of vision.googleapis.com for your REST API calls:

  • https:// eu- vision.googleapis.com/v1/projects/ PROJECT_ID /locations/ eu /images:annotate
  • https:// eu- vision.googleapis.com/v1/projects/ PROJECT_ID /locations/ eu /images:asyncBatchAnnotate
  • https:// eu- vision.googleapis.com/v1/projects/ PROJECT_ID /locations/ eu /files:annotate
  • https:// eu- vision.googleapis.com/v1/projects/ PROJECT_ID /locations/ eu /files:asyncBatchAnnotate

To store and process your data in the United States only, use the US endpoint ( us-vision.googleapis.com ) with the preceding methods.

Setting the location using the client libraries

The Vision API client libraries accesses the global API endpoint ( vision.googleapis.com ) by default. To store and process your data in the European Union only, you need to explicitly set the endpoint ( eu-vision.googleapis.com ). The following code samples show how to configure this setting.

REST

Before using any of the request data, make the following replacements:

  • REGION_ID : One of the valid regional location identifiers:
    • us : USA country only
    • eu : The European Union
  • CLOUD_STORAGE_IMAGE_URI : the path to a valid image file in a Cloud Storage bucket. You must at least have read privileges to the file. Example:
    • gs://cloud-samples-data/vision/handwriting_image.png
  • PROJECT_ID : Your Google Cloud project ID.

HTTP method and URL:

POST https:// REGION_ID 
-vision.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ REGION_ID 
/images:annotate

Request JSON body:

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": " CLOUD_STORAGE_IMAGE_URI 
"
        }
       },
       "features": [
         {
           "type": "DOCUMENT_TEXT_DETECTION"
         }
       ]
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json , and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID " \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https:// REGION_ID -vision.googleapis.com/v1/projects/ PROJECT_ID /locations/ REGION_ID /images:annotate"

PowerShell

Save the request body in a file named request.json , and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = " PROJECT_ID " }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https:// REGION_ID -vision.googleapis.com/v1/projects/ PROJECT_ID /locations/ REGION_ID /images:annotate" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

Go

Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Go API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 ( 
  
 "context" 
  
 "fmt" 
  
 vision 
  
 "cloud.google.com/go/vision/apiv1" 
  
 "google.golang.org/api/option" 
 ) 
 // setEndpoint changes your endpoint. 
 func 
  
 setEndpoint 
 ( 
 endpoint 
  
 string 
 ) 
  
 error 
  
 { 
  
 // endpoint := "eu-vision.googleapis.com:443" 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 vision 
 . 
 NewImageAnnotatorClient 
 ( 
 ctx 
 , 
  
 option 
 . 
 WithEndpoint 
 ( 
 endpoint 
 )) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "NewImageAnnotatorClient: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 defer 
  
 client 
 . 
 Close 
 () 
  
 return 
  
 nil 
 } 
 

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Java reference documentation .

  ImageAnnotatorSettings 
  
 settings 
  
 = 
  
 ImageAnnotatorSettings 
 . 
 newBuilder 
 (). 
 setEndpoint 
 ( 
 "eu-vision.googleapis.com:443" 
 ). 
 build 
 (); 
 // Initialize client that will be used to send requests. This client only needs to be created 
 // once, and can be reused for multiple requests. After completing all of your requests, call 
 // the "close" method on the client to safely clean up any remaining background resources. 
 ImageAnnotatorClient 
  
 client 
  
 = 
  
 ImageAnnotatorClient 
 . 
 create 
 ( 
 settings 
 ); 
 

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Node.js API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  // Imports the Google Cloud client library 
 const 
  
 vision 
  
 = 
  
 require 
 ( 
 '@google-cloud/vision' 
 ); 
 async 
  
 function 
  
 setEndpoint 
 () 
  
 { 
  
 // Specifies the location of the api endpoint 
  
 const 
  
 clientOptions 
  
 = 
  
 { 
 apiEndpoint 
 : 
  
 'eu-vision.googleapis.com' 
 }; 
  
 // Creates a client 
  
 const 
  
 client 
  
 = 
  
 new 
  
 vision 
 . 
 ImageAnnotatorClient 
 ( 
 clientOptions 
 ); 
  
 // Performs text detection on the image file 
  
 const 
  
 [ 
 result 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 textDetection 
 ( 
 './resources/wakeupcat.jpg' 
 ); 
  
 const 
  
 labels 
  
 = 
  
 result 
 . 
 textAnnotations 
 ; 
  
 console 
 . 
 log 
 ( 
 'Text:' 
 ); 
  
 labels 
 . 
 forEach 
 ( 
 label 
  
 = 
>  
 console 
 . 
 log 
 ( 
 label 
 . 
 description 
 )); 
 } 
 setEndpoint 
 (); 
 

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries . For more information, see the Vision Python API reference documentation .

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  from 
  
 google.cloud 
  
 import 
 vision 
 client_options 
 = 
 { 
 "api_endpoint" 
 : 
 "eu-vision.googleapis.com" 
 } 
 client 
 = 
 vision 
 . 
 ImageAnnotatorClient 
 ( 
 client_options 
 = 
 client_options 
 ) 
 

Try it

Try text detection and document text detection in the following tool. You can use the image specified already ( gs://cloud-samples-data/vision/handwriting_image.png ) by clicking Execute, or you can specify your own image in its place.

Lined paper with Google Cloud Platform written in cursive.

Request body:

{
  "requests": [
    {
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "image": {
        "source": {
          "imageUri": "gs://cloud-samples-data/vision/handwriting_image.png"
        }
      }
    }
  ]
}
Create a Mobile Website
View Site in Mobile | Classic
Share by: