Toolbox - Convert Document to hOCR

Convert Document output from Document AI to an hOCR XML string.

Explore further

For detailed documentation that includes this code sample, see the following:

Code sample

Python

For more information, see the Document AI Python API reference documentation .

To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  from 
  
 google.cloud.documentai_toolbox 
  
 import 
 document 
 # TODO(developer): Uncomment these variables before running the sample. 
 # Given a document.proto or sharded document.proto in path gs://bucket/path/to/folder 
 # document_path = "path/to/local/document.json" 
 # document_title = "your-document-title" 
 def 
  
 convert_document_to_hocr_sample 
 ( 
 document_path 
 : 
 str 
 , 
 document_title 
 : 
 str 
 ) 
 - 
> str 
 : 
 wrapped_document 
 = 
 document 
 . 
 Document 
 . 
 from_document_path 
 ( 
 document_path 
 = 
 document_path 
 ) 
 # Converting wrapped_document to hOCR format 
 hocr_string 
 = 
 wrapped_document 
 . 
 export_hocr_str 
 ( 
 title 
 = 
 document_title 
 ) 
 print 
 ( 
 "Document converted to hOCR!" 
 ) 
 return 
 hocr_string 
 

What's next

To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser .

Create a Mobile Website
View Site in Mobile | Classic
Share by: