Toolbox - Create document batches

Create batches of documents for processing with batch_process_documents() .

Explore further

For detailed documentation that includes this code sample, see the following:

Code sample

Python

For more information, see the Document AI Python API reference documentation .

To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  from 
  
 google.cloud 
  
 import 
 documentai 
 from 
  
 google.cloud.documentai_toolbox 
  
 import 
  gcs_utilities 
 
 # TODO(developer): Uncomment these variables before running the sample. 
 # Given unprocessed documents in path gs://bucket/path/to/folder 
 # gcs_bucket_name = "bucket" 
 # gcs_prefix = "path/to/folder" 
 # batch_size = 50 
 def 
  
 create_batches_sample 
 ( 
 gcs_bucket_name 
 : 
 str 
 , 
 gcs_prefix 
 : 
 str 
 , 
 batch_size 
 : 
 int 
 = 
 50 
 , 
 ) 
 - 
> None 
 : 
 # Creating batches of documents for processing 
 batches 
 = 
  gcs_utilities 
 
 . 
  create_batches 
 
 ( 
 gcs_bucket_name 
 = 
 gcs_bucket_name 
 , 
 gcs_prefix 
 = 
 gcs_prefix 
 , 
 batch_size 
 = 
 batch_size 
 ) 
 print 
 ( 
 f 
 " 
 { 
 len 
 ( 
 batches 
 ) 
 } 
 batch(es) created." 
 ) 
 for 
 batch 
 in 
 batches 
 : 
 print 
 ( 
 f 
 " 
 { 
 len 
 ( 
 batch 
 . 
 gcs_documents 
 . 
 documents 
 ) 
 } 
 files in batch." 
 ) 
 print 
 ( 
 batch 
 . 
 gcs_documents 
 . 
 documents 
 ) 
 # Use as input for batch_process_documents() 
 # Refer to https://cloud.google.com/document-ai/docs/send-request 
 # for how to send a batch processing request 
 request 
 = 
 documentai 
 . 
  BatchProcessRequest 
 
 ( 
 name 
 = 
 "processor_name" 
 , 
 input_documents 
 = 
 batch 
 ) 
 print 
 ( 
 request 
 ) 
 

What's next

To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser .

Design a Mobile Site
View Site in Mobile | Classic
Share by: