Run multiple BigQuery query jobs in parallel

Runs multiple BigQuery query jobs in parallel, demonstrating an improvement in performance when compared to running the jobs serially, one after the other.

Explore further

For detailed documentation that includes this code sample, see the following:

Code sample

YAML

  main 
 : 
  
 steps 
 : 
  
 - 
  
 init 
 : 
  
 assign 
 : 
  
 - 
  
 results 
 : 
  
 {} 
  
 # result from each iteration keyed by table name 
  
 - 
  
 tables 
 : 
  
 - 
  
 201201h 
  
 - 
  
 201202h 
  
 - 
  
 201203h 
  
 - 
  
 201204h 
  
 - 
  
 201205h 
  
 - 
  
 runQueries 
 : 
  
 parallel 
 : 
  
 shared 
 : 
  
 [ 
 results 
 ] 
  
 for 
 : 
  
 value 
 : 
  
 table 
  
 in 
 : 
  
 ${tables} 
  
 steps 
 : 
  
 - 
  
 logTable 
 : 
  
 call 
 : 
  
 sys.log 
  
 args 
 : 
  
 text 
 : 
  
 ${"Running query for table " + table} 
  
 - 
  
 runQuery 
 : 
  
 call 
 : 
  
 googleapis.bigquery.v2.jobs.query 
  
 args 
 : 
  
 projectId 
 : 
  
 ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")} 
  
 body 
 : 
  
 useLegacySql 
 : 
  
 false 
  
 useQueryCache 
 : 
  
 false 
  
 timeoutMs 
 : 
  
 30000 
  
 # Find top 100 titles with most views on Wikipedia 
  
 query 
 : 
  
 ${ 
  
 "SELECT TITLE, SUM(views) 
  
 FROM `bigquery-samples.wikipedia_pageviews." + table + "` 
  
 WHERE LENGTH(TITLE) > 10 
  
 GROUP BY TITLE 
  
 ORDER BY SUM(VIEWS) DESC 
  
 LIMIT 100" 
  
 } 
  
 result 
 : 
  
 queryResult 
  
 - 
  
 returnResult 
 : 
  
 assign 
 : 
  
 # Return the top title from each table 
  
 - 
  
 results[table] 
 : 
  
 {} 
  
 - 
  
 results[table].title 
 : 
  
 ${queryResult.rows[0].f[0].v} 
  
 - 
  
 results[table].views 
 : 
  
 ${queryResult.rows[0].f[1].v} 
  
 - 
  
 returnResults 
 : 
  
 return 
 : 
  
 ${results} 
 

What's next

To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser .

Create a Mobile Website
View Site in Mobile | Classic
Share by: