Estimate and control costs

This page describes how to estimate cost and lists best practices for controlling costs in BigQuery. BigQuery offers two types of pricing models, on-demand and capacity-based pricing. For information about pricing, see BigQuery pricing .

With BigQuery, you can estimate the cost of running a query, calculate the byte processed by various queries, and get a monthly cost estimate based on your projected usage. To control cost, you must also follow the best practices for optimizing query computation and BigQuery storage . For cost-specific best practices, see Control query costs .

To monitor query costs and BigQuery usage, analyze BigQuery audit logs .

Estimate query costs

BigQuery provides various methods to estimate cost:

On-demand query size calculation

To calculate the number of bytes processed by the various types of query using the on-demand billing model , see the following sections:

Query columnar formats on Cloud Storage

If your external data is stored in ORC or Parquet, the number of bytes charged is limited to the columns that BigQuery reads. Because the data types from an external data source are converted to BigQuery data types by the query, the number of bytes read is computed based on the size of BigQuery data types. For information about data type conversions, see the following pages:

Use the Google Cloud Pricing Calculator

The Google Cloud Pricing Calculator can help you create an overall monthly cost estimate for BigQuery based on projected usage.

On-demand

To estimate costs in the Google Cloud Pricing Calculator when using the on-demand pricing model, follow these steps:

  1. Open the Google Cloud Pricing Calculator .
  2. Click BigQuery.
  3. Click the On-Demandtab.
  4. For Storage Pricing, enter the estimated size of the table in the storage fields. You only need to estimate either physical storage or logical storage, depending on the dataset storage billing model .
  5. For Query Pricing, enter the estimated bytes read from your dry run or the query validator.
  6. Click Add To Estimate.
  7. The estimate appears to the right. Notice that you can save or email the estimate.

For more information, see on-demand pricing .

Editions

To estimate costs in the Google Cloud Pricing Calculator when using the capacity-based pricing model with BigQuery editions , follow these steps:

  1. Open the Google Cloud Pricing Calculator .
  2. Click BigQuery.
  3. Click the Editionstab.
  4. Choose the location where the slots are used.
  5. Choose your Edition.
  6. Choose the Maximum slots, Baseline slots, optional Commitment, and Estimated utilization of autoscaling.
  7. Choose the location where the data is stored.
  8. Enter your estimations of storage usage for Active storage, Long-term storage, Streaming inserts, and Streaming reads. You only need to estimate either physical storage or logical storage, depending on the dataset storage billing model .
  9. Click Add to Estimate.

For more information, see capacity-based pricing .

Control query costs

To optimize query costs, ensure that you have optimized storage and query computation . For additional methods to control the query cost, see the following sections:

Check the query cost before running them

Best practice:Before running queries, preview them to estimate costs.

Queries are billed according to the number of bytes read. To estimate costs before running a query:

Use the query validator

When you enter a query in the Google Cloud console, the query validator verifies the query syntax and provides an estimate of the number of bytes read. You can use this estimate to calculate query cost in the pricing calculator.

  • If your query is not valid, then the query validator displays an error message. For example:

    Not found: Table myProject:myDataset.myTable was not found in location US

  • If your query is valid, then the query validator provides an estimate of the number of bytes required to process the query. For example:

    This query will process 623.1 KiB when run.

Perform a dry run

To perform a dry run, do the following:

Console

  1. Go to the BigQuery page.

    Go to BigQuery

  2. Enter your query in the query editor.

    If the query is valid, then a check mark automatically appears along with the amount of data that the query will process. If the query is invalid, then an exclamation point appears along with an error message.

bq

Enter a query like the following using the --dry_run flag.

bq  
query  
 \ 
--use_legacy_sql = 
 false 
  
 \ 
--dry_run  
 \ 
 'SELECT 
 COUNTRY, 
 AIRPORT, 
 IATA 
 FROM 
 ` project_id 
`. dataset 
.airports 
 LIMIT 
 1000' 
  

For a valid query, the command produces the following response:

Query successfully validated. Assuming the tables are not modified,
running this query will process 10918 bytes of data.

API

To perform a dry run by using the API, submit a query job with dryRun set to true in the JobConfiguration type.

Go

Before trying this sample, follow the Go setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Go API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  import 
  
 ( 
  
" context 
"  
" fmt 
"  
" io 
"  
" cloud 
 . 
 google 
 . 
 com 
 / 
 go 
 / 
 bigquery 
" ) 
 // queryDryRun demonstrates issuing a dry run query to validate query structure and 
 // provide an estimate of the bytes scanned. 
 func 
  
 queryDryRun 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 projectID 
  
 string 
 ) 
  
 error 
  
 { 
  
 // projectID := "my-project-id 
"  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 bigquery 
 . 
 NewClient 
 ( 
 ctx 
 , 
  
 projectID 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
" bigquery 
 . 
 NewClient 
 : 
  
 % 
 v 
" , 
  
 err 
 ) 
  
 } 
  
 defer 
  
 client 
 . 
 Close 
 () 
  
 q 
  
 := 
  
 client 
 . 
 Query 
 ( 
 ` 
 SELECT 
 name, 
 COUNT(*) as name_count 
 FROM ` 
  
 + 
  
" `bigquery-public-data.usa_names.usa_1910_2013` 
"  
 + 
  
 ` 
 WHERE state = 'WA 
' GROUP BY name` 
 ) 
  
 q 
 . 
 DryRun 
  
 = 
  
 true 
  
 // Location must match that of the dataset(s) referenced in the query. 
  
 q 
 . 
 Location 
  
 = 
  
" US 
"  
 job 
 , 
  
 err 
  
 := 
  
 q 
 . 
 Run 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 // Dry run is not asynchronous, so get the latest status and statistics. 
  
 status 
  
 := 
  
 job 
 . 
 LastStatus 
 () 
  
 if 
  
 err 
  
 := 
  
 status 
 . 
 Err 
 (); 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
" This 
  
 query 
  
 will 
  
 process 
  
 % 
 d 
  
 bytes 
 \ 
 n 
" , 
  
 status 
 . 
 Statistics 
 . 
 TotalBytesProcessed 
 ) 
  
 return 
  
 nil 
 } 
 

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Java API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  import 
  
 com.google.cloud.bigquery.BigQuery 
 ; 
 import 
  
 com.google.cloud.bigquery.BigQueryException 
 ; 
 import 
  
 com.google.cloud.bigquery.BigQueryOptions 
 ; 
 import 
  
 com.google.cloud.bigquery.Job 
 ; 
 import 
  
 com.google.cloud.bigquery.JobInfo 
 ; 
 import 
  
 com.google.cloud.bigquery.JobStatistics 
 ; 
 import 
  
 com.google.cloud.bigquery.QueryJobConfiguration 
 ; 
 // Sample to run dry query on the table 
 public 
  
 class 
 QueryDryRun 
  
 { 
  
 public 
  
 static 
  
 void 
  
 runQueryDryRun 
 () 
  
 { 
  
 String 
  
 query 
  
 = 
  
" SELECT 
  
 name 
 , 
  
 COUNT 
 ( 
 * 
 ) 
  
 as 
  
 name_count 
  
"  
 + 
  
" FROM 
  
 ` 
 bigquery 
 - 
 public 
 - 
 data 
 . 
 usa_names 
 . 
 usa_1910_2013 
 ` 
  
"  
 + 
  
" WHERE 
  
 state 
  
 = 
  
' WA 
'  
"  
 + 
  
" GROUP 
  
 BY 
  
 name 
" ; 
  
 queryDryRun 
 ( 
 query 
 ); 
  
 } 
  
 public 
  
 static 
  
 void 
  
 queryDryRun 
 ( 
 String 
  
 query 
 ) 
  
 { 
  
 try 
  
 { 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. 
  
 BigQuery 
  
 bigquery 
  
 = 
  
 BigQueryOptions 
 . 
 getDefaultInstance 
 (). 
 getService 
 (); 
  
 QueryJobConfiguration 
  
 queryConfig 
  
 = 
  
 QueryJobConfiguration 
 . 
 newBuilder 
 ( 
 query 
 ). 
 setDryRun 
 ( 
 true 
 ). 
 setUseQueryCache 
 ( 
 false 
 ). 
 build 
 (); 
  
 Job 
  
 job 
  
 = 
  
 bigquery 
 . 
 create 
 ( 
 JobInfo 
 . 
 of 
 ( 
 queryConfig 
 )); 
  
 JobStatistics 
 . 
 QueryStatistics 
  
 statistics 
  
 = 
  
 job 
 . 
 getStatistics 
 (); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
  
" Query 
  
 dry 
  
 run 
  
 performed 
  
 successfully 
 . 
"  
 + 
  
 statistics 
 . 
 getTotalBytesProcessed 
 ()); 
  
 } 
  
 catch 
  
 ( 
 BigQueryException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
" Query 
  
 not 
  
 performed 
  
 \ 
 n 
"  
 + 
  
 e 
 . 
 toString 
 ()); 
  
 } 
  
 } 
 } 
 

Node.js

Before trying this sample, follow the Node.js setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Node.js API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  // Import the Google Cloud client library 
 const 
  
 { 
 BigQuery 
 } 
  
 = 
  
 require 
 ( 
' @ 
 google 
 - 
 cloud 
 / 
 bigquery 
' ); 
 const 
  
 bigquery 
  
 = 
  
 new 
  
 BigQuery 
 (); 
 async 
  
 function 
  
 queryDryRun 
 () 
  
 { 
  
 // Runs a dry query of the U.S. given names dataset for the state of Texas. 
  
 const 
  
 query 
  
 = 
  
 `SELECT name 
 FROM \`bigquery-public-data.usa_names.usa_1910_2013\` 
 WHERE state = 'TX 
' LIMIT 100` 
 ; 
  
 // For all options, see https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query 
  
 const 
  
 options 
  
 = 
  
 { 
  
 query 
 : 
  
 query 
 , 
  
 // Location must match that of the dataset(s) referenced in the query. 
  
 location 
 : 
  
' US 
' , 
  
 dryRun 
 : 
  
 true 
 , 
  
 }; 
  
 // Run the query as a job 
  
 const 
  
 [ 
 job 
 ] 
  
 = 
  
 await 
  
 bigquery 
 . 
 createQueryJob 
 ( 
 options 
 ); 
  
 // Print the status and statistics 
  
 console 
 . 
 log 
 ( 
' Status 
 : 
' ); 
  
 console 
 . 
 log 
 ( 
 job 
 . 
 metadata 
 . 
 status 
 ); 
  
 console 
 . 
 log 
 ( 
' \ 
 nJob 
  
 Statistics 
 : 
' ); 
  
 console 
 . 
 log 
 ( 
 job 
 . 
 metadata 
 . 
 statistics 
 ); 
 }