Write to destination table

Run a query on the natality public dataset and write the results to a destination table.

Explore further

For detailed documentation that includes this code sample, see the following:

Use Managed Service for Apache Spark, BigQuery, and Apache Spark ML for Machine Learning

Code sample

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Python API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  """Create a Google BigQuery linear regression input table. 
 In the code below, the following actions are taken: 
 * A new dataset is created "natality_regression." 
 * A query is run against the public dataset, 
 bigquery-public-data.samples.natality, selecting only the data of 
 interest to the regression, the output of which is stored in a new 
 "regression_input" table. 
 * The output table is moved over the wire to the user's default project via 
 the built-in BigQuery Connector for Spark that bridges BigQuery and 
 Cloud Dataproc. 
 """ 
 from 
  
 google.cloud 
  
 import 
  bigquery 
 
 # Create a new Google BigQuery client using Google Cloud Platform project 
 # defaults. 
 client 
 = 
  bigquery 
 
 . 
  Client 
 
 () 
 # Prepare a reference to a new dataset for storing the query results. 
 dataset_id 
 = 
 "natality_regression" 
 dataset_id_full 
 = 
 f 
 " 
 { 
 client 
 . 
 project 
 } 
 . 
 { 
 dataset_id 
 } 
 " 
 dataset 
 = 
  bigquery 
 
 . 
  Dataset 
 
 ( 
 dataset_id_full 
 ) 
 # Create the new BigQuery dataset. 
 dataset 
 = 
 client 
 . 
  create_dataset 
 
 ( 
 dataset 
 ) 
 # Configure the query job. 
 job_config 
 = 
  bigquery 
 
 . 
  QueryJobConfig 
 
 () 
 # Set the destination table to where you want to store query results. 
 # As of google-cloud-bigquery 1.11.0, a fully qualified table ID can be 
 # used in place of a TableReference. 
 job_config 
 . 
 destination 
 = 
 f 
 " 
 { 
 dataset_id_full 
 } 
 .regression_input" 
 # Set up a query in Standard SQL, which is the default for the BigQuery 
 # Python client library. 
 # The query selects the fields of interest. 
 query 
 = 
 """ 
 SELECT 
 weight_pounds, mother_age, father_age, gestation_weeks, 
 weight_gain_pounds, apgar_5min 
 FROM 
 `bigquery-public-data.samples.natality` 
 WHERE 
 weight_pounds IS NOT NULL 
 AND mother_age IS NOT NULL 
 AND father_age IS NOT NULL 
 AND gestation_weeks IS NOT NULL 
 AND weight_gain_pounds IS NOT NULL 
 AND apgar_5min IS NOT NULL 
 """ 
 # Run the query. 
 client 
 . 
  query_and_wait 
 
 ( 
 query 
 , 
 job_config 
 = 
 job_config 
 ) 
 # Waits for the query to finish

What's next

To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser .

Write to destination table Stay organized with collections Save and categorize content based on your preferences.

Explore further

Code sample

Python

What's next

Write to destination table