BigQuery to MongoDB template

The BigQuery to MongoDB template is a batch pipeline that reads rows from a BigQuery and writes them to MongoDB as documents. Currently each row is stored as a document.

Pipeline requirements

  • The source BigQuery table must exist.
  • The target MongoDB instance should be accessible from the Dataflow worker machines.

Template parameters

Required parameters

  • mongoDbUri: The MongoDB connection URI in the format mongodb+srv://:@ .
  • database: Database in MongoDB to store the collection. For example, my-db .
  • collection: The name of the collection in the MongoDB database. For example, my-collection .
  • inputTableSpec: The BigQuery table to read from. For example, bigquery-project:dataset.input_table .

Run the template

Console

  1. Go to the Dataflow Create job from template page.
  2. Go to Create job from template
  3. In the Job name field, enter a unique job name.
  4. Optional: For Regional endpoint , select a value from the drop-down menu. The default region is us-central1 .

    For a list of regions where you can run a Dataflow job, see Dataflow locations .

  5. From the Dataflow template drop-down menu, select the BigQuery to MongoDB template.
  6. In the provided parameter fields, enter your parameter values.
  7. Click Run job .

gcloud

In your shell or terminal, run the template:

  
gcloud  
dataflow  
flex-template  
run  
 JOB_NAME 
  
 \ 
  
--project = 
 PROJECT_ID 
  
 \ 
  
--region = 
 REGION_NAME 
  
 \ 
  
--template-file-gcs-location = 
gs://dataflow-templates- REGION_NAME 
/ VERSION 
/flex/BigQuery_to_MongoDB  
 \ 
  
--parameters  
 \ 
  
 inputTableSpec 
 = 
 INPUT_TABLE_SPEC 
, \ 
  
 mongoDbUri 
 = 
 MONGO_DB_URI 
, \ 
  
 database 
 = 
 DATABASE 
, \ 
  
 collection 
 = 
 COLLECTION 
  

Replace the following:

  • PROJECT_ID : the Google Cloud project ID where you want to run the Dataflow job
  • JOB_NAME : a unique job name of your choice
  • REGION_NAME : the region where you want to deploy your Dataflow job—for example, us-central1
  • VERSION : the version of the template that you want to use

    You can use the following values:

  • INPUT_TABLE_SPEC : your source BigQuery table name.
  • MONGO_DB_URI : your MongoDB URI.
  • DATABASE : your MongoDB database.
  • COLLECTION : your MongoDB collection.

API

To run the template using the REST API, send an HTTP POST request. For more information on the API and its authorization scopes, see projects.templates.launch .

  
 POST 
  
 h 
 tt 
 ps 
 : 
 //dataflow.googleapis.com/v1b3/projects/ PROJECT_ID 
/locations/ LOCATION 
/flexTemplates:launch  
 { 
  
 "launch_parameter" 
 : 
  
 { 
  
 "jobName" 
 : 
  
 " JOB_NAME 
" 
 , 
  
 "parameters" 
 : 
  
 { 
  
 "inputTableSpec" 
 : 
  
 " INPUT_TABLE_SPEC 
" 
 , 
  
 "mongoDbUri" 
 : 
  
 " MONGO_DB_URI 
" 
 , 
  
 "database" 
 : 
  
 " DATABASE 
" 
 , 
  
 "collection" 
 : 
  
 " COLLECTION 
" 
  
 }, 
  
 "containerSpecGcsPath" 
 : 
  
 "gs://dataflow-templates- LOCATION 
/ VERSION 
/flex/BigQuery_to_MongoDB" 
 , 
  
 } 
  
 } 
 

Replace the following:

  • PROJECT_ID : the Google Cloud project ID where you want to run the Dataflow job
  • JOB_NAME : a unique job name of your choice
  • LOCATION : the region where you want to deploy your Dataflow job—for example, us-central1
  • VERSION : the version of the template that you want to use

    You can use the following values:

  • INPUT_TABLE_SPEC : your source BigQuery table name.
  • MONGO_DB_URI : your MongoDB URI.
  • DATABASE : your MongoDB database.
  • COLLECTION : your MongoDB collection.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: