Schedule pipelines

This page describes how to create a schedule for your pipeline runs. For example, you can schedule a pipeline to run daily at 1:00 AM UTC.

Before you begin

To create the schedule, you need to have a deployed pipeline in Cloud Data Fusion. If you don't have one, you can create a pipeline by following the Quickstart .

To create, edit, or suspend a schedule, open your pipeline in Cloud Data Fusion:

Go to your instance:
1. In the Google Cloud console, go to the Cloud Data Fusion page.
2. To open the instance in the Cloud Data Fusion Studio, click Instances, and then click View instance.
  
  Go to Instances
Go to the Cloud Data Fusion Listpage.
In the Deployedtab, choose a pipeline.

The Pipelinepage opens, where you can create, edit, or suspend a schedule for your pipeline.

Create the schedule

From the Pipelinepage in the Cloud Data Fusion Studio, click Schedule.

You can use either the Basicor Advancedtab to define your schedule.

Basic

On the Basictab, enter the following information about your schedule:
- Frequency, such as Every 5 minutes and Repeats every 30 days .
- Start time, in UTC.
- Maximum concurrent runs: Choose up to ten runs. If there are already ten pipelines running, the scheduled run that you're creating won't run.
- Compute Engine profile: Choose a compute profile. The default is the Managed Service for Apache Spark compute profile.
Click Save and start schedule(or Save schedule, if you want to start it later).

Advanced

On the Advancedtab, define your schedule in cron syntax.
In the Maximum concurrent runsfield, choose up to ten runs. If there are already ten pipelines running, the scheduled run that you're creating won't run.
Click Save and start schedule(or Save schedule, if you want to start it later).

Change or suspend the schedule

You can change, start, or suspend a pipeline schedule from the Pipelinepage in the Cloud Data Fusion Studio.
To suspend the schedule, click Unschedule.