This page describes how to reuse Dataproc clusters for your pipeline runs in Cloud Data Fusion. For more information, see When to reuse clusters and Run a pipeline against an existing Dataproc cluster .
Before you begin
- You must have a Cloud Data Fusion instance in version 6.5.0 or later.
Enable cluster reuse
You can reuse clusters in a new compute profile, or in one that's been used in a deployed pipeline.
Enable cluster reuse in a new profile
-
Go to your instance:
-
In the Google Cloud console, go to the Cloud Data Fusion page.
-
To open the instance in the Cloud Data Fusion Studio, click Instances, and then click View instance.
-
-
Click System admin > Configuration > System compute profiles.
-
Click Create new profile.
-
Choose the Dataprocprovisioner.
-
In the Create a profile for Dataprocwindow, enter the details about your cluster:
- In the Profile labeland Profile namefields, enter a name to
identify the profile—for example,
execution_compute-profile. - In the Descriptionfield, describe the purpose of the
profile—for example,
Profile used for pipeline execution. - In the Max idle timefield, enter a value. For more information, see Set max idle time .
- Set the Skip cluster deletefield to
True. For more information, see When to reuse clusters . - Optional: configure other optional fields.
- Click Create.
- In the Profile labeland Profile namefields, enter a name to
identify the profile—for example,
Enable cluster reuse in a deployed pipeline
-
Go to your instance:
-
In the Google Cloud console, go to the Cloud Data Fusion page.
-
To open the instance in the Cloud Data Fusion Studio, click Instances, and then click View instance.
-
-
Click List.
-
Click the Deployedtab and click a pipeline name. The deployed pipeline opens on the Studiopage in the Cloud Data Fusion web interface.
-
Click Configure.
-
In the Compute configwindow, go to the chosen profile and click Customize.
-
In the window that opens, enter the following values:
- In the Max Idle Timefield, enter a value. For more information, see Set max idle time .
- Set Skip cluster deleteto
True. For more information, see When to reuse clusters .
-
Click Done.
What's next
- Learn more about configuring clusters .
- Troubleshoot deleting clusters .

