This page describes how to migrate to Batch from Cloud Life Sciences.
Google Cloud announced that Cloud Life Sciences was deprecated on July 17, 2023 and shut down on July 8, 2025. However, Batch is generally available and is a comprehensive successor that supports all use cases for Cloud Life Sciences.
Learn more about Batch , Cloud Life Sciences , and product launch stages .
Cloud Life Sciences versus Batch
Migrating from Cloud Life Sciences to Batch primarily involves understanding how you can use Batch for the workloads that you currently execute by running Cloud Life Sciences pipelines.
To understand how you can you can execute your Cloud Life Sciences workloads on Batch, see all of the following the sections:
Overview
A Cloud Life Sciences pipeline describes a sequence of actions (containers) to execute and the environment to execute the containers in.
A Batch job describes an array of one or more tasks and the environment to execute those tasks in. You define the workload for a job as one sequence of one or more runnables (containers and/or scripts) to be executed. Each task for a job represents one execution of its sequence of runnables.
Cloud Life Sciences pipelines can be expressed as single-task Batch jobs.
For example, the following samples describe a simple Cloud Life Sciences pipeline and its equivalent Batch job:
Cloud Life Sciences pipeline | Batch job |
---|---|
{ "actions" : [ { "imageUri" : "bash" , "commands" : [ "-c" , "echo Hello, world!" ] } ] } |
{ "taskGroups" : [{ "taskSpec" : { "runnables" : [{ "container" :{ "imageUri" : "bash" , "commands" : [ "-c" , "echo Hello, world!" ] } }] } }] } |
Multiple-task Batch jobs are similar to copied Cloud Life Sciences pipelines.
Unlike Cloud Life Sciences, Batch allows you to automatically schedule multiple executions of your workload. You indicate the number of times that you want to execute the sequence of runnables for a job by defining the number of tasks. When a job has multiple tasks, you specify how you want each execution to vary by referencing the task's index in your runnables. Additionally, you can configure the relative schedules for a job's tasks—for example, whether to allow multiple tasks to run in parallel or to require tasks to run in sequential order and one at a time. Batch manages the scheduling the job's tasks: when a task finishes, the job automatically starts the next task, if any.
For example, see the following Batch job. This example
job has 100 tasks that execute on 10 Compute Engine virtual
machine (VM) instances, so there are approximately 10 tasks running in parallel
at any given time. Each task in this example job only executes one runnable:
a script that prints a message and the task's index, which is defined by the BATCH_TASK_INDEX
predefined environment variable.
{
"taskGroups"
:
[{
"taskSpec"
:
{
"runnables"
:
[{
"script"
:{
"text"
:
"echo Hello world! This is task ${BATCH_TASK_INDEX}."
}
}]
},
"taskCount"
:
100
,
"parallelism"
:
10
}]
}
Workflows that involve the creation and monitoring of multiple similar Cloud Life Sciences pipelines can sometimes be simplified by taking advantage of Batch's built-in scheduling.
Basic operations
This section describes basic operations in Cloud Life Sciences versus Batch.
The following table summarizes the basic operations options for Cloud Life Sciences and Batch.
- Run a pipeline.
- Create and run a job.
- List long-running operations.
- View a list of your jobs.
- Get details for a long-running operation.
- Poll a long-running operation.
- View the details of a job.
- View a list of a job's tasks.
- View the details of a task.
- Cancel a long-running operation.
- Delete (and cancel) a job.
- Check the status of a job deletion request.
The basic operations for Cloud Life Sciences and Batch have a few key differences.
Firstly, long-running operation resources do not play the same role in
Batch that they do in Cloud Life Sciences. Long-running operation resources (LROs) in Cloud Life Sciences
are the primary resource used to list and view your pipelines. But, long-running operation resources in Batch and other Google Cloud APIs
are only used to monitor the status of a request that takes a long time to
complete. Specifically, in Batch, the only request that
returns a long-running operation resource is deleting a job.
For more information about long-running operation resources for
Batch, see the Batch API reference documentation for the projects.locations.operations
REST resource
.
Instead of using long-running operation resources, Batch has
job resources that you view and delete for your workloads.
Secondly, viewing the details of a workload in Batch involves different operations than Cloud Life Sciences. You can view a job to see both it's details and status. But, each of a job's tasks also has its own details and status that you can see by viewing a list of a job's tasks and viewing the details of a task.
To help you further understand the basic operations for Cloud Life Sciences versus Batch, the following sections provide examples of Google Cloud CLI commands and API requests paths for some of these basic operations.
Example gcloud CLI commands
For gcloud CLI, Cloud Life Sciences commands
begin with gcloud beta lifesciences
and Batch commands
begin with gcloud batch
.
For example, see the following gcloud CLI commands.
-
Cloud Life Sciences example gcloud CLI commands:
-
Run a pipeline:
gcloud beta lifesciences pipelines run \ --project= PROJECT_ID \ --regions= LOCATION \ --pipeline-file= JSON_CONFIGURATION_FILE
-
Get details for a long-running operation:
gcloud beta lifesciences operations describe OPERATION_ID
Replace the following:
-
PROJECT_ID
: the project ID of your project. -
LOCATION
: the location for the pipeline. -
JSON_CONFIGURATION_FILE
: the JSON configuration file for the pipeline. -
OPERATION_ID
: the identifier for the long-running operation, which was returned by the request to run the pipeline.
-
-
Batch example gcloud CLI commands:
-
Create and run a job:
gcloud batch jobs submit JOB_NAME \ --project= PROJECT_ID \ --location= LOCATION \ --config= JSON_CONFIGURATION_FILE
-
View the details of a job:
gcloud batch jobs describe JOB_NAME \ --project= PROJECT_ID \ --location= LOCATION \
-
View a job's list of tasks:
gcloud batch tasks list \ --project= PROJECT_ID \ --location= LOCATION \ --job= JOB_NAME
-
View the details of a task:
gcloud batch tasks describe TASK_INDEX \ --project= PROJECT_ID \ --location= LOCATION \ --job= JOB_NAME \ --task_group= TASK_GROUP
-
Delete (and cancel) a job:
gcloud batch jobs delete JOB_NAME \ --project= PROJECT_ID \ --location= LOCATION
Replace the following:
-
JOB_NAME
: the name of the job. -
PROJECT_ID
: the project ID of your project. -
LOCATION
: the location of the job. -
JSON_CONFIGURATION_FILE
: the path for a JSON file with the job's configuration details. -
TASK_INDEX
: the index of the task that you want to view the details of. In a task group, the task index starts at 0 for the first task and increases by 1 with each additional task. For example, a task group that contains four tasks has the indexes0
,1
,2
, and3
. -
TASK_GROUP_NAME
: the name of the task group that you want to view the details of. The value must be set togroup0
.
-
Example API request paths
For APIs, Cloud Life Sciences uses lifesciences.googleapis.com
request paths and Batch uses batch.googleapis.com
request paths.
For example, see the following API request paths. Unlike
Cloud Life Sciences, Batch does not have an RPC API;
it only has a REST API.
-
Cloud Life Sciences example API request paths:
-
Run a pipeline:
POST h tt ps : //lifesciences.googleapis.com/v2beta/projects/ PROJECT_ID /locations/ LOCATION /pipelines:run
-
Get details for a long-running operation:
GET h tt ps : //lifesciences.googleapis.com/v2beta/projects/ PROJECT_ID /locations/ LOCATION /operations/ OPERATION_ID
Replace the following:
-
PROJECT_ID
: the project ID of your project. -
LOCATION
: the location for the pipeline. -
OPERATION_ID
: the identifier for the long-running operation, which was returned by the request to run the pipeline.
-
-
Batch example API request paths:
-
Create and run a job:
POST h tt ps : //batch.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /jobs?job_id= JOB_NAME
-
View the details of a job:
GET h tt ps : //batch.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /jobs/ JOB_NAME
-
View a job's list of tasks:
GET h tt ps : //batch.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /jobs/ JOB_NAME /taskGroups/ TASK_GROUP /tasks
-
Delete a job
DELETE h tt ps : //batch.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /jobs/ JOB_NAME
-
Check the status of job deletion request:
GET h tt ps : //batch.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /operations/ OPERATION_ID
Replace the following:
-
PROJECT_ID
: the project ID of your project. -
LOCATION
: the location of the job. -
JOB_NAME
: the name of the job. -
TASK_GROUP_NAME
: the name of the task group that you want to view the details of. The value must be set togroup0
. -
OPERATION_ID
: the identifier for the long-running operation, which was returned by the request to delete the job.
-
IAM roles and permissions
This section summarizes the differences in Identity and Access Management roles and permissions for Cloud Life Sciences and Batch. For more information about any roles and their permissions, see the IAM basic and predefined roles reference .
The following table describes the predefined roles and their permissions that are required for users of Cloud Life Sciences.
Any of the following:
- Cloud Life Sciences Admin
(
roles/lifesciences.admin
) on the project - Cloud Life Sciences Editor
(
roles/lifesciences.editor
) on the project - Cloud Life Sciences Workflows Runner
(
roles/lifesciences.workflowsRunner
) on the project
-
lifesciences.workflows.run
-
lifesciences.operations.cancel
-
lifesciences.operations.get
-
lifesciences.operations.list
roles/lifesciences.viewer
) on the project-
lifesciences.operations.get
-
lifesciences.operations.list
-
resourcemanager.projects.get
-
resourcemanager.projects.list
The following table describes some of the predefined roles and their permissions for Batch. Unlike Cloud Life Sciences, Batch requires you to grant permissions to users and the service account for a job. For more information about the IAM requirements, see Prerequisites for Batch .
roles/batch.jobsEditor
) on the project-
batch.jobs.create
-
batch.jobs.delete
-
batch.jobs.get
-
batch.jobs.list
-
batch.locations.get
-
batch.locations.list
-
batch.operations.get
-
batch.operations.list
-
batch.tasks.get
-
batch.tasks.list
-
resourcemanager.projects.get
-
resourcemanager.projects.list
roles/batch.jobsViewer
) on the project-
batch.jobs.get
-
batch.jobs.list
-
batch.locations.get
-
batch.locations.list
-
batch.operations.get
-
batch.operations.list
-
batch.tasks.get
-
batch.tasks.list
-
resourcemanager.projects.get
-
resourcemanager.projects.list
roles/iam.serviceAccountUser
) on the job's service account-
iam.serviceAccounts.actAs
-
iam.serviceAccounts.get
-
iam.serviceAccounts.list
-
resourcemanager.projects.get
-
resourcemanager.projects.list
roles/batch.agentReporter
) on the project-
batch.states.report
Corresponding features
The following table describes the features for Cloud Life Sciences, the equivalent features for Batch, and details about the differences between them.
Each feature is represented by a description and its JSON syntax. You can use JSON syntax when accessing Batch through the API or when specifying a JSON configuration file through the Google Cloud CLI. However, note that you can also use Batch features through other methods—such as through Google Cloud console fields, flags gcloud CLI, and client libraries—which are described in the Batch documentation .
For more information about each feature and its JSON syntax, see the following:
-
For Cloud Life Sciences, see the Cloud Life Sciences API reference documentation for the
projects.locations.pipelines
REST resource . -
For Batch, see the Batch API reference documentation for the
projects.locations.jobs
REST resource .
A Batch job consists of an array of one or more tasks that each execute all of the same runnables. A Cloud Life Sciences pipeline is similar to a Batch job with one task. However, Cloud Life Sciences does not have an equivalent concept for (jobs with multiple) tasks, which are somewhat like repetitions of a pipeline.
For more information about jobs and tasks, see Overview for Batch .
A Cloud Life Sciences action describes a container, but a Batch runnable can contain either a container or script.
In Cloud Life Sciences an action's credentials must be a Cloud Key Management Service encrypted dictionary with username and password key-value pairs.
In Batch, the username and password for a container runnable are in separate fields. Either field may be specified with plain text or with the name of a Secret Manager secret.
for an action:
- environment variables (
environment
) - encrypted environment variables (
encryptedEnvironment
)
for an environment:
- variables (
variables
) - encrypted variables (
encryptedVariables
) - secret variables (
secretVariables
)
possible environments:
- a runnable (
environment
inrunnables[]
) - all runnables (
environment
intaskSpec
) - all tasks (
taskEnvironments[]
intaskGroups[]
)
Cloud Life Sciences lets you specify the environment variables
for an action that are formatted as plain text or as an encrypted dictionary.
In Batch, this is similar to having the environment
for a runnable ( environment
field in runnables[]
)
include variables that are formatted as
plain-text ( variables
) or an
encrypted dictionary ( encryptedVariables
).
But, Batch also has more options for specifying environment variables:
- As an alternative to specifying variables as plain text
or an encrypted dictionary
, you can specify variables using Secret Manager secrets
by using
a secret variable (
secretVariables
). - As an alternative to specifying an environment variable for a runnable
,
you can specify an environment variable for all runnables
by
using the
environment
field intaskSpec
. - As an alternative to specifying an environment variable that has the same
value for each task, you can specify an environment variable that
has a different
value for each task by using the
taskEnvironments[]
field intaskGroups[]
.
For more information, see Use environment variables .
Unlike Cloud Life Sciences, Batch does not include a labels field in the request to create a new job. The closest option for Batch is to use labels that are only associated with the job.
Batch has multiple types of labels
( labels
fields) that you can use when creating a job.
For more information,
see Organize resources using labels
.
In Cloud Life Sciences, a pipeline executes on a single VM, which you can specify the desired regions and/or zones for.
In Batch, the equivalent option is the allowed locations for a job, which you can define as one or more regions or zones and specifies where the VMs for a job can be created. All the VMs for a single Batch job belong to a single managed instance group (MIG), which exists in a particular region; however, individual VMs might be in different zones of that region.
Notably, specifying the allowed locations field for a job is optional because it is separate from the job's location. Unlike the job's location, the allowed location does not affect the location that is used for creating a Batch job and storing job metadata. For more information, see Batch locations .
for a pipeline's resources ( resources
):
- the VM (
virualMachine
)
for a job's resource policy ( allocationPolicy
):
- the VMs (
instances
) - service account (
serviceAccount
) - labels (
labels
inallocationPolicy
) - network (
network
)
In Cloud Life Sciences, you can configure the (one) VM that a pipeline runs on.
In Batch the
same options for VMs are available in the fields of a
job's resource allocation policy ( allocationPolicy
):
- The service account, labels, and network configuration for the VMs are defined in their dedicated fields.
- The VM field (
instances
), which you can define either directly or using an instance template, includes the configuration options for the machine type, minimum allowed CPU platform, boot disk and any other attached disks, and any GPUs and GPU drivers.
for an action:
- labels (
labels
inactions[]
) - option to timeout (
timeout
) - option to ignore exit status (
ignoreExitStatus
) - option to run in the background (
runInBackground
) - option to always run (
alwaysRun
)
for a runnable:
- labels (
labels
inrunnables[]
) - option to timeout (
timeout
) - option to ignore exit status (
ignoreExitStatus
) - option to run in the background (
background
) - option to always run (
alwaysRun
)
These various convenience flags from Cloud Life Sciences are equivalent in Batch except they are specified for each runnable (which can contain a script or container) instead of each action (container).
for an action:
- option to publish exposed ports (
publishExposedPorts
) - option to specify the process ID (PID) namespace (
pidNamespace
) - and option to specify container-to-host port mappings (
portMappings
)
These Cloud Life Sciences options (and others) are supported in
Batch through the options field ( options
)
for a container runnable. Set the options field to any flags that you
want Batch to append to the docker run
command
—for
example, -P --pid mynamespace -p 22:22
.
for an action:
- option to disable early download of container images (
disableImagePrefetch
) - option to disable returning the standard error stream (
disableStandardErrorCapture
)
Batch prefetches images and processes the
outputs of all runnables identically in accordance with the job's
logs policy ( logsPolicy
).
The Cloud Life Sciences option to block external networks for an action is similar to the Batch option to block external networks for a container.
Batch also has many other networking options, such as to block external networks for all of a job's VMs. For more information, see Batch networking overview .
volumes[]
in taskSpec
) and volume options for a container ( volumes[]
in container
)In Batch, you can use the volumes[]
field in taskSpec
to define a job's volumes and their mount paths.
Batch mounts storage volumes to the job's VMs
and storage volumes are accessible to all of the job's runnables
(scripts or containers). This mounting is done before the VM
executes any tasks or runnables.
Additionally, Batch
supports explicit volume options on container runnables by using the volumes[]
field in container
. These mount options
are passed to the container as options for the --volume
flag of the docker run
command
—for
example, the [ "/etc:/etc", "/foo:/bar" ]
value is
translated to the docker run --volume /etc:/etc --volume /foo:/bar
command
on the container.
For more information about using storage volumes with Batch, see Create and run a job that uses storage volumes .
Batch handles mounting any storage volumes,
such as a Cloud Storage bucket, that you specify for a job.
As a result, you don't enable any mounting tools like Cloud Storage FUSE
for Batch; however, you can optionally specify
mount options for your storage volumes by using the mountOptions[]
field
.
For more information about using Cloud Storage buckets with Batch, see Create and run a job that uses storage volumes .
for a job's notification configurations ( notifications[]
):
- Pub/Sub topic (
pubSubTopic
) - messages (
message
)
Batch allows greater customization of status updates than Cloud Life Sciences. For example, Batch users can be notified on a Pub/Sub topic when either individual tasks change state or only when the overall job changes state.
Workflow services
If you use a workflow service with Cloud Life Sciences, then your migration process also involves configuring a workflow service to work with Batch. This section summarizes the workflow services that you can use with Batch.
Batch supports Workflows , which is a workflow service from Google Cloud. If you want to use Workflows with Batch, see Run a Batch job using Workflows . Otherwise, the following table describes other workflows services which you might use for Cloud Life Sciences that you can also use with Batch. This table lists the key differences for using each workflow service with Batch instead of Cloud Life Sciences and details on where to learn more about using each service with Batch.
To use a Cromwell configuration file for the v2beta Cloud Life Sciences API with the Batch API instead, make the following changes:
- For the
actor-factory
field, replacecromwell.backend.google.pipelines.v2beta.PipelinesApiLifecycleActorFactory
withcromwell.backend.google.batch.GcpBatchLifecycleActorFactory
. - Remove the
genomics.endpoint-url
field. - Generate a new configuration file.
To use a run your dsub pipeline for Cloud Life Sciences with Batch instead, make the following changes:
- For the
provider
field, replacegoogle-cls-v2
withgoogle-batch
.
To use a Nextflow configuration file for Cloud Life Sciences with Batch instead, make the following changes:
- For the
executor
field, replacegoogle-lifesciences
withgoogle-batch
. - For any
config
prefixes, replacegoogle.lifeScience
withgoogle.batch
.
To use a Snakemake pipeline for the v2beta Cloud Life Sciences API with the Batch API instead, make the following changes:
- Make sure you are using Snakemake version 8 or newer. For more information, see Migration between Snakemake versions .
-
Make the following changes to the
snakemake
command:- Replace the
--google-lifesciences
flag with the--executor googlebatch
flag. - Replace any additional flags that have the
--google-lifesciences-
prefix to use the--googlebatch-
prefix instead.
- Replace the
What's next
- To configure Batch for new users and projects, see Get started .
- To learn how to execute workloads using Batch, see Create a job .