Cloud Composer 3 | Cloud Composer 2 | Cloud Composer 1
This page describes how to scale Cloud Composer environments.
Scale vertically and horizontally
In Cloud Composer 1, you don't define specific CPU and memory resources for Cloud Composer and Airflow components such as workers and schedulers. Instead, you specify the number and type of machines for nodes in your environment's cluster.
Options for horizontal scaling:
- Adjust the number of nodes
- Adjust the number of schedulers
Options for vertical scaling:
- Adjust the machine type of the Cloud SQL instance
- Adjust the web server machine type
Adjust scheduler parameters
Your environment can run more than one Airflow scheduler at the same time. Use multiple schedulers to distribute load between several scheduler instances for better performance and reliability.
If your environment uses Airflow 2, You can specify a number of schedulers up to the number of nodes in your environment.
When scaling schedulers, use the following considerations:
-
In Cloud Composer 3 environments, Airflow DAG processors run as separate environment components from schedulers. Because DAG processor offloads the parsing of DAGs from the scheduler, you might want to redistribute resources previously allocated to Airflow schedulers.
Because schedulers don't parse DAGs in Cloud Composer 3, they have lower resource limits for CPU and memory than in Cloud Composer 2.
-
Increasing the number of schedulers doesn't always improve Airflow performance.
For example, this might happen when the extra scheduler isn't utilized, and consumes resources of your environment without contributing to the overall performance. The actual scheduler performance depends on the number of Airflow workers, the number of DAGs and tasks that run in your environment, and the configuration of both Airflow and the environment.
-
We recommend starting with two schedulers and then monitoring the performance of your environment. If you change the number of schedulers, you can always scale your environment back to the original number of schedulers.
For more information about configuring multiple schedulers, see Airflow documentation .
Console
-
In the Google Cloud console, go to the Environmentspage.
-
In the list of environments, click the name of your environment. The Environment detailspage opens.
-
Go to the Environment configurationtab.
-
In the Resources > Workloads configurationitem, click Edit.
-
In the Resources > Number of schedulersitem, click Edit.
-
In the Scheduler configurationpane, in the Number of schedulersfield, specify the number of schedulers for your environment.
-
Click Save.
gcloud
The following Airflow scheduler parameters are available:
-
--scheduler-count
: the number of schedulers in your environment.
Run the following Google Cloud CLI command:
gcloud
composer
environments
update
ENVIRONMENT_NAME
\
--location
LOCATION
\
--scheduler-count
SCHEDULER_COUNT
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located.
Example:
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--scheduler-count
2
API
-
Construct an
environments.patch
API request. -
In this request:
-
In the
updateMask
parameter, specify theconfig.workloadsConfig.schedulerCount
mask. -
In the request body, specify the number of schedulers for your environment.
-
"config"
:
{
"workloadsConfig"
:
{
"scheduler"
:
{
"count"
:
SCHEDULER_COUNT
}
}
}
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
SCHEDULER_COUNT
: the number of schedulers.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.scheduler
"config"
:
{
"workloadsConfig"
:
{
"scheduler"
:
{
"count"
:
2
}
}
}
Terraform
The following fields in the workloads_config.scheduler
block control the
Airflow scheduler parameters. Each scheduler uses the specified amount of
resources.
-
scheduler.count
: the number of schedulers in your environment.
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
" ENVIRONMENT_NAME
"
region
=
" LOCATION
"
config
{
workloads_config
{
scheduler
{
count
=
SCHEDULER_COUNT
}
}
}
}
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
SCHEDULER_COUNT
: the number of schedulers.
Example:
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
"example-environment"
region
=
"us-central1"
config
{
workloads_config
{
scheduler
{
count
=
2
}
}
}
}
Adjust triggerer parameters
You can set the number of triggerers to zero, but you need at least one triggerer instance in your environment (or at least two in highly resilient environments), to use deferrable operators in your DAGs .
Depending on your environment's resilience mode , there are different possible configurations for the number of triggerers:
- Standard resilience: you can run up to 10 triggerers.
- High resilience: at least 2 triggerers, up to a maximum of 10 triggerers.
Even if the number of triggerers is set to zero, a triggerer pod definition is created and visible in your environment's cluster, but no actual triggerer workloads are run.
You can also specify the amount of CPUs, memory, and disk space used by Airflow triggerers in your environment. In this way, you can increase performance of your environment, in addition to horizontal scaling provided by using multiple triggerers.
Console
-
In the Google Cloud console, go to the Environmentspage.
-
In the list of environments, click the name of your environment. The Environment detailspage opens.
-
Go to the Environment configurationtab.
-
In the Resources > Workloads configurationitem, click Edit.
-
In the Workloads configurationpane adjust the parameters for Airflow triggerers:
-
In the Triggerersection, in the Number of triggerersfield, enter the number of triggerers in your environment.
If you set at least one triggerer for your environment, also use the the CPU, and Memoryfields to configure resource allocation for your triggerers.
-
In the CPUand Memory, specify the number of CPUs, memory, and storage for Airflow triggerers. Each triggerer uses the specified amount of resources.
-
-
Click Save.
gcloud
The following Airflow triggerer parameters are available:
-
--triggerer-count
: the number of triggerers in your environment.- For standard resilience environments, use a value between
0
and10
. - For highly resilient environments, use
0
, or a value between2
and10
.
- For standard resilience environments, use a value between
-
--triggerer-cpu
: the number of CPUs for an Airflow triggerer. -
--triggerer-memory
: the amount of memory for an Airflow triggerer.
Run the following Google Cloud CLI command:
gcloud
composer
environments
update
ENVIRONMENT_NAME
\
--location
LOCATION
\
--triggerer-count
TRIGGERER_COUNT
\
--triggerer-cpu
TRIGGERER_CPU
\
--triggerer-memory
TRIGGERER_MEMORY
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
TRIGGERER_COUNT
: the number of triggerers. -
TRIGGERER_CPU
: the number of CPUs for a triggerer, in vCPU units. -
TRIGGERER_MEMORY
: the amount of memory for a triggerer.
Examples:
- Scale to four triggerer instances:
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--triggerer-count
4
\
--triggerer-cpu
1
\
--triggerer-memory
1
- Disable triggerers by setting triggerer count to
0
. This operation doesn't require specifying CPU or memory for the triggerers.
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--triggerer-count
0
API
-
In the
updateMask
query parameter, specify theconfig.workloadsConfig.triggerer
mask. -
In the request body, specify all three parameters for triggerers.
"config"
:
{
"workloadsConfig"
:
{
"triggerer"
:
{
"count"
:
TRIGGERER_COUNT
,
"cpu"
:
TRIGGERER_CPU
,
"memoryGb"
:
TRIGGERER_MEMORY
}
}
}
Replace the following:
-
TRIGGERER_COUNT
: the number of triggerers.- For standard resilience environments, use a value between
0
and10
. - For highly resilient environments, use
0
, or a value between2
and10
.
- For standard resilience environments, use a value between
-
TRIGGERER_CPU
: the number of CPUs for a triggerer, in vCPU units. -
TRIGGERER_MEMORY
: the amount of memory for a triggerer.
Examples:
- Disable triggerers by setting triggerer count to
0
. This operation doesn't require specifying CPU or memory for the triggerers.
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.triggerer
"config"
:
{
"workloadsConfig"
:
{
"triggerer"
:
{
"count"
:
0
}
}
}
- Scale to four triggerer instances:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.triggerer
"config"
:
{
"workloadsConfig"
:
{
"triggerer"
:
{
"count"
:
4
,
"cpu"
:
1
,
"memoryGb"
:
1
}
}
}
Terraform
The following fields in the workloads_config.triggerer
block control the
Airflow triggerer parameters. Each triggerer uses the specified amount of
resources.
-
triggerer.count
: the number of triggerers in your environment.- For standard resilience environments, use a value between
0
and10
. - For highly resilient environments, use
0
, or a value between2
and10
.
- For standard resilience environments, use a value between
-
triggerer.cpu
: the number of CPUs for an Airflow triggerer. -
triggerer.memory_gb
: the amount of memory for an Airflow triggerer.
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
" ENVIRONMENT_NAME
"
region
=
" LOCATION
"
config
{
workloads_config
{
triggerer
{
count
=
TRIGGERER_COUNT
cpu
=
TRIGGERER_CPU
memory_gb
=
TRIGGERER_MEMORY
}
}
}
}
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
TRIGGERER_COUNT
: the number of triggerers. -
TRIGGERER_CPU
: the number of CPUs for a triggerer, in vCPU units. -
TRIGGERER_MEMORY
: the amount of memory for a triggerer, in GB.
Example:
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
"example-environment"
region
=
"us-central1"
config
{
workloads_config
{
triggerer
{
count
=
1
cpu
=
0.5
memory_gb
=
0.5
}
}
}
}
Adjust web server parameters
You can specify the amount of CPUs, memory, and disk space used by the Airflow web server in your environment. In this way, you can scale the performance of Airflow UI, for example, to match the demand coming from a large number of users or a large number of managed DAGs.
Console
-
In the Google Cloud console, go to the Environmentspage.
-
In the list of environments, click the name of your environment. The Environment detailspage opens.
-
Go to the Environment configurationtab.
-
In the Resources > Workloads configurationitem, click Edit.
-
In the Workloads configurationpane adjust the parameters for the web server. In the CPU, Memory, and Storagefields, specify the number of CPUs, memory, and storage for the web server.
-
Click Save.
gcloud
The following Airflow web server parameters are available:
-
--web-server-cpu
: the number of CPUs for the Airflow web server. -
--web-server-memory
: the amount of memory for the Airflow web server. -
--web-server-storage
: the amount of disk space for the Airflow web server.
Run the following Google Cloud CLI command:
gcloud
composer
environments
update
ENVIRONMENT_NAME
\
--location
LOCATION
\
--web-server-cpu
WEB_SERVER_CPU
\
--web-server-memory
WEB_SERVER_MEMORY
\
--web-server-storage
WEB_SERVER_STORAGE
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
WEB_SERVER_CPU
: the number of CPUs for web server, in vCPU units. -
WEB_SERVER_MEMORY
: the amount of memory for web server. -
WEB_SERVER_STORAGE
: the amount of memory for the web server.
Example:
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--web-server-cpu
1
\
--web-server-memory
2
.5
\
--web-server-storage
2
API
-
Construct an
environments.patch
API request. -
In this request:
-
In the
updateMask
parameter, specify theconfig.workloadsConfig.webServer
mask to update all web server parameters. You can also update individual web server parameters by specifying a mask for those arameters:config.workloadsConfig.webServer.cpu
,config.workloadsConfig.webServer.memoryGb
,config.workloadsConfig.webServer.storageGb
. -
In the request body, specify the new web server parameters.
-
"config"
:
{
"workloadsConfig"
:
{
"webServer"
:
{
"cpu"
:
WEB_SERVER_CPU
,
"memoryGb"
:
WEB_SERVER_MEMORY
,
"storageGb"
:
WEB_SERVER_STORAGE
}
}
}
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
WEB_SERVER_CPU
: the number of CPUs for the web server, in vCPU units. -
WEB_SERVER_MEMORY
: the amount of memory for the web server, in GB. -
WEB_SERVER_STORAGE
: the disk size for the web server, in GB.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.workloadsConfig.webServer.cpu,
// config.workloadsConfig.webServer.memoryGb,
// config.workloadsConfig.webServer.storageGb
"config"
:
{
"workloadsConfig"
:
{
"webServer"
:
{
"cpu"
:
0.5
,
"memoryGb"
:
2.5
,
"storageGb"
:
2
}
}
}
Terraform
The following fields in the workloads_config.web_server
block control the
web server parameters.
- The
web_server.cpu
: the number of CPUs for the web server. - The
web_server.memory_gb
: the amount of memory for the web server. - The
web_server.storage_gb
: the amount of disk space for the web server.
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
" ENVIRONMENT_NAME
"
region
=
" LOCATION
"
config
{
workloads_config
{
web_server
{
cpu
=
WEB_SERVER_CPU
memory_gb
=
WEB_SERVER_MEMORY
storage_gb
=
WEB_SERVER_STORAGE
}
}
}
}
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
WEB_SERVER_CPU
: the number of CPUs for the web server, in vCPU units. -
WEB_SERVER_MEMORY
: the amount of memory for the web server, in GB. -
WEB_SERVER_STORAGE
: the disk size for the web server, in GB.
Example:
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
"example-environment"
region
=
"us-central1"
config
{
workloads_config
{
web_server
{
cpu
=
0.5
memory_gb
=
1.875
storage_gb
=
1
}
}
}
}
Adjust the environment size
The environment size controls the performance parameters of the managed Cloud Composer infrastructure that includes, for example, the Airflow database.
Consider selecting a larger environment size if you want to run a large number of DAGs and tasks.
Console
-
In the Google Cloud console, go to the Environmentspage.
-
In the list of environments, click the name of your environment. The Environment detailspage opens.
-
Go to the Environment configurationtab.
-
In the Resources > Workloads configurationitem, click Edit.
-
In the Resources > Core infrastructureitem, click Edit.
-
In the Core infrastructurepane, in the Environment sizefield, specify the environment size.
-
Click Save.
gcloud
The --environment-size
argument controls the environment size:
gcloud
composer
environments
update
ENVIRONMENT_NAME
\
--location
LOCATION
\
--environment-size
ENVIRONMENT_SIZE
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
ENVIRONMENT_SIZE
:small
,medium
, orlarge
.
Example:
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--environment-size
medium
API
-
Create an
environments.patch
API request. -
In this request:
-
In the
updateMask
parameter, specify theconfig.environmentSize
mask. -
In the request body, specify the environment size.
-
"config"
:
{
"environmentSize"
:
" ENVIRONMENT_SIZE
"
}
Replace the following:
-
ENVIRONMENT_SIZE
: the environment size,ENVIRONMENT_SIZE_SMALL
,ENVIRONMENT_SIZE_MEDIUM
, orENVIRONMENT_SIZE_LARGE
.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.environmentSize
"config"
:
{
"environmentSize"
:
"ENVIRONMENT_SIZE_MEDIUM"
}
Terraform
The environment_size
field in the config
block controls the environment
size:
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
" ENVIRONMENT_NAME
"
region
=
" LOCATION
"
config
{
environment_size
=
" ENVIRONMENT_SIZE
"
}
}
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
ENVIRONMENT_SIZE
: the environment size,ENVIRONMENT_SIZE_SMALL
,ENVIRONMENT_SIZE_MEDIUM
, orENVIRONMENT_SIZE_LARGE
.
Example:
resource
"google_composer_environment"
"example"
{
provider
=
google-beta
name
=
"example-environment"
region
=
"us-central1"
config
{
environment_size
=
"ENVIRONMENT_SIZE_SMALL"
}
}
}
Adjust the number of nodes
You can change the number of nodes in your environment.
This number corresponds to the number of Airflow workers in your environment. In addition to running Airflow workers, your environment nodes also run Airflow schedulers and other environment components.
Console
-
In the Google Cloud console, go to the Environmentspage.
-
In the list of environments, click the name of your environment. The Environment detailspage opens.
-
Go to the Environment configurationtab.
-
In the Worker nodes > Node countitem, click Edit.
-
In the Worker nodes configurationpane, in the Node countfield, specify the number of nodes in your environment.
-
Click Save.
gcloud
The --node-count
argument controls the number of nodes in your environment:
gcloud
composer
environments
update
ENVIRONMENT_NAME
\
--location
LOCATION
\
--zone
NODE_ZONE
\
--node-count
NODE_COUNT
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
NODE_COUNT
: the number of nodes. The minimum number of nodes is3
. -
NODE_ZONE
: the Compute Engine zone for your environment VMs.
Example:
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--zone
us-central1-a
\
--node-count
6
API
-
Create an
environments.patch
API request. -
In this request:
-
In the
updateMask
parameter, specify theconfig.nodeCount
mask. -
In the request body, specify the number of nodes for your environment.
-
"config"
:
{
"nodeCount"
:
NODE_COUNT
}
Replace the following:
-
NODE_COUNT
: the number of nodes. The minimum number of nodes is3
.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.nodeCount
"config"
:
{
"nodeCount"
:
6
}
Terraform
The node_count
field in the node_config
block specifies the number of
nodes in your environment.
resource
"google_composer_environment"
"example"
{
config
{
node_config
{
node_count
=
NODE_COUNT
}
}
Replace the following:
-
NODE_COUNT
: the number of nodes.
Example:
resource
"google_composer_environment"
"example"
{
name
=
"example-environment"
region
=
"us-central1"
config
{
node_config
{
node_count
=
4
}
}
Adjust the machine type of the Cloud SQL instance
You can change the machine type of the Cloud SQL instance that stores the Airflow database of your environment.
Console
-
In the Google Cloud console, go to the Environmentspage.
-
In the list of environments, click the name of your environment. The Environment detailspage opens.
-
Go to the Environment configurationtab.
-
In the Resources > Cloud SQL machine typeitem, click Edit.
-
In the Cloud SQL configurationpane, in the Cloud SQL machine typedrop-down list, select the machine type for the Cloud SQL instance of your environment.
-
Click Save.
gcloud
The --cloud-sql-machine-type
arguments controls the machine type of
the Cloud SQL instance in your environment.
Run the following Google Cloud CLI command:
gcloud
composer
environments
update
ENVIRONMENT_NAME
\
--location
LOCATION
\
--cloud-sql-machine-type
SQL_MACHINE_TYPE
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
SQL_MACHINE_TYPE
: the machine type for the Cloud SQL instance .
Example:
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--cloud-sql-machine-type
db-n1-standard-2
API
-
Create an
environments.patch
API request. -
In this request:
-
In the
updateMask
parameter, specify theconfig.databaseConfig.machineType
mask. -
In the request body, specify the machine type for the Cloud SQL instance.
-
{
"config"
:
{
"databaseConfig"
:
{
"machineType"
:
" SQL_MACHINE_TYPE
"
}
}
}
Replace the following:
-
SQL_MACHINE_TYPE
: the machine type for the Cloud SQL instance .
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.databaseConfig.machineType
{
"config"
:
{
"databaseConfig"
:
{
"machineType"
:
"db-n1-standard-2"
}
}
}
Terraform
The machine_type
field in the database_config
block specifies the
machine type for the Cloud SQL instance.
resource
"google_composer_environment"
"example"
{
config
{
database_config
{
machine_type
=
" SQL_MACHINE_TYPE
"
}
}
}
Replace the following:
-
SQL_MACHINE_TYPE
: the machine type for the Cloud SQL instance .
Example:
resource
"google_composer_environment"
"example"
{
name
=
"example-environment"
region
=
"us-central1"
config
{
database_config
{
machine_type
=
"db-n1-standard-2"
}
}
Adjust the web server machine type
You can change the machine type for the Airflow web server of your environment.
Console
-
In the Google Cloud console, go to the Environmentspage.
-
In the list of environments, click the name of your environment. The Environment detailspage opens.
-
Go to the Environment configurationtab.
-
In the Resources > Web server machine typeitem, click Edit.
-
In the Web server configurationpane, in the Web server machine typedrop-down list, select the machine type for the Airflow web server.
-
Click Save.
gcloud
The --web-server-machine-type
arguments controls the machine type of
the Airflow web server instance in your environment.
Run the following Google Cloud CLI command:
gcloud
composer
environments
update
ENVIRONMENT_NAME
\
--location
LOCATION
\
--web-server-machine-type
WS_MACHINE_TYPE
Replace the following:
-
ENVIRONMENT_NAME
: the name of the environment. -
LOCATION
: the region where the environment is located. -
WS_MACHINE_TYPE
: the machine type for the Airflow web server instance .
Example:
gcloud
composer
environments
update
example-environment
\
--location
us-central1
\
--web-server-machine-type
composer-n1-webserver-2
API
-
Create an
environments.patch
API request. -
In this request:
-
In the
updateMask
parameter, specify theconfig.webServerConfig.machineType
mask. -
In the request body, specify the machine type for the web server.
-
{
"config"
:
{
"webServerConfig"
:
{
"machineType"
:
" WS_MACHINE_TYPE
"
}
}
}
Replace the following:
-
WS_MACHINE_TYPE
: the machine type for the Airflow web server instance .
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.webServerConfig.machineType
{
"config"
:
{
"webServerConfig"
:
{
"machineType"
:
"composer-n1-webserver-2"
}
}
}
Terraform
The machine_type
field in the web_server_config
block specifies the
machine type for the Airflow web server instance.
resource
"google_composer_environment"
"example"
{
config
{
web_server_config
{
machine_type
=
" WS_MACHINE_TYPE
"
}
}
}
Replace the following:
-
WS_MACHINE_TYPE
: the machine type for the Airflow web server instance .
Example:
resource
"google_composer_environment"
"example"
{
name
=
"example-environment"
region
=
"us-central1"
config
{
web_server_config
{
machine_type
=
"composer-n1-webserver-2"
}
}