DataflowJob


Property Value
Google Cloud Service Name Cloud Dataflow
Google Cloud Service Documentation /dataflow/docs/
Google Cloud REST Resource Name v1b3.projects.jobs
Google Cloud REST Resource Documentation /dataflow/docs/reference/rest/v1b3/projects.jobs
Config Connector Resource Short Names gcpdataflowjob
gcpdataflowjobs
dataflowjob
Config Connector Service Name dataflow.googleapis.com
Config Connector Resource Fully Qualified Name dataflowjobs.dataflow.cnrm.cloud.google.com
Can Be Referenced by IAMPolicy/IAMPolicyMember No
Config Connector Default Average Reconcile Interval In Seconds 600

Custom Resource Definition Properties

Annotations

Fields
cnrm.cloud.google.com/on-delete
cnrm.cloud.google.com/project-id
cnrm.cloud.google.com/skip-wait-on-job-termination

Spec

Schema

  additionalExperiments 
 : 
 - 
  
 string 
 enableStreamingEngine 
 : 
  
 boolean 
 ipConfiguration 
 : 
  
 string 
 kmsKeyRef 
 : 
  
 external 
 : 
  
 string 
  
 name 
 : 
  
 string 
  
 namespace 
 : 
  
 string 
 machineType 
 : 
  
 string 
 maxWorkers 
 : 
  
 integer 
 networkRef 
 : 
  
 external 
 : 
  
 string 
  
 name 
 : 
  
 string 
  
 namespace 
 : 
  
 string 
 parameters 
 : 
  
 {} 
 region 
 : 
  
 string 
 resourceID 
 : 
  
 string 
 serviceAccountRef 
 : 
  
 external 
 : 
  
 string 
  
 name 
 : 
  
 string 
  
 namespace 
 : 
  
 string 
 subnetworkRef 
 : 
  
 external 
 : 
  
 string 
  
 name 
 : 
  
 string 
  
 namespace 
 : 
  
 string 
 tempGcsLocation 
 : 
  
 string 
 templateGcsPath 
 : 
  
 string 
 transformNameMapping 
 : 
  
 {} 
 zone 
 : 
  
 string 
 
Fields

additionalExperiments

Optional

list (string)

List of experiments that should be used by the job. An example value is ["enable_stackdriver_agent_metrics"].

additionalExperiments[]

Optional

string

enableStreamingEngine

Optional

boolean

Indicates if the job should use the streaming engine feature.

ipConfiguration

Optional

string

The configuration for VM IPs. Options are "WORKER_IP_PUBLIC" or "WORKER_IP_PRIVATE".

kmsKeyRef

Optional

object

The name for the Cloud KMS key for the job.

kmsKeyRef.external

Optional

string

Allowed value: The `selfLink` field of a `KMSCryptoKey` resource.

kmsKeyRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

kmsKeyRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

machineType

Optional

string

The machine type to use for the job.

maxWorkers

Optional

integer

Immutable. The number of workers permitted to work on the job. More workers may improve processing speed at additional cost.

networkRef

Optional

object

networkRef.external

Optional

string

Allowed value: The `selfLink` field of a `ComputeNetwork` resource.

networkRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

networkRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

parameters

Optional

object

Key/Value pairs to be passed to the Dataflow job (as used in the template).

region

Optional

string

Immutable. The region in which the created job should run.

resourceID

Optional

string

Immutable. Optional. The name of the resource. Used for creation and acquisition. When unset, the value of `metadata.name` is used as the default.

serviceAccountRef

Optional

object

serviceAccountRef.external

Optional

string

Allowed value: The `email` field of an `IAMServiceAccount` resource.

serviceAccountRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

serviceAccountRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

subnetworkRef

Optional

object

subnetworkRef.external

Optional

string

Allowed value: The `selfLink` field of a `ComputeSubnetwork` resource.

subnetworkRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

subnetworkRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

tempGcsLocation

Required

string

A writeable location on Google Cloud Storage for the Dataflow job to dump its temporary data.

templateGcsPath

Required

string

The Google Cloud Storage path to the Dataflow job template.

transformNameMapping

Optional

object

Only applicable when updating a pipeline. Map of transform name prefixes of the job to be replaced with the corresponding name prefixes of the new job.

zone

Optional

string

Immutable. The zone in which the created job should run. If it is not provided, the provider zone is used.

Status

Schema

  conditions 
 : 
 - 
  
 lastTransitionTime 
 : 
  
 string 
  
 message 
 : 
  
 string 
  
 reason 
 : 
  
 string 
  
 status 
 : 
  
 string 
  
 type 
 : 
  
 string 
 jobId 
 : 
  
 string 
 observedGeneration 
 : 
  
 integer 
 state 
 : 
  
 string 
 type 
 : 
  
 string 
 
Fields
conditions

list (object)

Conditions represent the latest available observation of the resource's current state.

conditions[]

object

conditions[].lastTransitionTime

string

Last time the condition transitioned from one status to another.

conditions[].message

string

Human-readable message indicating details about last transition.

conditions[].reason

string

Unique, one-word, CamelCase reason for the condition's last transition.

conditions[].status

string

Status is the status of the condition. Can be True, False, Unknown.

conditions[].type

string

Type is the type of the condition.

jobId

string

The unique ID of this job.

observedGeneration

integer

ObservedGeneration is the generation of the resource that was most recently observed by the Config Connector controller. If this is equal to metadata.generation, then that means that the current reported status reflects the most recent desired state of the resource.

state

string

The current state of the resource, selected from the JobState enum.

type

string

The type of this job, selected from the JobType enum.

Sample YAML(s)

Batch Dataflow Job

  # Copyright 2020 Google LLC 
 # 
 # Licensed under the Apache License, Version 2.0 (the "License"); 
 # you may not use this file except in compliance with the License. 
 # You may obtain a copy of the License at 
 # 
 #     http://www.apache.org/licenses/LICENSE-2.0 
 # 
 # Unless required by applicable law or agreed to in writing, software 
 # distributed under the License is distributed on an "AS IS" BASIS, 
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
 # See the License for the specific language governing permissions and 
 # limitations under the License. 
 apiVersion 
 : 
  
 dataflow.cnrm.cloud.google.com/v1beta1 
 kind 
 : 
  
 DataflowJob 
 metadata 
 : 
  
 annotations 
 : 
  
 cnrm.cloud.google.com/on-delete 
 : 
  
 "cancel" 
  
 labels 
 : 
  
 label-one 
 : 
  
 "value-one" 
  
 name 
 : 
  
 dataflowjob-sample-batch 
 spec 
 : 
  
 tempGcsLocation 
 : 
  
 gs://${PROJECT_ID?}-dataflowjob-dep-batch/tmp 
  
 # This is a public, Google-maintained Dataflow Job template of a batch job 
  
 templateGcsPath 
 : 
  
 gs://dataflow-templates/2020-02-03-01_RC00/Word_Count 
  
 parameters 
 : 
  
 # This is a public, Google-maintained text file 
  
 inputFile 
 : 
  
 gs://dataflow-samples/shakespeare/various.txt 
  
 output 
 : 
  
 gs://${PROJECT_ID?}-dataflowjob-dep-batch/output 
  
 zone 
 : 
  
 us-central1-a 
  
 machineType 
 : 
  
 "n1-standard-1" 
  
 maxWorkers 
 : 
  
 3 
  
 ipConfiguration 
 : 
  
 "WORKER_IP_PUBLIC" 
 --- 
 apiVersion 
 : 
  
 storage.cnrm.cloud.google.com/v1beta1 
 kind 
 : 
  
 StorageBucket 
 metadata 
 : 
  
 annotations 
 : 
  
 cnrm.cloud.google.com/force-destroy 
 : 
  
 "true" 
  
 # StorageBucket names must be globally unique. Replace ${PROJECT_ID?} with your project ID. 
  
 name 
 : 
  
 ${PROJECT_ID?}-dataflowjob-dep-batch 
 

Streaming Dataflow Job

  # Copyright 2020 Google LLC 
 # 
 # Licensed under the Apache License, Version 2.0 (the "License"); 
 # you may not use this file except in compliance with the License. 
 # You may obtain a copy of the License at 
 # 
 #     http://www.apache.org/licenses/LICENSE-2.0 
 # 
 # Unless required by applicable law or agreed to in writing, software 
 # distributed under the License is distributed on an "AS IS" BASIS, 
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
 # See the License for the specific language governing permissions and 
 # limitations under the License. 
 apiVersion 
 : 
  
 dataflow.cnrm.cloud.google.com/v1beta1 
 kind 
 : 
  
 DataflowJob 
 metadata 
 : 
  
 annotations 
 : 
  
 cnrm.cloud.google.com/on-delete 
 : 
  
 "cancel" 
  
 labels 
 : 
  
 label-one 
 : 
  
 "value-one" 
  
 name 
 : 
  
 dataflowjob-sample-streaming 
 spec 
 : 
  
 tempGcsLocation 
 : 
  
 gs://${PROJECT_ID?}-dataflowjob-dep-streaming/tmp 
  
 # This is a public, Google-maintained Dataflow Job template of a streaming job 
  
 templateGcsPath 
 : 
  
 gs://dataflow-templates/2020-02-03-01_RC00/PubSub_to_BigQuery 
  
 parameters 
 : 
  
 # replace ${PROJECT_ID?} with your project name 
  
 inputTopic 
 : 
  
 projects/${PROJECT_ID?}/topics/dataflowjob-dep-streaming 
  
 outputTableSpec 
 : 
  
 ${PROJECT_ID?}:dataflowjobdepstreaming.dataflowjobdepstreaming 
  
 zone 
 : 
  
 us-central1-a 
  
 machineType 
 : 
  
 "n1-standard-1" 
  
 maxWorkers 
 : 
  
 3 
  
 ipConfiguration 
 : 
  
 "WORKER_IP_PUBLIC" 
 --- 
 apiVersion 
 : 
  
 bigquery.cnrm.cloud.google.com/v1beta1 
 kind 
 : 
  
 BigQueryDataset 
 metadata 
 : 
  
 name 
 : 
  
 dataflowjobdepstreaming 
 --- 
 apiVersion 
 : 
  
 bigquery.cnrm.cloud.google.com/v1beta1 
 kind 
 : 
  
 BigQueryTable 
 metadata 
 : 
  
 name 
 : 
  
 dataflowjobdepstreaming 
 spec 
 : 
  
 datasetRef 
 : 
  
 name 
 : 
  
 dataflowjobdepstreaming 
 --- 
 apiVersion 
 : 
  
 pubsub.cnrm.cloud.google.com/v1beta1 
 kind 
 : 
  
 PubSubTopic 
 metadata 
 : 
  
 name 
 : 
  
 dataflowjob-dep-streaming 
 --- 
 apiVersion 
 : 
  
 storage.cnrm.cloud.google.com/v1beta1 
 kind 
 : 
  
 StorageBucket 
 metadata 
 : 
  
 annotations 
 : 
  
 cnrm.cloud.google.com/force-destroy 
 : 
  
 "true" 
  
 # StorageBucket names must be globally unique. Replace ${PROJECT_ID?} with your project ID. 
  
 name 
 : 
  
 ${PROJECT_ID?}-dataflowjob-dep-streaming 
 
Create a Mobile Website
View Site in Mobile | Classic
Share by: