Deployment configuration samples

The config/config.yaml file — typically initialized from the config/config.yaml.example template — serves as the primary configuration for the Cortex Framework deployment. It defines critical parameters including the target Google Cloud execution project, source and destination BigQuery datasets, and Dataform specifications such as repository and workspace names.

The following sections provide a detailed breakdown of the config/config.yaml structure.

Build environment

The build environment project is the project that gets billed for build actions, such as BigQuery jobs (reading DD03L ).

  buildEnvironment 
 : 
  
 buildProjectId 
 : 
  
  YOUR_BUILD_PROJECT_ID 
 
 

The following table describes the build environment parameters.

Parameter Meaning Default value Description
buildEnvironment.buildProjectId
Build project ID YOUR_BUILD_PROJECT_ID Google Cloud Project ID where build operations are executed.

Data

The data: section of the configuration file defines your data sources, targets, and the specific modules for the data foundation and data products. Its general structure is as follows:

  data 
 : 
  
 # Geographic location for BigQuery datasets (for example: US, EU, us-central1) 
  
 # For full list see: https://docs.cloud.google.com/cortex/docs/supported-locations 
  
 bigQueryLocation 
 : 
  
 US 
  
 # List of namespaces for data foundation and product modules. 
  
 namespaces 
 : 
  
 - 
  
 name 
 : 
  
 cortex 
  
 path 
 : 
  
 cortex 
  
 # List of source datasets. 
  
 sources 
 : 
  
 - 
  
 ... 
  
 # List of target datasets. 
  
 targets 
 : 
  
 - 
  
 ... 
  
 # Configuration for data foundation and product modules. 
  
 modules 
 : 
  
 # List of foundation modules. 
  
 foundation 
 : 
  
 - 
  
 ... 
  
  
 # List of data product modules. 
  
 product 
 : 
  
 - 
  
 ... 
 

Data: BigQuery location

Defines the location of the BigQuery source and target datasets.

Parameter Meaning Default value Description
data.bigQueryLocation
BigQuery Location US BigQuery dataset location (for example, US , us-central1 , or europe-west1 ).

Data: Cortex namespace

Defines Cortex Framework namespace.

Parameter Meaning Default value Description
data.namespaces.name
Namespace name - Cortex Framework namespace name. For example, cortex .
data.namespaces.path
Namespace path - Cortex Framework namespace path for subdirectories used within src and config folder. For example, cortex .

Data: BigQuery sources and target datasets

The list of sources defines BigQuery datasets where the raw data from the source system has been replicated or streamed into.

The targets define a list of BigQuery datasets where the Dataform processed datasets will be stored.

Each of source and targets are referenced from the modules using its unique ID.

  # Data source and target mapping 
 sources 
 : 
  
 - 
  
 id 
 : 
  
 sap_raw 
  
 projectId 
 : 
  
  YOUR_SOURCE_PROJECT_ID 
 
  
 datasetId 
 : 
  
 cortex_sap_raw 
 targets 
 : 
  
 - 
  
 id 
 : 
  
 sap_foundation 
  
 projectId 
 : 
  
  YOUR_TARGET_PROJECT_ID 
 
  
 datasetId 
 : 
  
 cortex7_sap_data_foundation 
 

The following table describes the data source and target mapping parameters.

Parameter Meaning Default value Description
data.sources.id
Source ID - Defines the 'id' for the source dataset to pull data from. For example, sap_raw .
data.sources.projectId
Source Project ID YOUR_SOURCE_PROJECT_ID References the Google Cloud Project ID with source data.
data.sources.datasetId
Source BigQuery Dataset ID - References the BigQuery Dataset ID with source data. For example, cortex_sap_raw .
data.targets.id
Target ID - Defines the 'id' for the target dataset. For example, cortex_data_foundation .
data.targets.projectId
Target Project ID YOUR_TARGET_PROJECT_ID References the Google Cloud Project ID for the target data.
data.targets.datasetId
Target BigQuery Dataset ID - References the BigQuery Dataset ID for the target data. For example, cortex_sap_data_foundation .

Data: Modules

The modules define the structure and components of the Dataform data pipelines.

Data: Modules: Foundation

This section configures the data foundation layer modules that process data from the raw layer (CDC streams) into standardized latest records representation of the source data. In case the source provides a view on the latest records directly, or such transformations are performed by the source system connector, the module can be configured as an external data foundation source.

  modules 
 : 
  
 # List of foundation modules. 
  
 foundation 
 : 
  
 # Unique identifier for the module instance. 
  
 - 
  
 moduleId 
 : 
  
 erp 
  
 # Type of the module (namespaced, for example, cortex.sap). 
  
 type 
 : 
  
 cortex.sap 
  
 # Reference to the source dataset ID. 
  
 dataSourceId 
 : 
  
 sap_raw 
  
 # Reference to the target dataset ID. 
  
 dataTargetId 
 : 
  
 sap_foundation 
  
 # Module-specific configuration settings. 
  
 moduleSettings 
 : 
  
 # SAP version (for example, ecc, s4). 
  
 sapVersion 
 : 
  
 ecc 
  
 # SAP client number. 
  
 mandt 
 : 
  
 "100" 
  
 # Whether the module is enabled. 
  
 # enabled: true 
  
 # Whether the foundation is external (does not create target dataset). 
  
 # external: false 
  
 # Path to the table settings configuration file. 
  
 # tableSettings: "config/data_foundation/sap/table_settings.yaml" 
 

The following table describes the data foundation modules parameters for modules.foundation configuration.

Parameter Meaning Default value Description
moduleId
Module Identifier erp Unique identifier for a specific data foundation transformation module instance.
type
Module Logic Type cortex.sap Defines the business logic or template applied (for example, customers, sales_documents).
dataSourceId
Source Link sap_raw References the 'id' from the data.sources list to pull data from.
dataTargetId
Target Link sap_foundation References the 'id' from the targets list to push data to.
moduleSettings.sapVersion
SAP System Version ecc Applicable for SAP data sources only. Determines source-specific logic for ecc (ECC) or s4 (S/4HANA) systems.
moduleSettings.mandt
SAP Client (Mandant) 100 Applicable for SAP data sources only. The 3-digit SAP client identifier used to filter data rows.
enabled
Module enablement true Specifies whether the module is enabled.
external
External foundation false Specifies whether the foundation is external (does not create target dataset).
tableSettings
Table settings config/cortex/data_foundation/{source_system}/table_settings.yaml Path to the table settings configuration file.

Data: Modules: Data products

Data product modules define the aggregations, calculations, and joins necessary to transform raw data into insights that fulfill specific business use cases.

The configuration of the data products allows setting of unique ID, definition of dependencies as well as reference of the data foundation module and target dataset where the results will be stored into.

Detailed configuration of given data products is defined within files referenced by the key: tableSettings .

  modules 
 : 
  
 # List of data product modules. 
  
 product 
 : 
  
 # Unique identifier for the data product instance. 
  
 - 
  
 moduleId 
 : 
  
 sap_purchasing_organizations 
  
 # Type of the data product (namespaced). 
  
 type 
 : 
  
 cortex.purchasing_organizations 
  
 # Map of module dependencies. 
  
 dependsOn 
 : 
  
 sapModule 
 : 
  
 erp 
  
 # Reference to the target dataset ID. 
  
 dataTargetId 
 : 
  
 product_target 
  
 # Whether the module is enabled. 
  
 # enabled: true 
  
 # Path to the table settings configuration file. 
  
 # tableSettings:   "config/cortex/data_product/purchasing_organizations/table_settings.yaml" 
 

The following table describes the data product modules parameters for modules.product configuration.

Parameter Meaning Default value Description
moduleId
Module Identifier - Unique identifier for a specific transformation module instance.
type
Module Logic Type - Defines the business logic or template applied, defined in src/data_modules/{namespace}/data_product folder.
dataTargetId
Target Link sap_foundation References the 'id' from the targets list to push data to.
dependsOn
Upstream Dependency sapModule: erp Specifies which foundation module must exist before the product module can be built.
enabled
Module enablement true Specifies whether the module is enabled.
tableSettings
Table settings "config/{namespace}/data_product/data_product_name/table_settings.yaml" Path to the table settings configuration file.

Deployment environment

Cortex Framework uses Dataform to orchestrate SQL transformations within BigQuery. The deployment: block defines the Dataform configuration, responsible for the execution of the data pipelines, including the repository project, location, repository name, and the Dataform workspace name.

  deployment 
 : 
  
 targets 
 : 
  
 - 
  
 type 
 : 
  
 dataform 
  
 enabled 
 : 
  
 true 
  
 targetSettings 
 : 
  
 repositoryProjectId 
 : 
  
  YOUR_REPO_PROJECT_ID 
 
  
 repositoryRegion 
 : 
  
 us-central1 
  
 repositoryName 
 : 
  
 cortex-repository 
  
 workspaceName 
 : 
  
 dev 
 

The following table describes the deployment targets location parameters ( deployment.targets: ).

Google Cloud
Parameter Meaning Default Value Description
type
Deployment type dataform Type of the deployment targets.
enabled
Enabled/ Disabled true Specifies if given deployment target is enabled or disabled.
targetSettings.repositoryProjectId
Repository project ID YOUR_REPO_PROJECT_ID The Google Cloud Project ID where the Dataform repository is managed.
targetSettings.repositoryRegion
Repository region us-central1 The Google Cloud region for the Dataform repository (for example, us-central1 or europe-west1 ).
targetSettings.repositoryName
Repository name cortex-repository The specific name of the Dataform repository.
targetSettings.workspaceName
Workspace name dev The specific Dataform workspace used for the deployment cycle.
Create a Mobile Website
View Site in Mobile | Classic
Share by: