This document describes the methods, properties, and configuration options of Dataform core. You can use Dataform core in SQLX and JavaScript files.
assert()
assert
(name: string, query?: AContextable )
Adds a Dataform assertion the compiled graph.
Available only in the /definitions
directory.
Example:
// definitions/file.js
assert
(
"name"
).
query
(
ctx
=
>
"select 1"
);
CommonContext
Context methods are available when evaluating contextable SQL code,
such as within SQLX files, or when using a Contextable
argument with Dataform core.
database
() => string
name
() => string
ref
(ref: Resolvable | string[], rest: string[]) => string
from
expression. This function can be called with a Resolvable
object, for example: ${ref({ name: "name", schema: "schema", database: "database" })}
This function can also be called using individual arguments for the "database"
, "schema"
, and "name"
values.
When only two values are provided, the default database is used and the values
are interpreted as "schema"
and "name"
.
When only one value is provided, the default database and schema are used,
with the provided value interpreted as `"name"`.
${ref("database", "schema", "name")}
${ref("schema", "name")}
${ref("name")}
resolve
(ref: Resolvable | string[], rest: string[]) => string
ref
, but it does not add the referenced
action as a dependency to this action.self
() => string
resolve(name())
. Returns a valid SQL string that can be used to reference the table produced by this action.
schema
() => string
Contextable
Contextable arguments can either pass a plain value for their generic type T
or a function that is called with the context object for this
type of operation.
T | (ctx: Context) => T
Dataform
Global variable that contains the IProjectConfig
object.
Required for getting IProjectConfig
properties, for example:
dataform.projectConfig.vars.myVariableName === "myVariableValue"
declare()
declare
(dataset: dataform.ITarget)
Declares the dataset as a Dataform data source.
Available only in the /definitions
directory.
Example:
// definitions/file.js
declare
({
name
:
"a-declaration"
})
defaultLocation
The defaultLocation
property specifies your default BigQuery
dataset location. Dataform uses this location to process your
code and store the results. This processing location must match the location of
your BigQuery datasets. However, it doesn't need to match the
Dataform repository location.
If you don't set the defaultLocation
property, Dataform
determines the location based on the datasets your SQL query references. This
works as follows:
- If your query references datasets from the same location, Dataform uses that location.
- If your query references datasets from two or more different locations, an error occurs. For details about this limitation, see Cross-region dataset replication .
- If your query doesn't reference any datasets, the default location for
Dataform is the
US
multi-region. To choose a different location, set thedefaultLocation
property. Alternatively, use the@@location
system variable in your query.
For more information, see Specify locations .
IActionConfig
Defines Dataform tags and dependencies applied to a workflow action.
tags
string[]
A list of user-defined tags with which the action should be labeled.
dependencies
Dependencies of the action.
disabled
boolean
If set to true, this action won't be run. However, the action can still be depended upon. Useful for temporarily turning off broken actions.
IAssertionConfig
Configuration options for assertion
action types.
database
string
description
string
disabled
boolean
true
, this action is not run.
The action can still be depended upon. Useful for temporarily
turning off broken actions.hermetic
boolean
If this action depends on data from a source which is not declared
as a dependency, then set hermetic
to false
.
Otherwise, set to true
.
schema
string
tags
string[]
IBigQueryOptions
BigQuery-specific warehouse options.
additionalOptions
Some options, for example, partitionExpirationDays
,
have dedicated type/validity checked fields.
For such options, use the dedicated fields.
String values must be encapsulated in double-quotes, for example: additionalOptions: {numeric_option: "5"
, string_option: '"string-value"'}
If the option name contains special characters, encapsulate the name in quotes,
for example: additionalOptions: { "option-name": "value" }
.
clusterBy
string[]
labels
If the label name contains special characters, for example, hyphens,
then quote its name, for example, labels: { "label-name": "value" }.
partitionBy
string
partitionExpirationDays
number
requirePartitionFilter
boolean
WHERE
clause predicate filter that filters the partitioning column.updatePartitionFilter
string
IColumnsDescriptor
Describes columns in a table.
{ [name]: string | IRecordDescriptor }
IDeclarationConfig
Configuration options for declaration
action types.
columns
database
string
description
string
schema
string
IDependenciesConfig
Defines dependencies of a workflow action.
dependencies
One or more explicit dependencies for this action. Dependency actions
will run before dependent actions.
Typically this would remain unset, because most dependencies are declared
as a by-product of using the ref
function.
hermetic
boolean
Declares whether or not this action is hermetic. An action is hermetic if
all of its dependencies are explicitly declared. If this action depends on
data from a source which has not been declared as a dependency, then hermetic
should be explicitly set to false
.
Otherwise, if this action only depends on data from explicitly-declared
dependencies, then it should be set to true
.
IDocumentableConfig
Defines descriptions of a dataset and its columns.
columns
A description of columns within the dataset.
description
string
A description of the dataset.
INamedConfig
Defines the type and name of a workflow action.
type
string
The type of the action.
name
string
The name of the action.
IOperationConfig
Configuration options for operations
action types.
columns
database
string
description
string
disabled
boolean
true
, this action is not run.
The action can still be depended upon.
Useful for temporarily turning off broken actions.hasOutput
boolean
operations
action creates a table
that is referenceable using the ref
function. If set to true
, this action creates a table with its
configured name, using the self()
context function.
create or replace table ${self()} as select ...
hermetic
boolean
If this action depends on data from a source which is not declared
as a dependency, then set hermetic
to false
.
Otherwise, set to true
.
schema
string
tags
string[]
IProjectConfig
Contains compilation settings of a Dataform repository.
defaultDatabase
string
defaultSchema
string
string
assertionSchema
string
vars
map (key: string, value: string)
"key": value
pairs.
Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }
.databaseSuffix
string
schemaSuffix
string
tablePrefix
string
warehouse
string
bigquery
.You can set IProjectConfig
properties in workflow settings
at the repository level.
You can override the defaultSchema
and defaultDatabase
properties for individual tables
.
You can access all IProjectConfig
properties in a SQL SELECT
statement
in a SQLX or JavaScript file.
The following code sample shows the myVariableName
custom compilation variable
set in workflow settings
with the projectConfig.vars
property, accessed in a SELECT
statement in a
SQLX file:
config { type: "view" }
SELECT ${when(
dataform.projectConfig.vars.myVariableName === "myVariableValue",
"myVariableName is set to myVariableValue!",
"myVariableName is not set to myVariableValue!"
)}
For more information about overriding project configuration settings
for individual compilation results, see the projects.locations.repositories.compilationResults#CodeCompilationConfig
REST resource in Dataform API
.
IRecordDescriptor
Describes a struct, object or record in a table that has nested columns.
bigqueryPolicyTags
string | string[]
For example: "projects/1/locations/eu/taxonomies/2/policyTags/3"
columns
description
string
ITableAssertions
Options for creating assertions as part of a table definition.
nonNull
string | string[]
NULL
. If set, the corresponding assertion fails if any row contains NULL
values for these column(s).
rowConditions
string[]
If set, the corresponding assertion fails if any row violates any of these condition(s).
uniqueKey
string | string[]
If set, the resulting assertion fails if there is more than one row in the table with the same values for all of these column(s).
uniqueKeys
[]
If set, the resulting assertion fails if there is more than one row in the table with the same values for all of the column(s) in the unique key(s).
ITableConfig
Configuration options for table
actions, including table
, view
and incremental
table types.
Extends IActionConfig , IDependenciesConfig , IDocumentableConfig , INamedConfig , and ITargetableConfig .
assertions
If configured, relevant assertions are automatically created and run as a dependency of this table.
bigquery
columns
database
string
description
string
disabled
boolean
true
, this action is not run.
The action can still be depended upon.
Useful for temporarily turning off broken actions.hermetic
boolean
If this action depends on data from a source which is not declared
as a dependency, then set hermetic
to false
.
Otherwise, set to true
.
materialized
boolean
view
. If set to true, a materialized view will be created.
protected
boolean
incremental
table type. If set to true, running this action ignores the full-refresh
option.
This is useful for tables which are built from transient data,
to verify that historical data is never lost.
schema
string
tags
string[]
type
uniqueKey
string[]
If configured, records with matching unique key(s) are updated instead of new rows being inserted.
ITableContext
Context methods are available when evaluating contextable SQL code,
such as within SQLX files, or when using a Contextable
argument with Dataform core.
incremental
() => boolean
name
() => string
ref
(ref: Resolvable | string[], rest: string[]) => string
This function can be called with a Resolvable
object, for example:
${ref({ name: "name", schema: "schema", database: "database" })}
This function can also be called using individual arguments for the "database"
, "schema"
, and "name"
values.
When only two values are provided, the default database is used and the values
are interpreted as "schema"
and "name"
.
When only one value is provided, the default database schema is used, with the provided value interpreted as `"name"`.
${ref("database", "schema", "name")}
${ref("schema", "name")}
${ref("name")}
resolve
(ref: Resolvable | string[], rest: string[]) => string
ref
, but instead of adding a dependency,
it resolves the provided reference so that it can be used in SQL,
for example, in a `from` expression.self
() => string
resolve(name())
. Returns a valid SQL string that can be used to reference the table produced by this action.
when
(cond: boolean, trueCase: string, falseCase: string) => string
if
condition.
Equivalent to cond ? trueCase : falseCase
. falseCase
is optional, and defaults to an empty string.
ITarget
A reference to a table within BigQuery.
database
|
string
|
---|---|
name
|
string
|
schema
|
string
|
ITargetableConfig
Defines the target database and schema of a workflow action.
database
string
The database in which the output of this action should be created. Must be set to BigQuery.
schema
string
The schema in which the output of this action should be created.
operate()
operate
(name: string, queries?: Contextable )
Defines a SQL operation.
Available only in the /definitions
directory.
Example:
// definitions/file.js
operate
(
"an-operation"
,
[
"SELECT 1"
,
"SELECT 2"
])
publish()
publish
(name: string, queryOrConfig?: Contextable | ITableConfig)
Creates a table or view.
Available only in the /definitions
directory.
Example:
// definitions/file.js
publish
(
"published-table"
,
{
type
:
"table"
,
dependencies
:
[
"a-declaration"
],
}).
query
(
ctx
=
>
"SELECT 1 AS test"
);
Resolvable
A resolvable can be either the name of a table as string
,
or the object that describes the full path to the relation.
string | ITarget
TableType
Supported types of table actions.
Tables of type view
will be created as views.
Tables of type table
will be created as tables.
Tables of type incremental
must include a where
clause.
For more information, see Configure an incremental table
.