distribution

Usage

view: my_view { 
  derived_table: { 
 distribution 
 : 
 "customer_id" 
 ... 
 } 
 }

Hierarchy

View File

view

derived_table

distribution

- or -

Model File

explore

aggregate_table

materialization

distribution

Default Value

None

Accepts

The name of a column in a derived table or aggregate table

Special Rules

distribution is supported only for Redshift and Aster databases.

Definition

The distribution parameter lets you specify the column from a persistent derived table (PDT) or an aggregate table on which to apply a distribution key. The distribution key spreads the data around a cluster. When two tables are joined by the column specified in the distribution parameter, the database can find the joined data on the same node to minimize internode I/O. The distribution parameter works only with Redshift and Aster databases. For other SQL dialects (such as MySQL and Postgres), use indexes instead.

The distribution parameter works only with tables that are persistent , such as PDTs and aggregate tables. distribution is not supported for derived tables without a persistence strategy.

In addition, the distribution parameter is not supported for derived tables that are defined using create_process or sql_create .

Generally speaking, a distribution key should be applied to the column that will be acting as a foreign key (the column you use to join your table to others). Refer to the documentation for your dialect for best practices.

Examples

Create a customer_order_facts persistent derived table with a distribution key on customer_id :

 view: customer_order_facts {
  derived_table: {
    explore_source: order {
      column: customer_id { field: order.customer_id }
      column: lifetime_orders { field: order.lifetime_orders }
    }
    datagroup_trigger: order_datagroup
    distribution: "customer_id"
  }
}

Create a customer_order_facts derived table based on a SQL query with a distribution key on customer_id :

 view: customer_order_facts {
  derived_table: {
    sql:
      SELECT
        customer_id,
        COUNT(*) AS lifetime_orders
      FROM
        order ;;
    persist_for: "24 hours"
    distribution: "customer_id"
  }
}

distribution Stay organized with collections Save and categorize content based on your preferences.

Usage

Definition

Examples

distribution