The name of a column in a derived table or aggregate table
Special Rules
distributionis supported only for Redshift and Aster databases.
Definition
distributionlets you specify the column from apersistent derived table(PDT) or anaggregate tableon which to apply a distribution key to spread the data around a cluster. When two tables are joined by the column specified in thedistributionparameter, the database can find the joined data on the same node to minimize internode I/O. Currently,distributionworks only with Redshift and Aster databases. For other SQL dialects (such as MySQL and Postgres), useindexesinstead.
Thedistributionparameter works only with tables that arepersistent, such as PDTs and aggregate tables.distributionis not supported forderived tableswithout a persistence strategy.
In addition, thedistributionparameter is not supported for derived tables that are defined usingcreate_processorsql_create.
Generally speaking, a distribution key should be applied to the column that will be acting as a foreign key (the column you use to join your table to others). Refer to the documentation for your dialect for best practices.
Examples
Create acustomer_order_factspersistent derived table with a distribution key oncustomer_id:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-07-22 UTC."],[],[],null,["# distribution\n\n\u003cbr /\u003e\n\nUsage\n-----\n\n```\nview: my_view {\n derived_table: {\n distribution: \"customer_id\"\n ...\n }\n}\n```\n\nDefinition\n----------\n\n`distribution` lets you specify the column from a [persistent derived table](/looker/docs/derived-tables#persistent-derived-tables) (PDT) or an [aggregate table](/looker/docs/reference/param-explore-aggregate-table) on which to apply a distribution key to spread the data around a cluster. When two tables are joined by the column specified in the `distribution` parameter, the database can find the joined data on the same node to minimize internode I/O. Currently, `distribution` works only with Redshift and Aster databases. For other SQL dialects (such as MySQL and Postgres), use [`indexes`](/looker/docs/reference/param-view-indexes) instead.\n\n\u003e The `distribution` parameter works only with tables that are [persistent](/looker/docs/derived-tables#adding_persistence), such as PDTs and aggregate tables. `distribution` is not supported for [derived tables](/looker/docs/creating-ndts) without a persistence strategy.\n\u003e\n\u003e In addition, the `distribution` parameter is not supported for derived tables that are defined using [`create_process`](/looker/docs/reference/param-view-create-process) or [`sql_create`](/looker/docs/reference/param-view-sql-create).\n\nGenerally speaking, a distribution key should be applied to the column that will be acting as a foreign key (the column you use to join your table to others). Refer to the documentation for your dialect for best practices.\n\nExamples\n--------\n\nCreate a `customer_order_facts` persistent derived table with a distribution key on `customer_id`: \n\n view: customer_order_facts {\n derived_table: {\n explore_source: order {\n column: customer_id { field: order.customer_id }\n column: lifetime_orders { field: order.lifetime_orders }\n }\n datagroup_trigger: order_datagroup\n distribution: \"customer_id\"\n }\n }\n\nCreate a `customer_order_facts` derived table based on a SQL query with a distribution key on `customer_id`: \n\n view: customer_order_facts {\n derived_table: {\n sql:\n SELECT\n customer_id,\n COUNT(*) AS lifetime_orders\n FROM\n order ;;\n persist_for: \"24 hours\"\n distribution: \"customer_id\"\n }\n }"]]