PARTITIONS view
The INFORMATION_SCHEMA.PARTITIONS
view contains one row for each partition.
Querying the INFORMATION_SCHEMA.PARTITIONS
view is limited to 1000
tables. To get the data about partitions at the project level, you can split the
query into multiple queries and then join the results. If you exceed the limit,
you can encounter an error similar to the following:
INFORMATION_SCHEMA.PARTITIONS query attempted to read too many tables. Please add more restrictive filters.
Required permissions
To query the INFORMATION_SCHEMA.PARTITIONS
view, you need the following
Identity and Access Management (IAM) permissions:
-
bigquery.tables.get
-
bigquery.tables.list
Each of the following predefined IAM roles includes the preceding permissions:
-
roles/bigquery.admin
-
roles/bigquery.dataEditor
-
roles/bigquery.dataViewer
For more information about BigQuery permissions, see Access control with IAM .
Schema
When you query the INFORMATION_SCHEMA.PARTITIONS
view, the query results
typically contain one row for each partition. The exception is when there is
a combination of long-term and active storage tier data in the __UNPARTITIONED__
partition
. In that case,
the view returns two rows for the __UNPARTITIONED__
partition, one for each
storage tier.
The INFORMATION_SCHEMA.PARTITIONS
view has the following schema:
TABLE_CATALOG
STRING
TABLE_SCHEMA
STRING
datasetId
TABLE_NAME
STRING
tableId
PARTITION_ID
STRING
NULL
. For partitioned tables that contain rows with NULL
values in the partitioning column, the value is __NULL__
.TOTAL_ROWS
INTEGER
TOTAL_LOGICAL_BYTES
INTEGER
LAST_MODIFIED_TIME
TIMESTAMP
STORAGE_TIER
STRING
-
ACTIVE
: the partition is billed as active storage -
LONG_TERM
: the partition is billed as long-term storage
Scope and syntax
Queries against this view must include a dataset qualifier. For queries with a dataset qualifier, you must have permissions for the dataset. For more information see Syntax . The following table explains the region and resource scopes for this view:
View name | Resource scope | Region scope |
---|---|---|
[ PROJECT_ID
.] DATASET_ID
.INFORMATION_SCHEMA.PARTITIONS
|
Dataset level | Dataset location |
- Optional:
PROJECT_ID
: the ID of your Google Cloud project. If not specified, the default project is used. -
DATASET_ID
: the ID of your dataset. For more information, see Dataset qualifier .
Examples
Example 1
The following example calculates the number of logical bytes used by each
storage tier in all of the tables in a dataset named mydataset
:
SELECT storage_tier , SUM ( total_logical_bytes ) AS logical_bytes FROM ` mydataset .INFORMATION_SCHEMA.PARTITIONS` GROUP BY storage_tier ;
The results look similar to the following:
+--------------+----------------+ | storage_tier | logical_bytes | +--------------+----------------+ | LONG_TERM | 1311495144879 | | ACTIVE | 66757629240 | +--------------+----------------+
Example 2
The following example creates a column that extracts the partition type from the partition_id
field and aggregates partition information at the table level
for the public bigquery-public-data.covid19_usafacts
dataset:
SELECT table_name , CASE WHEN regexp_contains ( partition_id , '^[0-9]{4}$' ) THEN 'YEAR' WHEN regexp_contains ( partition_id , '^[0-9]{6}$' ) THEN 'MONTH' WHEN regexp_contains ( partition_id , '^[0-9]{8}$' ) THEN 'DAY' WHEN regexp_contains ( partition_id , '^[0-9]{10}$' ) THEN 'HOUR' END AS partition_type , min ( partition_id ) AS earliest_partition , max ( partition_id ) AS latest_partition_id , COUNT ( partition_id ) AS partition_count , sum ( total_logical_bytes ) AS sum_total_logical_bytes , max ( last_modified_time ) AS max_last_updated_time FROM `bigquery-public-data.covid19_usafacts.INFORMATION_SCHEMA.PARTITIONS` GROUP BY 1 , 2 ;
The results look similar to the following:
+-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+ | table_name | partition_type | earliest_partition | latest_partition_id | partition_count | sum_total_logical_bytes | max_last_updated_time | +--------------+-------------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+ | confirmed_cases | DAY | 20221204 | 20221213 | 10 | 26847302 | 2022-12-13 00:09:25.604000 UTC | | deaths | DAY | 20221204 | 20221213 | 10 | 26847302 | 2022-12-13 00:09:24.709000 UTC | | summary | DAY | 20221204 | 20221213 | 10 | 241285338 | 2022-12-13 00:09:27.496000 UTC | +-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+